Document (#33621)

Author
Efron, M.
Title
Shannon meets Shortz : a probabilistic model of crossword puzzle difficulty
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.6, S.875-886
Year
2008
Abstract
This article is concerned with the difficulty of crossword puzzles. A model is proposed that quantifies the difficulty of a Puzzle P with respect to its clues. Given a clue-answer pair (c,a), we model the difficulty of guessing a based on c using the conditional probability P(a based on c); easier mappings should enjoy a higher conditional probability. The model is tested by two experiments, each of which involves estimating the difficulty of puzzles taken from The New York Times. Additionally, we discuss how the notion of information implicit in our model relates to more easily quantifiable types of information that figure into crossword puzzles.

Similar documents (author)

  1. Efron, M.: Eigenvalue-based model selection during Latent Semantic Indexing (2005) 6.10
    6.0972233 = sum of:
      6.0972233 = weight(author_txt:efron in 4685) [ClassicSimilarity], result of:
        6.0972233 = fieldWeight in 4685, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.625 = fieldNorm(doc=4685)
    
  2. Efron, M.: Query expansion and dimensionality reduction : Notions of optimality in Rocchio relevance feedback and latent semantic indexing (2008) 6.10
    6.0972233 = sum of:
      6.0972233 = weight(author_txt:efron in 3020) [ClassicSimilarity], result of:
        6.0972233 = fieldWeight in 3020, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.625 = fieldNorm(doc=3020)
    
  3. Efron, M.: Linear time series models for term weighting in information retrieval (2010) 6.10
    6.0972233 = sum of:
      6.0972233 = weight(author_txt:efron in 675) [ClassicSimilarity], result of:
        6.0972233 = fieldWeight in 675, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.625 = fieldNorm(doc=675)
    
  4. Efron, M.: Information search and retrieval in microblogs (2011) 6.10
    6.0972233 = sum of:
      6.0972233 = weight(author_txt:efron in 455) [ClassicSimilarity], result of:
        6.0972233 = fieldWeight in 455, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.625 = fieldNorm(doc=455)
    
  5. Efron, M.; Winget, M.: Query polyrepresentation for ranking retrieval systems without relevance judgments (2010) 4.88
    4.8777785 = sum of:
      4.8777785 = weight(author_txt:efron in 456) [ClassicSimilarity], result of:
        4.8777785 = fieldWeight in 456, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.5 = fieldNorm(doc=456)
    

Similar documents (content)

  1. Bodoff, D.; Robertson, S.: ¬A new unified probabilistic model (2004) 0.09
    0.0887019 = sum of:
      0.0887019 = product of:
        0.55438685 = sum of:
          0.083698764 = weight(abstract_txt:probabilistic in 3129) [ClassicSimilarity], result of:
            0.083698764 = score(doc=3129,freq=3.0), product of:
              0.09140556 = queryWeight, product of:
                1.1180505 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.01208135 = queryNorm
              0.91568565 = fieldWeight in 3129, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.078125 = fieldNorm(doc=3129)
          0.10162277 = weight(abstract_txt:probability in 3129) [ClassicSimilarity], result of:
            0.10162277 = score(doc=3129,freq=1.0), product of:
              0.18903303 = queryWeight, product of:
                2.273835 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01208135 = queryNorm
              0.53759265 = fieldWeight in 3129, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.078125 = fieldNorm(doc=3129)
          0.14778218 = weight(abstract_txt:model in 3129) [ClassicSimilarity], result of:
            0.14778218 = score(doc=3129,freq=9.0), product of:
              0.1583158 = queryWeight, product of:
                3.2901993 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.01208135 = queryNorm
              0.9334645 = fieldWeight in 3129, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.078125 = fieldNorm(doc=3129)
          0.22128317 = weight(abstract_txt:difficulty in 3129) [ClassicSimilarity], result of:
            0.22128317 = score(doc=3129,freq=1.0), product of:
              0.43101192 = queryWeight, product of:
                5.428811 = boost
                6.571569 = idf(docFreq=168, maxDocs=44421)
                0.01208135 = queryNorm
              0.51340383 = fieldWeight in 3129, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.571569 = idf(docFreq=168, maxDocs=44421)
                0.078125 = fieldNorm(doc=3129)
        0.16 = coord(4/25)
    
  2. Bruza, P.D.; Huibers, T.W.C.: ¬A study of aboutness in information retrieval (1996) 0.09
    0.08653411 = sum of:
      0.08653411 = product of:
        0.43267056 = sum of:
          0.012070442 = weight(abstract_txt:based in 774) [ClassicSimilarity], result of:
            0.012070442 = score(doc=774,freq=1.0), product of:
              0.040448736 = queryWeight, product of:
                1.0518228 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.01208135 = queryNorm
              0.2984133 = fieldWeight in 774, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.09375 = fieldNorm(doc=774)
          0.057988204 = weight(abstract_txt:probabilistic in 774) [ClassicSimilarity], result of:
            0.057988204 = score(doc=774,freq=1.0), product of:
              0.09140556 = queryWeight, product of:
                1.1180505 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.01208135 = queryNorm
              0.6344056 = fieldWeight in 774, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.09375 = fieldNorm(doc=774)
          0.07227975 = weight(abstract_txt:relates in 774) [ClassicSimilarity], result of:
            0.07227975 = score(doc=774,freq=1.0), product of:
              0.105866194 = queryWeight, product of:
                1.2032441 = boost
                7.282627 = idf(docFreq=82, maxDocs=44421)
                0.01208135 = queryNorm
              0.6827463 = fieldWeight in 774, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.282627 = idf(docFreq=82, maxDocs=44421)
                0.09375 = fieldNorm(doc=774)
          0.20673394 = weight(abstract_txt:conditional in 774) [ClassicSimilarity], result of:
            0.20673394 = score(doc=774,freq=1.0), product of:
              0.26875964 = queryWeight, product of:
                2.711266 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.01208135 = queryNorm
              0.769215 = fieldWeight in 774, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.09375 = fieldNorm(doc=774)
          0.083598234 = weight(abstract_txt:model in 774) [ClassicSimilarity], result of:
            0.083598234 = score(doc=774,freq=2.0), product of:
              0.1583158 = queryWeight, product of:
                3.2901993 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.01208135 = queryNorm
              0.5280473 = fieldWeight in 774, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.09375 = fieldNorm(doc=774)
        0.2 = coord(5/25)
    
  3. Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.08
    0.083809495 = sum of:
      0.083809495 = product of:
        0.34920624 = sum of:
          0.02654703 = weight(abstract_txt:involves in 5277) [ClassicSimilarity], result of:
            0.02654703 = score(doc=5277,freq=1.0), product of:
              0.07777003 = queryWeight, product of:
                1.031291 = boost
                6.2418823 = idf(docFreq=234, maxDocs=44421)
                0.01208135 = queryNorm
              0.34135294 = fieldWeight in 5277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2418823 = idf(docFreq=234, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5277)
          0.03382645 = weight(abstract_txt:probabilistic in 5277) [ClassicSimilarity], result of:
            0.03382645 = score(doc=5277,freq=1.0), product of:
              0.09140556 = queryWeight, product of:
                1.1180505 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.01208135 = queryNorm
              0.37006995 = fieldWeight in 5277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5277)
          0.08094997 = weight(abstract_txt:estimating in 5277) [ClassicSimilarity], result of:
            0.08094997 = score(doc=5277,freq=2.0), product of:
              0.12979844 = queryWeight, product of:
                1.3323239 = boost
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.01208135 = queryNorm
              0.623659 = fieldWeight in 5277, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5277)
          0.059641637 = weight(abstract_txt:shannon in 5277) [ClassicSimilarity], result of:
            0.059641637 = score(doc=5277,freq=1.0), product of:
              0.13340375 = queryWeight, product of:
                1.3507006 = boost
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.01208135 = queryNorm
              0.44707617 = fieldWeight in 5277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5277)
          0.07113594 = weight(abstract_txt:probability in 5277) [ClassicSimilarity], result of:
            0.07113594 = score(doc=5277,freq=1.0), product of:
              0.18903303 = queryWeight, product of:
                2.273835 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01208135 = queryNorm
              0.37631485 = fieldWeight in 5277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5277)
          0.07710523 = weight(abstract_txt:model in 5277) [ClassicSimilarity], result of:
            0.07710523 = score(doc=5277,freq=5.0), product of:
              0.1583158 = queryWeight, product of:
                3.2901993 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.01208135 = queryNorm
              0.48703438 = fieldWeight in 5277, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5277)
        0.24 = coord(6/25)
    
  4. Dominich, S.; Góth, J.; Kiezer, T.; Szlávik, Z.: ¬An entropy-based interpretation of retrieval status value-based retrieval, and its application to the computation of term and query discrimination value (2004) 0.07
    0.073540024 = sum of:
      0.073540024 = product of:
        0.30641678 = sum of:
          0.017247079 = weight(abstract_txt:based in 3237) [ClassicSimilarity], result of:
            0.017247079 = score(doc=3237,freq=6.0), product of:
              0.040448736 = queryWeight, product of:
                1.0518228 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.01208135 = queryNorm
              0.42639354 = fieldWeight in 3237, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3237)
          0.030896278 = weight(abstract_txt:easier in 3237) [ClassicSimilarity], result of:
            0.030896278 = score(doc=3237,freq=1.0), product of:
              0.08604767 = queryWeight, product of:
                1.0847875 = boost
                6.565669 = idf(docFreq=169, maxDocs=44421)
                0.01208135 = queryNorm
              0.35906002 = fieldWeight in 3237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.565669 = idf(docFreq=169, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3237)
          0.03382645 = weight(abstract_txt:probabilistic in 3237) [ClassicSimilarity], result of:
            0.03382645 = score(doc=3237,freq=1.0), product of:
              0.09140556 = queryWeight, product of:
                1.1180505 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.01208135 = queryNorm
              0.37006995 = fieldWeight in 3237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3237)
          0.08434601 = weight(abstract_txt:shannon in 3237) [ClassicSimilarity], result of:
            0.08434601 = score(doc=3237,freq=2.0), product of:
              0.13340375 = queryWeight, product of:
                1.3507006 = boost
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.01208135 = queryNorm
              0.63226116 = fieldWeight in 3237, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3237)
          0.07113594 = weight(abstract_txt:probability in 3237) [ClassicSimilarity], result of:
            0.07113594 = score(doc=3237,freq=1.0), product of:
              0.18903303 = queryWeight, product of:
                2.273835 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01208135 = queryNorm
              0.37631485 = fieldWeight in 3237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3237)
          0.06896502 = weight(abstract_txt:model in 3237) [ClassicSimilarity], result of:
            0.06896502 = score(doc=3237,freq=4.0), product of:
              0.1583158 = queryWeight, product of:
                3.2901993 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.01208135 = queryNorm
              0.4356168 = fieldWeight in 3237, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3237)
        0.24 = coord(6/25)
    
  5. Torvik, V.I.; Weeber, M.; Swanson, D.R.; Smalheiser, N.R.: ¬A probabilistic similarity metric for medline mecords : a model for author name disambiguation (2005) 0.07
    0.06708106 = sum of:
      0.06708106 = product of:
        0.3354053 = sum of:
          0.00804696 = weight(abstract_txt:based in 4308) [ClassicSimilarity], result of:
            0.00804696 = score(doc=4308,freq=1.0), product of:
              0.040448736 = queryWeight, product of:
                1.0518228 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.01208135 = queryNorm
              0.1989422 = fieldWeight in 4308, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=4308)
          0.07871018 = weight(abstract_txt:pair in 4308) [ClassicSimilarity], result of:
            0.07871018 = score(doc=4308,freq=2.0), product of:
              0.116542496 = queryWeight, product of:
                1.2624589 = boost
                7.6410246 = idf(docFreq=57, maxDocs=44421)
                0.01208135 = queryNorm
              0.67537755 = fieldWeight in 4308, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.6410246 = idf(docFreq=57, maxDocs=44421)
                0.0625 = fieldNorm(doc=4308)
          0.06541745 = weight(abstract_txt:estimating in 4308) [ClassicSimilarity], result of:
            0.06541745 = score(doc=4308,freq=1.0), product of:
              0.12979844 = queryWeight, product of:
                1.3323239 = boost
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.01208135 = queryNorm
              0.5039926 = fieldWeight in 4308, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.0625 = fieldNorm(doc=4308)
          0.11497304 = weight(abstract_txt:probability in 4308) [ClassicSimilarity], result of:
            0.11497304 = score(doc=4308,freq=2.0), product of:
              0.18903303 = queryWeight, product of:
                2.273835 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01208135 = queryNorm
              0.60821664 = fieldWeight in 4308, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0625 = fieldNorm(doc=4308)
          0.06825767 = weight(abstract_txt:model in 4308) [ClassicSimilarity], result of:
            0.06825767 = score(doc=4308,freq=3.0), product of:
              0.1583158 = queryWeight, product of:
                3.2901993 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.01208135 = queryNorm
              0.4311488 = fieldWeight in 4308, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0625 = fieldNorm(doc=4308)
        0.2 = coord(5/25)