Document (#9896)

Author
Cox, K.
Title
¬An experiment to test the utility of repeating phrases for information retrieval systems
Source
Journal of information science. 20(1994) no.5, S.348-355
Year
1994
Abstract
Describes a method of evaluating the utility of repeating phrases for information retrieval systems and calculating recall and precision. The technique compares 2 different techniques by asking people to perform tasks and then examining the outcomes of those tasks. Shows that people found those phrases automatically generated by finding repeating content words in documents to be good content indicators and discriminators were better than using words alone or phrases generated randomly from content words

Similar documents (content)

  1. Fagan, J.L.: ¬The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval (1989) 0.12
    0.1245931 = sum of:
      0.1245931 = product of:
        0.5191379 = sum of:
          0.0318838 = weight(abstract_txt:indicators in 2845) [ClassicSimilarity], result of:
            0.0318838 = score(doc=2845,freq=1.0), product of:
              0.0846265 = queryWeight, product of:
                1.0775232 = boost
                6.0281444 = idf(docFreq=290, maxDocs=44421)
                0.013028551 = queryNorm
              0.37675902 = fieldWeight in 2845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0281444 = idf(docFreq=290, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.016340906 = weight(abstract_txt:systems in 2845) [ClassicSimilarity], result of:
            0.016340906 = score(doc=2845,freq=2.0), product of:
              0.05419723 = queryWeight, product of:
                1.2194865 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.013028551 = queryNorm
              0.30150813 = fieldWeight in 2845, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.021185389 = weight(abstract_txt:retrieval in 2845) [ClassicSimilarity], result of:
            0.021185389 = score(doc=2845,freq=3.0), product of:
              0.056292895 = queryWeight, product of:
                1.24284 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013028551 = queryNorm
              0.37634215 = fieldWeight in 2845, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.055221453 = weight(abstract_txt:content in 2845) [ClassicSimilarity], result of:
            0.055221453 = score(doc=2845,freq=3.0), product of:
              0.12204826 = queryWeight, product of:
                2.2412994 = boost
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.013028551 = queryNorm
              0.45245588 = fieldWeight in 2845, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.06708432 = weight(abstract_txt:words in 2845) [ClassicSimilarity], result of:
            0.06708432 = score(doc=2845,freq=1.0), product of:
              0.20040756 = queryWeight, product of:
                2.8720443 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.013028551 = queryNorm
              0.33473945 = fieldWeight in 2845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.32742205 = weight(abstract_txt:phrases in 2845) [ClassicSimilarity], result of:
            0.32742205 = score(doc=2845,freq=3.0), product of:
              0.4400593 = queryWeight, product of:
                4.9142704 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.013028551 = queryNorm
              0.7440407 = fieldWeight in 2845, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
        0.24 = coord(6/25)
    
  2. Kim, W.; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms (2001) 0.11
    0.11078997 = sum of:
      0.11078997 = product of:
        0.55394983 = sum of:
          0.020762082 = weight(abstract_txt:recall in 188) [ClassicSimilarity], result of:
            0.020762082 = score(doc=188,freq=1.0), product of:
              0.07701928 = queryWeight, product of:
                1.0279528 = boost
                5.750825 = idf(docFreq=383, maxDocs=44421)
                0.013028551 = queryNorm
              0.26956993 = fieldWeight in 188, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.750825 = idf(docFreq=383, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.017465679 = weight(abstract_txt:those in 188) [ClassicSimilarity], result of:
            0.017465679 = score(doc=188,freq=1.0), product of:
              0.08647406 = queryWeight, product of:
                1.5403924 = boost
                4.3088202 = idf(docFreq=1623, maxDocs=44421)
                0.013028551 = queryNorm
              0.20197594 = fieldWeight in 188, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3088202 = idf(docFreq=1623, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.05346794 = weight(abstract_txt:content in 188) [ClassicSimilarity], result of:
            0.05346794 = score(doc=188,freq=5.0), product of:
              0.12204826 = queryWeight, product of:
                2.2412994 = boost
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.013028551 = queryNorm
              0.4380885 = fieldWeight in 188, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.08714508 = weight(abstract_txt:words in 188) [ClassicSimilarity], result of:
            0.08714508 = score(doc=188,freq=3.0), product of:
              0.20040756 = queryWeight, product of:
                2.8720443 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.013028551 = queryNorm
              0.43483928 = fieldWeight in 188, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.37510905 = weight(abstract_txt:phrases in 188) [ClassicSimilarity], result of:
            0.37510905 = score(doc=188,freq=7.0), product of:
              0.4400593 = queryWeight, product of:
                4.9142704 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.013028551 = queryNorm
              0.85240567 = fieldWeight in 188, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
        0.2 = coord(5/25)
    
  3. Lin, X.: Searching and browsing on map displays (1995) 0.11
    0.10743056 = sum of:
      0.10743056 = product of:
        0.38368058 = sum of:
          0.043693244 = weight(abstract_txt:compares in 3920) [ClassicSimilarity], result of:
            0.043693244 = score(doc=3920,freq=1.0), product of:
              0.07967861 = queryWeight, product of:
                1.0455488 = boost
                5.849265 = idf(docFreq=347, maxDocs=44421)
                0.013028551 = queryNorm
              0.5483686 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.849265 = idf(docFreq=347, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.050180957 = weight(abstract_txt:perform in 3920) [ClassicSimilarity], result of:
            0.050180957 = score(doc=3920,freq=1.0), product of:
              0.08738257 = queryWeight, product of:
                1.0949287 = boost
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.013028551 = queryNorm
              0.5742673 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.018347086 = weight(abstract_txt:retrieval in 3920) [ClassicSimilarity], result of:
            0.018347086 = score(doc=3920,freq=1.0), product of:
              0.056292895 = queryWeight, product of:
                1.24284 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013028551 = queryNorm
              0.3259219 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.08696208 = weight(abstract_txt:randomly in 3920) [ClassicSimilarity], result of:
            0.08696208 = score(doc=3920,freq=1.0), product of:
              0.12607205 = queryWeight, product of:
                1.3151729 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.013028551 = queryNorm
              0.68978083 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.05173071 = weight(abstract_txt:people in 3920) [ClassicSimilarity], result of:
            0.05173071 = score(doc=3920,freq=1.0), product of:
              0.11235037 = queryWeight, product of:
                1.7558026 = boost
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.013028551 = queryNorm
              0.46044096 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.05975629 = weight(abstract_txt:tasks in 3920) [ClassicSimilarity], result of:
            0.05975629 = score(doc=3920,freq=1.0), product of:
              0.123689055 = queryWeight, product of:
                1.8422734 = boost
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.013028551 = queryNorm
              0.48311704 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.073010206 = weight(abstract_txt:generated in 3920) [ClassicSimilarity], result of:
            0.073010206 = score(doc=3920,freq=1.0), product of:
              0.14136153 = queryWeight, product of:
                1.9694914 = boost
                5.509105 = idf(docFreq=488, maxDocs=44421)
                0.013028551 = queryNorm
              0.5164786 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.509105 = idf(docFreq=488, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
        0.28 = coord(7/25)
    
  4. Sanderson, M.; Lawrie, D.: Building, testing, and applying concept hierarchies (2000) 0.09
    0.090763144 = sum of:
      0.090763144 = product of:
        0.45381573 = sum of:
          0.032969337 = weight(abstract_txt:experiment in 1037) [ClassicSimilarity], result of:
            0.032969337 = score(doc=1037,freq=1.0), product of:
              0.074574985 = queryWeight, product of:
                1.0115097 = boost
                5.658835 = idf(docFreq=420, maxDocs=44421)
                0.013028551 = queryNorm
              0.44209647 = fieldWeight in 1037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.658835 = idf(docFreq=420, maxDocs=44421)
                0.078125 = fieldNorm(doc=1037)
          0.060841843 = weight(abstract_txt:generated in 1037) [ClassicSimilarity], result of:
            0.060841843 = score(doc=1037,freq=1.0), product of:
              0.14136153 = queryWeight, product of:
                1.9694914 = boost
                5.509105 = idf(docFreq=488, maxDocs=44421)
                0.013028551 = queryNorm
              0.43039885 = fieldWeight in 1037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.509105 = idf(docFreq=488, maxDocs=44421)
                0.078125 = fieldNorm(doc=1037)
          0.03985265 = weight(abstract_txt:content in 1037) [ClassicSimilarity], result of:
            0.03985265 = score(doc=1037,freq=1.0), product of:
              0.12204826 = queryWeight, product of:
                2.2412994 = boost
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.013028551 = queryNorm
              0.3265319 = fieldWeight in 1037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.078125 = fieldNorm(doc=1037)
          0.0838554 = weight(abstract_txt:words in 1037) [ClassicSimilarity], result of:
            0.0838554 = score(doc=1037,freq=1.0), product of:
              0.20040756 = queryWeight, product of:
                2.8720443 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.013028551 = queryNorm
              0.4184243 = fieldWeight in 1037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.078125 = fieldNorm(doc=1037)
          0.2362965 = weight(abstract_txt:phrases in 1037) [ClassicSimilarity], result of:
            0.2362965 = score(doc=1037,freq=1.0), product of:
              0.4400593 = queryWeight, product of:
                4.9142704 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.013028551 = queryNorm
              0.53696513 = fieldWeight in 1037, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.078125 = fieldNorm(doc=1037)
        0.2 = coord(5/25)
    
  5. Pirkola, A.; Jarvelin, K.: ¬The effect of anaphor and ellipsis resolution on proximity searching in a text database (1995) 0.09
    0.086110674 = sum of:
      0.086110674 = product of:
        0.53819174 = sum of:
          0.047947973 = weight(abstract_txt:recall in 4156) [ClassicSimilarity], result of:
            0.047947973 = score(doc=4156,freq=3.0), product of:
              0.07701928 = queryWeight, product of:
                1.0279528 = boost
                5.750825 = idf(docFreq=383, maxDocs=44421)
                0.013028551 = queryNorm
              0.62254506 = fieldWeight in 4156, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.750825 = idf(docFreq=383, maxDocs=44421)
                0.0625 = fieldNorm(doc=4156)
          0.017297799 = weight(abstract_txt:retrieval in 4156) [ClassicSimilarity], result of:
            0.017297799 = score(doc=4156,freq=2.0), product of:
              0.056292895 = queryWeight, product of:
                1.24284 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013028551 = queryNorm
              0.3072821 = fieldWeight in 4156, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=4156)
          0.09487155 = weight(abstract_txt:words in 4156) [ClassicSimilarity], result of:
            0.09487155 = score(doc=4156,freq=2.0), product of:
              0.20040756 = queryWeight, product of:
                2.8720443 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.013028551 = queryNorm
              0.47339305 = fieldWeight in 4156, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.0625 = fieldNorm(doc=4156)
          0.3780744 = weight(abstract_txt:phrases in 4156) [ClassicSimilarity], result of:
            0.3780744 = score(doc=4156,freq=4.0), product of:
              0.4400593 = queryWeight, product of:
                4.9142704 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.013028551 = queryNorm
              0.8591442 = fieldWeight in 4156, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0625 = fieldNorm(doc=4156)
        0.16 = coord(4/25)