Document (#16041)

Author
Sanderson, M.
Title
¬The Reuters test collection
Source
Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
Imprint
London : Taylor Graham
Year
1996
Pages
S.219-227
Abstract
Describes the Reuters test collection, which at 22.173 references is significantly larger than most traditional test collections. In addition, Reuters has none of the recall calculation problems normally associated with some of the larger test collections available. Explains the method derived by D.D. Lewis to perform retrieval experiments on the Reuters collection and illustrates the use of the Reuters collection using some simple retrieval experiments that compare the performance of stemming algorithms
Theme
Retrievalstudien

Similar documents (author)

  1. Sanderson, M.: Revisiting h measured on UK LIS and IR academics (2008) 5.35
    5.353733 = sum of:
      5.353733 = weight(author_txt:sanderson in 2867) [ClassicSimilarity], result of:
        5.353733 = fieldWeight in 2867, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.565973 = idf(docFreq=22, maxDocs=44421)
          0.625 = fieldNorm(doc=2867)
    
  2. Purves, R.S.; Sanderson, M.: ¬A methodology to allow avalanche forecasting on an information retrieval system (1998) 4.28
    4.2829866 = sum of:
      4.2829866 = weight(author_txt:sanderson in 2073) [ClassicSimilarity], result of:
        4.2829866 = fieldWeight in 2073, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.565973 = idf(docFreq=22, maxDocs=44421)
          0.5 = fieldNorm(doc=2073)
    
  3. Sanderson, M.; Ruthven, I.: Report on the Glasgow IR group (glair4) submission (1997) 4.28
    4.2829866 = sum of:
      4.2829866 = weight(author_txt:sanderson in 4088) [ClassicSimilarity], result of:
        4.2829866 = fieldWeight in 4088, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.565973 = idf(docFreq=22, maxDocs=44421)
          0.5 = fieldNorm(doc=4088)
    
  4. Sanderson, M.; Lawrie, D.: Building, testing, and applying concept hierarchies (2000) 4.28
    4.2829866 = sum of:
      4.2829866 = weight(author_txt:sanderson in 1037) [ClassicSimilarity], result of:
        4.2829866 = fieldWeight in 1037, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.565973 = idf(docFreq=22, maxDocs=44421)
          0.5 = fieldNorm(doc=1037)
    
  5. Clough, P.; Sanderson, M.: User experiments with the Eurovision Cross-Language Image Retrieval System (2006) 4.28
    4.2829866 = sum of:
      4.2829866 = weight(author_txt:sanderson in 52) [ClassicSimilarity], result of:
        4.2829866 = fieldWeight in 52, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.565973 = idf(docFreq=22, maxDocs=44421)
          0.5 = fieldNorm(doc=52)
    

Similar documents (content)

  1. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 0.17
    0.16758326 = sum of:
      0.16758326 = product of:
        0.83791625 = sum of:
          0.028974032 = weight(abstract_txt:compare in 4456) [ClassicSimilarity], result of:
            0.028974032 = score(doc=4456,freq=1.0), product of:
              0.08263306 = queryWeight, product of:
                1.1037688 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.01334445 = queryNorm
              0.35063484 = fieldWeight in 4456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.0625 = fieldNorm(doc=4456)
          0.013789243 = weight(abstract_txt:retrieval in 4456) [ClassicSimilarity], result of:
            0.013789243 = score(doc=4456,freq=1.0), product of:
              0.063462645 = queryWeight, product of:
                1.3679659 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01334445 = queryNorm
              0.21728125 = fieldWeight in 4456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=4456)
          0.09330584 = weight(abstract_txt:collection in 4456) [ClassicSimilarity], result of:
            0.09330584 = score(doc=4456,freq=2.0), product of:
              0.22703725 = queryWeight, product of:
                3.659146 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.01334445 = queryNorm
              0.41097152 = fieldWeight in 4456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.0625 = fieldNorm(doc=4456)
          0.08433202 = weight(abstract_txt:test in 4456) [ClassicSimilarity], result of:
            0.08433202 = score(doc=4456,freq=1.0), product of:
              0.26740092 = queryWeight, product of:
                3.9711165 = boost
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.01334445 = queryNorm
              0.3153767 = fieldWeight in 4456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.0625 = fieldNorm(doc=4456)
          0.6175151 = weight(abstract_txt:reuters in 4456) [ClassicSimilarity], result of:
            0.6175151 = score(doc=4456,freq=3.0), product of:
              0.75311714 = queryWeight, product of:
                7.45105 = boost
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.01334445 = queryNorm
              0.81994563 = fieldWeight in 4456, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.0625 = fieldNorm(doc=4456)
        0.2 = coord(5/25)
    
  2. Frei, H.P.; Stieger, D.: ¬The use of semantic links in hypertext information retrieval (1995) 0.16
    0.16392116 = sum of:
      0.16392116 = product of:
        0.5854327 = sum of:
          0.044503856 = weight(abstract_txt:addition in 1237) [ClassicSimilarity], result of:
            0.044503856 = score(doc=1237,freq=1.0), product of:
              0.0692989 = queryWeight, product of:
                1.010798 = boost
                5.137612 = idf(docFreq=708, maxDocs=44421)
                0.01334445 = queryNorm
              0.6422015 = fieldWeight in 1237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.137612 = idf(docFreq=708, maxDocs=44421)
                0.125 = fieldNorm(doc=1237)
          0.06077855 = weight(abstract_txt:algorithms in 1237) [ClassicSimilarity], result of:
            0.06077855 = score(doc=1237,freq=1.0), product of:
              0.08530244 = queryWeight, product of:
                1.1214552 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.01334445 = queryNorm
              0.7125066 = fieldWeight in 1237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.125 = fieldNorm(doc=1237)
          0.047767337 = weight(abstract_txt:retrieval in 1237) [ClassicSimilarity], result of:
            0.047767337 = score(doc=1237,freq=3.0), product of:
              0.063462645 = queryWeight, product of:
                1.3679659 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01334445 = queryNorm
              0.7526843 = fieldWeight in 1237, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.125 = fieldNorm(doc=1237)
          0.032672405 = weight(abstract_txt:some in 1237) [ClassicSimilarity], result of:
            0.032672405 = score(doc=1237,freq=1.0), product of:
              0.07105456 = queryWeight, product of:
                1.4474787 = boost
                3.6785707 = idf(docFreq=3049, maxDocs=44421)
                0.01334445 = queryNorm
              0.45982134 = fieldWeight in 1237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6785707 = idf(docFreq=3049, maxDocs=44421)
                0.125 = fieldNorm(doc=1237)
          0.09909213 = weight(abstract_txt:experiments in 1237) [ClassicSimilarity], result of:
            0.09909213 = score(doc=1237,freq=1.0), product of:
              0.14887805 = queryWeight, product of:
                2.0952291 = boost
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.01334445 = queryNorm
              0.6655926 = fieldWeight in 1237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.125 = fieldNorm(doc=1237)
          0.13195439 = weight(abstract_txt:collection in 1237) [ClassicSimilarity], result of:
            0.13195439 = score(doc=1237,freq=1.0), product of:
              0.22703725 = queryWeight, product of:
                3.659146 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.01334445 = queryNorm
              0.5812015 = fieldWeight in 1237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.125 = fieldNorm(doc=1237)
          0.16866404 = weight(abstract_txt:test in 1237) [ClassicSimilarity], result of:
            0.16866404 = score(doc=1237,freq=1.0), product of:
              0.26740092 = queryWeight, product of:
                3.9711165 = boost
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.01334445 = queryNorm
              0.6307534 = fieldWeight in 1237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.125 = fieldNorm(doc=1237)
        0.28 = coord(7/25)
    
  3. Cathey, R.J.; Jensen, E.C.; Beitzel, S.M.; Frieder, O.; Grossman, D.: Exploiting parallelism to support scalable hierarchical clustering (2007) 0.15
    0.14501871 = sum of:
      0.14501871 = product of:
        0.45318347 = sum of:
          0.027135048 = weight(abstract_txt:significantly in 1448) [ClassicSimilarity], result of:
            0.027135048 = score(doc=1448,freq=1.0), product of:
              0.07909851 = queryWeight, product of:
                1.0799044 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.01334445 = queryNorm
              0.34305385 = fieldWeight in 1448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
          0.042976923 = weight(abstract_txt:algorithms in 1448) [ClassicSimilarity], result of:
            0.042976923 = score(doc=1448,freq=2.0), product of:
              0.08530244 = queryWeight, product of:
                1.1214552 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.01334445 = queryNorm
              0.5038182 = fieldWeight in 1448, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
          0.013789243 = weight(abstract_txt:retrieval in 1448) [ClassicSimilarity], result of:
            0.013789243 = score(doc=1448,freq=1.0), product of:
              0.063462645 = queryWeight, product of:
                1.3679659 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01334445 = queryNorm
              0.21728125 = fieldWeight in 1448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
          0.04810911 = weight(abstract_txt:collections in 1448) [ClassicSimilarity], result of:
            0.04810911 = score(doc=1448,freq=2.0), product of:
              0.115868695 = queryWeight, product of:
                1.848414 = boost
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.01334445 = queryNorm
              0.4152037 = fieldWeight in 1448, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
          0.049546067 = weight(abstract_txt:experiments in 1448) [ClassicSimilarity], result of:
            0.049546067 = score(doc=1448,freq=1.0), product of:
              0.14887805 = queryWeight, product of:
                2.0952291 = boost
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.01334445 = queryNorm
              0.3327963 = fieldWeight in 1448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
          0.073019214 = weight(abstract_txt:larger in 1448) [ClassicSimilarity], result of:
            0.073019214 = score(doc=1448,freq=1.0), product of:
              0.19280398 = queryWeight, product of:
                2.384373 = boost
                6.059561 = idf(docFreq=281, maxDocs=44421)
                0.01334445 = queryNorm
              0.37872255 = fieldWeight in 1448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.059561 = idf(docFreq=281, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
          0.11427585 = weight(abstract_txt:collection in 1448) [ClassicSimilarity], result of:
            0.11427585 = score(doc=1448,freq=3.0), product of:
              0.22703725 = queryWeight, product of:
                3.659146 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.01334445 = queryNorm
              0.50333524 = fieldWeight in 1448, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
          0.08433202 = weight(abstract_txt:test in 1448) [ClassicSimilarity], result of:
            0.08433202 = score(doc=1448,freq=1.0), product of:
              0.26740092 = queryWeight, product of:
                3.9711165 = boost
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.01334445 = queryNorm
              0.3153767 = fieldWeight in 1448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.0625 = fieldNorm(doc=1448)
        0.32 = coord(8/25)
    
  4. Witschel, H.F.: Global term weights in distributed environments (2008) 0.12
    0.12167316 = sum of:
      0.12167316 = product of:
        0.434547 = sum of:
          0.02781491 = weight(abstract_txt:addition in 3096) [ClassicSimilarity], result of:
            0.02781491 = score(doc=3096,freq=1.0), product of:
              0.0692989 = queryWeight, product of:
                1.010798 = boost
                5.137612 = idf(docFreq=708, maxDocs=44421)
                0.01334445 = queryNorm
              0.40137592 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.137612 = idf(docFreq=708, maxDocs=44421)
                0.078125 = fieldNorm(doc=3096)
          0.038957838 = weight(abstract_txt:derived in 3096) [ClassicSimilarity], result of:
            0.038957838 = score(doc=3096,freq=1.0), product of:
              0.08675033 = queryWeight, product of:
                1.1309327 = boost
                5.7482243 = idf(docFreq=384, maxDocs=44421)
                0.01334445 = queryNorm
              0.44908002 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7482243 = idf(docFreq=384, maxDocs=44421)
                0.078125 = fieldNorm(doc=3096)
          0.034473106 = weight(abstract_txt:retrieval in 3096) [ClassicSimilarity], result of:
            0.034473106 = score(doc=3096,freq=4.0), product of:
              0.063462645 = queryWeight, product of:
                1.3679659 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01334445 = queryNorm
              0.5432031 = fieldWeight in 3096, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=3096)
          0.020420251 = weight(abstract_txt:some in 3096) [ClassicSimilarity], result of:
            0.020420251 = score(doc=3096,freq=1.0), product of:
              0.07105456 = queryWeight, product of:
                1.4474787 = boost
                3.6785707 = idf(docFreq=3049, maxDocs=44421)
                0.01334445 = queryNorm
              0.28738832 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6785707 = idf(docFreq=3049, maxDocs=44421)
                0.078125 = fieldNorm(doc=3096)
          0.042522848 = weight(abstract_txt:collections in 3096) [ClassicSimilarity], result of:
            0.042522848 = score(doc=3096,freq=1.0), product of:
              0.115868695 = queryWeight, product of:
                1.848414 = boost
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.01334445 = queryNorm
              0.3669917 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.078125 = fieldNorm(doc=3096)
          0.164943 = weight(abstract_txt:collection in 3096) [ClassicSimilarity], result of:
            0.164943 = score(doc=3096,freq=4.0), product of:
              0.22703725 = queryWeight, product of:
                3.659146 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.01334445 = queryNorm
              0.7265019 = fieldWeight in 3096, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.078125 = fieldNorm(doc=3096)
          0.10541503 = weight(abstract_txt:test in 3096) [ClassicSimilarity], result of:
            0.10541503 = score(doc=3096,freq=1.0), product of:
              0.26740092 = queryWeight, product of:
                3.9711165 = boost
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.01334445 = queryNorm
              0.3942209 = fieldWeight in 3096, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.078125 = fieldNorm(doc=3096)
        0.28 = coord(7/25)
    
  5. Sparck Jones, K.; Rijsbergen, C.J. van: Progress in documentation : Information retrieval test collection (1976) 0.12
    0.12150301 = sum of:
      0.12150301 = product of:
        0.60751504 = sum of:
          0.0292514 = weight(abstract_txt:retrieval in 4229) [ClassicSimilarity], result of:
            0.0292514 = score(doc=4229,freq=2.0), product of:
              0.063462645 = queryWeight, product of:
                1.3679659 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01334445 = queryNorm
              0.46092314 = fieldWeight in 4229, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.09375 = fieldNorm(doc=4229)
          0.11410078 = weight(abstract_txt:collections in 4229) [ClassicSimilarity], result of:
            0.11410078 = score(doc=4229,freq=5.0), product of:
              0.115868695 = queryWeight, product of:
                1.848414 = boost
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.01334445 = queryNorm
              0.98474205 = fieldWeight in 4229, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.09375 = fieldNorm(doc=4229)
          0.105103076 = weight(abstract_txt:experiments in 4229) [ClassicSimilarity], result of:
            0.105103076 = score(doc=4229,freq=2.0), product of:
              0.14887805 = queryWeight, product of:
                2.0952291 = boost
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.01334445 = queryNorm
              0.70596755 = fieldWeight in 4229, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.09375 = fieldNorm(doc=4229)
          0.13995877 = weight(abstract_txt:collection in 4229) [ClassicSimilarity], result of:
            0.13995877 = score(doc=4229,freq=2.0), product of:
              0.22703725 = queryWeight, product of:
                3.659146 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.01334445 = queryNorm
              0.6164573 = fieldWeight in 4229, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.09375 = fieldNorm(doc=4229)
          0.219101 = weight(abstract_txt:test in 4229) [ClassicSimilarity], result of:
            0.219101 = score(doc=4229,freq=3.0), product of:
              0.26740092 = queryWeight, product of:
                3.9711165 = boost
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.01334445 = queryNorm
              0.81937265 = fieldWeight in 4229, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.046027 = idf(docFreq=776, maxDocs=44421)
                0.09375 = fieldNorm(doc=4229)
        0.2 = coord(5/25)