Document (#11949)

Author
Srinivasan, P.
Title
On generalizing the Two-Poisson Model
Source
Journal of the American Society for Information Science. 41(1990) no.1, S.61-66.
Year
1990
Abstract
Automatic indexing is one of the important functions of a modern document retrieval system. Numerous techniques for this function have been proposed in the literature ranging from purely statistical to linguistically complex mechanisms. Most result from examining properties of terms. Examines term distribution within the framework of the Poisson models. Specifically examines the effectiveness of the Two-Poisson and the Three-Poisson model to see if generalisation results in increased effectiveness. The results show that the Two-Poisson model is only moderately effective in identifying index terms. In addition, generalisation to the Three-Poisson does not give any additional power. The only Poisson model which consistently works well is the basic One-Poisson model. Also discusses term distribution information.
Theme
Automatisches Indexieren

Similar documents (author)

  1. Srinivasan, P.: Expert interface to Library of Congress Subject Headings (1990/91) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 2208) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 2208, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=2208)
    
  2. Srinivasan, P.: Query expansion and MEDLINE (1996) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 67) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 67, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=67)
    
  3. Srinivasan, P.: Intelligent information retrieval using rough set approximations (1989) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 2594) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 2594, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=2594)
    
  4. Srinivasan, P.: Optimal document-indexing vocabulary for MEDLINE (1996) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 6702) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 6702, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=6702)
    
  5. Srinivasan, P.: Thesaurus construction (1992) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 4504) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 4504, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=4504)
    

Similar documents (content)

  1. Lassalle, E.; Lassalle, E.: Semantic models in information retrieval (2012) 0.16
    0.16032971 = sum of:
      0.16032971 = product of:
        0.6680405 = sum of:
          0.018401535 = weight(abstract_txt:properties in 1097) [ClassicSimilarity], result of:
            0.018401535 = score(doc=1097,freq=1.0), product of:
              0.05008565 = queryWeight, product of:
                1.0405431 = boost
                5.878422 = idf(docFreq=337, maxDocs=44421)
                0.008188277 = queryNorm
              0.36740136 = fieldWeight in 1097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.878422 = idf(docFreq=337, maxDocs=44421)
                0.0625 = fieldNorm(doc=1097)
          0.011979607 = weight(abstract_txt:terms in 1097) [ClassicSimilarity], result of:
            0.011979607 = score(doc=1097,freq=1.0), product of:
              0.047400434 = queryWeight, product of:
                1.43156 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.008188277 = queryNorm
              0.252732 = fieldWeight in 1097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0625 = fieldNorm(doc=1097)
          0.0137693025 = weight(abstract_txt:only in 1097) [ClassicSimilarity], result of:
            0.0137693025 = score(doc=1097,freq=1.0), product of:
              0.052011 = queryWeight, product of:
                1.4995675 = boost
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.008188277 = queryNorm
              0.26473826 = fieldWeight in 1097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.0625 = fieldNorm(doc=1097)
          0.070921615 = weight(abstract_txt:generalizing in 1097) [ClassicSimilarity], result of:
            0.070921615 = score(doc=1097,freq=1.0), product of:
              0.1231203 = queryWeight, product of:
                1.6314293 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.008188277 = queryNorm
              0.5760351 = fieldWeight in 1097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=1097)
          0.057231016 = weight(abstract_txt:model in 1097) [ClassicSimilarity], result of:
            0.057231016 = score(doc=1097,freq=4.0), product of:
              0.11495686 = queryWeight, product of:
                3.5249736 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.008188277 = queryNorm
              0.49784777 = fieldWeight in 1097, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0625 = fieldNorm(doc=1097)
          0.49573743 = weight(abstract_txt:poisson in 1097) [ClassicSimilarity], result of:
            0.49573743 = score(doc=1097,freq=1.0), product of:
              0.9002057 = queryWeight, product of:
                12.477262 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.008188277 = queryNorm
              0.5506935 = fieldWeight in 1097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0625 = fieldNorm(doc=1097)
        0.24 = coord(6/25)
    
  2. Kuperman, V.: Productivity in the Internet mailing lists : a bibliometric analysis (2006) 0.16
    0.15993284 = sum of:
      0.15993284 = product of:
        0.7996642 = sum of:
          0.0076266252 = weight(abstract_txt:results in 5907) [ClassicSimilarity], result of:
            0.0076266252 = score(doc=5907,freq=1.0), product of:
              0.035078596 = queryWeight, product of:
                1.2315145 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.008188277 = queryNorm
              0.21741535 = fieldWeight in 5907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=5907)
          0.017426867 = weight(abstract_txt:examines in 5907) [ClassicSimilarity], result of:
            0.017426867 = score(doc=5907,freq=1.0), product of:
              0.060855545 = queryWeight, product of:
                1.6220659 = boost
                4.581832 = idf(docFreq=1235, maxDocs=44421)
                0.008188277 = queryNorm
              0.2863645 = fieldWeight in 5907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.581832 = idf(docFreq=1235, maxDocs=44421)
                0.0625 = fieldNorm(doc=5907)
          0.04491661 = weight(abstract_txt:distribution in 5907) [ClassicSimilarity], result of:
            0.04491661 = score(doc=5907,freq=2.0), product of:
              0.09079918 = queryWeight, product of:
                1.9813417 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.008188277 = queryNorm
              0.4946808 = fieldWeight in 5907, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.0625 = fieldNorm(doc=5907)
          0.028615508 = weight(abstract_txt:model in 5907) [ClassicSimilarity], result of:
            0.028615508 = score(doc=5907,freq=1.0), product of:
              0.11495686 = queryWeight, product of:
                3.5249736 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.008188277 = queryNorm
              0.24892388 = fieldWeight in 5907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0625 = fieldNorm(doc=5907)
          0.7010786 = weight(abstract_txt:poisson in 5907) [ClassicSimilarity], result of:
            0.7010786 = score(doc=5907,freq=2.0), product of:
              0.9002057 = queryWeight, product of:
                12.477262 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.008188277 = queryNorm
              0.7787982 = fieldWeight in 5907, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0625 = fieldNorm(doc=5907)
        0.2 = coord(5/25)
    
  3. Kim, W.; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms (2001) 0.15
    0.15115303 = sum of:
      0.15115303 = product of:
        0.6298043 = sum of:
          0.017969409 = weight(abstract_txt:terms in 188) [ClassicSimilarity], result of:
            0.017969409 = score(doc=188,freq=4.0), product of:
              0.047400434 = queryWeight, product of:
                1.43156 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.008188277 = queryNorm
              0.379098 = fieldWeight in 188, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.010326977 = weight(abstract_txt:only in 188) [ClassicSimilarity], result of:
            0.010326977 = score(doc=188,freq=1.0), product of:
              0.052011 = queryWeight, product of:
                1.4995675 = boost
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.008188277 = queryNorm
              0.1985537 = fieldWeight in 188, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.025935866 = weight(abstract_txt:three in 188) [ClassicSimilarity], result of:
            0.025935866 = score(doc=188,freq=5.0), product of:
              0.056198355 = queryWeight, product of:
                1.5587635 = boost
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.008188277 = queryNorm
              0.4615058 = fieldWeight in 188, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.025942488 = weight(abstract_txt:term in 188) [ClassicSimilarity], result of:
            0.025942488 = score(doc=188,freq=3.0), product of:
              0.06664186 = queryWeight, product of:
                1.6974304 = boost
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.008188277 = queryNorm
              0.38928217 = fieldWeight in 188, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.023820631 = weight(abstract_txt:distribution in 188) [ClassicSimilarity], result of:
            0.023820631 = score(doc=188,freq=1.0), product of:
              0.09079918 = queryWeight, product of:
                1.9813417 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.008188277 = queryNorm
              0.26234412 = fieldWeight in 188, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
          0.52580893 = weight(abstract_txt:poisson in 188) [ClassicSimilarity], result of:
            0.52580893 = score(doc=188,freq=2.0), product of:
              0.9002057 = queryWeight, product of:
                12.477262 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.008188277 = queryNorm
              0.5840987 = fieldWeight in 188, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.046875 = fieldNorm(doc=188)
        0.24 = coord(6/25)
    
  4. Huber, J.C.: ¬A new model that generated Lotka's law (2002) 0.12
    0.11850538 = sum of:
      0.11850538 = product of:
        0.74065864 = sum of:
          0.019331453 = weight(abstract_txt:three in 1248) [ClassicSimilarity], result of:
            0.019331453 = score(doc=1248,freq=1.0), product of:
              0.056198355 = queryWeight, product of:
                1.5587635 = boost
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.008188277 = queryNorm
              0.34398612 = fieldWeight in 1248, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.078125 = fieldNorm(doc=1248)
          0.039701052 = weight(abstract_txt:distribution in 1248) [ClassicSimilarity], result of:
            0.039701052 = score(doc=1248,freq=1.0), product of:
              0.09079918 = queryWeight, product of:
                1.9813417 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.008188277 = queryNorm
              0.43724018 = fieldWeight in 1248, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.078125 = fieldNorm(doc=1248)
          0.061954394 = weight(abstract_txt:model in 1248) [ClassicSimilarity], result of:
            0.061954394 = score(doc=1248,freq=3.0), product of:
              0.11495686 = queryWeight, product of:
                3.5249736 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.008188277 = queryNorm
              0.538936 = fieldWeight in 1248, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.078125 = fieldNorm(doc=1248)
          0.61967176 = weight(abstract_txt:poisson in 1248) [ClassicSimilarity], result of:
            0.61967176 = score(doc=1248,freq=1.0), product of:
              0.9002057 = queryWeight, product of:
                12.477262 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.008188277 = queryNorm
              0.6883669 = fieldWeight in 1248, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.078125 = fieldNorm(doc=1248)
        0.16 = coord(4/25)
    
  5. Lee, C.; Lee, G.G.: Probabilistic information retrieval model for a dependence structured indexing system (2005) 0.12
    0.115285605 = sum of:
      0.115285605 = product of:
        0.72053504 = sum of:
          0.014974508 = weight(abstract_txt:terms in 2004) [ClassicSimilarity], result of:
            0.014974508 = score(doc=2004,freq=1.0), product of:
              0.047400434 = queryWeight, product of:
                1.43156 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.008188277 = queryNorm
              0.31591502 = fieldWeight in 2004, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.078125 = fieldNorm(doc=2004)
          0.035303254 = weight(abstract_txt:term in 2004) [ClassicSimilarity], result of:
            0.035303254 = score(doc=2004,freq=2.0), product of:
              0.06664186 = queryWeight, product of:
                1.6974304 = boost
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.008188277 = queryNorm
              0.52974594 = fieldWeight in 2004, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.078125 = fieldNorm(doc=2004)
          0.05058555 = weight(abstract_txt:model in 2004) [ClassicSimilarity], result of:
            0.05058555 = score(doc=2004,freq=2.0), product of:
              0.11495686 = queryWeight, product of:
                3.5249736 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.008188277 = queryNorm
              0.4400394 = fieldWeight in 2004, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.078125 = fieldNorm(doc=2004)
          0.61967176 = weight(abstract_txt:poisson in 2004) [ClassicSimilarity], result of:
            0.61967176 = score(doc=2004,freq=1.0), product of:
              0.9002057 = queryWeight, product of:
                12.477262 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.008188277 = queryNorm
              0.6883669 = fieldWeight in 2004, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.078125 = fieldNorm(doc=2004)
        0.16 = coord(4/25)