Document (#11012)

Author
Story, R.E.
Title
¬An explanation of the effectiveness of latent semantic indexing by means of a Baysian regression model
Source
Information processing and management. 32(1996) no.3, S.329-344
Year
1996
Abstract
Latent Semantic Indexing (LSI) is an effective automated method for determining if a document is relevant to a reader based on a few words or an abstract describing the reader's needs. A particular feature of LSI is its ability to deal automatically with synonyms. Compares LSI to statistical regression and Bayesian methods. The relationships found can be useful in explaining the performance of LSI and in suggesting variations on the LSI approach
Object
Latent Semantic Indexing

Similar documents (content)

  1. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.22
    0.2171849 = sum of:
      0.2171849 = product of:
        0.77566034 = sum of:
          0.069063514 = weight(abstract_txt:means in 1690) [ClassicSimilarity], result of:
            0.069063514 = score(doc=1690,freq=3.0), product of:
              0.101759404 = queryWeight, product of:
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.020288553 = queryNorm
              0.6786942 = fieldWeight in 1690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.078125 = fieldNorm(doc=1690)
          0.041826278 = weight(abstract_txt:effectiveness in 1690) [ClassicSimilarity], result of:
            0.041826278 = score(doc=1690,freq=1.0), product of:
              0.10505466 = queryWeight, product of:
                1.0160624 = boost
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.020288553 = queryNorm
              0.39813823 = fieldWeight in 1690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.078125 = fieldNorm(doc=1690)
          0.05392627 = weight(abstract_txt:statistical in 1690) [ClassicSimilarity], result of:
            0.05392627 = score(doc=1690,freq=1.0), product of:
              0.12444652 = queryWeight, product of:
                1.10587 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.020288553 = queryNorm
              0.43332887 = fieldWeight in 1690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.078125 = fieldNorm(doc=1690)
          0.064982735 = weight(abstract_txt:feature in 1690) [ClassicSimilarity], result of:
            0.064982735 = score(doc=1690,freq=1.0), product of:
              0.14092277 = queryWeight, product of:
                1.1768017 = boost
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.020288553 = queryNorm
              0.46112302 = fieldWeight in 1690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.078125 = fieldNorm(doc=1690)
          0.07359072 = weight(abstract_txt:indexing in 1690) [ClassicSimilarity], result of:
            0.07359072 = score(doc=1690,freq=2.0), product of:
              0.15310799 = queryWeight, product of:
                1.734709 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.020288553 = queryNorm
              0.48064584 = fieldWeight in 1690, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.078125 = fieldNorm(doc=1690)
          0.09807335 = weight(abstract_txt:semantic in 1690) [ClassicSimilarity], result of:
            0.09807335 = score(doc=1690,freq=3.0), product of:
              0.16197678 = queryWeight, product of:
                1.7842433 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.020288553 = queryNorm
              0.6054778 = fieldWeight in 1690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.078125 = fieldNorm(doc=1690)
          0.37419748 = weight(abstract_txt:latent in 1690) [ClassicSimilarity], result of:
            0.37419748 = score(doc=1690,freq=3.0), product of:
              0.39550564 = queryWeight, product of:
                2.7880723 = boost
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.020288553 = queryNorm
              0.94612426 = fieldWeight in 1690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.078125 = fieldNorm(doc=1690)
        0.28 = coord(7/25)
    
  2. He, X.; Cai, D.; Liu, H.; Ma, W.Y.: Locality preserving indexing for document representation (2004) 0.17
    0.16620307 = sum of:
      0.16620307 = product of:
        1.3850256 = sum of:
          0.29436287 = weight(abstract_txt:indexing in 5079) [ClassicSimilarity], result of:
            0.29436287 = score(doc=5079,freq=2.0), product of:
              0.15310799 = queryWeight, product of:
                1.734709 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.020288553 = queryNorm
              1.9225833 = fieldWeight in 5079, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.3125 = fieldNorm(doc=5079)
          0.22649069 = weight(abstract_txt:semantic in 5079) [ClassicSimilarity], result of:
            0.22649069 = score(doc=5079,freq=1.0), product of:
              0.16197678 = queryWeight, product of:
                1.7842433 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.020288553 = queryNorm
              1.3982911 = fieldWeight in 5079, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.3125 = fieldNorm(doc=5079)
          0.86417204 = weight(abstract_txt:latent in 5079) [ClassicSimilarity], result of:
            0.86417204 = score(doc=5079,freq=1.0), product of:
              0.39550564 = queryWeight, product of:
                2.7880723 = boost
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.020288553 = queryNorm
              2.1849804 = fieldWeight in 5079, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.3125 = fieldNorm(doc=5079)
        0.12 = coord(3/25)
    
  3. Dumais, S.T.: Latent semantic analysis (2003) 0.13
    0.13287844 = sum of:
      0.13287844 = product of:
        0.36910677 = sum of:
          0.015949536 = weight(abstract_txt:means in 3462) [ClassicSimilarity], result of:
            0.015949536 = score(doc=3462,freq=1.0), product of:
              0.101759404 = queryWeight, product of:
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.020288553 = queryNorm
              0.15673772 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.016730512 = weight(abstract_txt:effectiveness in 3462) [ClassicSimilarity], result of:
            0.016730512 = score(doc=3462,freq=1.0), product of:
              0.10505466 = queryWeight, product of:
                1.0160624 = boost
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.020288553 = queryNorm
              0.1592553 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.06727424 = weight(abstract_txt:words in 3462) [ClassicSimilarity], result of:
            0.06727424 = score(doc=3462,freq=12.0), product of:
              0.11603294 = queryWeight, product of:
                1.0678331 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.020288553 = queryNorm
              0.5797857 = fieldWeight in 3462, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.030505303 = weight(abstract_txt:statistical in 3462) [ClassicSimilarity], result of:
            0.030505303 = score(doc=3462,freq=2.0), product of:
              0.12444652 = queryWeight, product of:
                1.10587 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.020288553 = queryNorm
              0.24512781 = fieldWeight in 3462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.02678443 = weight(abstract_txt:deal in 3462) [ClassicSimilarity], result of:
            0.02678443 = score(doc=3462,freq=1.0), product of:
              0.14376862 = queryWeight, product of:
                1.1886247 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.020288553 = queryNorm
              0.18630233 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.056779888 = weight(abstract_txt:synonyms in 3462) [ClassicSimilarity], result of:
            0.056779888 = score(doc=3462,freq=1.0), product of:
              0.23724963 = queryWeight, product of:
                1.5269172 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.020288553 = queryNorm
              0.23932551 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.029436285 = weight(abstract_txt:indexing in 3462) [ClassicSimilarity], result of:
            0.029436285 = score(doc=3462,freq=2.0), product of:
              0.15310799 = queryWeight, product of:
                1.734709 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.020288553 = queryNorm
              0.19225833 = fieldWeight in 3462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.03922934 = weight(abstract_txt:semantic in 3462) [ClassicSimilarity], result of:
            0.03922934 = score(doc=3462,freq=3.0), product of:
              0.16197678 = queryWeight, product of:
                1.7842433 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.020288553 = queryNorm
              0.24219112 = fieldWeight in 3462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.08641721 = weight(abstract_txt:latent in 3462) [ClassicSimilarity], result of:
            0.08641721 = score(doc=3462,freq=1.0), product of:
              0.39550564 = queryWeight, product of:
                2.7880723 = boost
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.020288553 = queryNorm
              0.21849805 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
        0.36 = coord(9/25)
    
  4. Gordon, M.D.; Dumais, S.: Using latent semantic indexing for literature based discovery (1998) 0.13
    0.13217318 = sum of:
      0.13217318 = product of:
        0.6608659 = sum of:
          0.07098155 = weight(abstract_txt:effectiveness in 5892) [ClassicSimilarity], result of:
            0.07098155 = score(doc=5892,freq=2.0), product of:
              0.10505466 = queryWeight, product of:
                1.0160624 = boost
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.020288553 = queryNorm
              0.675663 = fieldWeight in 5892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.09375 = fieldNorm(doc=5892)
          0.06471152 = weight(abstract_txt:statistical in 5892) [ClassicSimilarity], result of:
            0.06471152 = score(doc=5892,freq=1.0), product of:
              0.12444652 = queryWeight, product of:
                1.10587 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.020288553 = queryNorm
              0.5199946 = fieldWeight in 5892, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.09375 = fieldNorm(doc=5892)
          0.062443793 = weight(abstract_txt:indexing in 5892) [ClassicSimilarity], result of:
            0.062443793 = score(doc=5892,freq=1.0), product of:
              0.15310799 = queryWeight, product of:
                1.734709 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.020288553 = queryNorm
              0.4078415 = fieldWeight in 5892, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.09375 = fieldNorm(doc=5892)
          0.09609187 = weight(abstract_txt:semantic in 5892) [ClassicSimilarity], result of:
            0.09609187 = score(doc=5892,freq=2.0), product of:
              0.16197678 = queryWeight, product of:
                1.7842433 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.020288553 = queryNorm
              0.5932447 = fieldWeight in 5892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.09375 = fieldNorm(doc=5892)
          0.3666372 = weight(abstract_txt:latent in 5892) [ClassicSimilarity], result of:
            0.3666372 = score(doc=5892,freq=2.0), product of:
              0.39550564 = queryWeight, product of:
                2.7880723 = boost
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.020288553 = queryNorm
              0.92700875 = fieldWeight in 5892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.09375 = fieldNorm(doc=5892)
        0.2 = coord(5/25)
    
  5. Ding, C.H.Q.: ¬A probabilistic model for Latent Semantic Indexing (2005) 0.13
    0.12582088 = sum of:
      0.12582088 = product of:
        0.6291044 = sum of:
          0.06866149 = weight(abstract_txt:words in 4459) [ClassicSimilarity], result of:
            0.06866149 = score(doc=4459,freq=2.0), product of:
              0.11603294 = queryWeight, product of:
                1.0678331 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.020288553 = queryNorm
              0.5917413 = fieldWeight in 4459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.078125 = fieldNorm(doc=4459)
          0.07626326 = weight(abstract_txt:statistical in 4459) [ClassicSimilarity], result of:
            0.07626326 = score(doc=4459,freq=2.0), product of:
              0.12444652 = queryWeight, product of:
                1.10587 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.020288553 = queryNorm
              0.61281955 = fieldWeight in 4459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.078125 = fieldNorm(doc=4459)
          0.052036494 = weight(abstract_txt:indexing in 4459) [ClassicSimilarity], result of:
            0.052036494 = score(doc=4459,freq=1.0), product of:
              0.15310799 = queryWeight, product of:
                1.734709 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.020288553 = queryNorm
              0.33986792 = fieldWeight in 4459, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.078125 = fieldNorm(doc=4459)
          0.12661214 = weight(abstract_txt:semantic in 4459) [ClassicSimilarity], result of:
            0.12661214 = score(doc=4459,freq=5.0), product of:
              0.16197678 = queryWeight, product of:
                1.7842433 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.020288553 = queryNorm
              0.7816685 = fieldWeight in 4459, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.078125 = fieldNorm(doc=4459)
          0.30553097 = weight(abstract_txt:latent in 4459) [ClassicSimilarity], result of:
            0.30553097 = score(doc=4459,freq=2.0), product of:
              0.39550564 = queryWeight, product of:
                2.7880723 = boost
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.020288553 = queryNorm
              0.77250725 = fieldWeight in 4459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.078125 = fieldNorm(doc=4459)
        0.2 = coord(5/25)