Document (#26427)

Author
Bartell, B.T.
Cottrell, G.W.
Belew, R.K.
Title
Representing documents using an explicit model of their similarities
Source
Journal of the American Society for Information Science. 46(1995) no.4, S.254-271
Year
1995
Abstract
Proposes a method for creating vector space representations of documents based on modelling target interdocument similariyt values. The target similarity values are assumed to capture semantic relationships, or associations, between the documents. The vector representations are chosen so that the inner product similarities between document vector pairs closely match their target interdocument similarities. The method is closely related to the Latent Semantic Indexing approach
Object
Latent Semantic Indexing

Similar documents (content)

  1. Martin, D.I.; Berry, M.W.: Latent Semantic Indexing (2009) 0.23
    0.22562292 = sum of:
      0.22562292 = product of:
        0.80579615 = sum of:
          0.02031541 = weight(abstract_txt:between in 821) [ClassicSimilarity], result of:
            0.02031541 = score(doc=821,freq=1.0), product of:
              0.075211875 = queryWeight, product of:
                1.2499948 = boost
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.017403198 = queryNorm
              0.2701091 = fieldWeight in 821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.078125 = fieldNorm(doc=821)
          0.11881076 = weight(abstract_txt:latent in 821) [ClassicSimilarity], result of:
            0.11881076 = score(doc=821,freq=2.0), product of:
              0.1537989 = queryWeight, product of:
                1.2639405 = boost
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.017403198 = queryNorm
              0.77250725 = fieldWeight in 821, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.078125 = fieldNorm(doc=821)
          0.08807464 = weight(abstract_txt:semantic in 821) [ClassicSimilarity], result of:
            0.08807464 = score(doc=821,freq=4.0), product of:
              0.12597468 = queryWeight, product of:
                1.6177322 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.017403198 = queryNorm
              0.69914556 = fieldWeight in 821, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.078125 = fieldNorm(doc=821)
          0.06329729 = weight(abstract_txt:method in 821) [ClassicSimilarity], result of:
            0.06329729 = score(doc=821,freq=2.0), product of:
              0.12734525 = queryWeight, product of:
                1.6265086 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.017403198 = queryNorm
              0.4970526 = fieldWeight in 821, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.078125 = fieldNorm(doc=821)
          0.1033811 = weight(abstract_txt:documents in 821) [ClassicSimilarity], result of:
            0.1033811 = score(doc=821,freq=4.0), product of:
              0.16046262 = queryWeight, product of:
                2.2361326 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017403198 = queryNorm
              0.64426905 = fieldWeight in 821, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=821)
          0.20433795 = weight(abstract_txt:vector in 821) [ClassicSimilarity], result of:
            0.20433795 = score(doc=821,freq=1.0), product of:
              0.4011737 = queryWeight, product of:
                3.5357118 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.017403198 = queryNorm
              0.5093503 = fieldWeight in 821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.078125 = fieldNorm(doc=821)
          0.20757899 = weight(abstract_txt:similarities in 821) [ClassicSimilarity], result of:
            0.20757899 = score(doc=821,freq=1.0), product of:
              0.4054046 = queryWeight, product of:
                3.554307 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.017403198 = queryNorm
              0.5120292 = fieldWeight in 821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.078125 = fieldNorm(doc=821)
        0.28 = coord(7/25)
    
  2. Bartell, B.T.; Cottrell, G.W.; Belew, R.K.: Optimizing similarity using multi-query relevance feedback (1998) 0.19
    0.18993904 = sum of:
      0.18993904 = product of:
        0.6783537 = sum of:
          0.077450074 = weight(abstract_txt:similarity in 2152) [ClassicSimilarity], result of:
            0.077450074 = score(doc=2152,freq=4.0), product of:
              0.106494516 = queryWeight, product of:
                1.0517527 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.017403198 = queryNorm
              0.72726816 = fieldWeight in 2152, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.05079261 = weight(abstract_txt:match in 2152) [ClassicSimilarity], result of:
            0.05079261 = score(doc=2152,freq=1.0), product of:
              0.12760462 = queryWeight, product of:
                1.1512859 = boost
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.017403198 = queryNorm
              0.3980468 = fieldWeight in 2152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.016252328 = weight(abstract_txt:between in 2152) [ClassicSimilarity], result of:
            0.016252328 = score(doc=2152,freq=1.0), product of:
              0.075211875 = queryWeight, product of:
                1.2499948 = boost
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.017403198 = queryNorm
              0.21608727 = fieldWeight in 2152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.06201842 = weight(abstract_txt:method in 2152) [ClassicSimilarity], result of:
            0.06201842 = score(doc=2152,freq=3.0), product of:
              0.12734525 = queryWeight, product of:
                1.6265086 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.017403198 = queryNorm
              0.4870101 = fieldWeight in 2152, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.071624525 = weight(abstract_txt:documents in 2152) [ClassicSimilarity], result of:
            0.071624525 = score(doc=2152,freq=3.0), product of:
              0.16046262 = queryWeight, product of:
                2.2361326 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017403198 = queryNorm
              0.4463627 = fieldWeight in 2152, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.16347036 = weight(abstract_txt:vector in 2152) [ClassicSimilarity], result of:
            0.16347036 = score(doc=2152,freq=1.0), product of:
              0.4011737 = queryWeight, product of:
                3.5357118 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.017403198 = queryNorm
              0.40748024 = fieldWeight in 2152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.23674543 = weight(abstract_txt:target in 2152) [ClassicSimilarity], result of:
            0.23674543 = score(doc=2152,freq=2.0), product of:
              0.40758437 = queryWeight, product of:
                3.5638497 = boost
                6.571569 = idf(docFreq=168, maxDocs=44421)
                0.017403198 = queryNorm
              0.5808501 = fieldWeight in 2152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.571569 = idf(docFreq=168, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
        0.28 = coord(7/25)
    
  3. Liddy, E.D.: ¬An alternative representation for documents and queries (1993) 0.18
    0.1789966 = sum of:
      0.1789966 = product of:
        0.6392735 = sum of:
          0.048406295 = weight(abstract_txt:similarity in 7812) [ClassicSimilarity], result of:
            0.048406295 = score(doc=7812,freq=1.0), product of:
              0.106494516 = queryWeight, product of:
                1.0517527 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.017403198 = queryNorm
              0.4545426 = fieldWeight in 7812, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.078125 = fieldNorm(doc=7812)
          0.015399167 = weight(abstract_txt:their in 7812) [ClassicSimilarity], result of:
            0.015399167 = score(doc=7812,freq=1.0), product of:
              0.062526986 = queryWeight, product of:
                1.1397215 = boost
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.017403198 = queryNorm
              0.24628034 = fieldWeight in 7812, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.078125 = fieldNorm(doc=7812)
          0.04403732 = weight(abstract_txt:semantic in 7812) [ClassicSimilarity], result of:
            0.04403732 = score(doc=7812,freq=1.0), product of:
              0.12597468 = queryWeight, product of:
                1.6177322 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.017403198 = queryNorm
              0.34957278 = fieldWeight in 7812, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.078125 = fieldNorm(doc=7812)
          0.06329729 = weight(abstract_txt:method in 7812) [ClassicSimilarity], result of:
            0.06329729 = score(doc=7812,freq=2.0), product of:
              0.12734525 = queryWeight, product of:
                1.6265086 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.017403198 = queryNorm
              0.4970526 = fieldWeight in 7812, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.078125 = fieldNorm(doc=7812)
          0.1060545 = weight(abstract_txt:representations in 7812) [ClassicSimilarity], result of:
            0.1060545 = score(doc=7812,freq=1.0), product of:
              0.22633693 = queryWeight, product of:
                2.1684165 = boost
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.017403198 = queryNorm
              0.46856913 = fieldWeight in 7812, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.078125 = fieldNorm(doc=7812)
          0.073101476 = weight(abstract_txt:documents in 7812) [ClassicSimilarity], result of:
            0.073101476 = score(doc=7812,freq=2.0), product of:
              0.16046262 = queryWeight, product of:
                2.2361326 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017403198 = queryNorm
              0.455567 = fieldWeight in 7812, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=7812)
          0.2889775 = weight(abstract_txt:vector in 7812) [ClassicSimilarity], result of:
            0.2889775 = score(doc=7812,freq=2.0), product of:
              0.4011737 = queryWeight, product of:
                3.5357118 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.017403198 = queryNorm
              0.7203301 = fieldWeight in 7812, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.078125 = fieldNorm(doc=7812)
        0.28 = coord(7/25)
    
  4. Shibata, N.; Kajikawa, Y.; Sakata, I.: Measuring relatedness between communities in a citation network (2011) 0.18
    0.178436 = sum of:
      0.178436 = product of:
        0.6372714 = sum of:
          0.048406295 = weight(abstract_txt:similarity in 484) [ClassicSimilarity], result of:
            0.048406295 = score(doc=484,freq=1.0), product of:
              0.106494516 = queryWeight, product of:
                1.0517527 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.017403198 = queryNorm
              0.4545426 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.078125 = fieldNorm(doc=484)
          0.062920064 = weight(abstract_txt:capture in 484) [ClassicSimilarity], result of:
            0.062920064 = score(doc=484,freq=1.0), product of:
              0.1268388 = queryWeight, product of:
                1.147826 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.017403198 = queryNorm
              0.49606323 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.078125 = fieldNorm(doc=484)
          0.02031541 = weight(abstract_txt:between in 484) [ClassicSimilarity], result of:
            0.02031541 = score(doc=484,freq=1.0), product of:
              0.075211875 = queryWeight, product of:
                1.2499948 = boost
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.017403198 = queryNorm
              0.2701091 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.078125 = fieldNorm(doc=484)
          0.04403732 = weight(abstract_txt:semantic in 484) [ClassicSimilarity], result of:
            0.04403732 = score(doc=484,freq=1.0), product of:
              0.12597468 = queryWeight, product of:
                1.6177322 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.017403198 = queryNorm
              0.34957278 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.078125 = fieldNorm(doc=484)
          0.04475794 = weight(abstract_txt:method in 484) [ClassicSimilarity], result of:
            0.04475794 = score(doc=484,freq=1.0), product of:
              0.12734525 = queryWeight, product of:
                1.6265086 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.017403198 = queryNorm
              0.35146925 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.078125 = fieldNorm(doc=484)
          0.20757899 = weight(abstract_txt:similarities in 484) [ClassicSimilarity], result of:
            0.20757899 = score(doc=484,freq=1.0), product of:
              0.4054046 = queryWeight, product of:
                3.554307 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.017403198 = queryNorm
              0.5120292 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.078125 = fieldNorm(doc=484)
          0.20925538 = weight(abstract_txt:target in 484) [ClassicSimilarity], result of:
            0.20925538 = score(doc=484,freq=1.0), product of:
              0.40758437 = queryWeight, product of:
                3.5638497 = boost
                6.571569 = idf(docFreq=168, maxDocs=44421)
                0.017403198 = queryNorm
              0.51340383 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.571569 = idf(docFreq=168, maxDocs=44421)
                0.078125 = fieldNorm(doc=484)
        0.28 = coord(7/25)
    
  5. Dominich, S.; Kiezer, T.: ¬A measure theoretic approach to information retrieval (2007) 0.16
    0.16296703 = sum of:
      0.16296703 = product of:
        0.6790293 = sum of:
          0.05223347 = weight(abstract_txt:product in 1445) [ClassicSimilarity], result of:
            0.05223347 = score(doc=1445,freq=2.0), product of:
              0.11279328 = queryWeight, product of:
                1.0824095 = boost
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.017403198 = queryNorm
              0.46309024 = fieldWeight in 1445, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1445)
          0.014220788 = weight(abstract_txt:between in 1445) [ClassicSimilarity], result of:
            0.014220788 = score(doc=1445,freq=1.0), product of:
              0.075211875 = queryWeight, product of:
                1.2499948 = boost
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.017403198 = queryNorm
              0.18907636 = fieldWeight in 1445, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4573963 = idf(docFreq=3804, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1445)
          0.05880833 = weight(abstract_txt:latent in 1445) [ClassicSimilarity], result of:
            0.05880833 = score(doc=1445,freq=1.0), product of:
              0.1537989 = queryWeight, product of:
                1.2639405 = boost
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.017403198 = queryNorm
              0.3823716 = fieldWeight in 1445, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9919376 = idf(docFreq=110, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1445)
          0.14450143 = weight(abstract_txt:inner in 1445) [ClassicSimilarity], result of:
            0.14450143 = score(doc=1445,freq=2.0), product of:
              0.22227918 = queryWeight, product of:
                1.5194954 = boost
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.017403198 = queryNorm
              0.65008986 = fieldWeight in 1445, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1445)
          0.030826125 = weight(abstract_txt:semantic in 1445) [ClassicSimilarity], result of:
            0.030826125 = score(doc=1445,freq=1.0), product of:
              0.12597468 = queryWeight, product of:
                1.6177322 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.017403198 = queryNorm
              0.24470095 = fieldWeight in 1445, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1445)
          0.37843916 = weight(abstract_txt:vector in 1445) [ClassicSimilarity], result of:
            0.37843916 = score(doc=1445,freq=7.0), product of:
              0.4011737 = queryWeight, product of:
                3.5357118 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.017403198 = queryNorm
              0.94332993 = fieldWeight in 1445, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1445)
        0.24 = coord(6/25)