Document (#37373)

Author
Huang, L.
Milne, D.
Frank, E.
Witten, I.H.
Title
Learning a concept-based document similarity measure
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.8, S.1593-1608
Year
2012
Abstract
Document similarity measures are crucial components of many text-analysis tasks, including information retrieval, document classification, and document clustering. Conventional measures are brittle: They estimate the surface overlap between documents based on the words they mention and ignore deeper semantic connections. We propose a new measure that assesses similarity at both the lexical and semantic levels, and learns from human judgments how to combine them by using machine-learning techniques. Experiments show that the new measure produces values for documents that are more consistent with people's judgments than people are with each other. We also use it to classify and cluster large document sets covering different genres and topics, and find that it improves both classification and clustering performance.
Theme
Semantisches Umfeld in Indexierung u. Retrieval

Similar documents (author)

  1. Witten, I.H.; Frank, E.: Data Mining : Praktische Werkzeuge und Techniken für das maschinelle Lernen (2000) 2.24
    2.2361298 = sum of:
      2.2361298 = product of:
        4.4722595 = sum of:
          1.9915707 = weight(author_txt:frank in 833) [ClassicSimilarity], result of:
            1.9915707 = score(doc=833,freq=1.0), product of:
              0.46499577 = queryWeight, product of:
                1.1930858 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.04549887 = queryNorm
              4.2829866 = fieldWeight in 833, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.5 = fieldNorm(doc=833)
          2.480689 = weight(author_txt:witten in 833) [ClassicSimilarity], result of:
            2.480689 = score(doc=833,freq=1.0), product of:
              0.5383112 = queryWeight, product of:
                1.2837011 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.04549887 = queryNorm
              4.6082807 = fieldWeight in 833, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.5 = fieldNorm(doc=833)
        0.5 = coord(2/4)
    
  2. Milne, R.J.: Hypertext and its implications for library services. (1994) 0.96
    0.96360314 = sum of:
      0.96360314 = product of:
        3.8544126 = sum of:
          3.8544126 = weight(author_txt:milne in 106) [ClassicSimilarity], result of:
            3.8544126 = score(doc=106,freq=1.0), product of:
              0.6223251 = queryWeight, product of:
                1.3802439 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.04549887 = queryNorm
              6.1935673 = fieldWeight in 106, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.625 = fieldNorm(doc=106)
        0.25 = coord(1/4)
    
  3. Milne, R.: ¬The Google Library Project at Oxford (2005) 0.96
    0.96360314 = sum of:
      0.96360314 = product of:
        3.8544126 = sum of:
          3.8544126 = weight(author_txt:milne in 203) [ClassicSimilarity], result of:
            3.8544126 = score(doc=203,freq=1.0), product of:
              0.6223251 = queryWeight, product of:
                1.3802439 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.04549887 = queryNorm
              6.1935673 = fieldWeight in 203, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.625 = fieldNorm(doc=203)
        0.25 = coord(1/4)
    
  4. Milne, C.: Developing information architecture through records management classification techniques (2010) 0.96
    0.96360314 = sum of:
      0.96360314 = product of:
        3.8544126 = sum of:
          3.8544126 = weight(author_txt:milne in 929) [ClassicSimilarity], result of:
            3.8544126 = score(doc=929,freq=1.0), product of:
              0.6223251 = queryWeight, product of:
                1.3802439 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.04549887 = queryNorm
              6.1935673 = fieldWeight in 929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.625 = fieldNorm(doc=929)
        0.25 = coord(1/4)
    
  5. Frank, S.; Frank, M.: Spannungsfeld zwischen Bibliotheks-/Informationswissenschaft und Bibliotheksinformatik : Teil 1 von: Bibliothekarische Studiengänge an der HTWK Leipzig: Veränderung und Kontinuität (2019) 0.70
    0.7041266 = sum of:
      0.7041266 = product of:
        2.8165064 = sum of:
          2.8165064 = weight(author_txt:frank in 366) [ClassicSimilarity], result of:
            2.8165064 = score(doc=366,freq=2.0), product of:
              0.46499577 = queryWeight, product of:
                1.1930858 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.04549887 = queryNorm
              6.057058 = fieldWeight in 366, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.5 = fieldNorm(doc=366)
        0.25 = coord(1/4)
    

Similar documents (content)

  1. Mens, G. Le; Kovács; B.; Hannan, M.T.; Pros, G.: Uncovering the semantics of concepts using GPT-4 (2023) 0.40
    0.3984316 = sum of:
      0.3984316 = product of:
        0.90552634 = sum of:
          0.09751363 = weight(abstract_txt:genres in 2305) [ClassicSimilarity], result of:
            0.09751363 = score(doc=2305,freq=2.0), product of:
              0.17480607 = queryWeight, product of:
                1.0736797 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.02257231 = queryNorm
              0.5578389 = fieldWeight in 2305, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.02338227 = weight(abstract_txt:classification in 2305) [ClassicSimilarity], result of:
            0.02338227 = score(doc=2305,freq=1.0), product of:
              0.10710031 = queryWeight, product of:
                1.1885209 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.02257231 = queryNorm
              0.21832122 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.036435083 = weight(abstract_txt:documents in 2305) [ClassicSimilarity], result of:
            0.036435083 = score(doc=2305,freq=2.0), product of:
              0.11425348 = queryWeight, product of:
                1.2275698 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02257231 = queryNorm
              0.31889692 = fieldWeight in 2305, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.032923486 = weight(abstract_txt:semantic in 2305) [ClassicSimilarity], result of:
            0.032923486 = score(doc=2305,freq=1.0), product of:
              0.1345458 = queryWeight, product of:
                1.3321298 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.02257231 = queryNorm
              0.24470095 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.021738812 = weight(abstract_txt:that in 2305) [ClassicSimilarity], result of:
            0.021738812 = score(doc=2305,freq=5.0), product of:
              0.07516981 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02257231 = queryNorm
              0.28919604 = fieldWeight in 2305, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.03918923 = weight(abstract_txt:learning in 2305) [ClassicSimilarity], result of:
            0.03918923 = score(doc=2305,freq=1.0), product of:
              0.15111612 = queryWeight, product of:
                1.4117795 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.02257231 = queryNorm
              0.2593319 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.1311913 = weight(abstract_txt:measures in 2305) [ClassicSimilarity], result of:
            0.1311913 = score(doc=2305,freq=5.0), product of:
              0.19776358 = queryWeight, product of:
                1.6150451 = boost
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.02257231 = queryNorm
              0.6633744 = fieldWeight in 2305, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.11974377 = weight(abstract_txt:judgments in 2305) [ClassicSimilarity], result of:
            0.11974377 = score(doc=2305,freq=1.0), product of:
              0.31820104 = queryWeight, product of:
                2.048624 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.02257231 = queryNorm
              0.37631485 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.17711793 = weight(abstract_txt:measure in 2305) [ClassicSimilarity], result of:
            0.17711793 = score(doc=2305,freq=4.0), product of:
              0.29788712 = queryWeight, product of:
                2.4276326 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.02257231 = queryNorm
              0.5945807 = fieldWeight in 2305, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.15354052 = weight(abstract_txt:similarity in 2305) [ClassicSimilarity], result of:
            0.15354052 = score(doc=2305,freq=2.0), product of:
              0.3412207 = queryWeight, product of:
                2.5982132 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.02257231 = queryNorm
              0.4499742 = fieldWeight in 2305, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
          0.072750285 = weight(abstract_txt:document in 2305) [ClassicSimilarity], result of:
            0.072750285 = score(doc=2305,freq=1.0), product of:
              0.30979145 = queryWeight, product of:
                3.1960692 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.02257231 = queryNorm
              0.23483633 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2305)
        0.44 = coord(11/25)
    
  2. Bartell, B.T.; Cottrell, G.W.; Belew, R.K.: Optimizing similarity using multi-query relevance feedback (1998) 0.34
    0.3399763 = sum of:
      0.3399763 = product of:
        0.9443786 = sum of:
          0.11041723 = weight(abstract_txt:estimate in 2152) [ClassicSimilarity], result of:
            0.11041723 = score(doc=2152,freq=2.0), product of:
              0.17373057 = queryWeight, product of:
                1.0703716 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.02257231 = queryNorm
              0.63556594 = fieldWeight in 2152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.023156462 = weight(abstract_txt:both in 2152) [ClassicSimilarity], result of:
            0.023156462 = score(doc=2152,freq=1.0), product of:
              0.09734637 = queryWeight, product of:
                1.1331081 = boost
                3.8060317 = idf(docFreq=2684, maxDocs=44421)
                0.02257231 = queryNorm
              0.23787698 = fieldWeight in 2152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8060317 = idf(docFreq=2684, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.050998494 = weight(abstract_txt:documents in 2152) [ClassicSimilarity], result of:
            0.050998494 = score(doc=2152,freq=3.0), product of:
              0.11425348 = queryWeight, product of:
                1.2275698 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02257231 = queryNorm
              0.4463627 = fieldWeight in 2152, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.019244352 = weight(abstract_txt:that in 2152) [ClassicSimilarity], result of:
            0.019244352 = score(doc=2152,freq=3.0), product of:
              0.07516981 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02257231 = queryNorm
              0.25601172 = fieldWeight in 2152, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.07757456 = weight(abstract_txt:learning in 2152) [ClassicSimilarity], result of:
            0.07757456 = score(doc=2152,freq=3.0), product of:
              0.15111612 = queryWeight, product of:
                1.4117795 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.02257231 = queryNorm
              0.51334405 = fieldWeight in 2152, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.0948259 = weight(abstract_txt:measures in 2152) [ClassicSimilarity], result of:
            0.0948259 = score(doc=2152,freq=2.0), product of:
              0.19776358 = queryWeight, product of:
                1.6150451 = boost
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.02257231 = queryNorm
              0.47949123 = fieldWeight in 2152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.20242049 = weight(abstract_txt:measure in 2152) [ClassicSimilarity], result of:
            0.20242049 = score(doc=2152,freq=4.0), product of:
              0.29788712 = queryWeight, product of:
                2.4276326 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.02257231 = queryNorm
              0.6795208 = fieldWeight in 2152, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.24815896 = weight(abstract_txt:similarity in 2152) [ClassicSimilarity], result of:
            0.24815896 = score(doc=2152,freq=4.0), product of:
              0.3412207 = queryWeight, product of:
                2.5982132 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.02257231 = queryNorm
              0.72726816 = fieldWeight in 2152, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
          0.11758222 = weight(abstract_txt:document in 2152) [ClassicSimilarity], result of:
            0.11758222 = score(doc=2152,freq=2.0), product of:
              0.30979145 = queryWeight, product of:
                3.1960692 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.02257231 = queryNorm
              0.3795528 = fieldWeight in 2152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=2152)
        0.36 = coord(9/25)
    
  3. Rorvig, M.: Images of similarity : a visual exploration of optimal similarity metrics and scaling properties of TREC topic-document sets (1999) 0.28
    0.28314105 = sum of:
      0.28314105 = product of:
        1.0112181 = sum of:
          0.123259254 = weight(abstract_txt:overlap in 4767) [ClassicSimilarity], result of:
            0.123259254 = score(doc=4767,freq=1.0), product of:
              0.16219942 = queryWeight, product of:
                1.0342395 = boost
                6.9478774 = idf(docFreq=115, maxDocs=44421)
                0.02257231 = queryNorm
              0.7599241 = fieldWeight in 4767, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9478774 = idf(docFreq=115, maxDocs=44421)
                0.109375 = fieldNorm(doc=4767)
          0.072870165 = weight(abstract_txt:documents in 4767) [ClassicSimilarity], result of:
            0.072870165 = score(doc=4767,freq=2.0), product of:
              0.11425348 = queryWeight, product of:
                1.2275698 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02257231 = queryNorm
              0.63779384 = fieldWeight in 4767, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.109375 = fieldNorm(doc=4767)
          0.019443782 = weight(abstract_txt:that in 4767) [ClassicSimilarity], result of:
            0.019443782 = score(doc=4767,freq=1.0), product of:
              0.07516981 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02257231 = queryNorm
              0.2586648 = fieldWeight in 4767, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.109375 = fieldNorm(doc=4767)
          0.16594532 = weight(abstract_txt:measures in 4767) [ClassicSimilarity], result of:
            0.16594532 = score(doc=4767,freq=2.0), product of:
              0.19776358 = queryWeight, product of:
                1.6150451 = boost
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.02257231 = queryNorm
              0.83910966 = fieldWeight in 4767, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.109375 = fieldNorm(doc=4767)
          0.17711793 = weight(abstract_txt:measure in 4767) [ClassicSimilarity], result of:
            0.17711793 = score(doc=4767,freq=1.0), product of:
              0.29788712 = queryWeight, product of:
                2.4276326 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.02257231 = queryNorm
              0.5945807 = fieldWeight in 4767, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.109375 = fieldNorm(doc=4767)
          0.30708104 = weight(abstract_txt:similarity in 4767) [ClassicSimilarity], result of:
            0.30708104 = score(doc=4767,freq=2.0), product of:
              0.3412207 = queryWeight, product of:
                2.5982132 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.02257231 = queryNorm
              0.8999484 = fieldWeight in 4767, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.109375 = fieldNorm(doc=4767)
          0.14550057 = weight(abstract_txt:document in 4767) [ClassicSimilarity], result of:
            0.14550057 = score(doc=4767,freq=1.0), product of:
              0.30979145 = queryWeight, product of:
                3.1960692 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.02257231 = queryNorm
              0.46967265 = fieldWeight in 4767, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.109375 = fieldNorm(doc=4767)
        0.28 = coord(7/25)
    
  4. Egghe, L.: Good properties of similarity measures and their complementarity (2010) 0.23
    0.23439656 = sum of:
      0.23439656 = product of:
        0.9766524 = sum of:
          0.15249377 = weight(abstract_txt:overlap in 980) [ClassicSimilarity], result of:
            0.15249377 = score(doc=980,freq=3.0), product of:
              0.16219942 = queryWeight, product of:
                1.0342395 = boost
                6.9478774 = idf(docFreq=115, maxDocs=44421)
                0.02257231 = queryNorm
              0.94016224 = fieldWeight in 980, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.9478774 = idf(docFreq=115, maxDocs=44421)
                0.078125 = fieldNorm(doc=980)
          0.05013521 = weight(abstract_txt:both in 980) [ClassicSimilarity], result of:
            0.05013521 = score(doc=980,freq=3.0), product of:
              0.09734637 = queryWeight, product of:
                1.1331081 = boost
                3.8060317 = idf(docFreq=2684, maxDocs=44421)
                0.02257231 = queryNorm
              0.51501876 = fieldWeight in 980, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.8060317 = idf(docFreq=2684, maxDocs=44421)
                0.078125 = fieldNorm(doc=980)
          0.027776832 = weight(abstract_txt:that in 980) [ClassicSimilarity], result of:
            0.027776832 = score(doc=980,freq=4.0), product of:
              0.07516981 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02257231 = queryNorm
              0.3695211 = fieldWeight in 980, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=980)
          0.18741615 = weight(abstract_txt:measures in 980) [ClassicSimilarity], result of:
            0.18741615 = score(doc=980,freq=5.0), product of:
              0.19776358 = queryWeight, product of:
                1.6150451 = boost
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.02257231 = queryNorm
              0.9476778 = fieldWeight in 980, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.078125 = fieldNorm(doc=980)
          0.17891611 = weight(abstract_txt:measure in 980) [ClassicSimilarity], result of:
            0.17891611 = score(doc=980,freq=2.0), product of:
              0.29788712 = queryWeight, product of:
                2.4276326 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.02257231 = queryNorm
              0.6006172 = fieldWeight in 980, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.078125 = fieldNorm(doc=980)
          0.37991428 = weight(abstract_txt:similarity in 980) [ClassicSimilarity], result of:
            0.37991428 = score(doc=980,freq=6.0), product of:
              0.3412207 = queryWeight, product of:
                2.5982132 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.02257231 = queryNorm
              1.1133975 = fieldWeight in 980, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.078125 = fieldNorm(doc=980)
        0.24 = coord(6/25)
    
  5. Alzahrani, S.; Palade, V.; Salim, N.; Abraham, A.: Using structural information and citation evidence to detect significant plagiarism cases in scientific publications (2012) 0.23
    0.22616161 = sum of:
      0.22616161 = product of:
        0.6282267 = sum of:
          0.027359394 = weight(abstract_txt:they in 982) [ClassicSimilarity], result of:
            0.027359394 = score(doc=982,freq=2.0), product of:
              0.09439028 = queryWeight, product of:
                1.1157712 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.02257231 = queryNorm
              0.28985393 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.02865466 = weight(abstract_txt:both in 982) [ClassicSimilarity], result of:
            0.02865466 = score(doc=982,freq=2.0), product of:
              0.09734637 = queryWeight, product of:
                1.1331081 = boost
                3.8060317 = idf(docFreq=2684, maxDocs=44421)
                0.02257231 = queryNorm
              0.29435775 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8060317 = idf(docFreq=2684, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.08752403 = weight(abstract_txt:ignore in 982) [ClassicSimilarity], result of:
            0.08752403 = score(doc=982,freq=1.0), product of:
              0.20493107 = queryWeight, product of:
                1.1625198 = boost
                7.809647 = idf(docFreq=48, maxDocs=44421)
                0.02257231 = queryNorm
              0.42709008 = fieldWeight in 982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.809647 = idf(docFreq=48, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.036435083 = weight(abstract_txt:documents in 982) [ClassicSimilarity], result of:
            0.036435083 = score(doc=982,freq=2.0), product of:
              0.11425348 = queryWeight, product of:
                1.2275698 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02257231 = queryNorm
              0.31889692 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.01374883 = weight(abstract_txt:that in 982) [ClassicSimilarity], result of:
            0.01374883 = score(doc=982,freq=2.0), product of:
              0.07516981 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02257231 = queryNorm
              0.18290362 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.08297266 = weight(abstract_txt:measures in 982) [ClassicSimilarity], result of:
            0.08297266 = score(doc=982,freq=2.0), product of:
              0.19776358 = queryWeight, product of:
                1.6150451 = boost
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.02257231 = queryNorm
              0.41955483 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.12524128 = weight(abstract_txt:measure in 982) [ClassicSimilarity], result of:
            0.12524128 = score(doc=982,freq=2.0), product of:
              0.29788712 = queryWeight, product of:
                2.4276326 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.02257231 = queryNorm
              0.42043203 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.15354052 = weight(abstract_txt:similarity in 982) [ClassicSimilarity], result of:
            0.15354052 = score(doc=982,freq=2.0), product of:
              0.3412207 = queryWeight, product of:
                2.5982132 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.02257231 = queryNorm
              0.4499742 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.072750285 = weight(abstract_txt:document in 982) [ClassicSimilarity], result of:
            0.072750285 = score(doc=982,freq=1.0), product of:
              0.30979145 = queryWeight, product of:
                3.1960692 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.02257231 = queryNorm
              0.23483633 = fieldWeight in 982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
        0.36 = coord(9/25)