Document (#28646)

Author
Needham, R.M.
Sparck Jones, K.
Title
Keywords and clumps
Source
Theory of subject analysis: a sourcebook. Ed.: L.M. Chan, et al
Imprint
Littleton, CO : Libraries Unlimited
Year
1985
Pages
S.262-272
Abstract
The selection that follows was chosen as it represents "a very early paper an the possibilities allowed by computers an documentation." In the early 1960s computers were being used to provide simple automatic indexing systems wherein keywords were extracted from documents. The problem with such systems was that they lacked vocabulary control, thus documents related in subject matter were not always collocated in retrieval. To improve retrieval by improving recall is the raison d'être of vocabulary control tools such as classifications and thesauri. The question arose whether it was possible by automatic means to construct classes of terms, which when substituted, one for another, could be used to improve retrieval performance? One of the first theoretical approaches to this question was initiated by R. M. Needham and Karen Sparck Jones at the Cambridge Language Research Institute in England.t The question was later pursued using experimental methodologies by Sparck Jones, who, as a Senior Research Associate in the Computer Laboratory at the University of Cambridge, has devoted her life's work to research in information retrieval and automatic naturai language processing. Based an the principles of numerical taxonomy, automatic classification techniques start from the premise that two objects are similar to the degree that they share attributes in common. When these two objects are keywords, their similarity is measured in terms of the number of documents they index in common. Step 1 in automatic classification is to compute mathematically the degree to which two terms are similar. Step 2 is to group together those terms that are "most similar" to each other, forming equivalence classes of intersubstitutable terms. The technique for forming such classes varies and is the factor that characteristically distinguishes different approaches to automatic classification. The technique used by Needham and Sparck Jones, that of clumping, is described in the selection that follows. Questions that must be asked are whether the use of automatically generated classes really does improve retrieval performance and whether there is a true eco nomic advantage in substituting mechanical for manual labor. Several years after her work with clumping, Sparck Jones was to observe that while it was not wholly satisfactory in itself, it was valuable in that it stimulated research into automatic classification. To this it might be added that it was valuable in that it introduced to libraryl information science the methods of numerical taxonomy, thus stimulating us to think again about the fundamental nature and purpose of classification. In this connection it might be useful to review how automatically derived classes differ from those of manually constructed classifications: 1) the manner of their derivation is purely a posteriori, the ultimate operationalization of the principle of literary warrant; 2) the relationship between members forming such classes is essentially statistical; the members of a given class are similar to each other not because they possess the class-defining characteristic but by virtue of sharing a family resemblance; and finally, 3) automatically derived classes are not related meaningfully one to another, that is, they are not ordered in traditional hierarchical and precedence relationships.
Footnote
Nachdruck des Originalartikels mit Kommentierung durch die Herausgeber
Original in: Journal of documentation 20(1964) no.1, S.5-15.
Theme
Computerlinguistik
Automatisches Indexieren

Similar documents (author)

  1. Sparck Jones, K.: Fashionable trends and feasible strategies in information management (1988) 5.31
    5.3145585 = sum of:
      5.3145585 = sum of:
        2.0560415 = weight(author_txt:jones in 816) [ClassicSimilarity], result of:
          2.0560415 = score(doc=816,freq=1.0), product of:
            0.5925794 = queryWeight, product of:
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.08539477 = queryNorm
            3.469647 = fieldWeight in 816, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.5 = fieldNorm(doc=816)
        3.258517 = weight(author_txt:sparck in 816) [ClassicSimilarity], result of:
          3.258517 = score(doc=816,freq=1.0), product of:
            0.80551195 = queryWeight, product of:
              1.1659038 = boost
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.08539477 = queryNorm
            4.0452747 = fieldWeight in 816, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.5 = fieldNorm(doc=816)
    
  2. Sparck Jones, K.: Automatic classification (1976) 5.31
    5.3145585 = sum of:
      5.3145585 = sum of:
        2.0560415 = weight(author_txt:jones in 2907) [ClassicSimilarity], result of:
          2.0560415 = score(doc=2907,freq=1.0), product of:
            0.5925794 = queryWeight, product of:
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.08539477 = queryNorm
            3.469647 = fieldWeight in 2907, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.5 = fieldNorm(doc=2907)
        3.258517 = weight(author_txt:sparck in 2907) [ClassicSimilarity], result of:
          3.258517 = score(doc=2907,freq=1.0), product of:
            0.80551195 = queryWeight, product of:
              1.1659038 = boost
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.08539477 = queryNorm
            4.0452747 = fieldWeight in 2907, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.5 = fieldNorm(doc=2907)
    
  3. Sparck Jones, K.: ¬The role of artificial intelligence in information retrieval (1991) 5.31
    5.3145585 = sum of:
      5.3145585 = sum of:
        2.0560415 = weight(author_txt:jones in 4810) [ClassicSimilarity], result of:
          2.0560415 = score(doc=4810,freq=1.0), product of:
            0.5925794 = queryWeight, product of:
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.08539477 = queryNorm
            3.469647 = fieldWeight in 4810, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.5 = fieldNorm(doc=4810)
        3.258517 = weight(author_txt:sparck in 4810) [ClassicSimilarity], result of:
          3.258517 = score(doc=4810,freq=1.0), product of:
            0.80551195 = queryWeight, product of:
              1.1659038 = boost
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.08539477 = queryNorm
            4.0452747 = fieldWeight in 4810, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.5 = fieldNorm(doc=4810)
    
  4. Sparck Jones, K.: Automatic keyword classification for information retrieval (1971) 5.31
    5.3145585 = sum of:
      5.3145585 = sum of:
        2.0560415 = weight(author_txt:jones in 5175) [ClassicSimilarity], result of:
          2.0560415 = score(doc=5175,freq=1.0), product of:
            0.5925794 = queryWeight, product of:
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.08539477 = queryNorm
            3.469647 = fieldWeight in 5175, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.5 = fieldNorm(doc=5175)
        3.258517 = weight(author_txt:sparck in 5175) [ClassicSimilarity], result of:
          3.258517 = score(doc=5175,freq=1.0), product of:
            0.80551195 = queryWeight, product of:
              1.1659038 = boost
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.08539477 = queryNorm
            4.0452747 = fieldWeight in 5175, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.5 = fieldNorm(doc=5175)
    
  5. Sparck Jones, K.: ¬A statistical interpretation of term specifity and its application in retrieval (1972) 5.31
    5.3145585 = sum of:
      5.3145585 = sum of:
        2.0560415 = weight(author_txt:jones in 5186) [ClassicSimilarity], result of:
          2.0560415 = score(doc=5186,freq=1.0), product of:
            0.5925794 = queryWeight, product of:
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.08539477 = queryNorm
            3.469647 = fieldWeight in 5186, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.939294 = idf(docFreq=116, maxDocs=44421)
              0.5 = fieldNorm(doc=5186)
        3.258517 = weight(author_txt:sparck in 5186) [ClassicSimilarity], result of:
          3.258517 = score(doc=5186,freq=1.0), product of:
            0.80551195 = queryWeight, product of:
              1.1659038 = boost
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.08539477 = queryNorm
            4.0452747 = fieldWeight in 5186, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.090549 = idf(docFreq=36, maxDocs=44421)
              0.5 = fieldNorm(doc=5186)
    

Similar documents (content)

  1. Borko, H.: Research in computer based classification systems (1985) 0.78
    0.78378 = sum of:
      0.78378 = product of:
        1.1526176 = sum of:
          0.020696593 = weight(abstract_txt:documents in 4647) [ClassicSimilarity], result of:
            0.020696593 = score(doc=4647,freq=3.0), product of:
              0.07418754 = queryWeight, product of:
                1.0103313 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017808195 = queryNorm
              0.27897668 = fieldWeight in 4647, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.0071684304 = weight(abstract_txt:research in 4647) [ClassicSimilarity], result of:
            0.0071684304 = score(doc=4647,freq=1.0), product of:
              0.058081042 = queryWeight, product of:
                1.0322499 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.017808195 = queryNorm
              0.12342117 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.009099095 = weight(abstract_txt:such in 4647) [ClassicSimilarity], result of:
            0.009099095 = score(doc=4647,freq=1.0), product of:
              0.06809008 = queryWeight, product of:
                1.1176597 = boost
                3.42101 = idf(docFreq=3945, maxDocs=44421)
                0.017808195 = queryNorm
              0.1336332 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.42101 = idf(docFreq=3945, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.027397528 = weight(abstract_txt:whether in 4647) [ClassicSimilarity], result of:
            0.027397528 = score(doc=4647,freq=2.0), product of:
              0.102385014 = queryWeight, product of:
                1.1869065 = boost
                4.8439536 = idf(docFreq=950, maxDocs=44421)
                0.017808195 = queryNorm
              0.26759315 = fieldWeight in 4647, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8439536 = idf(docFreq=950, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.04155164 = weight(abstract_txt:question in 4647) [ClassicSimilarity], result of:
            0.04155164 = score(doc=4647,freq=3.0), product of:
              0.118065715 = queryWeight, product of:
                1.2745597 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.017808195 = queryNorm
              0.35193652 = fieldWeight in 4647, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.057383377 = weight(abstract_txt:automatically in 4647) [ClassicSimilarity], result of:
            0.057383377 = score(doc=4647,freq=4.0), product of:
              0.13302794 = queryWeight, product of:
                1.3529127 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.017808195 = queryNorm
              0.43136334 = fieldWeight in 4647, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.01688056 = weight(abstract_txt:retrieval in 4647) [ClassicSimilarity], result of:
            0.01688056 = score(doc=4647,freq=2.0), product of:
              0.0878961 = queryWeight, product of:
                1.4197357 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.017808195 = queryNorm
              0.1920513 = fieldWeight in 4647, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.025902059 = weight(abstract_txt:they in 4647) [ClassicSimilarity], result of:
            0.025902059 = score(doc=4647,freq=3.0), product of:
              0.1021498 = queryWeight, product of:
                1.5305285 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.017808195 = queryNorm
              0.25356936 = fieldWeight in 4647, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.11058285 = weight(abstract_txt:clumping in 4647) [ClassicSimilarity], result of:
            0.11058285 = score(doc=4647,freq=1.0), product of:
              0.2856715 = queryWeight, product of:
                1.6187736 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.017808195 = queryNorm
              0.38709795 = fieldWeight in 4647, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.059946686 = weight(abstract_txt:classification in 4647) [ClassicSimilarity], result of:
            0.059946686 = score(doc=4647,freq=11.0), product of:
              0.11590467 = queryWeight, product of:
                1.630321 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017808195 = queryNorm
              0.51720685 = fieldWeight in 4647, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.032534678 = weight(abstract_txt:terms in 4647) [ClassicSimilarity], result of:
            0.032534678 = score(doc=4647,freq=3.0), product of:
              0.118917465 = queryWeight, product of:
                1.6513742 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.017808195 = queryNorm
              0.27359042 = fieldWeight in 4647, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.045353755 = weight(abstract_txt:similar in 4647) [ClassicSimilarity], result of:
            0.045353755 = score(doc=4647,freq=2.0), product of:
              0.15769474 = queryWeight, product of:
                1.7008902 = boost
                5.206202 = idf(docFreq=661, maxDocs=44421)
                0.017808195 = queryNorm
              0.28760475 = fieldWeight in 4647, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.206202 = idf(docFreq=661, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.15555292 = weight(abstract_txt:jones in 4647) [ClassicSimilarity], result of:
            0.15555292 = score(doc=4647,freq=2.0), product of:
              0.35864145 = queryWeight, product of:
                2.5650623 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.017808195 = queryNorm
              0.43372825 = fieldWeight in 4647, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.036446143 = weight(abstract_txt:that in 4647) [ClassicSimilarity], result of:
            0.036446143 = score(doc=4647,freq=12.0), product of:
              0.11388898 = queryWeight, product of:
                2.7042234 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017808195 = queryNorm
              0.32001466 = fieldWeight in 4647, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.12473442 = weight(abstract_txt:automatic in 4647) [ClassicSimilarity], result of:
            0.12473442 = score(doc=4647,freq=5.0), product of:
              0.2748518 = queryWeight, product of:
                2.970544 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017808195 = queryNorm
              0.45382428 = fieldWeight in 4647, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.24512771 = weight(abstract_txt:sparck in 4647) [ClassicSimilarity], result of:
            0.24512771 = score(doc=4647,freq=2.0), product of:
              0.4856648 = queryWeight, product of:
                2.9849427 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.017808195 = queryNorm
              0.5047261 = fieldWeight in 4647, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
          0.13625903 = weight(abstract_txt:classes in 4647) [ClassicSimilarity], result of:
            0.13625903 = score(doc=4647,freq=3.0), product of:
              0.3456481 = queryWeight, product of:
                3.3312235 = boost
                5.8265367 = idf(docFreq=355, maxDocs=44421)
                0.017808195 = queryNorm
              0.39421317 = fieldWeight in 4647, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.8265367 = idf(docFreq=355, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4647)
        0.68 = coord(17/25)
    
  2. Hjoerland, B.; Pedersen, K.N.: ¬A substantive theory of classification for information retrieval (2005) 0.22
    0.21606435 = sum of:
      0.21606435 = product of:
        0.77165836 = sum of:
          0.011469488 = weight(abstract_txt:research in 2892) [ClassicSimilarity], result of:
            0.011469488 = score(doc=2892,freq=1.0), product of:
              0.058081042 = queryWeight, product of:
                1.0322499 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.017808195 = queryNorm
              0.19747387 = fieldWeight in 2892, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.0625 = fieldNorm(doc=2892)
          0.033079006 = weight(abstract_txt:retrieval in 2892) [ClassicSimilarity], result of:
            0.033079006 = score(doc=2892,freq=3.0), product of:
              0.0878961 = queryWeight, product of:
                1.4197357 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.017808195 = queryNorm
              0.37634215 = fieldWeight in 2892, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=2892)
          0.057838738 = weight(abstract_txt:classification in 2892) [ClassicSimilarity], result of:
            0.057838738 = score(doc=2892,freq=4.0), product of:
              0.11590467 = queryWeight, product of:
                1.630321 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017808195 = queryNorm
              0.49901992 = fieldWeight in 2892, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=2892)
          0.24888468 = weight(abstract_txt:jones in 2892) [ClassicSimilarity], result of:
            0.24888468 = score(doc=2892,freq=2.0), product of:
              0.35864145 = queryWeight, product of:
                2.5650623 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.017808195 = queryNorm
              0.6939652 = fieldWeight in 2892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=2892)
          0.016833752 = weight(abstract_txt:that in 2892) [ClassicSimilarity], result of:
            0.016833752 = score(doc=2892,freq=1.0), product of:
              0.11388898 = queryWeight, product of:
                2.7042234 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017808195 = queryNorm
              0.14780845 = fieldWeight in 2892, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=2892)
          0.12622236 = weight(abstract_txt:automatic in 2892) [ClassicSimilarity], result of:
            0.12622236 = score(doc=2892,freq=2.0), product of:
              0.2748518 = queryWeight, product of:
                2.970544 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017808195 = queryNorm
              0.45923787 = fieldWeight in 2892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=2892)
          0.27733034 = weight(abstract_txt:sparck in 2892) [ClassicSimilarity], result of:
            0.27733034 = score(doc=2892,freq=1.0), product of:
              0.4856648 = queryWeight, product of:
                2.9849427 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.017808195 = queryNorm
              0.5710324 = fieldWeight in 2892, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.0625 = fieldNorm(doc=2892)
        0.28 = coord(7/25)
    
  3. Robertson, M.; Willett, P.: ¬An upperbound to the performance of ranked output searching : optimal weighting of query terms using a genetic algorithms (1996) 0.21
    0.20772183 = sum of:
      0.20772183 = product of:
        1.0386091 = sum of:
          0.03819635 = weight(abstract_txt:retrieval in 46) [ClassicSimilarity], result of:
            0.03819635 = score(doc=46,freq=1.0), product of:
              0.0878961 = queryWeight, product of:
                1.4197357 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.017808195 = queryNorm
              0.4345625 = fieldWeight in 46, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.125 = fieldNorm(doc=46)
          0.0601085 = weight(abstract_txt:terms in 46) [ClassicSimilarity], result of:
            0.0601085 = score(doc=46,freq=1.0), product of:
              0.118917465 = queryWeight, product of:
                1.6513742 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.017808195 = queryNorm
              0.505464 = fieldWeight in 46, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.125 = fieldNorm(doc=46)
          0.3519761 = weight(abstract_txt:jones in 46) [ClassicSimilarity], result of:
            0.3519761 = score(doc=46,freq=1.0), product of:
              0.35864145 = queryWeight, product of:
                2.5650623 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.017808195 = queryNorm
              0.981415 = fieldWeight in 46, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.125 = fieldNorm(doc=46)
          0.033667505 = weight(abstract_txt:that in 46) [ClassicSimilarity], result of:
            0.033667505 = score(doc=46,freq=1.0), product of:
              0.11388898 = queryWeight, product of:
                2.7042234 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017808195 = queryNorm
              0.2956169 = fieldWeight in 46, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.125 = fieldNorm(doc=46)
          0.5546607 = weight(abstract_txt:sparck in 46) [ClassicSimilarity], result of:
            0.5546607 = score(doc=46,freq=1.0), product of:
              0.4856648 = queryWeight, product of:
                2.9849427 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.017808195 = queryNorm
              1.1420648 = fieldWeight in 46, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.125 = fieldNorm(doc=46)
        0.2 = coord(5/25)
    
  4. Robertson, S.E.: On relevance weight estimation and query expansion (1986) 0.20
    0.20099653 = sum of:
      0.20099653 = product of:
        1.2562283 = sum of:
          0.047796737 = weight(abstract_txt:documents in 3874) [ClassicSimilarity], result of:
            0.047796737 = score(doc=3874,freq=1.0), product of:
              0.07418754 = queryWeight, product of:
                1.0103313 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017808195 = queryNorm
              0.64426905 = fieldWeight in 3874, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.15625 = fieldNorm(doc=3874)
          0.075135626 = weight(abstract_txt:terms in 3874) [ClassicSimilarity], result of:
            0.075135626 = score(doc=3874,freq=1.0), product of:
              0.118917465 = queryWeight, product of:
                1.6513742 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.017808195 = queryNorm
              0.63183004 = fieldWeight in 3874, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.15625 = fieldNorm(doc=3874)
          0.4399701 = weight(abstract_txt:jones in 3874) [ClassicSimilarity], result of:
            0.4399701 = score(doc=3874,freq=1.0), product of:
              0.35864145 = queryWeight, product of:
                2.5650623 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.017808195 = queryNorm
              1.2267687 = fieldWeight in 3874, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.15625 = fieldNorm(doc=3874)
          0.6933259 = weight(abstract_txt:sparck in 3874) [ClassicSimilarity], result of:
            0.6933259 = score(doc=3874,freq=1.0), product of:
              0.4856648 = queryWeight, product of:
                2.9849427 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.017808195 = queryNorm
              1.4275811 = fieldWeight in 3874, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.15625 = fieldNorm(doc=3874)
        0.16 = coord(4/25)
    
  5. Sjögårde, P.; Ahlgren, P.; Waltman, L.: Algorithmic labeling in hierarchical classifications of publications : evaluation of bibliographic fields and term weighting approaches (2021) 0.17
    0.16629235 = sum of:
      0.16629235 = product of:
        0.51966363 = sum of:
          0.07224537 = weight(abstract_txt:classifications in 1262) [ClassicSimilarity], result of:
            0.07224537 = score(doc=1262,freq=3.0), product of:
              0.109017104 = queryWeight, product of:
                6.121738 = idf(docFreq=264, maxDocs=44421)
                0.017808195 = queryNorm
              0.66269755 = fieldWeight in 1262, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.121738 = idf(docFreq=264, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
          0.016220307 = weight(abstract_txt:research in 1262) [ClassicSimilarity], result of:
            0.016220307 = score(doc=1262,freq=2.0), product of:
              0.058081042 = queryWeight, product of:
                1.0322499 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.017808195 = queryNorm
              0.27927023 = fieldWeight in 1262, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
          0.014558553 = weight(abstract_txt:such in 1262) [ClassicSimilarity], result of:
            0.014558553 = score(doc=1262,freq=1.0), product of:
              0.06809008 = queryWeight, product of:
                1.1176597 = boost
                3.42101 = idf(docFreq=3945, maxDocs=44421)
                0.017808195 = queryNorm
              0.21381313 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.42101 = idf(docFreq=3945, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
          0.059036378 = weight(abstract_txt:keywords in 1262) [ClassicSimilarity], result of:
            0.059036378 = score(doc=1262,freq=1.0), product of:
              0.15731566 = queryWeight, product of:
                1.4712425 = boost
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.017808195 = queryNorm
              0.37527338 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
          0.028919369 = weight(abstract_txt:classification in 1262) [ClassicSimilarity], result of:
            0.028919369 = score(doc=1262,freq=1.0), product of:
              0.11590467 = queryWeight, product of:
                1.630321 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017808195 = queryNorm
              0.24950996 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
          0.0601085 = weight(abstract_txt:terms in 1262) [ClassicSimilarity], result of:
            0.0601085 = score(doc=1262,freq=4.0), product of:
              0.118917465 = queryWeight, product of:
                1.6513742 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.017808195 = queryNorm
              0.505464 = fieldWeight in 1262, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
          0.016833752 = weight(abstract_txt:that in 1262) [ClassicSimilarity], result of:
            0.016833752 = score(doc=1262,freq=1.0), product of:
              0.11388898 = queryWeight, product of:
                2.7042234 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017808195 = queryNorm
              0.14780845 = fieldWeight in 1262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
          0.2517414 = weight(abstract_txt:classes in 1262) [ClassicSimilarity], result of:
            0.2517414 = score(doc=1262,freq=4.0), product of:
              0.3456481 = queryWeight, product of:
                3.3312235 = boost
                5.8265367 = idf(docFreq=355, maxDocs=44421)
                0.017808195 = queryNorm
              0.7283171 = fieldWeight in 1262, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8265367 = idf(docFreq=355, maxDocs=44421)
                0.0625 = fieldNorm(doc=1262)
        0.32 = coord(8/25)