Document (#34533)

Author
Pong, J.Y.-H.
Kwok, R.C.-W.
Lau, R.Y.-K.
Hao, J.-X.
Wong, P.C.-C.
Title
¬A comparative study of two automatic document classification methods in a library setting
Source
Journal of information science. 34(2008) no.2, S.213-230
Year
2008
Abstract
In current library practice, trained human experts usually carry out document cataloguing and indexing based on a manual approach. With the explosive growth in the number of electronic documents available on the Internet and digital libraries, it is increasingly difficult for library practitioners to categorize both electronic documents and traditional library materials using just a manual approach. To improve the effectiveness and efficiency of document categorization at the library setting, more in-depth studies of using automatic document classification methods to categorize library items are required. Machine learning research has advanced rapidly in recent years. However, applying machine learning techniques to improve library practice is still a relatively unexplored area. This paper illustrates the design and development of a machine learning based automatic document classification system to alleviate the manual categorization problem encountered within the library setting. Two supervised machine learning algorithms have been tested. Our empirical tests show that supervised machine learning algorithms in general, and the k-nearest neighbours (KNN) algorithm in particular, can be used to develop an effective document classification system to enhance current library practice. Moreover, some concrete recommendations regarding how to practically apply the KNN algorithm to develop automatic document classification in a library setting are made. To our best knowledge, this is the first in-depth study of applying the KNN algorithm to automatic document classification based on the widely used LCC classification scheme adopted by many large libraries.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Wu, H.C.; Luk, R.W.P.; Wong, K.F,; Kwok, K.L.: ¬A retrospective study of a hybrid document-context based retrieval model (2007) 3.80
    3.8004067 = sum of:
      3.8004067 = sum of:
        1.640315 = weight(author_txt:wong in 1936) [ClassicSimilarity], result of:
          1.640315 = score(doc=1936,freq=1.0), product of:
            0.639736 = queryWeight, product of:
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.077969424 = queryNorm
            2.56405 = fieldWeight in 1936, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.3125 = fieldNorm(doc=1936)
        2.1600916 = weight(author_txt:kwok in 1936) [ClassicSimilarity], result of:
          2.1600916 = score(doc=1936,freq=1.0), product of:
            0.76859474 = queryWeight, product of:
              1.0960953 = boost
              8.993418 = idf(docFreq=14, maxDocs=44421)
              0.077969424 = queryNorm
            2.810443 = fieldWeight in 1936, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.993418 = idf(docFreq=14, maxDocs=44421)
              0.3125 = fieldNorm(doc=1936)
    
  2. Kwok, K.L.: ¬The use of titles and cited titles as document representations for automatic classification (1975) 2.16
    2.1600916 = sum of:
      2.1600916 = product of:
        4.3201833 = sum of:
          4.3201833 = weight(author_txt:kwok in 4346) [ClassicSimilarity], result of:
            4.3201833 = score(doc=4346,freq=1.0), product of:
              0.76859474 = queryWeight, product of:
                1.0960953 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.077969424 = queryNorm
              5.620886 = fieldWeight in 4346, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.625 = fieldNorm(doc=4346)
        0.5 = coord(1/2)
    
  3. Kwok, K.L.: Employing multiple representations for Chinese information retrieval (1999) 2.16
    2.1600916 = sum of:
      2.1600916 = product of:
        4.3201833 = sum of:
          4.3201833 = weight(author_txt:kwok in 4773) [ClassicSimilarity], result of:
            4.3201833 = score(doc=4773,freq=1.0), product of:
              0.76859474 = queryWeight, product of:
                1.0960953 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.077969424 = queryNorm
              5.620886 = fieldWeight in 4773, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.625 = fieldNorm(doc=4773)
        0.5 = coord(1/2)
    
  4. Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 2.16
    2.1600916 = sum of:
      2.1600916 = product of:
        4.3201833 = sum of:
          4.3201833 = weight(author_txt:kwok in 6696) [ClassicSimilarity], result of:
            4.3201833 = score(doc=6696,freq=1.0), product of:
              0.76859474 = queryWeight, product of:
                1.0960953 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.077969424 = queryNorm
              5.620886 = fieldWeight in 6696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.625 = fieldNorm(doc=6696)
        0.5 = coord(1/2)
    
  5. Kwok, K.L.: Improving English and Chinese ad-hoc retrieval : a TIPSTER text phase 3 project report (2000) 2.16
    2.1600916 = sum of:
      2.1600916 = product of:
        4.3201833 = sum of:
          4.3201833 = weight(author_txt:kwok in 388) [ClassicSimilarity], result of:
            4.3201833 = score(doc=388,freq=1.0), product of:
              0.76859474 = queryWeight, product of:
                1.0960953 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.077969424 = queryNorm
              5.620886 = fieldWeight in 388, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.625 = fieldNorm(doc=388)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Wang, J.: ¬An extensive study on automated Dewey Decimal Classification (2009) 0.41
    0.40934035 = sum of:
      0.40934035 = product of:
        0.930319 = sum of:
          0.03249748 = weight(abstract_txt:improve in 159) [ClassicSimilarity], result of:
            0.03249748 = score(doc=159,freq=1.0), product of:
              0.104885384 = queryWeight, product of:
                1.1964381 = boost
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.017683575 = queryNorm
              0.30983803 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.05707435 = weight(abstract_txt:depth in 159) [ClassicSimilarity], result of:
            0.05707435 = score(doc=159,freq=1.0), product of:
              0.15267777 = queryWeight, product of:
                1.4435128 = boost
                5.981156 = idf(docFreq=304, maxDocs=44421)
                0.017683575 = queryNorm
              0.37382224 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.981156 = idf(docFreq=304, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.07632011 = weight(abstract_txt:categorization in 159) [ClassicSimilarity], result of:
            0.07632011 = score(doc=159,freq=1.0), product of:
              0.18531384 = queryWeight, product of:
                1.5903279 = boost
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.017683575 = queryNorm
              0.4118425 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.11106751 = weight(abstract_txt:supervised in 159) [ClassicSimilarity], result of:
            0.11106751 = score(doc=159,freq=1.0), product of:
              0.23797968 = queryWeight, product of:
                1.8021986 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.017683575 = queryNorm
              0.46671006 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.07429292 = weight(abstract_txt:algorithm in 159) [ClassicSimilarity], result of:
            0.07429292 = score(doc=159,freq=1.0), product of:
              0.20835818 = queryWeight, product of:
                2.0653024 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.017683575 = queryNorm
              0.35656348 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.07110977 = weight(abstract_txt:learning in 159) [ClassicSimilarity], result of:
            0.07110977 = score(doc=159,freq=1.0), product of:
              0.23992825 = queryWeight, product of:
                2.8611684 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.017683575 = queryNorm
              0.29637933 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.093530595 = weight(abstract_txt:automatic in 159) [ClassicSimilarity], result of:
            0.093530595 = score(doc=159,freq=1.0), product of:
              0.28802553 = queryWeight, product of:
                3.1348603 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017683575 = queryNorm
              0.32473022 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.09787863 = weight(abstract_txt:machine in 159) [ClassicSimilarity], result of:
            0.09787863 = score(doc=159,freq=1.0), product of:
              0.2968842 = queryWeight, product of:
                3.1827042 = boost
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.017683575 = queryNorm
              0.3296862 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.15715432 = weight(abstract_txt:classification in 159) [ClassicSimilarity], result of:
            0.15715432 = score(doc=159,freq=7.0), product of:
              0.23806164 = queryWeight, product of:
                3.3721855 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017683575 = queryNorm
              0.6601413 = fieldWeight in 159, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.07490879 = weight(abstract_txt:library in 159) [ClassicSimilarity], result of:
            0.07490879 = score(doc=159,freq=3.0), product of:
              0.21699679 = queryWeight, product of:
                3.8480825 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.017683575 = queryNorm
              0.34520692 = fieldWeight in 159, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.084484525 = weight(abstract_txt:document in 159) [ClassicSimilarity], result of:
            0.084484525 = score(doc=159,freq=1.0), product of:
              0.3147893 = queryWeight, product of:
                4.1454597 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017683575 = queryNorm
              0.26838437 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
        0.44 = coord(11/25)
    
  2. Dietterich, T.G.: Machine-learning research : four current directions (1997) 0.36
    0.3618115 = sum of:
      0.3618115 = product of:
        1.2921839 = sum of:
          0.0474372 = weight(abstract_txt:methods in 4321) [ClassicSimilarity], result of:
            0.0474372 = score(doc=4321,freq=1.0), product of:
              0.07327141 = queryWeight, product of:
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.017683575 = queryNorm
              0.6474176 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.052870072 = weight(abstract_txt:current in 4321) [ClassicSimilarity], result of:
            0.052870072 = score(doc=4321,freq=1.0), product of:
              0.078764126 = queryWeight, product of:
                1.0368047 = boost
                4.295972 = idf(docFreq=1644, maxDocs=44421)
                0.017683575 = queryNorm
              0.6712456 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.295972 = idf(docFreq=1644, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.123498656 = weight(abstract_txt:algorithms in 4321) [ClassicSimilarity], result of:
            0.123498656 = score(doc=4321,freq=1.0), product of:
              0.13866387 = queryWeight, product of:
                1.3756704 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.017683575 = queryNorm
              0.8906332 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.27766877 = weight(abstract_txt:supervised in 4321) [ClassicSimilarity], result of:
            0.27766877 = score(doc=4321,freq=1.0), product of:
              0.23797968 = queryWeight, product of:
                1.8021986 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.017683575 = queryNorm
              1.1667751 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.39751568 = weight(abstract_txt:learning in 4321) [ClassicSimilarity], result of:
            0.39751568 = score(doc=4321,freq=5.0), product of:
              0.23992825 = queryWeight, product of:
                2.8611684 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.017683575 = queryNorm
              1.6568108 = fieldWeight in 4321, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.24469656 = weight(abstract_txt:machine in 4321) [ClassicSimilarity], result of:
            0.24469656 = score(doc=4321,freq=1.0), product of:
              0.2968842 = queryWeight, product of:
                3.1827042 = boost
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.017683575 = queryNorm
              0.8242155 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.14849687 = weight(abstract_txt:classification in 4321) [ClassicSimilarity], result of:
            0.14849687 = score(doc=4321,freq=1.0), product of:
              0.23806164 = queryWeight, product of:
                3.3721855 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017683575 = queryNorm
              0.6237749 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
        0.28 = coord(7/25)
    
  3. Li, Y.; Shawe-Taylor, J.: Advanced learning algorithms for cross-language patent retrieval and classification (2007) 0.28
    0.28000703 = sum of:
      0.28000703 = product of:
        0.875022 = sum of:
          0.0237186 = weight(abstract_txt:methods in 1931) [ClassicSimilarity], result of:
            0.0237186 = score(doc=1931,freq=1.0), product of:
              0.07327141 = queryWeight, product of:
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.017683575 = queryNorm
              0.3237088 = fieldWeight in 1931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
          0.016129749 = weight(abstract_txt:based in 1931) [ClassicSimilarity], result of:
            0.016129749 = score(doc=1931,freq=1.0), product of:
              0.06486205 = queryWeight, product of:
                1.1523216 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.017683575 = queryNorm
              0.24867775 = fieldWeight in 1931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
          0.123498656 = weight(abstract_txt:algorithms in 1931) [ClassicSimilarity], result of:
            0.123498656 = score(doc=1931,freq=4.0), product of:
              0.13866387 = queryWeight, product of:
                1.3756704 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.017683575 = queryNorm
              0.8906332 = fieldWeight in 1931, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
          0.092866145 = weight(abstract_txt:algorithm in 1931) [ClassicSimilarity], result of:
            0.092866145 = score(doc=1931,freq=1.0), product of:
              0.20835818 = queryWeight, product of:
                2.0653024 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.017683575 = queryNorm
              0.44570434 = fieldWeight in 1931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
          0.23517345 = weight(abstract_txt:learning in 1931) [ClassicSimilarity], result of:
            0.23517345 = score(doc=1931,freq=7.0), product of:
              0.23992825 = queryWeight, product of:
                2.8611684 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.017683575 = queryNorm
              0.9801824 = fieldWeight in 1931, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
          0.1730266 = weight(abstract_txt:machine in 1931) [ClassicSimilarity], result of:
            0.1730266 = score(doc=1931,freq=2.0), product of:
              0.2968842 = queryWeight, product of:
                3.1827042 = boost
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.017683575 = queryNorm
              0.5828084 = fieldWeight in 1931, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
          0.10500314 = weight(abstract_txt:classification in 1931) [ClassicSimilarity], result of:
            0.10500314 = score(doc=1931,freq=2.0), product of:
              0.23806164 = queryWeight, product of:
                3.3721855 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017683575 = queryNorm
              0.44107544 = fieldWeight in 1931, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
          0.105605654 = weight(abstract_txt:document in 1931) [ClassicSimilarity], result of:
            0.105605654 = score(doc=1931,freq=1.0), product of:
              0.3147893 = queryWeight, product of:
                4.1454597 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017683575 = queryNorm
              0.33548045 = fieldWeight in 1931, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=1931)
        0.32 = coord(8/25)
    
  4. Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.25
    0.24901567 = sum of:
      0.24901567 = product of:
        0.778174 = sum of:
          0.04108182 = weight(abstract_txt:methods in 6480) [ClassicSimilarity], result of:
            0.04108182 = score(doc=6480,freq=3.0), product of:
              0.07327141 = queryWeight, product of:
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.017683575 = queryNorm
              0.5606801 = fieldWeight in 6480, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.016129749 = weight(abstract_txt:based in 6480) [ClassicSimilarity], result of:
            0.016129749 = score(doc=6480,freq=1.0), product of:
              0.06486205 = queryWeight, product of:
                1.1523216 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.017683575 = queryNorm
              0.24867775 = fieldWeight in 6480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.04062185 = weight(abstract_txt:improve in 6480) [ClassicSimilarity], result of:
            0.04062185 = score(doc=6480,freq=1.0), product of:
              0.104885384 = queryWeight, product of:
                1.1964381 = boost
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.017683575 = queryNorm
              0.38729754 = fieldWeight in 6480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.1257055 = weight(abstract_txt:learning in 6480) [ClassicSimilarity], result of:
            0.1257055 = score(doc=6480,freq=2.0), product of:
              0.23992825 = queryWeight, product of:
                2.8611684 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.017683575 = queryNorm
              0.52392954 = fieldWeight in 6480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.11691324 = weight(abstract_txt:automatic in 6480) [ClassicSimilarity], result of:
            0.11691324 = score(doc=6480,freq=1.0), product of:
              0.28802553 = queryWeight, product of:
                3.1348603 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017683575 = queryNorm
              0.40591276 = fieldWeight in 6480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.12234828 = weight(abstract_txt:machine in 6480) [ClassicSimilarity], result of:
            0.12234828 = score(doc=6480,freq=1.0), product of:
              0.2968842 = queryWeight, product of:
                3.1827042 = boost
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.017683575 = queryNorm
              0.41210774 = fieldWeight in 6480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.16602455 = weight(abstract_txt:classification in 6480) [ClassicSimilarity], result of:
            0.16602455 = score(doc=6480,freq=5.0), product of:
              0.23806164 = queryWeight, product of:
                3.3721855 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017683575 = queryNorm
              0.6974015 = fieldWeight in 6480, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.14934896 = weight(abstract_txt:document in 6480) [ClassicSimilarity], result of:
            0.14934896 = score(doc=6480,freq=2.0), product of:
              0.3147893 = queryWeight, product of:
                4.1454597 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017683575 = queryNorm
              0.47444102 = fieldWeight in 6480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
        0.32 = coord(8/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.24
    0.24250545 = sum of:
      0.24250545 = product of:
        0.8660909 = sum of:
          0.02846232 = weight(abstract_txt:methods in 2595) [ClassicSimilarity], result of:
            0.02846232 = score(doc=2595,freq=1.0), product of:
              0.07327141 = queryWeight, product of:
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.017683575 = queryNorm
              0.38845056 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.0193557 = weight(abstract_txt:based in 2595) [ClassicSimilarity], result of:
            0.0193557 = score(doc=2595,freq=1.0), product of:
              0.06486205 = queryWeight, product of:
                1.1523216 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.017683575 = queryNorm
              0.2984133 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.1618994 = weight(abstract_txt:categorization in 2595) [ClassicSimilarity], result of:
            0.1618994 = score(doc=2595,freq=2.0), product of:
              0.18531384 = queryWeight, product of:
                1.5903279 = boost
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.017683575 = queryNorm
              0.87364984 = fieldWeight in 2595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.15759908 = weight(abstract_txt:algorithm in 2595) [ClassicSimilarity], result of:
            0.15759908 = score(doc=2595,freq=2.0), product of:
              0.20835818 = queryWeight, product of:
                2.0653024 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.017683575 = queryNorm
              0.7563853 = fieldWeight in 2595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.1508466 = weight(abstract_txt:learning in 2595) [ClassicSimilarity], result of:
            0.1508466 = score(doc=2595,freq=2.0), product of:
              0.23992825 = queryWeight, product of:
                2.8611684 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.017683575 = queryNorm
              0.62871546 = fieldWeight in 2595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.1402959 = weight(abstract_txt:automatic in 2595) [ClassicSimilarity], result of:
            0.1402959 = score(doc=2595,freq=1.0), product of:
              0.28802553 = queryWeight, product of:
                3.1348603 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017683575 = queryNorm
              0.48709533 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.20763192 = weight(abstract_txt:machine in 2595) [ClassicSimilarity], result of:
            0.20763192 = score(doc=2595,freq=2.0), product of:
              0.2968842 = queryWeight, product of:
                3.1827042 = boost
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.017683575 = queryNorm
              0.69937 = fieldWeight in 2595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.274979 = idf(docFreq=617, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
        0.28 = coord(7/25)