Document (#27235)

Author
Díaz, I.
Ranilla, J.
Montañes, E.
Fernández, J.
Combarro, E.F.
Title
Improving performance of text categorization by combining filtering and support vector machines
Source
Journal of the American Society for Information Science and technology. 55(2004) no.7, S.579-592
Year
2004
Abstract
Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going an with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, an two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Díaz, P.: Usability of hypermedia educational e-books (2003) 1.94
    1.9362745 = sum of:
      1.9362745 = product of:
        3.872549 = sum of:
          3.872549 = weight(author_txt:díaz in 2198) [ClassicSimilarity], result of:
            3.872549 = score(doc=2198,freq=1.0), product of:
              0.7233362 = queryWeight, product of:
                1.0235039 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0825038 = queryNorm
              5.353733 = fieldWeight in 2198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.625 = fieldNorm(doc=2198)
        0.5 = coord(1/2)
    
  2. Díaz, J.P. -> Pino-Díaz, J.: 1.92
    1.9168141 = sum of:
      1.9168141 = product of:
        3.8336282 = sum of:
          3.8336282 = weight(author_txt:díaz in 1052) [ClassicSimilarity], result of:
            3.8336282 = score(doc=1052,freq=2.0), product of:
              0.7233362 = queryWeight, product of:
                1.0235039 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0825038 = queryNorm
              5.299926 = fieldWeight in 1052, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.4375 = fieldNorm(doc=1052)
        0.5 = coord(1/2)
    
  3. Moreno Fernández, L.M. -> Fernández, L.M.M.: 1.79
    1.7877691 = sum of:
      1.7877691 = product of:
        3.5755382 = sum of:
          3.5755382 = weight(author_txt:fernández in 5950) [ClassicSimilarity], result of:
            3.5755382 = score(doc=5950,freq=2.0), product of:
              0.690496 = queryWeight, product of:
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0825038 = queryNorm
              5.178217 = fieldWeight in 5950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.4375 = fieldNorm(doc=5950)
        0.5 = coord(1/2)
    
  4. Esteban, A. Díaz -> Díaz Esteban, A.: 1.64
    1.6429834 = sum of:
      1.6429834 = product of:
        3.2859669 = sum of:
          3.2859669 = weight(author_txt:díaz in 3747) [ClassicSimilarity], result of:
            3.2859669 = score(doc=3747,freq=2.0), product of:
              0.7233362 = queryWeight, product of:
                1.0235039 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0825038 = queryNorm
              4.5427933 = fieldWeight in 3747, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.375 = fieldNorm(doc=3747)
        0.5 = coord(1/2)
    
  5. Díaz, N.P. Cruz -> Cruz Díaz, N.P.: 1.64
    1.6429834 = sum of:
      1.6429834 = product of:
        3.2859669 = sum of:
          3.2859669 = weight(author_txt:díaz in 1233) [ClassicSimilarity], result of:
            3.2859669 = score(doc=1233,freq=2.0), product of:
              0.7233362 = queryWeight, product of:
                1.0235039 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0825038 = queryNorm
              4.5427933 = fieldWeight in 1233, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.375 = fieldNorm(doc=1233)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Yang, Y.; Liu, X.: ¬A re-examination of text categorization methods (1999) 0.41
    0.41348273 = sum of:
      0.41348273 = product of:
        0.93973345 = sum of:
          0.026368625 = weight(abstract_txt:results in 4386) [ClassicSimilarity], result of:
            0.026368625 = score(doc=4386,freq=1.0), product of:
              0.08085484 = queryWeight, product of:
                1.0356408 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.022443296 = queryNorm
              0.32612303 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.017576033 = weight(abstract_txt:that in 4386) [ClassicSimilarity], result of:
            0.017576033 = score(doc=4386,freq=2.0), product of:
              0.05605513 = queryWeight, product of:
                1.0561107 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.022443296 = queryNorm
              0.31354907 = fieldWeight in 4386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.013090678 = weight(abstract_txt:this in 4386) [ClassicSimilarity], result of:
            0.013090678 = score(doc=4386,freq=1.0), product of:
              0.05803004 = queryWeight, product of:
                1.074554 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.022443296 = queryNorm
              0.2255845 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.020666638 = weight(abstract_txt:with in 4386) [ClassicSimilarity], result of:
            0.020666638 = score(doc=4386,freq=2.0), product of:
              0.062447447 = queryWeight, product of:
                1.1147029 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.022443296 = queryNorm
              0.33094448 = fieldWeight in 4386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.04133191 = weight(abstract_txt:text in 4386) [ClassicSimilarity], result of:
            0.04133191 = score(doc=4386,freq=1.0), product of:
              0.10910333 = queryWeight, product of:
                1.2030264 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.022443296 = queryNorm
              0.3788327 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.05238469 = weight(abstract_txt:support in 4386) [ClassicSimilarity], result of:
            0.05238469 = score(doc=4386,freq=1.0), product of:
              0.12777638 = queryWeight, product of:
                1.3019115 = boost
                4.37303 = idf(docFreq=1522, maxDocs=44421)
                0.022443296 = queryNorm
              0.4099716 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.37303 = idf(docFreq=1522, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.10358664 = weight(abstract_txt:significantly in 4386) [ClassicSimilarity], result of:
            0.10358664 = score(doc=4386,freq=1.0), product of:
              0.20130296 = queryWeight, product of:
                1.6341099 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.022443296 = queryNorm
              0.5145808 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.17359471 = weight(abstract_txt:vector in 4386) [ClassicSimilarity], result of:
            0.17359471 = score(doc=4386,freq=1.0), product of:
              0.2840133 = queryWeight, product of:
                1.9409999 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.022443296 = queryNorm
              0.61122036 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.17922984 = weight(abstract_txt:categorization in 4386) [ClassicSimilarity], result of:
            0.17922984 = score(doc=4386,freq=1.0), product of:
              0.29012683 = queryWeight, product of:
                1.9617791 = boost
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.022443296 = queryNorm
              0.61776376 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.092641614 = weight(abstract_txt:performance in 4386) [ClassicSimilarity], result of:
            0.092641614 = score(doc=4386,freq=1.0), product of:
              0.21390232 = queryWeight, product of:
                2.063049 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.022443296 = queryNorm
              0.43310243 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
          0.21926205 = weight(abstract_txt:machines in 4386) [ClassicSimilarity], result of:
            0.21926205 = score(doc=4386,freq=1.0), product of:
              0.33186135 = queryWeight, product of:
                2.0981402 = boost
                7.0475073 = idf(docFreq=104, maxDocs=44421)
                0.022443296 = queryNorm
              0.6607038 = fieldWeight in 4386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0475073 = idf(docFreq=104, maxDocs=44421)
                0.09375 = fieldNorm(doc=4386)
        0.44 = coord(11/25)
    
  2. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.32
    0.31718466 = sum of:
      0.31718466 = product of:
        0.72087425 = sum of:
          0.01435077 = weight(abstract_txt:that in 1831) [ClassicSimilarity], result of:
            0.01435077 = score(doc=1831,freq=3.0), product of:
              0.05605513 = queryWeight, product of:
                1.0561107 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.022443296 = queryNorm
              0.25601172 = fieldWeight in 1831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.01234201 = weight(abstract_txt:this in 1831) [ClassicSimilarity], result of:
            0.01234201 = score(doc=1831,freq=2.0), product of:
              0.05803004 = queryWeight, product of:
                1.074554 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.022443296 = queryNorm
              0.21268311 = fieldWeight in 1831, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.01687424 = weight(abstract_txt:with in 1831) [ClassicSimilarity], result of:
            0.01687424 = score(doc=1831,freq=3.0), product of:
              0.062447447 = queryWeight, product of:
                1.1147029 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.022443296 = queryNorm
              0.27021506 = fieldWeight in 1831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.061613973 = weight(abstract_txt:text in 1831) [ClassicSimilarity], result of:
            0.061613973 = score(doc=1831,freq=5.0), product of:
              0.10910333 = queryWeight, product of:
                1.2030264 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.022443296 = queryNorm
              0.56473047 = fieldWeight in 1831, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.034923125 = weight(abstract_txt:support in 1831) [ClassicSimilarity], result of:
            0.034923125 = score(doc=1831,freq=1.0), product of:
              0.12777638 = queryWeight, product of:
                1.3019115 = boost
                4.37303 = idf(docFreq=1522, maxDocs=44421)
                0.022443296 = queryNorm
              0.2733144 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.37303 = idf(docFreq=1522, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.069057755 = weight(abstract_txt:significantly in 1831) [ClassicSimilarity], result of:
            0.069057755 = score(doc=1831,freq=1.0), product of:
              0.20130296 = queryWeight, product of:
                1.6341099 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.022443296 = queryNorm
              0.34305385 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.08534611 = weight(abstract_txt:improving in 1831) [ClassicSimilarity], result of:
            0.08534611 = score(doc=1831,freq=1.0), product of:
              0.23182718 = queryWeight, product of:
                1.7536315 = boost
                5.8903265 = idf(docFreq=333, maxDocs=44421)
                0.022443296 = queryNorm
              0.3681454 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8903265 = idf(docFreq=333, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.11572981 = weight(abstract_txt:vector in 1831) [ClassicSimilarity], result of:
            0.11572981 = score(doc=1831,freq=1.0), product of:
              0.2840133 = queryWeight, product of:
                1.9409999 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.022443296 = queryNorm
              0.40748024 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.12352215 = weight(abstract_txt:performance in 1831) [ClassicSimilarity], result of:
            0.12352215 = score(doc=1831,freq=4.0), product of:
              0.21390232 = queryWeight, product of:
                2.063049 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.022443296 = queryNorm
              0.5774699 = fieldWeight in 1831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.1461747 = weight(abstract_txt:machines in 1831) [ClassicSimilarity], result of:
            0.1461747 = score(doc=1831,freq=1.0), product of:
              0.33186135 = queryWeight, product of:
                2.0981402 = boost
                7.0475073 = idf(docFreq=104, maxDocs=44421)
                0.022443296 = queryNorm
              0.4404692 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0475073 = idf(docFreq=104, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.040939614 = weight(abstract_txt:different in 1831) [ClassicSimilarity], result of:
            0.040939614 = score(doc=1831,freq=1.0), product of:
              0.17898406 = queryWeight, product of:
                2.179106 = boost
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.022443296 = queryNorm
              0.2287333 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
        0.44 = coord(11/25)
    
  3. Mostafa, J.; Quiroga, L.M.; Palakal, M.: Filtering medical documents using automated and human classification methods (1998) 0.28
    0.2847877 = sum of:
      0.2847877 = product of:
        0.8899616 = sum of:
          0.079129554 = weight(abstract_txt:improves in 3326) [ClassicSimilarity], result of:
            0.079129554 = score(doc=3326,freq=1.0), product of:
              0.15077095 = queryWeight, product of:
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.022443296 = queryNorm
              0.5248329 = fieldWeight in 3326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
          0.031075722 = weight(abstract_txt:results in 3326) [ClassicSimilarity], result of:
            0.031075722 = score(doc=3326,freq=2.0), product of:
              0.08085484 = queryWeight, product of:
                1.0356408 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.022443296 = queryNorm
              0.38433966 = fieldWeight in 3326, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
          0.010356776 = weight(abstract_txt:that in 3326) [ClassicSimilarity], result of:
            0.010356776 = score(doc=3326,freq=1.0), product of:
              0.05605513 = queryWeight, product of:
                1.0561107 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.022443296 = queryNorm
              0.18476056 = fieldWeight in 3326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
          0.010908898 = weight(abstract_txt:this in 3326) [ClassicSimilarity], result of:
            0.010908898 = score(doc=3326,freq=1.0), product of:
              0.05803004 = queryWeight, product of:
                1.074554 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.022443296 = queryNorm
              0.18798709 = fieldWeight in 3326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
          0.0172222 = weight(abstract_txt:with in 3326) [ClassicSimilarity], result of:
            0.0172222 = score(doc=3326,freq=2.0), product of:
              0.062447447 = queryWeight, product of:
                1.1147029 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.022443296 = queryNorm
              0.2757871 = fieldWeight in 3326, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
          0.15440269 = weight(abstract_txt:performance in 3326) [ClassicSimilarity], result of:
            0.15440269 = score(doc=3326,freq=4.0), product of:
              0.21390232 = queryWeight, product of:
                2.063049 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.022443296 = queryNorm
              0.72183734 = fieldWeight in 3326, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
          0.051174518 = weight(abstract_txt:different in 3326) [ClassicSimilarity], result of:
            0.051174518 = score(doc=3326,freq=1.0), product of:
              0.17898406 = queryWeight, product of:
                2.179106 = boost
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.022443296 = queryNorm
              0.28591663 = fieldWeight in 3326, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
          0.53569126 = weight(abstract_txt:filtering in 3326) [ClassicSimilarity], result of:
            0.53569126 = score(doc=3326,freq=6.0), product of:
              0.42824426 = queryWeight, product of:
                2.9190905 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.022443296 = queryNorm
              1.2509012 = fieldWeight in 3326, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.078125 = fieldNorm(doc=3326)
        0.32 = coord(8/25)
    
  4. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.26
    0.2558528 = sum of:
      0.2558528 = product of:
        0.7107021 = sum of:
          0.024860578 = weight(abstract_txt:results in 3579) [ClassicSimilarity], result of:
            0.024860578 = score(doc=3579,freq=2.0), product of:
              0.08085484 = queryWeight, product of:
                1.0356408 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.022443296 = queryNorm
              0.30747172 = fieldWeight in 3579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.016570844 = weight(abstract_txt:that in 3579) [ClassicSimilarity], result of:
            0.016570844 = score(doc=3579,freq=4.0), product of:
              0.05605513 = queryWeight, product of:
                1.0561107 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.022443296 = queryNorm
              0.2956169 = fieldWeight in 3579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.008727118 = weight(abstract_txt:this in 3579) [ClassicSimilarity], result of:
            0.008727118 = score(doc=3579,freq=1.0), product of:
              0.05803004 = queryWeight, product of:
                1.074554 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.022443296 = queryNorm
              0.15038967 = fieldWeight in 3579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.0097423475 = weight(abstract_txt:with in 3579) [ClassicSimilarity], result of:
            0.0097423475 = score(doc=3579,freq=1.0), product of:
              0.062447447 = queryWeight, product of:
                1.1147029 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.022443296 = queryNorm
              0.15600874 = fieldWeight in 3579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.050411925 = weight(abstract_txt:three in 3579) [ClassicSimilarity], result of:
            0.050411925 = score(doc=3579,freq=2.0), product of:
              0.12953508 = queryWeight, product of:
                1.3108405 = boost
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.022443296 = queryNorm
              0.38917586 = fieldWeight in 3579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.069057755 = weight(abstract_txt:significantly in 3579) [ClassicSimilarity], result of:
            0.069057755 = score(doc=3579,freq=1.0), product of:
              0.20130296 = queryWeight, product of:
                1.6341099 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.022443296 = queryNorm
              0.34305385 = fieldWeight in 3579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.12352215 = weight(abstract_txt:performance in 3579) [ClassicSimilarity], result of:
            0.12352215 = score(doc=3579,freq=4.0), product of:
              0.21390232 = queryWeight, product of:
                2.063049 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.022443296 = queryNorm
              0.5774699 = fieldWeight in 3579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.05789736 = weight(abstract_txt:different in 3579) [ClassicSimilarity], result of:
            0.05789736 = score(doc=3579,freq=2.0), product of:
              0.17898406 = queryWeight, product of:
                2.179106 = boost
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.022443296 = queryNorm
              0.32347775 = fieldWeight in 3579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.34991205 = weight(abstract_txt:filtering in 3579) [ClassicSimilarity], result of:
            0.34991205 = score(doc=3579,freq=4.0), product of:
              0.42824426 = queryWeight, product of:
                2.9190905 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.022443296 = queryNorm
              0.8170852 = fieldWeight in 3579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
        0.36 = coord(9/25)
    
  5. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.24
    0.24433348 = sum of:
      0.24433348 = product of:
        0.6787041 = sum of:
          0.09495547 = weight(abstract_txt:improves in 2595) [ClassicSimilarity], result of:
            0.09495547 = score(doc=2595,freq=1.0), product of:
              0.15077095 = queryWeight, product of:
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.022443296 = queryNorm
              0.6297995 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.026368625 = weight(abstract_txt:results in 2595) [ClassicSimilarity], result of:
            0.026368625 = score(doc=2595,freq=1.0), product of:
              0.08085484 = queryWeight, product of:
                1.0356408 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.022443296 = queryNorm
              0.32612303 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.021526156 = weight(abstract_txt:that in 2595) [ClassicSimilarity], result of:
            0.021526156 = score(doc=2595,freq=3.0), product of:
              0.05605513 = queryWeight, product of:
                1.0561107 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.022443296 = queryNorm
              0.3840176 = fieldWeight in 2595, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.013090678 = weight(abstract_txt:this in 2595) [ClassicSimilarity], result of:
            0.013090678 = score(doc=2595,freq=1.0), product of:
              0.05803004 = queryWeight, product of:
                1.074554 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.022443296 = queryNorm
              0.2255845 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.014613521 = weight(abstract_txt:with in 2595) [ClassicSimilarity], result of:
            0.014613521 = score(doc=2595,freq=1.0), product of:
              0.062447447 = queryWeight, product of:
                1.1147029 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.022443296 = queryNorm
              0.23401311 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.058452144 = weight(abstract_txt:text in 2595) [ClassicSimilarity], result of:
            0.058452144 = score(doc=2595,freq=2.0), product of:
              0.10910333 = queryWeight, product of:
                1.2030264 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.022443296 = queryNorm
              0.5357503 = fieldWeight in 2595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.10358664 = weight(abstract_txt:significantly in 2595) [ClassicSimilarity], result of:
            0.10358664 = score(doc=2595,freq=1.0), product of:
              0.20130296 = queryWeight, product of:
                1.6341099 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.022443296 = queryNorm
              0.5145808 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.25346926 = weight(abstract_txt:categorization in 2595) [ClassicSimilarity], result of:
            0.25346926 = score(doc=2595,freq=2.0), product of:
              0.29012683 = queryWeight, product of:
                1.9617791 = boost
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.022443296 = queryNorm
              0.87364984 = fieldWeight in 2595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
          0.092641614 = weight(abstract_txt:performance in 2595) [ClassicSimilarity], result of:
            0.092641614 = score(doc=2595,freq=1.0), product of:
              0.21390232 = queryWeight, product of:
                2.063049 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.022443296 = queryNorm
              0.43310243 = fieldWeight in 2595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.09375 = fieldNorm(doc=2595)
        0.36 = coord(9/25)