Document (#40015)

Author
Salles, T.
Rocha, L.
Gonçalves, M.A.
Almeida, J.M.
Mourão, F.
Meira Jr., W.
Viegas, F.
Title
¬A quantitative analysis of the temporal effects on automatic text classification
Source
Journal of the Association for Information Science and Technology. 67(2016) no.7, S.1639-1667
Year
2016
Abstract
Automatic text classification (TC) continues to be a relevant research topic and several TC algorithms have been proposed. However, the majority of TC algorithms assume that the underlying data distribution does not change over time. In this work, we are concerned with the challenges imposed by the temporal dynamics observed in textual data sets. We provide evidence of the existence of temporal effects in three textual data sets, reflected by variations observed over time in the class distribution, in the pairwise class similarities, and in the relationships between terms and classes. We then quantify, using a series of full factorial design experiments, the impact of these effects on four well-known TC algorithms. We show that these temporal effects affect each analyzed data set differently and that they restrict the performance of each considered TC algorithm to different extents. The reported quantitative analyses, which are the original contributions of this article, provide valuable new insights to better understand the behavior of TC algorithms when faced with nonstatic (temporal) data distributions and highlight important requirements for the proposal of more accurate classification models.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23452/abstract.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Belém, F.M.; Almeida, J.M.; Gonçalves, M.A.: ¬A survey on tag recommendation methods : a review (2017) 2.19
    2.188672 = sum of:
      2.188672 = product of:
        3.283008 = sum of:
          1.5266947 = weight(author_txt:almeida in 4524) [ClassicSimilarity], result of:
            1.5266947 = score(doc=4524,freq=1.0), product of:
              0.49799785 = queryWeight, product of:
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.06091637 = queryNorm
              3.0656652 = fieldWeight in 4524, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.375 = fieldNorm(doc=4524)
          1.7563133 = weight(author_txt:gonçalves in 4524) [ClassicSimilarity], result of:
            1.7563133 = score(doc=4524,freq=1.0), product of:
              0.54675657 = queryWeight, product of:
                1.0478117 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.06091637 = queryNorm
              3.21224 = fieldWeight in 4524, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.375 = fieldNorm(doc=4524)
        0.6666667 = coord(2/3)
    
  2. Martins, E.F.; Belém, F.M.; Almeida, J.M.; Gonçalves, M.A.: On cold start for associative tag recommendation (2016) 1.82
    1.8238933 = sum of:
      1.8238933 = product of:
        2.7358398 = sum of:
          1.2722455 = weight(author_txt:almeida in 3494) [ClassicSimilarity], result of:
            1.2722455 = score(doc=3494,freq=1.0), product of:
              0.49799785 = queryWeight, product of:
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.06091637 = queryNorm
              2.5547209 = fieldWeight in 3494, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.3125 = fieldNorm(doc=3494)
          1.4635943 = weight(author_txt:gonçalves in 3494) [ClassicSimilarity], result of:
            1.4635943 = score(doc=3494,freq=1.0), product of:
              0.54675657 = queryWeight, product of:
                1.0478117 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.06091637 = queryNorm
              2.6768665 = fieldWeight in 3494, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.3125 = fieldNorm(doc=3494)
        0.6666667 = coord(2/3)
    
  3. Souza, R. Rocha => Rocha Souza, R.: 1.13
    1.1308844 = sum of:
      1.1308844 = product of:
        3.392653 = sum of:
          3.392653 = weight(author_txt:rocha in 708) [ClassicSimilarity], result of:
            3.392653 = score(doc=708,freq=2.0), product of:
              0.6730939 = queryWeight, product of:
                1.1625834 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.06091637 = queryNorm
              5.0403857 = fieldWeight in 708, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.375 = fieldNorm(doc=708)
        0.33333334 = coord(1/3)
    
  4. Silva, N.; Rocha, J.: Merging ontologies using a bottom-up lexical and structural approach (2003) 1.07
    1.0662081 = sum of:
      1.0662081 = product of:
        3.1986241 = sum of:
          3.1986241 = weight(author_txt:rocha in 3685) [ClassicSimilarity], result of:
            3.1986241 = score(doc=3685,freq=1.0), product of:
              0.6730939 = queryWeight, product of:
                1.1625834 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.06091637 = queryNorm
              4.7521214 = fieldWeight in 3685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.5 = fieldNorm(doc=3685)
        0.33333334 = coord(1/3)
    
  5. Rocha, R.; Cobo, A.: Automatización de procesos de categorización jerárquica documental en las organizaciones (2010) 1.07
    1.0662081 = sum of:
      1.0662081 = product of:
        3.1986241 = sum of:
          3.1986241 = weight(author_txt:rocha in 838) [ClassicSimilarity], result of:
            3.1986241 = score(doc=838,freq=1.0), product of:
              0.6730939 = queryWeight, product of:
                1.1625834 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.06091637 = queryNorm
              4.7521214 = fieldWeight in 838, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.5 = fieldNorm(doc=838)
        0.33333334 = coord(1/3)
    

Similar documents (content)

  1. Ren, P.; Chen, Z.; Ma, J.; Zhang, Z.; Si, L.; Wang, S.: Detecting temporal patterns of user queries (2017) 0.24
    0.24386002 = sum of:
      0.24386002 = product of:
        0.87092865 = sum of:
          0.025009692 = weight(abstract_txt:text in 4315) [ClassicSimilarity], result of:
            0.025009692 = score(doc=4315,freq=1.0), product of:
              0.07922133 = queryWeight, product of:
                1.1060276 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01772556 = queryNorm
              0.3156939 = fieldWeight in 4315, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=4315)
          0.04687899 = weight(abstract_txt:time in 4315) [ClassicSimilarity], result of:
            0.04687899 = score(doc=4315,freq=3.0), product of:
              0.08350548 = queryWeight, product of:
                1.1355399 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.01772556 = queryNorm
              0.5613882 = fieldWeight in 4315, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.078125 = fieldNorm(doc=4315)
          0.028947316 = weight(abstract_txt:over in 4315) [ClassicSimilarity], result of:
            0.028947316 = score(doc=4315,freq=1.0), product of:
              0.08733241 = queryWeight, product of:
                1.1612685 = boost
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.01772556 = queryNorm
              0.3314613 = fieldWeight in 4315, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.078125 = fieldNorm(doc=4315)
          0.05275365 = weight(abstract_txt:sets in 4315) [ClassicSimilarity], result of:
            0.05275365 = score(doc=4315,freq=1.0), product of:
              0.13029815 = queryWeight, product of:
                1.41845 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.01772556 = queryNorm
              0.40486875 = fieldWeight in 4315, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.078125 = fieldNorm(doc=4315)
          0.062654935 = weight(abstract_txt:classification in 4315) [ClassicSimilarity], result of:
            0.062654935 = score(doc=4315,freq=3.0), product of:
              0.11598366 = queryWeight, product of:
                1.6390377 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.01772556 = queryNorm
              0.5402049 = fieldWeight in 4315, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=4315)
          0.0349978 = weight(abstract_txt:data in 4315) [ClassicSimilarity], result of:
            0.0349978 = score(doc=4315,freq=1.0), product of:
              0.13451697 = queryWeight, product of:
                2.2787855 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.01772556 = queryNorm
              0.26017386 = fieldWeight in 4315, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.078125 = fieldNorm(doc=4315)
          0.6196863 = weight(abstract_txt:temporal in 4315) [ClassicSimilarity], result of:
            0.6196863 = score(doc=4315,freq=4.0), product of:
              0.5756756 = queryWeight, product of:
                4.7141547 = boost
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.01772556 = queryNorm
              1.0764505 = fieldWeight in 4315, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.078125 = fieldNorm(doc=4315)
        0.28 = coord(7/25)
    
  2. Tudhope, D.; Taylor, C.: Navigation via similarity (1997) 0.19
    0.18870938 = sum of:
      0.18870938 = product of:
        0.67396206 = sum of:
          0.02646352 = weight(abstract_txt:each in 1155) [ClassicSimilarity], result of:
            0.02646352 = score(doc=1155,freq=1.0), product of:
              0.08226245 = queryWeight, product of:
                1.1270566 = boost
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.01772556 = queryNorm
              0.32169622 = fieldWeight in 1155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.078125 = fieldNorm(doc=1155)
          0.038276535 = weight(abstract_txt:time in 1155) [ClassicSimilarity], result of:
            0.038276535 = score(doc=1155,freq=2.0), product of:
              0.08350548 = queryWeight, product of:
                1.1355399 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.01772556 = queryNorm
              0.45837152 = fieldWeight in 1155, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.078125 = fieldNorm(doc=1155)
          0.028947316 = weight(abstract_txt:over in 1155) [ClassicSimilarity], result of:
            0.028947316 = score(doc=1155,freq=1.0), product of:
              0.08733241 = queryWeight, product of:
                1.1612685 = boost
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.01772556 = queryNorm
              0.3314613 = fieldWeight in 1155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.078125 = fieldNorm(doc=1155)
          0.05275365 = weight(abstract_txt:sets in 1155) [ClassicSimilarity], result of:
            0.05275365 = score(doc=1155,freq=1.0), product of:
              0.13029815 = queryWeight, product of:
                1.41845 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.01772556 = queryNorm
              0.40486875 = fieldWeight in 1155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.078125 = fieldNorm(doc=1155)
          0.0531628 = weight(abstract_txt:automatic in 1155) [ClassicSimilarity], result of:
            0.0531628 = score(doc=1155,freq=1.0), product of:
              0.130971 = queryWeight, product of:
                1.4221077 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.01772556 = queryNorm
              0.40591276 = fieldWeight in 1155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=1155)
          0.036173847 = weight(abstract_txt:classification in 1155) [ClassicSimilarity], result of:
            0.036173847 = score(doc=1155,freq=1.0), product of:
              0.11598366 = queryWeight, product of:
                1.6390377 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.01772556 = queryNorm
              0.31188744 = fieldWeight in 1155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=1155)
          0.4381844 = weight(abstract_txt:temporal in 1155) [ClassicSimilarity], result of:
            0.4381844 = score(doc=1155,freq=2.0), product of:
              0.5756756 = queryWeight, product of:
                4.7141547 = boost
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.01772556 = queryNorm
              0.7611655 = fieldWeight in 1155, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.078125 = fieldNorm(doc=1155)
        0.28 = coord(7/25)
    
  3. Fairthorne, R.A.: Temporal structure in bibliographic classification (1985) 0.15
    0.15130539 = sum of:
      0.15130539 = product of:
        0.47282934 = sum of:
          0.01323176 = weight(abstract_txt:each in 4651) [ClassicSimilarity], result of:
            0.01323176 = score(doc=4651,freq=1.0), product of:
              0.08226245 = queryWeight, product of:
                1.1270566 = boost
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.01772556 = queryNorm
              0.16084811 = fieldWeight in 4651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.030260257 = weight(abstract_txt:time in 4651) [ClassicSimilarity], result of:
            0.030260257 = score(doc=4651,freq=5.0), product of:
              0.08350548 = queryWeight, product of:
                1.1355399 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.01772556 = queryNorm
              0.3623745 = fieldWeight in 4651, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.014473658 = weight(abstract_txt:over in 4651) [ClassicSimilarity], result of:
            0.014473658 = score(doc=4651,freq=1.0), product of:
              0.08733241 = queryWeight, product of:
                1.1612685 = boost
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.01772556 = queryNorm
              0.16573066 = fieldWeight in 4651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.05275365 = weight(abstract_txt:sets in 4651) [ClassicSimilarity], result of:
            0.05275365 = score(doc=4651,freq=4.0), product of:
              0.13029815 = queryWeight, product of:
                1.41845 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.01772556 = queryNorm
              0.40486875 = fieldWeight in 4651, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.054942798 = weight(abstract_txt:class in 4651) [ClassicSimilarity], result of:
            0.054942798 = score(doc=4651,freq=2.0), product of:
              0.16867618 = queryWeight, product of:
                1.6138821 = boost
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.01772556 = queryNorm
              0.32572943 = fieldWeight in 4651, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.040221505 = weight(abstract_txt:textual in 4651) [ClassicSimilarity], result of:
            0.040221505 = score(doc=4651,freq=1.0), product of:
              0.17262173 = queryWeight, product of:
                1.6326482 = boost
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.01772556 = queryNorm
              0.23300372 = fieldWeight in 4651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.0478535 = weight(abstract_txt:classification in 4651) [ClassicSimilarity], result of:
            0.0478535 = score(doc=4651,freq=7.0), product of:
              0.11598366 = queryWeight, product of:
                1.6390377 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.01772556 = queryNorm
              0.4125883 = fieldWeight in 4651, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.2190922 = weight(abstract_txt:temporal in 4651) [ClassicSimilarity], result of:
            0.2190922 = score(doc=4651,freq=2.0), product of:
              0.5756756 = queryWeight, product of:
                4.7141547 = boost
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.01772556 = queryNorm
              0.38058275 = fieldWeight in 4651, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
        0.32 = coord(8/25)
    
  4. Bose, I.; Chen, X.: ¬A method for extension of generative topographic mapping for fuzzy clustering (2009) 0.15
    0.14714997 = sum of:
      0.14714997 = product of:
        0.5255356 = sum of:
          0.02451257 = weight(abstract_txt:provide in 3711) [ClassicSimilarity], result of:
            0.02451257 = score(doc=3711,freq=1.0), product of:
              0.07816803 = queryWeight, product of:
                1.0986503 = boost
                4.013929 = idf(docFreq=2180, maxDocs=44421)
                0.01772556 = queryNorm
              0.3135882 = fieldWeight in 3711, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.013929 = idf(docFreq=2180, maxDocs=44421)
                0.078125 = fieldNorm(doc=3711)
          0.028947316 = weight(abstract_txt:over in 3711) [ClassicSimilarity], result of:
            0.028947316 = score(doc=3711,freq=1.0), product of:
              0.08733241 = queryWeight, product of:
                1.1612685 = boost
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.01772556 = queryNorm
              0.3314613 = fieldWeight in 3711, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.078125 = fieldNorm(doc=3711)
          0.05275365 = weight(abstract_txt:sets in 3711) [ClassicSimilarity], result of:
            0.05275365 = score(doc=3711,freq=1.0), product of:
              0.13029815 = queryWeight, product of:
                1.41845 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.01772556 = queryNorm
              0.40486875 = fieldWeight in 3711, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.078125 = fieldNorm(doc=3711)
          0.06644616 = weight(abstract_txt:distribution in 3711) [ClassicSimilarity], result of:
            0.06644616 = score(doc=3711,freq=1.0), product of:
              0.1519672 = queryWeight, product of:
                1.5318627 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.01772556 = queryNorm
              0.43724018 = fieldWeight in 3711, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.078125 = fieldNorm(doc=3711)
          0.084334105 = weight(abstract_txt:observed in 3711) [ClassicSimilarity], result of:
            0.084334105 = score(doc=3711,freq=1.0), product of:
              0.17814435 = queryWeight, product of:
                1.6585591 = boost
                6.059561 = idf(docFreq=281, maxDocs=44421)
                0.01772556 = queryNorm
              0.4734032 = fieldWeight in 3711, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.059561 = idf(docFreq=281, maxDocs=44421)
                0.078125 = fieldNorm(doc=3711)
          0.0699956 = weight(abstract_txt:data in 3711) [ClassicSimilarity], result of:
            0.0699956 = score(doc=3711,freq=4.0), product of:
              0.13451697 = queryWeight, product of:
                2.2787855 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.01772556 = queryNorm
              0.5203477 = fieldWeight in 3711, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.078125 = fieldNorm(doc=3711)
          0.19854614 = weight(abstract_txt:algorithms in 3711) [ClassicSimilarity], result of:
            0.19854614 = score(doc=3711,freq=2.0), product of:
              0.3152663 = queryWeight, product of:
                3.1203167 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.01772556 = queryNorm
              0.6297728 = fieldWeight in 3711, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.078125 = fieldNorm(doc=3711)
        0.28 = coord(7/25)
    
  5. Stamatatos, E.: Author identification : using text sampling to handle the class imbalance problem (2008) 0.14
    0.13978608 = sum of:
      0.13978608 = product of:
        0.4368315 = sum of:
          0.019610057 = weight(abstract_txt:provide in 3063) [ClassicSimilarity], result of:
            0.019610057 = score(doc=3063,freq=1.0), product of:
              0.07816803 = queryWeight, product of:
                1.0986503 = boost
                4.013929 = idf(docFreq=2180, maxDocs=44421)
                0.01772556 = queryNorm
              0.25087056 = fieldWeight in 3063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.013929 = idf(docFreq=2180, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
          0.04900879 = weight(abstract_txt:text in 3063) [ClassicSimilarity], result of:
            0.04900879 = score(doc=3063,freq=6.0), product of:
              0.07922133 = queryWeight, product of:
                1.1060276 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01772556 = queryNorm
              0.61863124 = fieldWeight in 3063, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
          0.03275015 = weight(abstract_txt:over in 3063) [ClassicSimilarity], result of:
            0.03275015 = score(doc=3063,freq=2.0), product of:
              0.08733241 = queryWeight, product of:
                1.1612685 = boost
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.01772556 = queryNorm
              0.37500566 = fieldWeight in 3063, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
          0.075175256 = weight(abstract_txt:distribution in 3063) [ClassicSimilarity], result of:
            0.075175256 = score(doc=3063,freq=2.0), product of:
              0.1519672 = queryWeight, product of:
                1.5318627 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.01772556 = queryNorm
              0.4946808 = fieldWeight in 3063, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
          0.13899551 = weight(abstract_txt:class in 3063) [ClassicSimilarity], result of:
            0.13899551 = score(doc=3063,freq=5.0), product of:
              0.16867618 = queryWeight, product of:
                1.6138821 = boost
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.01772556 = queryNorm
              0.82403755 = fieldWeight in 3063, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
          0.064354405 = weight(abstract_txt:textual in 3063) [ClassicSimilarity], result of:
            0.064354405 = score(doc=3063,freq=1.0), product of:
              0.17262173 = queryWeight, product of:
                1.6326482 = boost
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.01772556 = queryNorm
              0.37280595 = fieldWeight in 3063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
          0.028939078 = weight(abstract_txt:classification in 3063) [ClassicSimilarity], result of:
            0.028939078 = score(doc=3063,freq=1.0), product of:
              0.11598366 = queryWeight, product of:
                1.6390377 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.01772556 = queryNorm
              0.24950996 = fieldWeight in 3063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
          0.02799824 = weight(abstract_txt:data in 3063) [ClassicSimilarity], result of:
            0.02799824 = score(doc=3063,freq=1.0), product of:
              0.13451697 = queryWeight, product of:
                2.2787855 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.01772556 = queryNorm
              0.20813909 = fieldWeight in 3063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=3063)
        0.32 = coord(8/25)