Document (#33049)

Author
Lim, C.S.
Lee, K.J.
Kim, G.C.
Title
Multiple sets of features for automatic genre classification of web documents
Source
Information processing and management. 41(2005) no.5, S.1263-1276
Year
2005
Abstract
With the increase of information on the Web, it is difficult to find desired information quickly out of the documents retrieved by a search engine. One way to solve this problem is to classify web documents according to various criteria. Most document classification has been focused on a subject or a topic of a document. A genre or a style is another view of a document different from a subject or a topic. The genre is also a criterion to classify documents. In this paper, we suggest multiple sets of features to classify genres of web documents. The basic set of features, which have been proposed in the previous studies, is acquired from the textual properties of documents, such as the number of sentences, the number of a certain word, etc. However, web documents are different from textual documents in that they contain URL and HTML tags within the pages. We introduce new sets of features specific to web documents, which are extracted from URL and HTML tags. The present work is an attempt to evaluate the performance of the proposed sets of features, and to discuss their characteristics. Finally, we conclude which is an appropriate set of features in automatic genre classification of web documents.
Theme
Automatisches Klassifizieren

Similar documents (content)

  1. Finn, A.; Kushmerick, N.: Learning to classify documents according to genre (2006) 0.69
    0.6872336 = sum of:
      0.6872336 = product of:
        1.4317367 = sum of:
          0.02746656 = weight(abstract_txt:different in 10) [ClassicSimilarity], result of:
            0.02746656 = score(doc=10,freq=2.0), product of:
              0.06792816 = queryWeight, product of:
                1.0454905 = boost
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.017753351 = queryNorm
              0.40434718 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.028027225 = weight(abstract_txt:number in 10) [ClassicSimilarity], result of:
            0.028027225 = score(doc=10,freq=1.0), product of:
              0.08674486 = queryWeight, product of:
                1.1814547 = boost
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.017753351 = queryNorm
              0.32309955 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.02079322 = weight(abstract_txt:which in 10) [ClassicSimilarity], result of:
            0.02079322 = score(doc=10,freq=2.0), product of:
              0.064589 = queryWeight, product of:
                1.2485907 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.017753351 = queryNorm
              0.32193127 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.07232774 = weight(abstract_txt:topic in 10) [ClassicSimilarity], result of:
            0.07232774 = score(doc=10,freq=2.0), product of:
              0.12953395 = queryWeight, product of:
                1.4437333 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.017753351 = queryNorm
              0.558369 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.07579993 = weight(abstract_txt:multiple in 10) [ClassicSimilarity], result of:
            0.07579993 = score(doc=10,freq=2.0), product of:
              0.1336471 = queryWeight, product of:
                1.466476 = boost
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.017753351 = queryNorm
              0.5671648 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.078593045 = weight(abstract_txt:automatic in 10) [ClassicSimilarity], result of:
            0.078593045 = score(doc=10,freq=2.0), product of:
              0.1369104 = queryWeight, product of:
                1.4842716 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017753351 = queryNorm
              0.5740473 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.053477485 = weight(abstract_txt:classification in 10) [ClassicSimilarity], result of:
            0.053477485 = score(doc=10,freq=2.0), product of:
              0.121243395 = queryWeight, product of:
                1.7106843 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017753351 = queryNorm
              0.44107544 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.047061305 = weight(abstract_txt:document in 10) [ClassicSimilarity], result of:
            0.047061305 = score(doc=10,freq=1.0), product of:
              0.14028032 = queryWeight, product of:
                1.8400905 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017753351 = queryNorm
              0.33548045 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.11029194 = weight(abstract_txt:sets in 10) [ClassicSimilarity], result of:
            0.11029194 = score(doc=10,freq=1.0), product of:
              0.27241406 = queryWeight, product of:
                2.9609082 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.017753351 = queryNorm
              0.40486875 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.46642497 = weight(abstract_txt:genre in 10) [ClassicSimilarity], result of:
            0.46642497 = score(doc=10,freq=4.0), product of:
              0.4487814 = queryWeight, product of:
                3.8003852 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.017753351 = queryNorm
              1.0393144 = fieldWeight in 10, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.11127776 = weight(abstract_txt:features in 10) [ClassicSimilarity], result of:
            0.11127776 = score(doc=10,freq=1.0), product of:
              0.31369168 = queryWeight, product of:
                3.8914127 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.017753351 = queryNorm
              0.3547361 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.34019545 = weight(abstract_txt:documents in 10) [ClassicSimilarity], result of:
            0.34019545 = score(doc=10,freq=6.0), product of:
              0.43113726 = queryWeight, product of:
                5.8896294 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017753351 = queryNorm
              0.7890653 = fieldWeight in 10, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
        0.48 = coord(12/25)
    
  2. Ko, Y.; Park, J.; Seo, J.: Improving text categorization using the importance of sentences (2004) 0.31
    0.30592486 = sum of:
      0.30592486 = product of:
        0.8497913 = sum of:
          0.108769946 = weight(abstract_txt:sentences in 3557) [ClassicSimilarity], result of:
            0.108769946 = score(doc=3557,freq=4.0), product of:
              0.12429098 = queryWeight, product of:
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.017753351 = queryNorm
              0.8751234 = fieldWeight in 3557, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.015537431 = weight(abstract_txt:different in 3557) [ClassicSimilarity], result of:
            0.015537431 = score(doc=3557,freq=1.0), product of:
              0.06792816 = queryWeight, product of:
                1.0454905 = boost
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.017753351 = queryNorm
              0.2287333 = fieldWeight in 3557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.04445894 = weight(abstract_txt:automatic in 3557) [ClassicSimilarity], result of:
            0.04445894 = score(doc=3557,freq=1.0), product of:
              0.1369104 = queryWeight, product of:
                1.4842716 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017753351 = queryNorm
              0.32473022 = fieldWeight in 3557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.013320011 = weight(abstract_txt:from in 3557) [ClassicSimilarity], result of:
            0.013320011 = score(doc=3557,freq=1.0), product of:
              0.077234276 = queryWeight, product of:
                1.5765771 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017753351 = queryNorm
              0.17246243 = fieldWeight in 3557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.07529809 = weight(abstract_txt:document in 3557) [ClassicSimilarity], result of:
            0.07529809 = score(doc=3557,freq=4.0), product of:
              0.14028032 = queryWeight, product of:
                1.8400905 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017753351 = queryNorm
              0.53676873 = fieldWeight in 3557, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.13245188 = weight(abstract_txt:classify in 3557) [ClassicSimilarity], result of:
            0.13245188 = score(doc=3557,freq=1.0), product of:
              0.32448864 = queryWeight, product of:
                2.7985983 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.017753351 = queryNorm
              0.40818647 = fieldWeight in 3557, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.12478109 = weight(abstract_txt:sets in 3557) [ClassicSimilarity], result of:
            0.12478109 = score(doc=3557,freq=2.0), product of:
              0.27241406 = queryWeight, product of:
                2.9609082 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.017753351 = queryNorm
              0.45805672 = fieldWeight in 3557, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.17804441 = weight(abstract_txt:features in 3557) [ClassicSimilarity], result of:
            0.17804441 = score(doc=3557,freq=4.0), product of:
              0.31369168 = queryWeight, product of:
                3.8914127 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.017753351 = queryNorm
              0.5675777 = fieldWeight in 3557, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
          0.15712954 = weight(abstract_txt:documents in 3557) [ClassicSimilarity], result of:
            0.15712954 = score(doc=3557,freq=2.0), product of:
              0.43113726 = queryWeight, product of:
                5.8896294 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017753351 = queryNorm
              0.3644536 = fieldWeight in 3557, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=3557)
        0.36 = coord(9/25)
    
  3. Fairthorne, R.A.: Temporal structure in bibliographic classification (1985) 0.30
    0.2953016 = sum of:
      0.2953016 = product of:
        0.67113996 = sum of:
          0.016819762 = weight(abstract_txt:different in 4651) [ClassicSimilarity], result of:
            0.016819762 = score(doc=4651,freq=3.0), product of:
              0.06792816 = queryWeight, product of:
                1.0454905 = boost
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.017753351 = queryNorm
              0.24761105 = fieldWeight in 4651, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6597328 = idf(docFreq=3107, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.020510849 = weight(abstract_txt:subject in 4651) [ClassicSimilarity], result of:
            0.020510849 = score(doc=4651,freq=3.0), product of:
              0.07753403 = queryWeight, product of:
                1.1169696 = boost
                3.9099448 = idf(docFreq=2419, maxDocs=44421)
                0.017753351 = queryNorm
              0.26453996 = fieldWeight in 4651, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9099448 = idf(docFreq=2419, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.014013613 = weight(abstract_txt:number in 4651) [ClassicSimilarity], result of:
            0.014013613 = score(doc=4651,freq=1.0), product of:
              0.08674486 = queryWeight, product of:
                1.1814547 = boost
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.017753351 = queryNorm
              0.16154978 = fieldWeight in 4651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.018007457 = weight(abstract_txt:which in 4651) [ClassicSimilarity], result of:
            0.018007457 = score(doc=4651,freq=6.0), product of:
              0.064589 = queryWeight, product of:
                1.2485907 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.017753351 = queryNorm
              0.27880067 = fieldWeight in 4651, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.014419334 = weight(abstract_txt:from in 4651) [ClassicSimilarity], result of:
            0.014419334 = score(doc=4651,freq=3.0), product of:
              0.077234276 = queryWeight, product of:
                1.5765771 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017753351 = queryNorm
              0.18669605 = fieldWeight in 4651, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.042045508 = weight(abstract_txt:textual in 4651) [ClassicSimilarity], result of:
            0.042045508 = score(doc=4651,freq=1.0), product of:
              0.18044993 = queryWeight, product of:
                1.7040155 = boost
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.017753351 = queryNorm
              0.23300372 = fieldWeight in 4651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.050023604 = weight(abstract_txt:classification in 4651) [ClassicSimilarity], result of:
            0.050023604 = score(doc=4651,freq=7.0), product of:
              0.121243395 = queryWeight, product of:
                1.7106843 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017753351 = queryNorm
              0.4125883 = fieldWeight in 4651, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.03327737 = weight(abstract_txt:document in 4651) [ClassicSimilarity], result of:
            0.03327737 = score(doc=4651,freq=2.0), product of:
              0.14028032 = queryWeight, product of:
                1.8400905 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017753351 = queryNorm
              0.23722051 = fieldWeight in 4651, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.082782425 = weight(abstract_txt:classify in 4651) [ClassicSimilarity], result of:
            0.082782425 = score(doc=4651,freq=1.0), product of:
              0.32448864 = queryWeight, product of:
                2.7985983 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.017753351 = queryNorm
              0.25511655 = fieldWeight in 4651, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.11029194 = weight(abstract_txt:sets in 4651) [ClassicSimilarity], result of:
            0.11029194 = score(doc=4651,freq=4.0), product of:
              0.27241406 = queryWeight, product of:
                2.9609082 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.017753351 = queryNorm
              0.40486875 = fieldWeight in 4651, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
          0.2689481 = weight(abstract_txt:documents in 4651) [ClassicSimilarity], result of:
            0.2689481 = score(doc=4651,freq=15.0), product of:
              0.43113726 = queryWeight, product of:
                5.8896294 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017753351 = queryNorm
              0.6238108 = fieldWeight in 4651, product of:
                3.8729835 = tf(freq=15.0), with freq of:
                  15.0 = termFreq=15.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0390625 = fieldNorm(doc=4651)
        0.44 = coord(11/25)
    
  4. Liu, Y.; Xu, S.; Blanchard, E.: ¬A local context-aware LDA model for topic modeling in a document network (2017) 0.28
    0.2779151 = sum of:
      0.2779151 = product of:
        0.69478774 = sum of:
          0.014967693 = weight(abstract_txt:been in 4642) [ClassicSimilarity], result of:
            0.014967693 = score(doc=4642,freq=1.0), product of:
              0.066257276 = queryWeight, product of:
                1.0325521 = boost
                3.614442 = idf(docFreq=3251, maxDocs=44421)
                0.017753351 = queryNorm
              0.22590263 = fieldWeight in 4642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.614442 = idf(docFreq=3251, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.011762422 = weight(abstract_txt:which in 4642) [ClassicSimilarity], result of:
            0.011762422 = score(doc=4642,freq=1.0), product of:
              0.064589 = queryWeight, product of:
                1.2485907 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.017753351 = queryNorm
              0.18211183 = fieldWeight in 4642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.043863308 = weight(abstract_txt:proposed in 4642) [ClassicSimilarity], result of:
            0.043863308 = score(doc=4642,freq=2.0), product of:
              0.10769311 = queryWeight, product of:
                1.3164039 = boost
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.017753351 = queryNorm
              0.4072991 = fieldWeight in 4642, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.07086642 = weight(abstract_txt:topic in 4642) [ClassicSimilarity], result of:
            0.07086642 = score(doc=4642,freq=3.0), product of:
              0.12953395 = queryWeight, product of:
                1.4437333 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.017753351 = queryNorm
              0.5470876 = fieldWeight in 4642, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.042878915 = weight(abstract_txt:multiple in 4642) [ClassicSimilarity], result of:
            0.042878915 = score(doc=4642,freq=1.0), product of:
              0.1336471 = queryWeight, product of:
                1.466476 = boost
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.017753351 = queryNorm
              0.32083684 = fieldWeight in 4642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.013320011 = weight(abstract_txt:from in 4642) [ClassicSimilarity], result of:
            0.013320011 = score(doc=4642,freq=1.0), product of:
              0.077234276 = queryWeight, product of:
                1.5765771 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017753351 = queryNorm
              0.17246243 = fieldWeight in 4642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.030251434 = weight(abstract_txt:classification in 4642) [ClassicSimilarity], result of:
            0.030251434 = score(doc=4642,freq=1.0), product of:
              0.121243395 = queryWeight, product of:
                1.7106843 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017753351 = queryNorm
              0.24950996 = fieldWeight in 4642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.10648758 = weight(abstract_txt:document in 4642) [ClassicSimilarity], result of:
            0.10648758 = score(doc=4642,freq=8.0), product of:
              0.14028032 = queryWeight, product of:
                1.8400905 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017753351 = queryNorm
              0.7591056 = fieldWeight in 4642, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.08823355 = weight(abstract_txt:sets in 4642) [ClassicSimilarity], result of:
            0.08823355 = score(doc=4642,freq=1.0), product of:
              0.27241406 = queryWeight, product of:
                2.9609082 = boost
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.017753351 = queryNorm
              0.323895 = fieldWeight in 4642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.18232 = idf(docFreq=677, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
          0.27215636 = weight(abstract_txt:documents in 4642) [ClassicSimilarity], result of:
            0.27215636 = score(doc=4642,freq=6.0), product of:
              0.43113726 = queryWeight, product of:
                5.8896294 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017753351 = queryNorm
              0.6312522 = fieldWeight in 4642, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=4642)
        0.4 = coord(10/25)
    
  5. Ringltetter, C.; Stubbe, A.: Practical aspects of automatic genre classification (2008) 0.26
    0.2569299 = sum of:
      0.2569299 = product of:
        0.91760683 = sum of:
          0.040914748 = weight(abstract_txt:topic in 2954) [ClassicSimilarity], result of:
            0.040914748 = score(doc=2954,freq=1.0), product of:
              0.12953395 = queryWeight, product of:
                1.4437333 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.017753351 = queryNorm
              0.3158612 = fieldWeight in 2954, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0625 = fieldNorm(doc=2954)
          0.06287444 = weight(abstract_txt:automatic in 2954) [ClassicSimilarity], result of:
            0.06287444 = score(doc=2954,freq=2.0), product of:
              0.1369104 = queryWeight, product of:
                1.4842716 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017753351 = queryNorm
              0.45923787 = fieldWeight in 2954, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=2954)
          0.01883734 = weight(abstract_txt:from in 2954) [ClassicSimilarity], result of:
            0.01883734 = score(doc=2954,freq=2.0), product of:
              0.077234276 = queryWeight, product of:
                1.5765771 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017753351 = queryNorm
              0.2438987 = fieldWeight in 2954, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=2954)
          0.05239702 = weight(abstract_txt:classification in 2954) [ClassicSimilarity], result of:
            0.05239702 = score(doc=2954,freq=3.0), product of:
              0.121243395 = queryWeight, product of:
                1.7106843 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017753351 = queryNorm
              0.43216392 = fieldWeight in 2954, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=2954)
          0.05324379 = weight(abstract_txt:document in 2954) [ClassicSimilarity], result of:
            0.05324379 = score(doc=2954,freq=2.0), product of:
              0.14028032 = queryWeight, product of:
                1.8400905 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.017753351 = queryNorm
              0.3795528 = fieldWeight in 2954, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=2954)
          0.41718316 = weight(abstract_txt:genre in 2954) [ClassicSimilarity], result of:
            0.41718316 = score(doc=2954,freq=5.0), product of:
              0.4487814 = queryWeight, product of:
                3.8003852 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.017753351 = queryNorm
              0.929591 = fieldWeight in 2954, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0625 = fieldNorm(doc=2954)
          0.27215636 = weight(abstract_txt:documents in 2954) [ClassicSimilarity], result of:
            0.27215636 = score(doc=2954,freq=6.0), product of:
              0.43113726 = queryWeight, product of:
                5.8896294 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017753351 = queryNorm
              0.6312522 = fieldWeight in 2954, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=2954)
        0.28 = coord(7/25)