Document (#34114)

Author
Hu, G.
Zhou, S.
Guan, J.
Hu, X.
Title
Towards effective document clustering : a constrained K-means based approach
Source
Information processing and management. 44(2008) no.4, S.1397-1409
Year
2008
Abstract
Document clustering is an important tool for document collection organization and browsing. In real applications, some limited knowledge about cluster membership of a small number of documents is often available, such as some pairs of documents belonging to the same cluster. This kind of prior knowledge can be served as constraints for the clustering process. We integrate the constraints into the trace formulation of the sum of square Euclidean distance function of K-means. Then, the combined criterion function is transformed into trace maximization, which is further optimized by eigen-decomposition. Our experimental evaluation shows that the proposed semi-supervised clustering method can achieve better performance, compared to three existing methods.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Bell, D.A.; Guan, J.W.: Computational methods for rough classification and discovery (1998) 2.06
    2.0595155 = sum of:
      2.0595155 = product of:
        4.119031 = sum of:
          4.119031 = weight(author_txt:guan in 3909) [ClassicSimilarity], result of:
            4.119031 = score(doc=3909,freq=1.0), product of:
              0.8444481 = queryWeight, product of:
                1.2555994 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.06893977 = queryNorm
              4.8777785 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.5 = fieldNorm(doc=3909)
        0.5 = coord(1/2)
    
  2. Cowie, J.; Guan, Z.: CRL English routing system for TREC-5 (1997) 2.06
    2.0595155 = sum of:
      2.0595155 = product of:
        4.119031 = sum of:
          4.119031 = weight(author_txt:guan in 4106) [ClassicSimilarity], result of:
            4.119031 = score(doc=4106,freq=1.0), product of:
              0.8444481 = queryWeight, product of:
                1.2555994 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.06893977 = queryNorm
              4.8777785 = fieldWeight in 4106, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.5 = fieldNorm(doc=4106)
        0.5 = coord(1/2)
    
  3. Wang, J.; Guan, J.: ¬The analysis and evaluation of knowledge efficiency in research groups (2005) 2.06
    2.0595155 = sum of:
      2.0595155 = product of:
        4.119031 = sum of:
          4.119031 = weight(author_txt:guan in 5238) [ClassicSimilarity], result of:
            4.119031 = score(doc=5238,freq=1.0), product of:
              0.8444481 = queryWeight, product of:
                1.2555994 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.06893977 = queryNorm
              4.8777785 = fieldWeight in 5238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.5 = fieldNorm(doc=5238)
        0.5 = coord(1/2)
    
  4. Guan, J.C.; Gao, X.: Exploring the h-index at patent level (2009) 2.06
    2.0595155 = sum of:
      2.0595155 = product of:
        4.119031 = sum of:
          4.119031 = weight(author_txt:guan in 3696) [ClassicSimilarity], result of:
            4.119031 = score(doc=3696,freq=1.0), product of:
              0.8444481 = queryWeight, product of:
                1.2555994 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.06893977 = queryNorm
              4.8777785 = fieldWeight in 3696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.5 = fieldNorm(doc=3696)
        0.5 = coord(1/2)
    
  5. Ma, N.; Guan, J.; Zhao, Y.: Bringing PageRank to the citation analysis (2008) 1.54
    1.5446365 = sum of:
      1.5446365 = product of:
        3.089273 = sum of:
          3.089273 = weight(author_txt:guan in 3064) [ClassicSimilarity], result of:
            3.089273 = score(doc=3064,freq=1.0), product of:
              0.8444481 = queryWeight, product of:
                1.2555994 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.06893977 = queryNorm
              3.6583338 = fieldWeight in 3064, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.375 = fieldNorm(doc=3064)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. AlQenaei, Z.M.; Monarchi, D.E.: ¬The use of learning techniques to analyze the results of a manual classification system (2016) 0.22
    0.22356695 = sum of:
      0.22356695 = product of:
        0.6986467 = sum of:
          0.053505406 = weight(abstract_txt:pairs in 3836) [ClassicSimilarity], result of:
            0.053505406 = score(doc=3836,freq=1.0), product of:
              0.12610254 = queryWeight, product of:
                6.7888126 = idf(docFreq=135, maxDocs=44421)
                0.01857505 = queryNorm
              0.4243008 = fieldWeight in 3836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7888126 = idf(docFreq=135, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
          0.07120617 = weight(abstract_txt:supervised in 3836) [ClassicSimilarity], result of:
            0.07120617 = score(doc=3836,freq=1.0), product of:
              0.15257046 = queryWeight, product of:
                1.0999509 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.01857505 = queryNorm
              0.46671006 = fieldWeight in 3836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
          0.0834465 = weight(abstract_txt:belonging in 3836) [ClassicSimilarity], result of:
            0.0834465 = score(doc=3836,freq=1.0), product of:
              0.16958891 = queryWeight, product of:
                1.1596764 = boost
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.01857505 = queryNorm
              0.49205163 = fieldWeight in 3836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
          0.08486798 = weight(abstract_txt:decomposition in 3836) [ClassicSimilarity], result of:
            0.08486798 = score(doc=3836,freq=1.0), product of:
              0.1715094 = queryWeight, product of:
                1.1662242 = boost
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.01857505 = queryNorm
              0.49482986 = fieldWeight in 3836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
          0.033908058 = weight(abstract_txt:documents in 3836) [ClassicSimilarity], result of:
            0.033908058 = score(doc=3836,freq=2.0), product of:
              0.09303807 = queryWeight, product of:
                1.2147403 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01857505 = queryNorm
              0.3644536 = fieldWeight in 3836, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
          0.16641742 = weight(abstract_txt:euclidean in 3836) [ClassicSimilarity], result of:
            0.16641742 = score(doc=3836,freq=1.0), product of:
              0.26869395 = queryWeight, product of:
                1.4597116 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.01857505 = queryNorm
              0.61935675 = fieldWeight in 3836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
          0.040622726 = weight(abstract_txt:document in 3836) [ClassicSimilarity], result of:
            0.040622726 = score(doc=3836,freq=1.0), product of:
              0.15136026 = queryWeight, product of:
                1.8975999 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01857505 = queryNorm
              0.26838437 = fieldWeight in 3836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
          0.16467242 = weight(abstract_txt:clustering in 3836) [ClassicSimilarity], result of:
            0.16467242 = score(doc=3836,freq=1.0), product of:
              0.42353824 = queryWeight, product of:
                3.6653411 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01857505 = queryNorm
              0.38880178 = fieldWeight in 3836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.0625 = fieldNorm(doc=3836)
        0.32 = coord(8/25)
    
  2. Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.16
    0.15578891 = sum of:
      0.15578891 = product of:
        0.556389 = sum of:
          0.015123022 = weight(abstract_txt:into in 1947) [ClassicSimilarity], result of:
            0.015123022 = score(doc=1947,freq=1.0), product of:
              0.07479784 = queryWeight, product of:
                1.0891749 = boost
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.01857505 = queryNorm
              0.20218527 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.074259475 = weight(abstract_txt:decomposition in 1947) [ClassicSimilarity], result of:
            0.074259475 = score(doc=1947,freq=1.0), product of:
              0.1715094 = queryWeight, product of:
                1.1662242 = boost
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.01857505 = queryNorm
              0.43297613 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.029669553 = weight(abstract_txt:documents in 1947) [ClassicSimilarity], result of:
            0.029669553 = score(doc=1947,freq=2.0), product of:
              0.09303807 = queryWeight, product of:
                1.2147403 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01857505 = queryNorm
              0.31889692 = fieldWeight in 1947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.03775937 = weight(abstract_txt:means in 1947) [ClassicSimilarity], result of:
            0.03775937 = score(doc=1947,freq=1.0), product of:
              0.13766171 = queryWeight, product of:
                1.4776095 = boost
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.01857505 = queryNorm
              0.274291 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.05026806 = weight(abstract_txt:document in 1947) [ClassicSimilarity], result of:
            0.05026806 = score(doc=1947,freq=2.0), product of:
              0.15136026 = queryWeight, product of:
                1.8975999 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01857505 = queryNorm
              0.3321087 = fieldWeight in 1947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.14553781 = weight(abstract_txt:cluster in 1947) [ClassicSimilarity], result of:
            0.14553781 = score(doc=1947,freq=3.0), product of:
              0.23464258 = queryWeight, product of:
                1.9291078 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01857505 = queryNorm
              0.6202532 = fieldWeight in 1947, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.20377171 = weight(abstract_txt:clustering in 1947) [ClassicSimilarity], result of:
            0.20377171 = score(doc=1947,freq=2.0), product of:
              0.42353824 = queryWeight, product of:
                3.6653411 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01857505 = queryNorm
              0.48111764 = fieldWeight in 1947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
        0.28 = coord(7/25)
    
  3. Zamir, O.; Etzioni, O.: Grouper : a dynamic clustering interface to Web search results (1999) 0.15
    0.14859577 = sum of:
      0.14859577 = product of:
        0.7429788 = sum of:
          0.021604316 = weight(abstract_txt:into in 207) [ClassicSimilarity], result of:
            0.021604316 = score(doc=207,freq=1.0), product of:
              0.07479784 = queryWeight, product of:
                1.0891749 = boost
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.01857505 = queryNorm
              0.2888361 = fieldWeight in 207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.029970774 = weight(abstract_txt:documents in 207) [ClassicSimilarity], result of:
            0.029970774 = score(doc=207,freq=1.0), product of:
              0.09303807 = queryWeight, product of:
                1.2147403 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01857505 = queryNorm
              0.32213452 = fieldWeight in 207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.07181151 = weight(abstract_txt:document in 207) [ClassicSimilarity], result of:
            0.07181151 = score(doc=207,freq=2.0), product of:
              0.15136026 = queryWeight, product of:
                1.8975999 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01857505 = queryNorm
              0.47444102 = fieldWeight in 207, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.20791116 = weight(abstract_txt:cluster in 207) [ClassicSimilarity], result of:
            0.20791116 = score(doc=207,freq=3.0), product of:
              0.23464258 = queryWeight, product of:
                1.9291078 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01857505 = queryNorm
              0.88607603 = fieldWeight in 207, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.41168106 = weight(abstract_txt:clustering in 207) [ClassicSimilarity], result of:
            0.41168106 = score(doc=207,freq=4.0), product of:
              0.42353824 = queryWeight, product of:
                3.6653411 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01857505 = queryNorm
              0.9720045 = fieldWeight in 207, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
        0.2 = coord(5/25)
    
  4. Rooney, N.; Patterson, D.; Galushka, M.; Dobrynin, V.; Smirnova, E.: ¬An investigation into the stability of contextual document clustering (2008) 0.14
    0.14423169 = sum of:
      0.14423169 = product of:
        0.6009654 = sum of:
          0.017283453 = weight(abstract_txt:into in 2356) [ClassicSimilarity], result of:
            0.017283453 = score(doc=2356,freq=1.0), product of:
              0.07479784 = queryWeight, product of:
                1.0891749 = boost
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.01857505 = queryNorm
              0.23106888 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.697102 = idf(docFreq=2993, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.033908058 = weight(abstract_txt:documents in 2356) [ClassicSimilarity], result of:
            0.033908058 = score(doc=2356,freq=2.0), product of:
              0.09303807 = queryWeight, product of:
                1.2147403 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01857505 = queryNorm
              0.3644536 = fieldWeight in 2356, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.043153565 = weight(abstract_txt:means in 2356) [ClassicSimilarity], result of:
            0.043153565 = score(doc=2356,freq=1.0), product of:
              0.13766171 = queryWeight, product of:
                1.4776095 = boost
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.01857505 = queryNorm
              0.31347543 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.08124545 = weight(abstract_txt:document in 2356) [ClassicSimilarity], result of:
            0.08124545 = score(doc=2356,freq=4.0), product of:
              0.15136026 = queryWeight, product of:
                1.8975999 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01857505 = queryNorm
              0.53676873 = fieldWeight in 2356, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.09603006 = weight(abstract_txt:cluster in 2356) [ClassicSimilarity], result of:
            0.09603006 = score(doc=2356,freq=1.0), product of:
              0.23464258 = queryWeight, product of:
                1.9291078 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01857505 = queryNorm
              0.409261 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.32934484 = weight(abstract_txt:clustering in 2356) [ClassicSimilarity], result of:
            0.32934484 = score(doc=2356,freq=4.0), product of:
              0.42353824 = queryWeight, product of:
                3.6653411 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01857505 = queryNorm
              0.77760357 = fieldWeight in 2356, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
        0.24 = coord(6/25)
    
  5. Na, S.-H.; Kang, I.-S.; Lee, J.-H.: Adaptive document clustering based on query-based similarity (2007) 0.14
    0.13942717 = sum of:
      0.13942717 = product of:
        0.6971358 = sum of:
          0.02397662 = weight(abstract_txt:documents in 1920) [ClassicSimilarity], result of:
            0.02397662 = score(doc=1920,freq=1.0), product of:
              0.09303807 = queryWeight, product of:
                1.2147403 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01857505 = queryNorm
              0.25770763 = fieldWeight in 1920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=1920)
          0.043153565 = weight(abstract_txt:means in 1920) [ClassicSimilarity], result of:
            0.043153565 = score(doc=1920,freq=1.0), product of:
              0.13766171 = queryWeight, product of:
                1.4776095 = boost
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.01857505 = queryNorm
              0.31347543 = fieldWeight in 1920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.015607 = idf(docFreq=800, maxDocs=44421)
                0.0625 = fieldNorm(doc=1920)
          0.09083518 = weight(abstract_txt:document in 1920) [ClassicSimilarity], result of:
            0.09083518 = score(doc=1920,freq=5.0), product of:
              0.15136026 = queryWeight, product of:
                1.8975999 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01857505 = queryNorm
              0.6001257 = fieldWeight in 1920, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=1920)
          0.13580701 = weight(abstract_txt:cluster in 1920) [ClassicSimilarity], result of:
            0.13580701 = score(doc=1920,freq=2.0), product of:
              0.23464258 = queryWeight, product of:
                1.9291078 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01857505 = queryNorm
              0.57878244 = fieldWeight in 1920, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.0625 = fieldNorm(doc=1920)
          0.4033634 = weight(abstract_txt:clustering in 1920) [ClassicSimilarity], result of:
            0.4033634 = score(doc=1920,freq=6.0), product of:
              0.42353824 = queryWeight, product of:
                3.6653411 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01857505 = queryNorm
              0.952366 = fieldWeight in 1920, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.0625 = fieldNorm(doc=1920)
        0.2 = coord(5/25)