Document (#40898)

Author
Stevens, G.
Title
New metadata recipes for old cookbooks : creating and analyzing a digital collection using the HathiTrust Research Center Portal
Source
Code4Lib journal. Issue 37(2017), [http://journal.code4lib.org]
Year
2017
Abstract
The Early American Cookbooks digital project is a case study in analyzing collections as data using HathiTrust and the HathiTrust Research Center (HTRC) Portal. The purposes of the project are to create a freely available, searchable collection of full-text early American cookbooks within the HathiTrust Digital Library, to offer an overview of the scope and contents of the collection, and to analyze trends and patterns in the metadata and the full text of the collection. The digital project has two basic components: a collection of 1450 full-text cookbooks published in the United States between 1800 and 1920 and a website to present a guide to the collection and the results of the analysis. This article will focus on the workflow for analyzing the metadata and the full-text of the collection. The workflow will cover: 1) creating a searchable public collection of full-text titles within the HathiTrust Digital Library and uploading it to the HTRC Portal, 2) analyzing and visualizing legacy MARC data for the collection using MarcEdit, OpenRefine and Tableau, and 3) using the text analysis tools in the HTRC Portal to look for trends and patterns in the full text of the collection.
Content
Vgl.: http://journal.code4lib.org/articles/12548.
Theme
Metadaten

Similar documents (author)

  1. Stevens, N.D.: ¬The flaw of subject access in the library catalog : an opinion (1984) 5.38
    5.3815155 = sum of:
      5.3815155 = weight(author_txt:stevens in 1877) [ClassicSimilarity], result of:
        5.3815155 = fieldWeight in 1877, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.610425 = idf(docFreq=21, maxDocs=44421)
          0.625 = fieldNorm(doc=1877)
    
  2. Stevens, N.D.: Evaluating reference books in theory and practice (1986) 5.38
    5.3815155 = sum of:
      5.3815155 = weight(author_txt:stevens in 4592) [ClassicSimilarity], result of:
        5.3815155 = fieldWeight in 4592, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.610425 = idf(docFreq=21, maxDocs=44421)
          0.625 = fieldNorm(doc=4592)
    
  3. Stevens, N.D.: Public libraries and the Internet / NREN : new challenges, new opportunities (1992) 5.38
    5.3815155 = sum of:
      5.3815155 = weight(author_txt:stevens in 6248) [ClassicSimilarity], result of:
        5.3815155 = fieldWeight in 6248, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.610425 = idf(docFreq=21, maxDocs=44421)
          0.625 = fieldNorm(doc=6248)
    
  4. Stevens, N.D.: First-hand reflections on the realities of reference service (1995) 5.38
    5.3815155 = sum of:
      5.3815155 = weight(author_txt:stevens in 1813) [ClassicSimilarity], result of:
        5.3815155 = fieldWeight in 1813, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.610425 = idf(docFreq=21, maxDocs=44421)
          0.625 = fieldNorm(doc=1813)
    
  5. Stevens, N.D.: ¬The importance of the verb in the reference question (1988) 5.38
    5.3815155 = sum of:
      5.3815155 = weight(author_txt:stevens in 2582) [ClassicSimilarity], result of:
        5.3815155 = fieldWeight in 2582, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.610425 = idf(docFreq=21, maxDocs=44421)
          0.625 = fieldNorm(doc=2582)
    

Similar documents (content)

  1. Hill, L.L.; Janée, G.; Dolin, R.; Frew, J.; Larsgaard, M.: Collection metadata solutions for digital library applications (1999) 0.18
    0.18371214 = sum of:
      0.18371214 = product of:
        0.7654673 = sum of:
          0.022813939 = weight(abstract_txt:within in 5053) [ClassicSimilarity], result of:
            0.022813939 = score(doc=5053,freq=1.0), product of:
              0.06968598 = queryWeight, product of:
                4.19049 = idf(docFreq=1827, maxDocs=44421)
                0.016629554 = queryNorm
              0.32738203 = fieldWeight in 5053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.19049 = idf(docFreq=1827, maxDocs=44421)
                0.078125 = fieldNorm(doc=5053)
          0.074486814 = weight(abstract_txt:center in 5053) [ClassicSimilarity], result of:
            0.074486814 = score(doc=5053,freq=1.0), product of:
              0.15336685 = queryWeight, product of:
                1.4835188 = boost
                6.2166705 = idf(docFreq=240, maxDocs=44421)
                0.016629554 = queryNorm
              0.4856774 = fieldWeight in 5053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2166705 = idf(docFreq=240, maxDocs=44421)
                0.078125 = fieldNorm(doc=5053)
          0.039066605 = weight(abstract_txt:project in 5053) [ClassicSimilarity], result of:
            0.039066605 = score(doc=5053,freq=1.0), product of:
              0.11417722 = queryWeight, product of:
                1.567699 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.016629554 = queryNorm
              0.34215763 = fieldWeight in 5053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.078125 = fieldNorm(doc=5053)
          0.108041756 = weight(abstract_txt:metadata in 5053) [ClassicSimilarity], result of:
            0.108041756 = score(doc=5053,freq=4.0), product of:
              0.1417153 = queryWeight, product of:
                1.7465512 = boost
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.016629554 = queryNorm
              0.76238596 = fieldWeight in 5053, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.078125 = fieldNorm(doc=5053)
          0.10879461 = weight(abstract_txt:digital in 5053) [ClassicSimilarity], result of:
            0.10879461 = score(doc=5053,freq=3.0), product of:
              0.18579032 = queryWeight, product of:
                2.5817182 = boost
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.016629554 = queryNorm
              0.58557737 = fieldWeight in 5053, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.078125 = fieldNorm(doc=5053)
          0.4122636 = weight(abstract_txt:collection in 5053) [ClassicSimilarity], result of:
            0.4122636 = score(doc=5053,freq=7.0), product of:
              0.4289624 = queryWeight, product of:
                5.5478144 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.016629554 = queryNorm
              0.9610716 = fieldWeight in 5053, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.078125 = fieldNorm(doc=5053)
        0.24 = coord(6/25)
    
  2. Lynch, J.D.; Gibson, J.; Han, M.-J.: Analyzing and normalizing type metadata for a large aggregated digital library (2020) 0.18
    0.18044248 = sum of:
      0.18044248 = product of:
        0.7518437 = sum of:
          0.15970176 = weight(abstract_txt:openrefine in 720) [ClassicSimilarity], result of:
            0.15970176 = score(doc=720,freq=1.0), product of:
              0.1792342 = queryWeight, product of:
                1.1340253 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.016629554 = queryNorm
              0.8910228 = fieldWeight in 720, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.09375 = fieldNorm(doc=720)
          0.06629823 = weight(abstract_txt:project in 720) [ClassicSimilarity], result of:
            0.06629823 = score(doc=720,freq=2.0), product of:
              0.11417722 = queryWeight, product of:
                1.567699 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.016629554 = queryNorm
              0.58066076 = fieldWeight in 720, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.09375 = fieldNorm(doc=720)
          0.03073736 = weight(abstract_txt:using in 720) [ClassicSimilarity], result of:
            0.03073736 = score(doc=720,freq=1.0), product of:
              0.0948445 = queryWeight, product of:
                1.6498648 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.016629554 = queryNorm
              0.32408163 = fieldWeight in 720, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.09375 = fieldNorm(doc=720)
          0.15878831 = weight(abstract_txt:metadata in 720) [ClassicSimilarity], result of:
            0.15878831 = score(doc=720,freq=6.0), product of:
              0.1417153 = queryWeight, product of:
                1.7465512 = boost
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.016629554 = queryNorm
              1.120474 = fieldWeight in 720, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.09375 = fieldNorm(doc=720)
          0.1065965 = weight(abstract_txt:digital in 720) [ClassicSimilarity], result of:
            0.1065965 = score(doc=720,freq=2.0), product of:
              0.18579032 = queryWeight, product of:
                2.5817182 = boost
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.016629554 = queryNorm
              0.57374626 = fieldWeight in 720, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.09375 = fieldNorm(doc=720)
          0.22972158 = weight(abstract_txt:analyzing in 720) [ClassicSimilarity], result of:
            0.22972158 = score(doc=720,freq=2.0), product of:
              0.2877568 = queryWeight, product of:
                2.8737905 = boost
                6.021295 = idf(docFreq=292, maxDocs=44421)
                0.016629554 = queryNorm
              0.7983185 = fieldWeight in 720, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.021295 = idf(docFreq=292, maxDocs=44421)
                0.09375 = fieldNorm(doc=720)
        0.24 = coord(6/25)
    
  3. Larson, R.R.; Carson, C.: Information access for a digital library : Cheshire II and the Berkeley environment digital library (1999) 0.18
    0.17578848 = sum of:
      0.17578848 = product of:
        0.732452 = sum of:
          0.04687993 = weight(abstract_txt:project in 685) [ClassicSimilarity], result of:
            0.04687993 = score(doc=685,freq=1.0), product of:
              0.11417722 = queryWeight, product of:
                1.567699 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.016629554 = queryNorm
              0.41058916 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.09375 = fieldNorm(doc=685)
          0.04346919 = weight(abstract_txt:using in 685) [ClassicSimilarity], result of:
            0.04346919 = score(doc=685,freq=2.0), product of:
              0.0948445 = queryWeight, product of:
                1.6498648 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.016629554 = queryNorm
              0.45832062 = fieldWeight in 685, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.09375 = fieldNorm(doc=685)
          0.07537512 = weight(abstract_txt:digital in 685) [ClassicSimilarity], result of:
            0.07537512 = score(doc=685,freq=1.0), product of:
              0.18579032 = queryWeight, product of:
                2.5817182 = boost
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.016629554 = queryNorm
              0.4056999 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.09375 = fieldNorm(doc=685)
          0.14881419 = weight(abstract_txt:text in 685) [ClassicSimilarity], result of:
            0.14881419 = score(doc=685,freq=3.0), product of:
              0.22679643 = queryWeight, product of:
                3.375044 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.016629554 = queryNorm
              0.6561575 = fieldWeight in 685, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.09375 = fieldNorm(doc=685)
          0.23092836 = weight(abstract_txt:full in 685) [ClassicSimilarity], result of:
            0.23092836 = score(doc=685,freq=3.0), product of:
              0.2887637 = queryWeight, product of:
                3.5258126 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.016629554 = queryNorm
              0.79971397 = fieldWeight in 685, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.09375 = fieldNorm(doc=685)
          0.1869852 = weight(abstract_txt:collection in 685) [ClassicSimilarity], result of:
            0.1869852 = score(doc=685,freq=1.0), product of:
              0.4289624 = queryWeight, product of:
                5.5478144 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.016629554 = queryNorm
              0.4359011 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.09375 = fieldNorm(doc=685)
        0.24 = coord(6/25)
    
  4. Agnew, G.; Kniesner, D.; Weber, M.B.: Integrating MPEG-7 into the moving image collections portal (2007) 0.18
    0.17567718 = sum of:
      0.17567718 = product of:
        0.6274185 = sum of:
          0.031611916 = weight(abstract_txt:within in 1478) [ClassicSimilarity], result of:
            0.031611916 = score(doc=1478,freq=3.0), product of:
              0.06968598 = queryWeight, product of:
                4.19049 = idf(docFreq=1827, maxDocs=44421)
                0.016629554 = queryNorm
              0.45363382 = fieldWeight in 1478, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.19049 = idf(docFreq=1827, maxDocs=44421)
                0.0625 = fieldNorm(doc=1478)
          0.04419882 = weight(abstract_txt:project in 1478) [ClassicSimilarity], result of:
            0.04419882 = score(doc=1478,freq=2.0), product of:
              0.11417722 = queryWeight, product of:
                1.567699 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.016629554 = queryNorm
              0.38710716 = fieldWeight in 1478, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.0625 = fieldNorm(doc=1478)
          0.061117645 = weight(abstract_txt:metadata in 1478) [ClassicSimilarity], result of:
            0.061117645 = score(doc=1478,freq=2.0), product of:
              0.1417153 = queryWeight, product of:
                1.7465512 = boost
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.016629554 = queryNorm
              0.4312706 = fieldWeight in 1478, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.0625 = fieldNorm(doc=1478)
          0.07106434 = weight(abstract_txt:digital in 1478) [ClassicSimilarity], result of:
            0.07106434 = score(doc=1478,freq=2.0), product of:
              0.18579032 = queryWeight, product of:
                2.5817182 = boost
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.016629554 = queryNorm
              0.38249752 = fieldWeight in 1478, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.0625 = fieldNorm(doc=1478)
          0.23749043 = weight(abstract_txt:portal in 1478) [ClassicSimilarity], result of:
            0.23749043 = score(doc=1478,freq=3.0), product of:
              0.33678463 = queryWeight, product of:
                3.1089835 = boost
                6.514082 = idf(docFreq=178, maxDocs=44421)
                0.016629554 = queryNorm
              0.70517004 = fieldWeight in 1478, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.514082 = idf(docFreq=178, maxDocs=44421)
                0.0625 = fieldNorm(doc=1478)
          0.057278603 = weight(abstract_txt:text in 1478) [ClassicSimilarity], result of:
            0.057278603 = score(doc=1478,freq=1.0), product of:
              0.22679643 = queryWeight, product of:
                3.375044 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.016629554 = queryNorm
              0.25255513 = fieldWeight in 1478, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1478)
          0.1246568 = weight(abstract_txt:collection in 1478) [ClassicSimilarity], result of:
            0.1246568 = score(doc=1478,freq=1.0), product of:
              0.4289624 = queryWeight, product of:
                5.5478144 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.016629554 = queryNorm
              0.29060075 = fieldWeight in 1478, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.0625 = fieldNorm(doc=1478)
        0.28 = coord(7/25)
    
  5. Choudhury, G.S.; DiLauro, T.; Droettboom, M.; Fujinaga, I.; MacMillan, K.: Strike up the score : deriving searchable and playable digital formats from sheet music (2001) 0.16
    0.16295542 = sum of:
      0.16295542 = product of:
        0.5092357 = sum of:
          0.020993201 = weight(abstract_txt:creating in 2220) [ClassicSimilarity], result of:
            0.020993201 = score(doc=2220,freq=1.0), product of:
              0.12143886 = queryWeight, product of:
                1.3200979 = boost
                5.531857 = idf(docFreq=477, maxDocs=44421)
                0.016629554 = queryNorm
              0.17287053 = fieldWeight in 2220, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.531857 = idf(docFreq=477, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
          0.037080057 = weight(abstract_txt:early in 2220) [ClassicSimilarity], result of:
            0.037080057 = score(doc=2220,freq=3.0), product of:
              0.123033985 = queryWeight, product of:
                1.3287395 = boost
                5.5680695 = idf(docFreq=460, maxDocs=44421)
                0.016629554 = queryNorm
              0.3013806 = fieldWeight in 2220, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.5680695 = idf(docFreq=460, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
          0.029794725 = weight(abstract_txt:center in 2220) [ClassicSimilarity], result of:
            0.029794725 = score(doc=2220,freq=1.0), product of:
              0.15336685 = queryWeight, product of:
                1.4835188 = boost
                6.2166705 = idf(docFreq=240, maxDocs=44421)
                0.016629554 = queryNorm
              0.19427095 = fieldWeight in 2220, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2166705 = idf(docFreq=240, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
          0.034942236 = weight(abstract_txt:project in 2220) [ClassicSimilarity], result of:
            0.034942236 = score(doc=2220,freq=5.0), product of:
              0.11417722 = queryWeight, product of:
                1.567699 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.016629554 = queryNorm
              0.3060351 = fieldWeight in 2220, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
          0.08762059 = weight(abstract_txt:workflow in 2220) [ClassicSimilarity], result of:
            0.08762059 = score(doc=2220,freq=5.0), product of:
              0.18409954 = queryWeight, product of:
                1.6253753 = boost
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.016629554 = queryNorm
              0.4759414 = fieldWeight in 2220, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
          0.030558823 = weight(abstract_txt:metadata in 2220) [ClassicSimilarity], result of:
            0.030558823 = score(doc=2220,freq=2.0), product of:
              0.1417153 = queryWeight, product of:
                1.7465512 = boost
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.016629554 = queryNorm
              0.2156353 = fieldWeight in 2220, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
          0.04351784 = weight(abstract_txt:digital in 2220) [ClassicSimilarity], result of:
            0.04351784 = score(doc=2220,freq=3.0), product of:
              0.18579032 = queryWeight, product of:
                2.5817182 = boost
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.016629554 = queryNorm
              0.23423094 = fieldWeight in 2220, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
          0.22472823 = weight(abstract_txt:collection in 2220) [ClassicSimilarity], result of:
            0.22472823 = score(doc=2220,freq=13.0), product of:
              0.4289624 = queryWeight, product of:
                5.5478144 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.016629554 = queryNorm
              0.52388793 = fieldWeight in 2220, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.03125 = fieldNorm(doc=2220)
        0.32 = coord(8/25)