Document (#30368)

Author
Pappas, E.
Herendeen, A.
Title
Enhancing bibliographic records with tables of contents derived from OCR technologies at the American Museum of Natural History Library
Source
Cataloging and classification quarterly. 29(2000) no.4, S.61-72
Year
2000
Abstract
This paper reports on a project undertaken at the American Museum of Natural History Library in 1997 and intended to enhance access to materials in the library's collection by using scanning and OCR software to digitize and add monograph tables of contents to the OPAC bibliographic records. Initially, conference proceedings already in the collection were used, but, as the project developed, other types of materials were also used. The rationale for the project is explained, the procedure developed is described, and the lessons learned from using this particular technology are outlined.
Theme
Kataloganreicherung

Similar documents (content)

  1. ¬The National Digital Library (1994) 0.25
    0.24558347 = sum of:
      0.24558347 = product of:
        1.0232645 = sum of:
          0.15608184 = weight(abstract_txt:initially in 1832) [ClassicSimilarity], result of:
            0.15608184 = score(doc=1832,freq=1.0), product of:
              0.19753711 = queryWeight, product of:
                1.1022521 = boost
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.024807451 = queryNorm
              0.7901393 = fieldWeight in 1832, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.109375 = fieldNorm(doc=1832)
          0.061168 = weight(abstract_txt:developed in 1832) [ClassicSimilarity], result of:
            0.061168 = score(doc=1832,freq=1.0), product of:
              0.13328256 = queryWeight, product of:
                1.2804371 = boost
                4.1959753 = idf(docFreq=1817, maxDocs=44421)
                0.024807451 = queryNorm
              0.45893478 = fieldWeight in 1832, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1959753 = idf(docFreq=1817, maxDocs=44421)
                0.109375 = fieldNorm(doc=1832)
          0.30812487 = weight(abstract_txt:digitize in 1832) [ClassicSimilarity], result of:
            0.30812487 = score(doc=1832,freq=1.0), product of:
              0.31086007 = queryWeight, product of:
                1.3827354 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.024807451 = queryNorm
              0.99120116 = fieldWeight in 1832, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.109375 = fieldNorm(doc=1832)
          0.09829468 = weight(abstract_txt:materials in 1832) [ClassicSimilarity], result of:
            0.09829468 = score(doc=1832,freq=1.0), product of:
              0.18285653 = queryWeight, product of:
                1.4997774 = boost
                4.9147506 = idf(docFreq=885, maxDocs=44421)
                0.024807451 = queryNorm
              0.53755087 = fieldWeight in 1832, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9147506 = idf(docFreq=885, maxDocs=44421)
                0.109375 = fieldNorm(doc=1832)
          0.19092739 = weight(abstract_txt:american in 1832) [ClassicSimilarity], result of:
            0.19092739 = score(doc=1832,freq=2.0), product of:
              0.22593974 = queryWeight, product of:
                1.6671239 = boost
                5.463143 = idf(docFreq=511, maxDocs=44421)
                0.024807451 = queryNorm
              0.84503675 = fieldWeight in 1832, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.463143 = idf(docFreq=511, maxDocs=44421)
                0.109375 = fieldNorm(doc=1832)
          0.20866778 = weight(abstract_txt:project in 1832) [ClassicSimilarity], result of:
            0.20866778 = score(doc=1832,freq=4.0), product of:
              0.21780665 = queryWeight, product of:
                2.0047157 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.024807451 = queryNorm
              0.95804137 = fieldWeight in 1832, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.109375 = fieldNorm(doc=1832)
        0.24 = coord(6/25)
    
  2. DeVorsey, K.L.; Elson, C.; Gregorev, N.P.; Hansen, J.: ¬The development of a local thesaurus to improve access to the anthropological collections of the American Museum of Natural History (2006) 0.21
    0.2121241 = sum of:
      0.2121241 = product of:
        0.5303102 = sum of:
          0.015664887 = weight(abstract_txt:used in 2174) [ClassicSimilarity], result of:
            0.015664887 = score(doc=2174,freq=1.0), product of:
              0.08532218 = queryWeight, product of:
                1.0244778 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.024807451 = queryNorm
              0.18359688 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.028881216 = weight(abstract_txt:were in 2174) [ClassicSimilarity], result of:
            0.028881216 = score(doc=2174,freq=2.0), product of:
              0.101822585 = queryWeight, product of:
                1.1191638 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.024807451 = queryNorm
              0.28364253 = fieldWeight in 2174, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.031349033 = weight(abstract_txt:bibliographic in 2174) [ClassicSimilarity], result of:
            0.031349033 = score(doc=2174,freq=1.0), product of:
              0.13549602 = queryWeight, product of:
                1.2910256 = boost
                4.230674 = idf(docFreq=1755, maxDocs=44421)
                0.024807451 = queryNorm
              0.23136497 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.230674 = idf(docFreq=1755, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.035851367 = weight(abstract_txt:records in 2174) [ClassicSimilarity], result of:
            0.035851367 = score(doc=2174,freq=1.0), product of:
              0.14817704 = queryWeight, product of:
                1.3500879 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.024807451 = queryNorm
              0.24194953 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.041614596 = weight(abstract_txt:collection in 2174) [ClassicSimilarity], result of:
            0.041614596 = score(doc=2174,freq=1.0), product of:
              0.16365938 = queryWeight, product of:
                1.4188682 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.024807451 = queryNorm
              0.25427565 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.05105812 = weight(abstract_txt:history in 2174) [ClassicSimilarity], result of:
            0.05105812 = score(doc=2174,freq=1.0), product of:
              0.1875658 = queryWeight, product of:
                1.5189673 = boost
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.024807451 = queryNorm
              0.27221444 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.05393642 = weight(abstract_txt:natural in 2174) [ClassicSimilarity], result of:
            0.05393642 = score(doc=2174,freq=1.0), product of:
              0.19455028 = queryWeight, product of:
                1.54699 = boost
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.024807451 = queryNorm
              0.2772364 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.06750303 = weight(abstract_txt:american in 2174) [ClassicSimilarity], result of:
            0.06750303 = score(doc=2174,freq=1.0), product of:
              0.22593974 = queryWeight, product of:
                1.6671239 = boost
                5.463143 = idf(docFreq=511, maxDocs=44421)
                0.024807451 = queryNorm
              0.29876563 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.463143 = idf(docFreq=511, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.1522846 = weight(abstract_txt:museum in 2174) [ClassicSimilarity], result of:
            0.1522846 = score(doc=2174,freq=2.0), product of:
              0.308464 = queryWeight, product of:
                1.9479321 = boost
                6.3833475 = idf(docFreq=203, maxDocs=44421)
                0.024807451 = queryNorm
              0.49368683 = fieldWeight in 2174, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3833475 = idf(docFreq=203, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
          0.052166946 = weight(abstract_txt:project in 2174) [ClassicSimilarity], result of:
            0.052166946 = score(doc=2174,freq=1.0), product of:
              0.21780665 = queryWeight, product of:
                2.0047157 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.024807451 = queryNorm
              0.23951034 = fieldWeight in 2174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2174)
        0.4 = coord(10/25)
    
  3. Makinen, R.H.; Friesen B.: Enhancing online bibliographic records to improve retrieval of reference collection monographs (1995) 0.19
    0.19032195 = sum of:
      0.19032195 = product of:
        0.95160973 = sum of:
          0.10243248 = weight(abstract_txt:records in 1768) [ClassicSimilarity], result of:
            0.10243248 = score(doc=1768,freq=1.0), product of:
              0.14817704 = queryWeight, product of:
                1.3500879 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.024807451 = queryNorm
              0.6912844 = fieldWeight in 1768, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.15625 = fieldNorm(doc=1768)
          0.118898846 = weight(abstract_txt:collection in 1768) [ClassicSimilarity], result of:
            0.118898846 = score(doc=1768,freq=1.0), product of:
              0.16365938 = queryWeight, product of:
                1.4188682 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.024807451 = queryNorm
              0.7265019 = fieldWeight in 1768, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.15625 = fieldNorm(doc=1768)
          0.22903262 = weight(abstract_txt:contents in 1768) [ClassicSimilarity], result of:
            0.22903262 = score(doc=1768,freq=1.0), product of:
              0.25336933 = queryWeight, product of:
                1.7654223 = boost
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.024807451 = queryNorm
              0.9039477 = fieldWeight in 1768, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.15625 = fieldNorm(doc=1768)
          0.14904842 = weight(abstract_txt:project in 1768) [ClassicSimilarity], result of:
            0.14904842 = score(doc=1768,freq=1.0), product of:
              0.21780665 = queryWeight, product of:
                2.0047157 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.024807451 = queryNorm
              0.68431526 = fieldWeight in 1768, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.15625 = fieldNorm(doc=1768)
          0.35219744 = weight(abstract_txt:tables in 1768) [ClassicSimilarity], result of:
            0.35219744 = score(doc=1768,freq=1.0), product of:
              0.3375566 = queryWeight, product of:
                2.0377219 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.024807451 = queryNorm
              1.043373 = fieldWeight in 1768, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.15625 = fieldNorm(doc=1768)
        0.2 = coord(5/25)
    
  4. Miksa, S.D.; Moen, WE.; Snyder, G.; Polyakov, S.; Eklund, A.: Metadata assistance of the Functional Requirements for Bibliographic Record's four user tasks : a report on the MARC content designation utilization (MCDU) project (2006) 0.16
    0.15603468 = sum of:
      0.15603468 = product of:
        0.5572667 = sum of:
          0.02531828 = weight(abstract_txt:used in 1125) [ClassicSimilarity], result of:
            0.02531828 = score(doc=1125,freq=2.0), product of:
              0.08532218 = queryWeight, product of:
                1.0244778 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.024807451 = queryNorm
              0.29673737 = fieldWeight in 1125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=1125)
          0.019545056 = weight(abstract_txt:using in 1125) [ClassicSimilarity], result of:
            0.019545056 = score(doc=1125,freq=1.0), product of:
              0.09046358 = queryWeight, product of:
                1.0548931 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.024807451 = queryNorm
              0.21605442 = fieldWeight in 1125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=1125)
          0.10133538 = weight(abstract_txt:bibliographic in 1125) [ClassicSimilarity], result of:
            0.10133538 = score(doc=1125,freq=8.0), product of:
              0.13549602 = queryWeight, product of:
                1.2910256 = boost
                4.230674 = idf(docFreq=1755, maxDocs=44421)
                0.024807451 = queryNorm
              0.7478845 = fieldWeight in 1125, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.230674 = idf(docFreq=1755, maxDocs=44421)
                0.0625 = fieldNorm(doc=1125)
          0.09161839 = weight(abstract_txt:records in 1125) [ClassicSimilarity], result of:
            0.09161839 = score(doc=1125,freq=5.0), product of:
              0.14817704 = queryWeight, product of:
                1.3500879 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.024807451 = queryNorm
              0.61830354 = fieldWeight in 1125, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.0625 = fieldNorm(doc=1125)
          0.077146314 = weight(abstract_txt:american in 1125) [ClassicSimilarity], result of:
            0.077146314 = score(doc=1125,freq=1.0), product of:
              0.22593974 = queryWeight, product of:
                1.6671239 = boost
                5.463143 = idf(docFreq=511, maxDocs=44421)
                0.024807451 = queryNorm
              0.34144643 = fieldWeight in 1125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.463143 = idf(docFreq=511, maxDocs=44421)
                0.0625 = fieldNorm(doc=1125)
          0.123064555 = weight(abstract_txt:museum in 1125) [ClassicSimilarity], result of:
            0.123064555 = score(doc=1125,freq=1.0), product of:
              0.308464 = queryWeight, product of:
                1.9479321 = boost
                6.3833475 = idf(docFreq=203, maxDocs=44421)
                0.024807451 = queryNorm
              0.39895922 = fieldWeight in 1125, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3833475 = idf(docFreq=203, maxDocs=44421)
                0.0625 = fieldNorm(doc=1125)
          0.119238734 = weight(abstract_txt:project in 1125) [ClassicSimilarity], result of:
            0.119238734 = score(doc=1125,freq=4.0), product of:
              0.21780665 = queryWeight, product of:
                2.0047157 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.024807451 = queryNorm
              0.5474522 = fieldWeight in 1125, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.0625 = fieldNorm(doc=1125)
        0.28 = coord(7/25)
    
  5. LaBarre, K.A.; Tilley, C.L.: ¬The elusive tale : leveraging the study of information seeking and knowledge organization to improve access to and discovery of folktales (2012) 0.15
    0.1519358 = sum of:
      0.1519358 = product of:
        0.4747994 = sum of:
          0.017902726 = weight(abstract_txt:used in 1048) [ClassicSimilarity], result of:
            0.017902726 = score(doc=1048,freq=1.0), product of:
              0.08532218 = queryWeight, product of:
                1.0244778 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.024807451 = queryNorm
              0.20982501 = fieldWeight in 1048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
          0.040425282 = weight(abstract_txt:were in 1048) [ClassicSimilarity], result of:
            0.040425282 = score(doc=1048,freq=3.0), product of:
              0.101822585 = queryWeight, product of:
                1.1191638 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.024807451 = queryNorm
              0.39701685 = fieldWeight in 1048, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
          0.035827465 = weight(abstract_txt:bibliographic in 1048) [ClassicSimilarity], result of:
            0.035827465 = score(doc=1048,freq=1.0), product of:
              0.13549602 = queryWeight, product of:
                1.2910256 = boost
                4.230674 = idf(docFreq=1755, maxDocs=44421)
                0.024807451 = queryNorm
              0.2644171 = fieldWeight in 1048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.230674 = idf(docFreq=1755, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
          0.04097299 = weight(abstract_txt:records in 1048) [ClassicSimilarity], result of:
            0.04097299 = score(doc=1048,freq=1.0), product of:
              0.14817704 = queryWeight, product of:
                1.3500879 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.024807451 = queryNorm
              0.27651376 = fieldWeight in 1048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
          0.047559537 = weight(abstract_txt:collection in 1048) [ClassicSimilarity], result of:
            0.047559537 = score(doc=1048,freq=1.0), product of:
              0.16365938 = queryWeight, product of:
                1.4188682 = boost
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.024807451 = queryNorm
              0.29060075 = fieldWeight in 1048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.649612 = idf(docFreq=1154, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
          0.091613054 = weight(abstract_txt:contents in 1048) [ClassicSimilarity], result of:
            0.091613054 = score(doc=1048,freq=1.0), product of:
              0.25336933 = queryWeight, product of:
                1.7654223 = boost
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.024807451 = queryNorm
              0.3615791 = fieldWeight in 1048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
          0.059619367 = weight(abstract_txt:project in 1048) [ClassicSimilarity], result of:
            0.059619367 = score(doc=1048,freq=1.0), product of:
              0.21780665 = queryWeight, product of:
                2.0047157 = boost
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.024807451 = queryNorm
              0.2737261 = fieldWeight in 1048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3796177 = idf(docFreq=1512, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
          0.14087898 = weight(abstract_txt:tables in 1048) [ClassicSimilarity], result of:
            0.14087898 = score(doc=1048,freq=1.0), product of:
              0.3375566 = queryWeight, product of:
                2.0377219 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.024807451 = queryNorm
              0.4173492 = fieldWeight in 1048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.0625 = fieldNorm(doc=1048)
        0.32 = coord(8/25)