Document (#42210)

Author
Wiesenmüller, H.
Title
Maschinelle Indexierung am Beispiel der DNB : Analyse und Entwicklungmöglichkeiten
Source
o-bib: Das offene Bibliotheksjournal. 5(2018) Nr.4, S.141-153
Year
2018
Abstract
Der Beitrag untersucht die Ergebnisse des bei der Deutschen Nationalbibliothek (DNB) eingesetzten Verfahrens zur automatischen Vergabe von Schlagwörtern. Seit 2017 kommt dieses auch bei Printausgaben der Reihen B und H der Deutschen Nationalbibliografie zum Einsatz. Die zentralen Problembereiche werden dargestellt und an Beispielen illustriert - beispielsweise dass nicht alle im Inhaltsverzeichnis vorkommenden Wörter tatsächlich thematische Aspekte ausdrücken und dass die Software sehr häufig Körperschaften und andere "Named entities" nicht erkennt. Die maschinell generierten Ergebnisse sind derzeit sehr unbefriedigend. Es werden Überlegungen für mögliche Verbesserungen und sinnvolle Strategien angestellt.
Content
Vortrag anlässlich des 107. Deutschen Bibliothekartages 2018 in Berlin, Themenkreis "Fokus Erschließen & Bewahren". https://www.o-bib.de/article/view/5396. https://doi.org/10.5282/o-bib/2018H4S141-153.
Theme
Automatisches Indexieren
Object
DNB
Location
D
Aid
Digitaler Assistent (Averbis)

Similar documents (author)

  1. Wiesenmüller, H.: Gewogen und für zu leicht befunden : die Ergebnisse des RDA Tests in den USA (2011) 4.93
    4.934253 = sum of:
      4.934253 = weight(author_txt:wiesenmüller in 6660) [ClassicSimilarity], result of:
        4.934253 = fieldWeight in 6660, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.894805 = idf(docFreq=44, maxDocs=44421)
          0.625 = fieldNorm(doc=6660)
    
  2. Wiesenmüller, H.: ¬Das Konzept der "Virtuellen Bibliothek" im deutschen Bibliothekswesen der 1990er Jahre (2000) 4.93
    4.934253 = sum of:
      4.934253 = weight(author_txt:wiesenmüller in 1123) [ClassicSimilarity], result of:
        4.934253 = fieldWeight in 1123, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.894805 = idf(docFreq=44, maxDocs=44421)
          0.625 = fieldNorm(doc=1123)
    
  3. Wiesenmüller, H.: Von Fröschen und Strategen : Ein kleiner Leitfaden zur AACR2-Debatte (2002) 4.93
    4.934253 = sum of:
      4.934253 = weight(author_txt:wiesenmüller in 1636) [ClassicSimilarity], result of:
        4.934253 = fieldWeight in 1636, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.894805 = idf(docFreq=44, maxDocs=44421)
          0.625 = fieldNorm(doc=1636)
    
  4. Wiesenmüller, H.: Versuch eines Fazits (2002) 4.93
    4.934253 = sum of:
      4.934253 = weight(author_txt:wiesenmüller in 2100) [ClassicSimilarity], result of:
        4.934253 = fieldWeight in 2100, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.894805 = idf(docFreq=44, maxDocs=44421)
          0.625 = fieldNorm(doc=2100)
    
  5. Wiesenmüller, H.: Langzeitarchivierung von Online-Publikationen an Regionalbibliotheken : Das Projekt 'Baden-Württembergisches Online-Archiv' (BOA) (2004) 4.93
    4.934253 = sum of:
      4.934253 = weight(author_txt:wiesenmüller in 3283) [ClassicSimilarity], result of:
        4.934253 = fieldWeight in 3283, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.894805 = idf(docFreq=44, maxDocs=44421)
          0.625 = fieldNorm(doc=3283)
    

Similar documents (content)

  1. Ansorge, K.; Vierschilling, N.: http://dnb.ddb.de : Von dicken Wälzern zur Online-Verzeichnung (2003) 0.15
    0.15072332 = sum of:
      0.15072332 = product of:
        0.5382976 = sum of:
          0.02378641 = weight(abstract_txt:nicht in 2952) [ClassicSimilarity], result of:
            0.02378641 = score(doc=2952,freq=3.0), product of:
              0.08914965 = queryWeight, product of:
                1.007261 = boost
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.022443417 = queryNorm
              0.2668144 = fieldWeight in 2952, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.07037244 = weight(abstract_txt:angestellt in 2952) [ClassicSimilarity], result of:
            0.07037244 = score(doc=2952,freq=1.0), product of:
              0.21031289 = queryWeight, product of:
                1.0939568 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.022443417 = queryNorm
              0.33460832 = fieldWeight in 2952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.07387353 = weight(abstract_txt:reihen in 2952) [ClassicSimilarity], result of:
            0.07387353 = score(doc=2952,freq=1.0), product of:
              0.21723178 = queryWeight, product of:
                1.1118058 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.022443417 = queryNorm
              0.34006777 = fieldWeight in 2952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.1841698 = weight(abstract_txt:nationalbibliografie in 2952) [ClassicSimilarity], result of:
            0.1841698 = score(doc=2952,freq=6.0), product of:
              0.21979913 = queryWeight, product of:
                1.1183563 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.022443417 = queryNorm
              0.8379005 = fieldWeight in 2952, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.03579509 = weight(abstract_txt:dass in 2952) [ClassicSimilarity], result of:
            0.03579509 = score(doc=2952,freq=3.0), product of:
              0.117071 = queryWeight, product of:
                1.1542686 = boost
                4.5191154 = idf(docFreq=1315, maxDocs=44421)
                0.022443417 = queryNorm
              0.30575538 = fieldWeight in 2952, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5191154 = idf(docFreq=1315, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.11343207 = weight(abstract_txt:deutschen in 2952) [ClassicSimilarity], result of:
            0.11343207 = score(doc=2952,freq=14.0), product of:
              0.15114322 = queryWeight, product of:
                1.311525 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.022443417 = queryNorm
              0.75049394 = fieldWeight in 2952, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.03686827 = weight(abstract_txt:ergebnisse in 2952) [ClassicSimilarity], result of:
            0.03686827 = score(doc=2952,freq=1.0), product of:
              0.17220376 = queryWeight, product of:
                1.3999211 = boost
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.022443417 = queryNorm
              0.21409677 = fieldWeight in 2952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
        0.28 = coord(7/25)
    
  2. Lepsky, K.: Automatische Indexierung des Reallexikons zur Deutschen Kunstgeschichte (2006) 0.11
    0.10681918 = sum of:
      0.10681918 = product of:
        0.38149709 = sum of:
          0.049515363 = weight(abstract_txt:nicht in 80) [ClassicSimilarity], result of:
            0.049515363 = score(doc=80,freq=13.0), product of:
              0.08914965 = queryWeight, product of:
                1.007261 = boost
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.022443417 = queryNorm
              0.5554185 = fieldWeight in 80, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.0390625 = fieldNorm(doc=80)
          0.057595827 = weight(abstract_txt:sinnvolle in 80) [ClassicSimilarity], result of:
            0.057595827 = score(doc=80,freq=1.0), product of:
              0.18401708 = queryWeight, product of:
                1.0232843 = boost
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.022443417 = queryNorm
              0.31299174 = fieldWeight in 80, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.0390625 = fieldNorm(doc=80)
          0.06184461 = weight(abstract_txt:maschinell in 80) [ClassicSimilarity], result of:
            0.06184461 = score(doc=80,freq=1.0), product of:
              0.19295914 = queryWeight, product of:
                1.0478519 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.022443417 = queryNorm
              0.32050624 = fieldWeight in 80, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0390625 = fieldNorm(doc=80)
          0.041332606 = weight(abstract_txt:dass in 80) [ClassicSimilarity], result of:
            0.041332606 = score(doc=80,freq=4.0), product of:
              0.117071 = queryWeight, product of:
                1.1542686 = boost
                4.5191154 = idf(docFreq=1315, maxDocs=44421)
                0.022443417 = queryNorm
              0.3530559 = fieldWeight in 80, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5191154 = idf(docFreq=1315, maxDocs=44421)
                0.0390625 = fieldNorm(doc=80)
          0.08765551 = weight(abstract_txt:inhaltsverzeichnis in 80) [ClassicSimilarity], result of:
            0.08765551 = score(doc=80,freq=1.0), product of:
              0.2434727 = queryWeight, product of:
                1.1770431 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.022443417 = queryNorm
              0.36002192 = fieldWeight in 80, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0390625 = fieldNorm(doc=80)
          0.04287329 = weight(abstract_txt:deutschen in 80) [ClassicSimilarity], result of:
            0.04287329 = score(doc=80,freq=2.0), product of:
              0.15114322 = queryWeight, product of:
                1.311525 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.022443417 = queryNorm
              0.28366002 = fieldWeight in 80, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.0390625 = fieldNorm(doc=80)
          0.040679857 = weight(abstract_txt:sehr in 80) [ClassicSimilarity], result of:
            0.040679857 = score(doc=80,freq=1.0), product of:
              0.18387686 = queryWeight, product of:
                1.446591 = boost
                5.6635966 = idf(docFreq=418, maxDocs=44421)
                0.022443417 = queryNorm
              0.22123425 = fieldWeight in 80, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6635966 = idf(docFreq=418, maxDocs=44421)
                0.0390625 = fieldNorm(doc=80)
        0.28 = coord(7/25)
    
  3. Ansorge, K.: Deutsche Nationalbibliographie 2004 (2003) 0.11
    0.10658355 = sum of:
      0.10658355 = product of:
        0.53291774 = sum of:
          0.019226326 = weight(abstract_txt:nicht in 2796) [ClassicSimilarity], result of:
            0.019226326 = score(doc=2796,freq=1.0), product of:
              0.08914965 = queryWeight, product of:
                1.007261 = boost
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.022443417 = queryNorm
              0.21566351 = fieldWeight in 2796, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2796)
          0.09852143 = weight(abstract_txt:angestellt in 2796) [ClassicSimilarity], result of:
            0.09852143 = score(doc=2796,freq=1.0), product of:
              0.21031289 = queryWeight, product of:
                1.0939568 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.022443417 = queryNorm
              0.46845168 = fieldWeight in 2796, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2796)
          0.14626212 = weight(abstract_txt:reihen in 2796) [ClassicSimilarity], result of:
            0.14626212 = score(doc=2796,freq=2.0), product of:
              0.21723178 = queryWeight, product of:
                1.1118058 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.022443417 = queryNorm
              0.67329985 = fieldWeight in 2796, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2796)
          0.14886267 = weight(abstract_txt:nationalbibliografie in 2796) [ClassicSimilarity], result of:
            0.14886267 = score(doc=2796,freq=2.0), product of:
              0.21979913 = queryWeight, product of:
                1.1183563 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.022443417 = queryNorm
              0.6772669 = fieldWeight in 2796, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2796)
          0.120045215 = weight(abstract_txt:deutschen in 2796) [ClassicSimilarity], result of:
            0.120045215 = score(doc=2796,freq=8.0), product of:
              0.15114322 = queryWeight, product of:
                1.311525 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.022443417 = queryNorm
              0.7942481 = fieldWeight in 2796, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2796)
        0.2 = coord(5/25)
    
  4. Ansorge, K.: Deutsche Nationalbibliographie 2004 (2003) 0.11
    0.10658355 = sum of:
      0.10658355 = product of:
        0.53291774 = sum of:
          0.019226326 = weight(abstract_txt:nicht in 3034) [ClassicSimilarity], result of:
            0.019226326 = score(doc=3034,freq=1.0), product of:
              0.08914965 = queryWeight, product of:
                1.007261 = boost
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.022443417 = queryNorm
              0.21566351 = fieldWeight in 3034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3034)
          0.09852143 = weight(abstract_txt:angestellt in 3034) [ClassicSimilarity], result of:
            0.09852143 = score(doc=3034,freq=1.0), product of:
              0.21031289 = queryWeight, product of:
                1.0939568 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.022443417 = queryNorm
              0.46845168 = fieldWeight in 3034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3034)
          0.14626212 = weight(abstract_txt:reihen in 3034) [ClassicSimilarity], result of:
            0.14626212 = score(doc=3034,freq=2.0), product of:
              0.21723178 = queryWeight, product of:
                1.1118058 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.022443417 = queryNorm
              0.67329985 = fieldWeight in 3034, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3034)
          0.14886267 = weight(abstract_txt:nationalbibliografie in 3034) [ClassicSimilarity], result of:
            0.14886267 = score(doc=3034,freq=2.0), product of:
              0.21979913 = queryWeight, product of:
                1.1183563 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.022443417 = queryNorm
              0.6772669 = fieldWeight in 3034, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3034)
          0.120045215 = weight(abstract_txt:deutschen in 3034) [ClassicSimilarity], result of:
            0.120045215 = score(doc=3034,freq=8.0), product of:
              0.15114322 = queryWeight, product of:
                1.311525 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.022443417 = queryNorm
              0.7942481 = fieldWeight in 3034, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3034)
        0.2 = coord(5/25)
    
  5. Iglezakis, D.; Schembera, B.: Anforderungen der Ingenieurwissenschaften an das Forschungsdatenmanagement der Universität Stuttgart : Ergebnisse der Bedarfsanalyse des Projektes DIPL-ING (2018) 0.10
    0.10298606 = sum of:
      0.10298606 = product of:
        0.4291086 = sum of:
          0.031074435 = weight(abstract_txt:nicht in 488) [ClassicSimilarity], result of:
            0.031074435 = score(doc=488,freq=2.0), product of:
              0.08914965 = queryWeight, product of:
                1.007261 = boost
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.022443417 = queryNorm
              0.34856486 = fieldWeight in 488, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9435613 = idf(docFreq=2339, maxDocs=44421)
                0.0625 = fieldNorm(doc=488)
          0.09215332 = weight(abstract_txt:sinnvolle in 488) [ClassicSimilarity], result of:
            0.09215332 = score(doc=488,freq=1.0), product of:
              0.18401708 = queryWeight, product of:
                1.0232843 = boost
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.022443417 = queryNorm
              0.5007868 = fieldWeight in 488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.0625 = fieldNorm(doc=488)
          0.033066086 = weight(abstract_txt:dass in 488) [ClassicSimilarity], result of:
            0.033066086 = score(doc=488,freq=1.0), product of:
              0.117071 = queryWeight, product of:
                1.1542686 = boost
                4.5191154 = idf(docFreq=1315, maxDocs=44421)
                0.022443417 = queryNorm
              0.28244472 = fieldWeight in 488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5191154 = idf(docFreq=1315, maxDocs=44421)
                0.0625 = fieldNorm(doc=488)
          0.14873774 = weight(abstract_txt:problembereiche in 488) [ClassicSimilarity], result of:
            0.14873774 = score(doc=488,freq=1.0), product of:
              0.2532007 = queryWeight, product of:
                1.2003273 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.022443417 = queryNorm
              0.5874302 = fieldWeight in 488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=488)
          0.058989234 = weight(abstract_txt:ergebnisse in 488) [ClassicSimilarity], result of:
            0.058989234 = score(doc=488,freq=1.0), product of:
              0.17220376 = queryWeight, product of:
                1.3999211 = boost
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.022443417 = queryNorm
              0.34255484 = fieldWeight in 488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.0625 = fieldNorm(doc=488)
          0.06508777 = weight(abstract_txt:sehr in 488) [ClassicSimilarity], result of:
            0.06508777 = score(doc=488,freq=1.0), product of:
              0.18387686 = queryWeight, product of:
                1.446591 = boost
                5.6635966 = idf(docFreq=418, maxDocs=44421)
                0.022443417 = queryNorm
              0.3539748 = fieldWeight in 488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6635966 = idf(docFreq=418, maxDocs=44421)
                0.0625 = fieldNorm(doc=488)
        0.24 = coord(6/25)