Document (#42210)

Author
Wiesenmüller, H.
Title
Maschinelle Indexierung am Beispiel der DNB : Analyse und Entwicklungmöglichkeiten
Source
o-bib: Das offene Bibliotheksjournal. 5(2018) Nr.4, S.141-153
Year
2018
Abstract
Der Beitrag untersucht die Ergebnisse des bei der Deutschen Nationalbibliothek (DNB) eingesetzten Verfahrens zur automatischen Vergabe von Schlagwörtern. Seit 2017 kommt dieses auch bei Printausgaben der Reihen B und H der Deutschen Nationalbibliografie zum Einsatz. Die zentralen Problembereiche werden dargestellt und an Beispielen illustriert - beispielsweise dass nicht alle im Inhaltsverzeichnis vorkommenden Wörter tatsächlich thematische Aspekte ausdrücken und dass die Software sehr häufig Körperschaften und andere "Named entities" nicht erkennt. Die maschinell generierten Ergebnisse sind derzeit sehr unbefriedigend. Es werden Überlegungen für mögliche Verbesserungen und sinnvolle Strategien angestellt.
Content
Vortrag anlässlich des 107. Deutschen Bibliothekartages 2018 in Berlin, Themenkreis "Fokus Erschließen & Bewahren". https://www.o-bib.de/article/view/5396. https://doi.org/10.5282/o-bib/2018H4S141-153.
Theme
Automatisches Indexieren
Object
DNB
Location
D
Aid
Digitaler Assistent (Averbis)

Similar documents (author)

  1. Wiesenmüller, H.: Gewogen und für zu leicht befunden : die Ergebnisse des RDA Tests in den USA (2011) 4.95
    4.945436 = sum of:
      4.945436 = weight(author_txt:wiesenmüller in 5660) [ClassicSimilarity], result of:
        4.945436 = fieldWeight in 5660, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.912698 = idf(docFreq=43, maxDocs=44218)
          0.625 = fieldNorm(doc=5660)
    
  2. Wiesenmüller, H.: ¬Das Konzept der "Virtuellen Bibliothek" im deutschen Bibliothekswesen der 1990er Jahre (2000) 4.95
    4.945436 = sum of:
      4.945436 = weight(author_txt:wiesenmüller in 123) [ClassicSimilarity], result of:
        4.945436 = fieldWeight in 123, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.912698 = idf(docFreq=43, maxDocs=44218)
          0.625 = fieldNorm(doc=123)
    
  3. Wiesenmüller, H.: Von Fröschen und Strategen : Ein kleiner Leitfaden zur AACR2-Debatte (2002) 4.95
    4.945436 = sum of:
      4.945436 = weight(author_txt:wiesenmüller in 636) [ClassicSimilarity], result of:
        4.945436 = fieldWeight in 636, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.912698 = idf(docFreq=43, maxDocs=44218)
          0.625 = fieldNorm(doc=636)
    
  4. Wiesenmüller, H.: Versuch eines Fazits (2002) 4.95
    4.945436 = sum of:
      4.945436 = weight(author_txt:wiesenmüller in 1100) [ClassicSimilarity], result of:
        4.945436 = fieldWeight in 1100, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.912698 = idf(docFreq=43, maxDocs=44218)
          0.625 = fieldNorm(doc=1100)
    
  5. Wiesenmüller, H.: Langzeitarchivierung von Online-Publikationen an Regionalbibliotheken : Das Projekt 'Baden-Württembergisches Online-Archiv' (BOA) (2004) 4.95
    4.945436 = sum of:
      4.945436 = weight(author_txt:wiesenmüller in 2283) [ClassicSimilarity], result of:
        4.945436 = fieldWeight in 2283, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.912698 = idf(docFreq=43, maxDocs=44218)
          0.625 = fieldNorm(doc=2283)
    

Similar documents (content)

  1. Ansorge, K.; Vierschilling, N.: http://dnb.ddb.de : Von dicken Wälzern zur Online-Verzeichnung (2003) 0.15
    0.15042014 = sum of:
      0.15042014 = product of:
        0.5372148 = sum of:
          0.023770291 = weight(abstract_txt:nicht in 1952) [ClassicSimilarity], result of:
            0.023770291 = score(doc=1952,freq=3.0), product of:
              0.08907657 = queryWeight, product of:
                1.007994 = boost
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.022405528 = queryNorm
              0.26685235 = fieldWeight in 1952, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.07018205 = weight(abstract_txt:angestellt in 1952) [ClassicSimilarity], result of:
            0.07018205 = score(doc=1952,freq=1.0), product of:
              0.2098561 = queryWeight, product of:
                1.0940118 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.022405528 = queryNorm
              0.3344294 = fieldWeight in 1952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.07367556 = weight(abstract_txt:reihen in 1952) [ClassicSimilarity], result of:
            0.07367556 = score(doc=1952,freq=1.0), product of:
              0.21676368 = queryWeight, product of:
                1.1118711 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.022405528 = queryNorm
              0.33988887 = fieldWeight in 1952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.18367799 = weight(abstract_txt:nationalbibliografie in 1952) [ClassicSimilarity], result of:
            0.18367799 = score(doc=1952,freq=6.0), product of:
              0.21932688 = queryWeight, product of:
                1.1184257 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.022405528 = queryNorm
              0.83746225 = fieldWeight in 1952, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.03580977 = weight(abstract_txt:dass in 1952) [ClassicSimilarity], result of:
            0.03580977 = score(doc=1952,freq=3.0), product of:
              0.1170599 = queryWeight, product of:
                1.1555275 = boost
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.022405528 = queryNorm
              0.30590978 = fieldWeight in 1952, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.11328356 = weight(abstract_txt:deutschen in 1952) [ClassicSimilarity], result of:
            0.11328356 = score(doc=1952,freq=14.0), product of:
              0.15095568 = queryWeight, product of:
                1.3122027 = boost
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.022405528 = queryNorm
              0.7504425 = fieldWeight in 1952, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
          0.036815558 = weight(abstract_txt:ergebnisse in 1952) [ClassicSimilarity], result of:
            0.036815558 = score(doc=1952,freq=1.0), product of:
              0.17197625 = queryWeight, product of:
                1.4005882 = boost
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.022405528 = queryNorm
              0.2140735 = fieldWeight in 1952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1952)
        0.28 = coord(7/25)
    
  2. Lepsky, K.: Automatische Indexierung des Reallexikons zur Deutschen Kunstgeschichte (2006) 0.11
    0.10683996 = sum of:
      0.10683996 = product of:
        0.3815713 = sum of:
          0.04948181 = weight(abstract_txt:nicht in 6080) [ClassicSimilarity], result of:
            0.04948181 = score(doc=6080,freq=13.0), product of:
              0.08907657 = queryWeight, product of:
                1.007994 = boost
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.022405528 = queryNorm
              0.55549747 = fieldWeight in 6080, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6080)
          0.057433635 = weight(abstract_txt:sinnvolle in 6080) [ClassicSimilarity], result of:
            0.057433635 = score(doc=6080,freq=1.0), product of:
              0.18360385 = queryWeight, product of:
                1.0232979 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.022405528 = queryNorm
              0.3128128 = fieldWeight in 6080, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6080)
          0.06236983 = weight(abstract_txt:maschinell in 6080) [ClassicSimilarity], result of:
            0.06236983 = score(doc=6080,freq=1.0), product of:
              0.19397865 = queryWeight, product of:
                1.0518122 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.022405528 = queryNorm
              0.32152936 = fieldWeight in 6080, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6080)
          0.041349556 = weight(abstract_txt:dass in 6080) [ClassicSimilarity], result of:
            0.041349556 = score(doc=6080,freq=4.0), product of:
              0.1170599 = queryWeight, product of:
                1.1555275 = boost
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.022405528 = queryNorm
              0.35323417 = fieldWeight in 6080, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6080)
          0.08742828 = weight(abstract_txt:inhaltsverzeichnis in 6080) [ClassicSimilarity], result of:
            0.08742828 = score(doc=6080,freq=1.0), product of:
              0.24296227 = queryWeight, product of:
                1.1771468 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.022405528 = queryNorm
              0.35984302 = fieldWeight in 6080, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6080)
          0.042817157 = weight(abstract_txt:deutschen in 6080) [ClassicSimilarity], result of:
            0.042817157 = score(doc=6080,freq=2.0), product of:
              0.15095568 = queryWeight, product of:
                1.3122027 = boost
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.022405528 = queryNorm
              0.2836406 = fieldWeight in 6080, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6080)
          0.04069106 = weight(abstract_txt:sehr in 6080) [ClassicSimilarity], result of:
            0.04069106 = score(doc=6080,freq=1.0), product of:
              0.18384291 = queryWeight, product of:
                1.4481037 = boost
                5.666202 = idf(docFreq=415, maxDocs=44218)
                0.022405528 = queryNorm
              0.22133602 = fieldWeight in 6080, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.666202 = idf(docFreq=415, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6080)
        0.28 = coord(7/25)
    
  3. Ansorge, K.: Deutsche Nationalbibliographie 2004 (2003) 0.11
    0.1063383 = sum of:
      0.1063383 = product of:
        0.5316915 = sum of:
          0.019213298 = weight(abstract_txt:nicht in 1796) [ClassicSimilarity], result of:
            0.019213298 = score(doc=1796,freq=1.0), product of:
              0.08907657 = queryWeight, product of:
                1.007994 = boost
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.022405528 = queryNorm
              0.21569419 = fieldWeight in 1796, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1796)
          0.09825487 = weight(abstract_txt:angestellt in 1796) [ClassicSimilarity], result of:
            0.09825487 = score(doc=1796,freq=1.0), product of:
              0.2098561 = queryWeight, product of:
                1.0940118 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.022405528 = queryNorm
              0.46820116 = fieldWeight in 1796, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1796)
          0.14587016 = weight(abstract_txt:reihen in 1796) [ClassicSimilarity], result of:
            0.14587016 = score(doc=1796,freq=2.0), product of:
              0.21676368 = queryWeight, product of:
                1.1118711 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.022405528 = queryNorm
              0.6729456 = fieldWeight in 1796, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1796)
          0.14846513 = weight(abstract_txt:nationalbibliografie in 1796) [ClassicSimilarity], result of:
            0.14846513 = score(doc=1796,freq=2.0), product of:
              0.21932688 = queryWeight, product of:
                1.1184257 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.022405528 = queryNorm
              0.6769126 = fieldWeight in 1796, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1796)
          0.11988804 = weight(abstract_txt:deutschen in 1796) [ClassicSimilarity], result of:
            0.11988804 = score(doc=1796,freq=8.0), product of:
              0.15095568 = queryWeight, product of:
                1.3122027 = boost
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.022405528 = queryNorm
              0.7941936 = fieldWeight in 1796, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1796)
        0.2 = coord(5/25)
    
  4. Ansorge, K.: Deutsche Nationalbibliographie 2004 (2003) 0.11
    0.1063383 = sum of:
      0.1063383 = product of:
        0.5316915 = sum of:
          0.019213298 = weight(abstract_txt:nicht in 2034) [ClassicSimilarity], result of:
            0.019213298 = score(doc=2034,freq=1.0), product of:
              0.08907657 = queryWeight, product of:
                1.007994 = boost
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.022405528 = queryNorm
              0.21569419 = fieldWeight in 2034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2034)
          0.09825487 = weight(abstract_txt:angestellt in 2034) [ClassicSimilarity], result of:
            0.09825487 = score(doc=2034,freq=1.0), product of:
              0.2098561 = queryWeight, product of:
                1.0940118 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.022405528 = queryNorm
              0.46820116 = fieldWeight in 2034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2034)
          0.14587016 = weight(abstract_txt:reihen in 2034) [ClassicSimilarity], result of:
            0.14587016 = score(doc=2034,freq=2.0), product of:
              0.21676368 = queryWeight, product of:
                1.1118711 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.022405528 = queryNorm
              0.6729456 = fieldWeight in 2034, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2034)
          0.14846513 = weight(abstract_txt:nationalbibliografie in 2034) [ClassicSimilarity], result of:
            0.14846513 = score(doc=2034,freq=2.0), product of:
              0.21932688 = queryWeight, product of:
                1.1184257 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.022405528 = queryNorm
              0.6769126 = fieldWeight in 2034, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2034)
          0.11988804 = weight(abstract_txt:deutschen in 2034) [ClassicSimilarity], result of:
            0.11988804 = score(doc=2034,freq=8.0), product of:
              0.15095568 = queryWeight, product of:
                1.3122027 = boost
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.022405528 = queryNorm
              0.7941936 = fieldWeight in 2034, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.1344433 = idf(docFreq=707, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2034)
        0.2 = coord(5/25)
    
  5. Iglezakis, D.; Schembera, B.: Anforderungen der Ingenieurwissenschaften an das Forschungsdatenmanagement der Universität Stuttgart : Ergebnisse der Bedarfsanalyse des Projektes DIPL-ING (2018) 0.10
    0.10281452 = sum of:
      0.10281452 = product of:
        0.42839384 = sum of:
          0.031053381 = weight(abstract_txt:nicht in 4488) [ClassicSimilarity], result of:
            0.031053381 = score(doc=4488,freq=2.0), product of:
              0.08907657 = queryWeight, product of:
                1.007994 = boost
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.022405528 = queryNorm
              0.34861445 = fieldWeight in 4488, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9441223 = idf(docFreq=2327, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.09189382 = weight(abstract_txt:sinnvolle in 4488) [ClassicSimilarity], result of:
            0.09189382 = score(doc=4488,freq=1.0), product of:
              0.18360385 = queryWeight, product of:
                1.0232979 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.022405528 = queryNorm
              0.5005005 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.033079647 = weight(abstract_txt:dass in 4488) [ClassicSimilarity], result of:
            0.033079647 = score(doc=4488,freq=1.0), product of:
              0.1170599 = queryWeight, product of:
                1.1555275 = boost
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.022405528 = queryNorm
              0.28258735 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.14835642 = weight(abstract_txt:problembereiche in 4488) [ClassicSimilarity], result of:
            0.14835642 = score(doc=4488,freq=1.0), product of:
              0.25267473 = queryWeight, product of:
                1.2004446 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.022405528 = queryNorm
              0.5871439 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.05890489 = weight(abstract_txt:ergebnisse in 4488) [ClassicSimilarity], result of:
            0.05890489 = score(doc=4488,freq=1.0), product of:
              0.17197625 = queryWeight, product of:
                1.4005882 = boost
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.022405528 = queryNorm
              0.34251758 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4802814 = idf(docFreq=500, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.06510569 = weight(abstract_txt:sehr in 4488) [ClassicSimilarity], result of:
            0.06510569 = score(doc=4488,freq=1.0), product of:
              0.18384291 = queryWeight, product of:
                1.4481037 = boost
                5.666202 = idf(docFreq=415, maxDocs=44218)
                0.022405528 = queryNorm
              0.35413763 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.666202 = idf(docFreq=415, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
        0.24 = coord(6/25)