Document (#43369)

Author
Steeg, F.
Pohl, A.
Title
¬Ein Protokoll für den Datenabgleich im Web am Beispiel von OpenRefine und der Gemeinsamen Normdatei (GND)
Source
Qualität in der Inhaltserschließung. Hrsg.: M. Franke-Maier, u.a
Imprint
München : DeGruyter-Saur
Year
2021
Pages
S.259-278
Series
Bibliotheks- und Informationspraxis; 70
Abstract
Normdaten spielen speziell im Hinblick auf die Qualität der Inhaltserschließung bibliografischer und archivalischer Ressourcen eine wichtige Rolle. Ein konkretes Ziel der Inhaltserschließung ist z. B., dass alle Werke über Hermann Hesse einheitlich zu finden sind. Hier bieten Normdaten eine Lösung, indem z. B. bei der Erschließung einheitlich die GND-Nummer 11855042X für Hermann Hesse verwendet wird. Das Ergebnis ist eine höhere Qualität der Inhaltserschließung vor allem im Sinne von Einheitlichkeit und Eindeutigkeit und, daraus resultierend, eine bessere Auffindbarkeit. Werden solche Entitäten miteinander verknüpft, z. B. Hermann Hesse mit einem seiner Werke, entsteht ein Knowledge Graph, wie ihn etwa Google bei der Inhaltserschließung des Web verwendet (Singhal 2012). Die Entwicklung des Google Knowledge Graph und das hier vorgestellte Protokoll sind historisch miteinander verbunden: OpenRefine wurde ursprünglich als Google Refine entwickelt, und die Funktionalität zum Abgleich mit externen Datenquellen (Reconciliation) wurde ursprünglich zur Einbindung von Freebase entwickelt, einer der Datenquellen des Google Knowledge Graph. Freebase wurde später in Wikidata integriert. Schon Google Refine wurde zum Abgleich mit Normdaten verwendet, etwa den Library of Congress Subject Headings (Hooland et al. 2013).
Theme
Normdateien
Semantische Interoperabilität
Object
OpenRefine
GND
Google Knowledge Graph

Similar documents (author)

  1. Pohl, M.: Hypertext und analoge Wissensrepräsentation : Wie Texte zu Bildern und Bilder zu Texten werden (2003) 5.66
    5.664006 = sum of:
      5.664006 = weight(author_txt:pohl in 4855) [ClassicSimilarity], result of:
        5.664006 = fieldWeight in 4855, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.06241 = idf(docFreq=13, maxDocs=44421)
          0.625 = fieldNorm(doc=4855)
    
  2. Pohl, A.: OCLC, WorldCat und die Metadaten-Kontroverse (2009) 5.66
    5.664006 = sum of:
      5.664006 = weight(author_txt:pohl in 3780) [ClassicSimilarity], result of:
        5.664006 = fieldWeight in 3780, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.06241 = idf(docFreq=13, maxDocs=44421)
          0.625 = fieldNorm(doc=3780)
    
  3. Pohl, O.: Konzept und prototypische Erstellung eines Informationssystems auf VuFind-Basis für die Bibliotheks- und Informationswissenschaft (2012) 5.66
    5.664006 = sum of:
      5.664006 = weight(author_txt:pohl in 2564) [ClassicSimilarity], result of:
        5.664006 = fieldWeight in 2564, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.06241 = idf(docFreq=13, maxDocs=44421)
          0.625 = fieldNorm(doc=2564)
    
  4. Pohl, O.: rdfedit: user supporting Web application for creating and manipulating RDF instance data (2014) 5.66
    5.664006 = sum of:
      5.664006 = weight(author_txt:pohl in 2571) [ClassicSimilarity], result of:
        5.664006 = fieldWeight in 2571, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.06241 = idf(docFreq=13, maxDocs=44421)
          0.625 = fieldNorm(doc=2571)
    
  5. Pohl, A.: Mit der DFG und CIB nach WorldShare und Alma (2013) 5.66
    5.664006 = sum of:
      5.664006 = weight(author_txt:pohl in 2829) [ClassicSimilarity], result of:
        5.664006 = fieldWeight in 2829, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.06241 = idf(docFreq=13, maxDocs=44421)
          0.625 = fieldNorm(doc=2829)
    

Similar documents (content)

  1. Lischka, K.: 128 Zeichen für die Welt : Vor 40 Jahren schrieben Fachleute das Alphabet des Computers - und schufen damit dem ASCII-Standard (2003) 0.10
    0.09521612 = sum of:
      0.09521612 = product of:
        0.29755038 = sum of:
          0.013416804 = weight(abstract_txt:hier in 1391) [ClassicSimilarity], result of:
            0.013416804 = score(doc=1391,freq=1.0), product of:
              0.08176309 = queryWeight, product of:
                1.1173662 = boost
                5.250997 = idf(docFreq=632, maxDocs=44421)
                0.013935417 = queryNorm
              0.16409366 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.250997 = idf(docFreq=632, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.017993283 = weight(abstract_txt:etwa in 1391) [ClassicSimilarity], result of:
            0.017993283 = score(doc=1391,freq=1.0), product of:
              0.09943321 = queryWeight, product of:
                1.2322041 = boost
                5.790671 = idf(docFreq=368, maxDocs=44421)
                0.013935417 = queryNorm
              0.18095846 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.790671 = idf(docFreq=368, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.027591953 = weight(abstract_txt:miteinander in 1391) [ClassicSimilarity], result of:
            0.027591953 = score(doc=1391,freq=1.0), product of:
              0.13222478 = queryWeight, product of:
                1.420932 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.013935417 = queryNorm
              0.2086746 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.020824187 = weight(abstract_txt:eine in 1391) [ClassicSimilarity], result of:
            0.020824187 = score(doc=1391,freq=7.0), product of:
              0.07219059 = queryWeight, product of:
                1.4848144 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.013935417 = queryNorm
              0.28846124 = fieldWeight in 1391, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.042511504 = weight(abstract_txt:ursprünglich in 1391) [ClassicSimilarity], result of:
            0.042511504 = score(doc=1391,freq=1.0), product of:
              0.17638522 = queryWeight, product of:
                1.6411489 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.013935417 = queryNorm
              0.2410151 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.065965146 = weight(abstract_txt:einheitlich in 1391) [ClassicSimilarity], result of:
            0.065965146 = score(doc=1391,freq=1.0), product of:
              0.23641095 = queryWeight, product of:
                1.8999872 = boost
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.013935417 = queryNorm
              0.27902746 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.06896916 = weight(abstract_txt:protokoll in 1391) [ClassicSimilarity], result of:
            0.06896916 = score(doc=1391,freq=1.0), product of:
              0.2435349 = queryWeight, product of:
                1.9284016 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.013935417 = queryNorm
              0.28320032 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.040278345 = weight(abstract_txt:wurde in 1391) [ClassicSimilarity], result of:
            0.040278345 = score(doc=1391,freq=4.0), product of:
              0.13505033 = queryWeight, product of:
                2.0308588 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.013935417 = queryNorm
              0.29824692 = fieldWeight in 1391, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
        0.32 = coord(8/25)
    
  2. Bohne-Lang, A.: Semantische Metadaten für den Webauftritt einer Bibliothek (2016) 0.09
    0.08596673 = sum of:
      0.08596673 = product of:
        0.3581947 = sum of:
          0.03130277 = weight(abstract_txt:entwickelt in 4337) [ClassicSimilarity], result of:
            0.03130277 = score(doc=4337,freq=1.0), product of:
              0.09060658 = queryWeight, product of:
                1.1762422 = boost
                5.5276814 = idf(docFreq=479, maxDocs=44421)
                0.013935417 = queryNorm
              0.34548008 = fieldWeight in 4337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5276814 = idf(docFreq=479, maxDocs=44421)
                0.0625 = fieldNorm(doc=4337)
          0.022261992 = weight(abstract_txt:eine in 4337) [ClassicSimilarity], result of:
            0.022261992 = score(doc=4337,freq=2.0), product of:
              0.07219059 = queryWeight, product of:
                1.4848144 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.013935417 = queryNorm
              0.30837804 = fieldWeight in 4337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.0625 = fieldNorm(doc=4337)
          0.08502301 = weight(abstract_txt:ursprünglich in 4337) [ClassicSimilarity], result of:
            0.08502301 = score(doc=4337,freq=1.0), product of:
              0.17638522 = queryWeight, product of:
                1.6411489 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.013935417 = queryNorm
              0.4820302 = fieldWeight in 4337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.0625 = fieldNorm(doc=4337)
          0.040278345 = weight(abstract_txt:wurde in 4337) [ClassicSimilarity], result of:
            0.040278345 = score(doc=4337,freq=1.0), product of:
              0.13505033 = queryWeight, product of:
                2.0308588 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.013935417 = queryNorm
              0.29824692 = fieldWeight in 4337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.0625 = fieldNorm(doc=4337)
          0.07826345 = weight(abstract_txt:graph in 4337) [ClassicSimilarity], result of:
            0.07826345 = score(doc=4337,freq=1.0), product of:
              0.19106199 = queryWeight, product of:
                2.091942 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.013935417 = queryNorm
              0.40962332 = fieldWeight in 4337, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.0625 = fieldNorm(doc=4337)
          0.101065144 = weight(abstract_txt:google in 4337) [ClassicSimilarity], result of:
            0.101065144 = score(doc=4337,freq=2.0), product of:
              0.21321061 = queryWeight, product of:
                2.8529313 = boost
                5.3628736 = idf(docFreq=565, maxDocs=44421)
                0.013935417 = queryNorm
              0.47401553 = fieldWeight in 4337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3628736 = idf(docFreq=565, maxDocs=44421)
                0.0625 = fieldNorm(doc=4337)
        0.24 = coord(6/25)
    
  3. Qualität in der Inhaltserschließung (2021) 0.08
    0.08134542 = sum of:
      0.08134542 = product of:
        0.6778785 = sum of:
          0.11431814 = weight(abstract_txt:qualität in 1754) [ClassicSimilarity], result of:
            0.11431814 = score(doc=1754,freq=3.0), product of:
              0.113696404 = queryWeight, product of:
                1.3176203 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.013935417 = queryNorm
              1.0054684 = fieldWeight in 1754, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.09375 = fieldNorm(doc=1754)
          0.2087573 = weight(abstract_txt:normdaten in 1754) [ClassicSimilarity], result of:
            0.2087573 = score(doc=1754,freq=1.0), product of:
              0.28043696 = queryWeight, product of:
                2.534429 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.013935417 = queryNorm
              0.7444001 = fieldWeight in 1754, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.09375 = fieldNorm(doc=1754)
          0.3548031 = weight(abstract_txt:inhaltserschließung in 1754) [ClassicSimilarity], result of:
            0.3548031 = score(doc=1754,freq=3.0), product of:
              0.30479294 = queryWeight, product of:
                3.0509448 = boost
                7.168868 = idf(docFreq=92, maxDocs=44421)
                0.013935417 = queryNorm
              1.1640791 = fieldWeight in 1754, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.168868 = idf(docFreq=92, maxDocs=44421)
                0.09375 = fieldNorm(doc=1754)
        0.12 = coord(3/25)
    
  4. Darstellung der CrissCross-Mappingrelationen im Rahmen des Semantic Web (2010) 0.08
    0.07738649 = sum of:
      0.07738649 = product of:
        0.32244372 = sum of:
          0.01956423 = weight(abstract_txt:entwickelt in 285) [ClassicSimilarity], result of:
            0.01956423 = score(doc=285,freq=1.0), product of:
              0.09060658 = queryWeight, product of:
                1.1762422 = boost
                5.5276814 = idf(docFreq=479, maxDocs=44421)
                0.013935417 = queryNorm
              0.21592505 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5276814 = idf(docFreq=479, maxDocs=44421)
                0.0390625 = fieldNorm(doc=285)
          0.03448994 = weight(abstract_txt:miteinander in 285) [ClassicSimilarity], result of:
            0.03448994 = score(doc=285,freq=1.0), product of:
              0.13222478 = queryWeight, product of:
                1.420932 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.013935417 = queryNorm
              0.26084325 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.0390625 = fieldNorm(doc=285)
          0.021999564 = weight(abstract_txt:eine in 285) [ClassicSimilarity], result of:
            0.021999564 = score(doc=285,freq=5.0), product of:
              0.07219059 = queryWeight, product of:
                1.4848144 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.013935417 = queryNorm
              0.3047428 = fieldWeight in 285, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.0390625 = fieldNorm(doc=285)
          0.025173964 = weight(abstract_txt:wurde in 285) [ClassicSimilarity], result of:
            0.025173964 = score(doc=285,freq=1.0), product of:
              0.13505033 = queryWeight, product of:
                2.0308588 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.013935417 = queryNorm
              0.18640432 = fieldWeight in 285, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.0390625 = fieldNorm(doc=285)
          0.07338139 = weight(abstract_txt:verwendet in 285) [ClassicSimilarity], result of:
            0.07338139 = score(doc=285,freq=2.0), product of:
              0.19872946 = queryWeight, product of:
                2.1335049 = boost
                6.684188 = idf(docFreq=150, maxDocs=44421)
                0.013935417 = queryNorm
              0.3692527 = fieldWeight in 285, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.684188 = idf(docFreq=150, maxDocs=44421)
                0.0390625 = fieldNorm(doc=285)
          0.14783461 = weight(abstract_txt:inhaltserschließung in 285) [ClassicSimilarity], result of:
            0.14783461 = score(doc=285,freq=3.0), product of:
              0.30479294 = queryWeight, product of:
                3.0509448 = boost
                7.168868 = idf(docFreq=92, maxDocs=44421)
                0.013935417 = queryNorm
              0.48503295 = fieldWeight in 285, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.168868 = idf(docFreq=92, maxDocs=44421)
                0.0390625 = fieldNorm(doc=285)
        0.24 = coord(6/25)
    
  5. Haffner, A.: Internationalisierung der GND durch das Semantic Web (2012) 0.07
    0.074810445 = sum of:
      0.074810445 = product of:
        0.3117102 = sum of:
          0.027500669 = weight(abstract_txt:qualität in 1318) [ClassicSimilarity], result of:
            0.027500669 = score(doc=1318,freq=1.0), product of:
              0.113696404 = queryWeight, product of:
                1.3176203 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.013935417 = queryNorm
              0.24187809 = fieldWeight in 1318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1318)
          0.03448994 = weight(abstract_txt:miteinander in 1318) [ClassicSimilarity], result of:
            0.03448994 = score(doc=1318,freq=1.0), product of:
              0.13222478 = queryWeight, product of:
                1.420932 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.013935417 = queryNorm
              0.26084325 = fieldWeight in 1318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1318)
          0.021999564 = weight(abstract_txt:eine in 1318) [ClassicSimilarity], result of:
            0.021999564 = score(doc=1318,freq=5.0), product of:
              0.07219059 = queryWeight, product of:
                1.4848144 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.013935417 = queryNorm
              0.3047428 = fieldWeight in 1318, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1318)
          0.025173964 = weight(abstract_txt:wurde in 1318) [ClassicSimilarity], result of:
            0.025173964 = score(doc=1318,freq=1.0), product of:
              0.13505033 = queryWeight, product of:
                2.0308588 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.013935417 = queryNorm
              0.18640432 = fieldWeight in 1318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1318)
          0.051888477 = weight(abstract_txt:verwendet in 1318) [ClassicSimilarity], result of:
            0.051888477 = score(doc=1318,freq=1.0), product of:
              0.19872946 = queryWeight, product of:
                2.1335049 = boost
                6.684188 = idf(docFreq=150, maxDocs=44421)
                0.013935417 = queryNorm
              0.2611011 = fieldWeight in 1318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.684188 = idf(docFreq=150, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1318)
          0.1506576 = weight(abstract_txt:normdaten in 1318) [ClassicSimilarity], result of:
            0.1506576 = score(doc=1318,freq=3.0), product of:
              0.28043696 = queryWeight, product of:
                2.534429 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.013935417 = queryNorm
              0.5372245 = fieldWeight in 1318, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1318)
        0.24 = coord(6/25)