Document (#38276)

Author
Wiesenmüller, H.
Pfeffer, M.
Title
Abgleichen, anreichern, verknüpfen : das Clustering-Verfahren - eine neue Möglichkeit für die Analyse und Verbesserung von Katalogdaten
Source
BuB. 65(2013) H.9, S. 625-629
Year
2013
Series
Lesesaal: Praxis
Abstract
Ein vergleichsweise einfaches Verfah ren bildet die Grundlage: Über einen Abgleich einiger weniger Kategorien lassen sich mit großer Zuverlässigkeit diejenigen bibliografischen Datensätze aus einem Datenpool (der auch aus mehreren Katalogen bestehen kann) zusammenführen, die zum selben Werk gehören. Ein solches Werk-Cluster umfasst dann unterschiedliche Ausgaben und Auflagen eines Werkes ebenso wie Übersetzungen. Zu einem Cluster gehören alle Datensätze, die im Einheitssachtitel beziehungsweise in Sachtitel und Zusätzen übereinstimmen und mindestens eine verknüpfte Person oder Körperschaft gemeinsam haben.
Footnote
Neben den gewohnten Vortragsveranstaltungen in großen Sälen wartete der Leipziger Bibliothekskongress im März 2013 mit einem neuen Veranstaltungsformat auf: Verschiedene Workshops boten die Gelegenheit, Themen intensiv zu beleuchten und in kleinen Gruppen zu diskutieren. Einer dieser Workshops wurde von den Autoren des vorliegenden Beitrags gestaltet und war neuartigen Möglichkeiten für die Analyse und Verbesserung von Katalogdaten gewidmet. Als dritter Referent wurde Markus Geipel von der Deutschen Nationalbibliothek (DNB) über Google Hangout virtuell zugeschaltet. Initiiert wurde die Veranstaltung von der AG Bibliotheken der Deutschen Gesellschaft für Klassifikation, die damit an ihre Hildesheimer Tagung von 2012 anknüpfte' Im Folgenden werden die wichtigsten Ergebnisse zusammengefasst.
Theme
Kataloganreicherung

Similar documents (author)

  1. Pfeffer, M.; Wiesenmüller, H.: Resource Discovery Systeme (2016) 6.02
    6.022142 = sum of:
      6.022142 = sum of:
        2.3616695 = weight(author_txt:wiesenmüller in 6843) [ClassicSimilarity], result of:
          2.3616695 = score(doc=6843,freq=1.0), product of:
            0.5982845 = queryWeight, product of:
              7.894805 = idf(docFreq=44, maxDocs=44421)
              0.075782046 = queryNorm
            3.9474025 = fieldWeight in 6843, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.894805 = idf(docFreq=44, maxDocs=44421)
              0.5 = fieldNorm(doc=6843)
        3.6604724 = weight(author_txt:pfeffer in 6843) [ClassicSimilarity], result of:
          3.6604724 = score(doc=6843,freq=1.0), product of:
            0.80128384 = queryWeight, product of:
              1.1572824 = boost
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.075782046 = queryNorm
            4.5682592 = fieldWeight in 6843, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.5 = fieldNorm(doc=6843)
    
  2. Wiesenmüller, H.; Maylein, L.; Pfeffer, M.: Mehr aus der Schlagwortnormdatei herausholen : Implementierung einer geographischen Facette in den Online-Katalogen der UB Heidelberg und der UB Mannheim (2011) 4.52
    4.5166063 = sum of:
      4.5166063 = sum of:
        1.7712522 = weight(author_txt:wiesenmüller in 3563) [ClassicSimilarity], result of:
          1.7712522 = score(doc=3563,freq=1.0), product of:
            0.5982845 = queryWeight, product of:
              7.894805 = idf(docFreq=44, maxDocs=44421)
              0.075782046 = queryNorm
            2.9605517 = fieldWeight in 3563, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.894805 = idf(docFreq=44, maxDocs=44421)
              0.375 = fieldNorm(doc=3563)
        2.7453542 = weight(author_txt:pfeffer in 3563) [ClassicSimilarity], result of:
          2.7453542 = score(doc=3563,freq=1.0), product of:
            0.80128384 = queryWeight, product of:
              1.1572824 = boost
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.075782046 = queryNorm
            3.4261944 = fieldWeight in 3563, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.375 = fieldNorm(doc=3563)
    
  3. Pfeffer, J.: Online-Tutorials an deutschen Universitäts- und Hochschulbibliotheken : Verbreitung, Typologie und Analyse am Beispiel von LOTSE, DISCUS und BibTutor (2005) 2.29
    2.2877953 = sum of:
      2.2877953 = product of:
        4.5755906 = sum of:
          4.5755906 = weight(author_txt:pfeffer in 5837) [ClassicSimilarity], result of:
            4.5755906 = score(doc=5837,freq=1.0), product of:
              0.80128384 = queryWeight, product of:
                1.1572824 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.075782046 = queryNorm
              5.7103243 = fieldWeight in 5837, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.625 = fieldNorm(doc=5837)
        0.5 = coord(1/2)
    
  4. Pfeffer, M.: Automatische Vergabe von RVK-Notationen anhand von bibliografischen Daten mittels fallbasiertem Schließen (2007) 2.29
    2.2877953 = sum of:
      2.2877953 = product of:
        4.5755906 = sum of:
          4.5755906 = weight(author_txt:pfeffer in 1558) [ClassicSimilarity], result of:
            4.5755906 = score(doc=1558,freq=1.0), product of:
              0.80128384 = queryWeight, product of:
                1.1572824 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.075782046 = queryNorm
              5.7103243 = fieldWeight in 1558, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.625 = fieldNorm(doc=1558)
        0.5 = coord(1/2)
    
  5. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 2.29
    2.2877953 = sum of:
      2.2877953 = product of:
        4.5755906 = sum of:
          4.5755906 = weight(author_txt:pfeffer in 38) [ClassicSimilarity], result of:
            4.5755906 = score(doc=38,freq=1.0), product of:
              0.80128384 = queryWeight, product of:
                1.1572824 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.075782046 = queryNorm
              5.7103243 = fieldWeight in 38, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.625 = fieldNorm(doc=38)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Schaffner, V.: FRBR in MAB2 und Primo - ein kafkaesker Prozess? : Möglichkeiten der FRBRisierung von MAB2-Datensätzen in Primo exemplarisch dargestellt an Datensätzen zu Franz Kafkas "Der Process" (2011) 0.05
    0.05274657 = sum of:
      0.05274657 = product of:
        0.43955478 = sum of:
          0.07750298 = weight(abstract_txt:übersetzungen in 1907) [ClassicSimilarity], result of:
            0.07750298 = score(doc=1907,freq=1.0), product of:
              0.17141828 = queryWeight, product of:
                1.0472052 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.019799404 = queryNorm
              0.45212787 = fieldWeight in 1907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1907)
          0.092102006 = weight(abstract_txt:werkes in 1907) [ClassicSimilarity], result of:
            0.092102006 = score(doc=1907,freq=1.0), product of:
              0.1923199 = queryWeight, product of:
                1.1092141 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.019799404 = queryNorm
              0.47890002 = fieldWeight in 1907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1907)
          0.2699498 = weight(abstract_txt:datensätze in 1907) [ClassicSimilarity], result of:
            0.2699498 = score(doc=1907,freq=4.0), product of:
              0.31262487 = queryWeight, product of:
                2.0 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.019799404 = queryNorm
              0.8634943 = fieldWeight in 1907, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1907)
        0.12 = coord(3/25)
    
  2. Unser, M.; Wäckerlin, D.: Dienstleistung "Abstract/Index" im NEBIS-Katalog (2006) 0.05
    0.049024146 = sum of:
      0.049024146 = product of:
        0.30640092 = sum of:
          0.019162228 = weight(abstract_txt:einem in 30) [ClassicSimilarity], result of:
            0.019162228 = score(doc=30,freq=1.0), product of:
              0.09428662 = queryWeight, product of:
                1.0983564 = boost
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.019799404 = queryNorm
              0.20323381 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.046875 = fieldNorm(doc=30)
          0.09466757 = weight(abstract_txt:datenpool in 30) [ClassicSimilarity], result of:
            0.09466757 = score(doc=30,freq=1.0), product of:
              0.2170752 = queryWeight, product of:
                1.1784424 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.019799404 = queryNorm
              0.43610495 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.046875 = fieldNorm(doc=30)
          0.07687836 = weight(abstract_txt:gehören in 30) [ClassicSimilarity], result of:
            0.07687836 = score(doc=30,freq=1.0), product of:
              0.2380613 = queryWeight, product of:
                1.74527 = boost
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.019799404 = queryNorm
              0.32293516 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.046875 = fieldNorm(doc=30)
          0.115692765 = weight(abstract_txt:datensätze in 30) [ClassicSimilarity], result of:
            0.115692765 = score(doc=30,freq=1.0), product of:
              0.31262487 = queryWeight, product of:
                2.0 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.019799404 = queryNorm
              0.37006897 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.046875 = fieldNorm(doc=30)
        0.16 = coord(4/25)
    
  3. Sen, W.: Social Media Measurement : Methoden zur automatischen Reichweitenmessung von Beiträgen in Webforen (2013) 0.05
    0.048342083 = sum of:
      0.048342083 = product of:
        0.30213803 = sum of:
          0.077128515 = weight(abstract_txt:solches in 1993) [ClassicSimilarity], result of:
            0.077128515 = score(doc=1993,freq=1.0), product of:
              0.15631244 = queryWeight, product of:
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.019799404 = queryNorm
              0.4934253 = fieldWeight in 1993, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0625 = fieldNorm(doc=1993)
          0.085638896 = weight(abstract_txt:diejenigen in 1993) [ClassicSimilarity], result of:
            0.085638896 = score(doc=1993,freq=1.0), product of:
              0.1676091 = queryWeight, product of:
                1.0355046 = boost
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.019799404 = queryNorm
              0.5109442 = fieldWeight in 1993, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.0625 = fieldNorm(doc=1993)
          0.03613265 = weight(abstract_txt:einem in 1993) [ClassicSimilarity], result of:
            0.03613265 = score(doc=1993,freq=2.0), product of:
              0.09428662 = queryWeight, product of:
                1.0983564 = boost
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.019799404 = queryNorm
              0.38322136 = fieldWeight in 1993, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.0625 = fieldNorm(doc=1993)
          0.10323797 = weight(abstract_txt:werk in 1993) [ClassicSimilarity], result of:
            0.10323797 = score(doc=1993,freq=1.0), product of:
              0.2391956 = queryWeight, product of:
                1.749423 = boost
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.019799404 = queryNorm
              0.4316048 = fieldWeight in 1993, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.0625 = fieldNorm(doc=1993)
        0.16 = coord(4/25)
    
  4. ¬Der Digitale Peters : Arno Peters' synchronoptische Weltgeschichte (2010) 0.05
    0.046607435 = sum of:
      0.046607435 = product of:
        0.3883953 = sum of:
          0.038324457 = weight(abstract_txt:einem in 783) [ClassicSimilarity], result of:
            0.038324457 = score(doc=783,freq=1.0), product of:
              0.09428662 = queryWeight, product of:
                1.0983564 = boost
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.019799404 = queryNorm
              0.40646762 = fieldWeight in 783, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.09375 = fieldNorm(doc=783)
          0.19521388 = weight(abstract_txt:auflagen in 783) [ClassicSimilarity], result of:
            0.19521388 = score(doc=783,freq=1.0), product of:
              0.22154564 = queryWeight, product of:
                1.1905149 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.019799404 = queryNorm
              0.88114524 = fieldWeight in 783, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.09375 = fieldNorm(doc=783)
          0.15485695 = weight(abstract_txt:werk in 783) [ClassicSimilarity], result of:
            0.15485695 = score(doc=783,freq=1.0), product of:
              0.2391956 = queryWeight, product of:
                1.749423 = boost
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.019799404 = queryNorm
              0.6474072 = fieldWeight in 783, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.09375 = fieldNorm(doc=783)
        0.12 = coord(3/25)
    
  5. Ansorge, K.; Vierschilling, N.: http://dnb.ddb.de : Von dicken Wälzern zur Online-Verzeichnung (2003) 0.05
    0.04618544 = sum of:
      0.04618544 = product of:
        0.288659 = sum of:
          0.06817262 = weight(abstract_txt:bibliografischen in 2952) [ClassicSimilarity], result of:
            0.06817262 = score(doc=2952,freq=2.0), product of:
              0.15631244 = queryWeight, product of:
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.019799404 = queryNorm
              0.43613046 = fieldWeight in 2952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.06817262 = weight(abstract_txt:ausgaben in 2952) [ClassicSimilarity], result of:
            0.06817262 = score(doc=2952,freq=2.0), product of:
              0.15631244 = queryWeight, product of:
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.019799404 = queryNorm
              0.43613046 = fieldWeight in 2952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.015968526 = weight(abstract_txt:einem in 2952) [ClassicSimilarity], result of:
            0.015968526 = score(doc=2952,freq=1.0), product of:
              0.09428662 = queryWeight, product of:
                1.0983564 = boost
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.019799404 = queryNorm
              0.16936152 = fieldWeight in 2952, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3356547 = idf(docFreq=1580, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
          0.13634524 = weight(abstract_txt:datensätze in 2952) [ClassicSimilarity], result of:
            0.13634524 = score(doc=2952,freq=2.0), product of:
              0.31262487 = queryWeight, product of:
                2.0 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.019799404 = queryNorm
              0.43613046 = fieldWeight in 2952, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2952)
        0.16 = coord(4/25)