Document (#34676)

Author
Reiner, U.
Title
VZG-Projekt Colibri : Bewertung von automatisch DDC-klassifizierten Titeldatensätzen der Deutschen Nationalbibliothek (DNB)
Issue
August 2008 - Februar 2009.
Imprint
Göttingen : Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG)
Year
2009
Pages
111 S
Series
VZG-Colibri-Bericht 1/2008
Abstract
Das VZG-Projekt Colibri/DDC beschäftigt sich seit 2003 mit automatischen Verfahren zur Dewey-Dezimalklassifikation (Dewey Decimal Classification, kurz DDC). Ziel des Projektes ist eine einheitliche DDC-Erschließung von bibliografischen Titeldatensätzen und eine Unterstützung der DDC-Expert(inn)en und DDC-Laien, z. B. bei der Analyse und Synthese von DDC-Notationen und deren Qualitätskontrolle und der DDC-basierten Suche. Der vorliegende Bericht konzentriert sich auf die erste größere automatische DDC-Klassifizierung und erste automatische und intellektuelle Bewertung mit der Klassifizierungskomponente vc_dcl1. Grundlage hierfür waren die von der Deutschen Nationabibliothek (DNB) im November 2007 zur Verfügung gestellten 25.653 Titeldatensätze (12 Wochen-/Monatslieferungen) der Deutschen Nationalbibliografie der Reihen A, B und H. Nach Erläuterung der automatischen DDC-Klassifizierung und automatischen Bewertung in Kapitel 2 wird in Kapitel 3 auf den DNB-Bericht "Colibri_Auswertung_DDC_Endbericht_Sommer_2008" eingegangen. Es werden Sachverhalte geklärt und Fragen gestellt, deren Antworten die Weichen für den Verlauf der weiteren Klassifizierungstests stellen werden. Über das Kapitel 3 hinaus führende weitergehende Betrachtungen und Gedanken zur Fortführung der automatischen DDC-Klassifizierung werden in Kapitel 4 angestellt. Der Bericht dient dem vertieften Verständnis für die automatischen Verfahren.
Content
Vgl. unter; http://taipan.dyndns.org/~ul/colibri05.pdf.
Theme
Automatisches Klassifizieren
Object
Colibri
DDC

Similar documents (author)

  1. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 5.71
    5.7103243 = sum of:
      5.7103243 = weight(author_txt:reiner in 1611) [ClassicSimilarity], result of:
        5.7103243 = fieldWeight in 1611, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.625 = fieldNorm(doc=1611)
    
  2. Reiner, U.: Anfragesprachen für Informationssysteme (1991) 5.71
    5.7103243 = sum of:
      5.7103243 = weight(author_txt:reiner in 5553) [ClassicSimilarity], result of:
        5.7103243 = fieldWeight in 5553, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.625 = fieldNorm(doc=5553)
    
  3. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 5.71
    5.7103243 = sum of:
      5.7103243 = weight(author_txt:reiner in 5854) [ClassicSimilarity], result of:
        5.7103243 = fieldWeight in 5854, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.625 = fieldNorm(doc=5854)
    
  4. Reiner, U.: Automatic analysis of DDC notations (2007) 5.71
    5.7103243 = sum of:
      5.7103243 = weight(author_txt:reiner in 1118) [ClassicSimilarity], result of:
        5.7103243 = fieldWeight in 1118, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.625 = fieldNorm(doc=1118)
    
  5. Reiner, U.: DDC-based search in the data of the German National Bibliography (2008) 5.71
    5.7103243 = sum of:
      5.7103243 = weight(author_txt:reiner in 3166) [ClassicSimilarity], result of:
        5.7103243 = fieldWeight in 3166, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.625 = fieldNorm(doc=3166)
    

Similar documents (content)

  1. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.48
    0.4801581 = sum of:
      0.4801581 = product of:
        1.2003952 = sum of:
          0.066556 = weight(abstract_txt:titeldatensätze in 271) [ClassicSimilarity], result of:
            0.066556 = score(doc=271,freq=1.0), product of:
              0.15106703 = queryWeight, product of:
                1.0915701 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.014724543 = queryNorm
              0.44057262 = fieldWeight in 271, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.023209507 = weight(abstract_txt:werden in 271) [ClassicSimilarity], result of:
            0.023209507 = score(doc=271,freq=5.0), product of:
              0.063125655 = queryWeight, product of:
                1.2221664 = boost
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.014724543 = queryNorm
              0.36767155 = fieldWeight in 271, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.030616445 = weight(abstract_txt:projekt in 271) [ClassicSimilarity], result of:
            0.030616445 = score(doc=271,freq=1.0), product of:
              0.113420464 = queryWeight, product of:
                1.3376038 = boost
                5.7586684 = idf(docFreq=380, maxDocs=44421)
                0.014724543 = queryNorm
              0.26993757 = fieldWeight in 271, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7586684 = idf(docFreq=380, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.06140092 = weight(abstract_txt:verfahren in 271) [ClassicSimilarity], result of:
            0.06140092 = score(doc=271,freq=4.0), product of:
              0.11362786 = queryWeight, product of:
                1.3388262 = boost
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.014724543 = queryNorm
              0.54036856 = fieldWeight in 271, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.032179225 = weight(abstract_txt:dewey in 271) [ClassicSimilarity], result of:
            0.032179225 = score(doc=271,freq=1.0), product of:
              0.117247954 = queryWeight, product of:
                1.359986 = boost
                5.8550286 = idf(docFreq=345, maxDocs=44421)
                0.014724543 = queryNorm
              0.27445447 = fieldWeight in 271, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8550286 = idf(docFreq=345, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.032557357 = weight(abstract_txt:deutschen in 271) [ClassicSimilarity], result of:
            0.032557357 = score(doc=271,freq=1.0), product of:
              0.13526478 = queryWeight, product of:
                1.7890389 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.014724543 = queryNorm
              0.24069352 = fieldWeight in 271, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.20197758 = weight(abstract_txt:titeldatensätzen in 271) [ClassicSimilarity], result of:
            0.20197758 = score(doc=271,freq=2.0), product of:
              0.31665063 = queryWeight, product of:
                2.2349713 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.014724543 = queryNorm
              0.63785625 = fieldWeight in 271, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.1560167 = weight(abstract_txt:colibri in 271) [ClassicSimilarity], result of:
            0.1560167 = score(doc=271,freq=1.0), product of:
              0.33586827 = queryWeight, product of:
                2.301793 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.014724543 = queryNorm
              0.46451756 = fieldWeight in 271, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.3038676 = weight(abstract_txt:klassifizierung in 271) [ClassicSimilarity], result of:
            0.3038676 = score(doc=271,freq=5.0), product of:
              0.35065895 = queryWeight, product of:
                2.8805132 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.014724543 = queryNorm
              0.86656165 = fieldWeight in 271, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
          0.29201385 = weight(abstract_txt:automatischen in 271) [ClassicSimilarity], result of:
            0.29201385 = score(doc=271,freq=5.0), product of:
              0.40486836 = queryWeight, product of:
                3.9958458 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.014724543 = queryNorm
              0.72125626 = fieldWeight in 271, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.046875 = fieldNorm(doc=271)
        0.4 = coord(10/25)
    
  2. Jersek, T.: Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse (2012) 0.24
    0.23540023 = sum of:
      0.23540023 = product of:
        1.1770011 = sum of:
          0.15529734 = weight(abstract_txt:titeldatensätze in 1122) [ClassicSimilarity], result of:
            0.15529734 = score(doc=1122,freq=1.0), product of:
              0.15106703 = queryWeight, product of:
                1.0915701 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.014724543 = queryNorm
              1.0280029 = fieldWeight in 1122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.109375 = fieldNorm(doc=1122)
          0.07596716 = weight(abstract_txt:deutschen in 1122) [ClassicSimilarity], result of:
            0.07596716 = score(doc=1122,freq=1.0), product of:
              0.13526478 = queryWeight, product of:
                1.7890389 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.014724543 = queryNorm
              0.5616182 = fieldWeight in 1122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.109375 = fieldNorm(doc=1122)
          0.33324602 = weight(abstract_txt:titeldatensätzen in 1122) [ClassicSimilarity], result of:
            0.33324602 = score(doc=1122,freq=1.0), product of:
              0.31665063 = queryWeight, product of:
                2.2349713 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.014724543 = queryNorm
              1.0524092 = fieldWeight in 1122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.109375 = fieldNorm(doc=1122)
          0.1815572 = weight(abstract_txt:bewertung in 1122) [ClassicSimilarity], result of:
            0.1815572 = score(doc=1122,freq=1.0), product of:
              0.24179265 = queryWeight, product of:
                2.3919327 = boost
                6.8651857 = idf(docFreq=125, maxDocs=44421)
                0.014724543 = queryNorm
              0.7508797 = fieldWeight in 1122, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8651857 = idf(docFreq=125, maxDocs=44421)
                0.109375 = fieldNorm(doc=1122)
          0.43093342 = weight(abstract_txt:automatischen in 1122) [ClassicSimilarity], result of:
            0.43093342 = score(doc=1122,freq=2.0), product of:
              0.40486836 = queryWeight, product of:
                3.9958458 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.014724543 = queryNorm
              1.0643791 = fieldWeight in 1122, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.109375 = fieldNorm(doc=1122)
        0.2 = coord(5/25)
    
  3. Balakrishnan, U.; Krausz, A,; Voss, J.: Cocoda - ein Konkordanztool für bibliothekarische Klassifikationssysteme (2015) 0.16
    0.15834622 = sum of:
      0.15834622 = product of:
        0.7917311 = sum of:
          0.023970675 = weight(abstract_txt:werden in 3030) [ClassicSimilarity], result of:
            0.023970675 = score(doc=3030,freq=3.0), product of:
              0.063125655 = queryWeight, product of:
                1.2221664 = boost
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.014724543 = queryNorm
              0.3797295 = fieldWeight in 3030, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.0625 = fieldNorm(doc=3030)
          0.042905632 = weight(abstract_txt:dewey in 3030) [ClassicSimilarity], result of:
            0.042905632 = score(doc=3030,freq=1.0), product of:
              0.117247954 = queryWeight, product of:
                1.359986 = boost
                5.8550286 = idf(docFreq=345, maxDocs=44421)
                0.014724543 = queryNorm
              0.3659393 = fieldWeight in 3030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8550286 = idf(docFreq=345, maxDocs=44421)
                0.0625 = fieldNorm(doc=3030)
          0.19042629 = weight(abstract_txt:titeldatensätzen in 3030) [ClassicSimilarity], result of:
            0.19042629 = score(doc=3030,freq=1.0), product of:
              0.31665063 = queryWeight, product of:
                2.2349713 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.014724543 = queryNorm
              0.60137665 = fieldWeight in 3030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.0625 = fieldNorm(doc=3030)
          0.36030516 = weight(abstract_txt:colibri in 3030) [ClassicSimilarity], result of:
            0.36030516 = score(doc=3030,freq=3.0), product of:
              0.33586827 = queryWeight, product of:
                2.301793 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.014724543 = queryNorm
              1.0727574 = fieldWeight in 3030, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0625 = fieldNorm(doc=3030)
          0.1741234 = weight(abstract_txt:automatischen in 3030) [ClassicSimilarity], result of:
            0.1741234 = score(doc=3030,freq=1.0), product of:
              0.40486836 = queryWeight, product of:
                3.9958458 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.014724543 = queryNorm
              0.43007413 = fieldWeight in 3030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0625 = fieldNorm(doc=3030)
        0.2 = coord(5/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.15041676 = sum of:
      0.15041676 = product of:
        0.7520838 = sum of:
          0.08874133 = weight(abstract_txt:titeldatensätze in 5854) [ClassicSimilarity], result of:
            0.08874133 = score(doc=5854,freq=1.0), product of:
              0.15106703 = queryWeight, product of:
                1.0915701 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.014724543 = queryNorm
              0.5874302 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.013839476 = weight(abstract_txt:werden in 5854) [ClassicSimilarity], result of:
            0.013839476 = score(doc=5854,freq=1.0), product of:
              0.063125655 = queryWeight, product of:
                1.2221664 = boost
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.014724543 = queryNorm
              0.21923694 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.29418793 = weight(abstract_txt:colibri in 5854) [ClassicSimilarity], result of:
            0.29418793 = score(doc=5854,freq=2.0), product of:
              0.33586827 = queryWeight, product of:
                2.301793 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.014724543 = queryNorm
              0.8759027 = fieldWeight in 5854, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.18119164 = weight(abstract_txt:klassifizierung in 5854) [ClassicSimilarity], result of:
            0.18119164 = score(doc=5854,freq=1.0), product of:
              0.35065895 = queryWeight, product of:
                2.8805132 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.014724543 = queryNorm
              0.51671755 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.1741234 = weight(abstract_txt:automatischen in 5854) [ClassicSimilarity], result of:
            0.1741234 = score(doc=5854,freq=1.0), product of:
              0.40486836 = queryWeight, product of:
                3.9958458 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.014724543 = queryNorm
              0.43007413 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
        0.2 = coord(5/25)
    
  5. Helmbrecht-Schaar, A.: Entwicklung eines Verfahrens der automatischen Klassifizierung für Textdokumente aus dem Fachbereich Informatik mithilfe eines fachspezifischen Klassifikationssystems (2007) 0.13
    0.13120008 = sum of:
      0.13120008 = product of:
        0.6560004 = sum of:
          0.03459869 = weight(abstract_txt:werden in 2410) [ClassicSimilarity], result of:
            0.03459869 = score(doc=2410,freq=4.0), product of:
              0.063125655 = queryWeight, product of:
                1.2221664 = boost
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.014724543 = queryNorm
              0.54809237 = fieldWeight in 2410, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.078125 = fieldNorm(doc=2410)
          0.08862459 = weight(abstract_txt:verfahren in 2410) [ClassicSimilarity], result of:
            0.08862459 = score(doc=2410,freq=3.0), product of:
              0.11362786 = queryWeight, product of:
                1.3388262 = boost
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.014724543 = queryNorm
              0.7799548 = fieldWeight in 2410, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.078125 = fieldNorm(doc=2410)
          0.088633284 = weight(abstract_txt:automatische in 2410) [ClassicSimilarity], result of:
            0.088633284 = score(doc=2410,freq=1.0), product of:
              0.16389044 = queryWeight, product of:
                1.6078984 = boost
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.014724543 = queryNorm
              0.54080814 = fieldWeight in 2410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.078125 = fieldNorm(doc=2410)
          0.22648953 = weight(abstract_txt:klassifizierung in 2410) [ClassicSimilarity], result of:
            0.22648953 = score(doc=2410,freq=1.0), product of:
              0.35065895 = queryWeight, product of:
                2.8805132 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.014724543 = queryNorm
              0.6458969 = fieldWeight in 2410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.078125 = fieldNorm(doc=2410)
          0.21765426 = weight(abstract_txt:automatischen in 2410) [ClassicSimilarity], result of:
            0.21765426 = score(doc=2410,freq=1.0), product of:
              0.40486836 = queryWeight, product of:
                3.9958458 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.014724543 = queryNorm
              0.53759265 = fieldWeight in 2410, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.078125 = fieldNorm(doc=2410)
        0.2 = coord(5/25)