Document (#39471)

Author
Kasprzik, A.
Title
Automatisierte und semiautomatisierte Klassifizierung : eine Analyse aktueller Projekte
Source
Perspektive Bibliothek. 3(2014) H.1, S.85-110
Year
2014
Abstract
Das sprunghafte Anwachsen der Menge digital verfügbarer Dokumente gepaart mit dem Zeit- und Personalmangel an wissenschaftlichen Bibliotheken legt den Einsatz von halb- oder vollautomatischen Verfahren für die verbale und klassifikatorische Inhaltserschließung nahe. Nach einer kurzen allgemeinen Einführung in die gängige Methodik beleuchtet dieser Artikel eine Reihe von Projekten zur automatisierten Klassifizierung aus dem Zeitraum 2007-2012 und aus dem deutschsprachigen Raum. Ein Großteil der vorgestellten Projekte verwendet Methoden des Maschinellen Lernens aus der Künstlichen Intelligenz, arbeitet meist mit angepassten Versionen einer kommerziellen Software und bezieht sich in der Regel auf die Dewey Decimal Classification (DDC). Als Datengrundlage dienen Metadatensätze, Abstracs, Inhaltsverzeichnisse und Volltexte in diversen Datenformaten. Die abschließende Analyse enthält eine Anordnung der Projekte nach einer Reihe von verschiedenen Kriterien und eine Zusammenfassung der aktuellen Lage und der größten Herausfordungen für automatisierte Klassifizierungsverfahren.
Content
Vgl.: https://journals.ub.uni-heidelberg.de/index.php/bibliothek/article/view/14022.
Theme
Automatisches Indexieren
Automatisches Klassifizieren

Similar documents (author)

  1. Kasprzik, A.: Implementierung eines Hierarchisierungsalgorithmus' für die Konstanzer Systematik : Projektbericht (2013) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:kasprzik in 2277) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 2277, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=2277)
    
  2. Kasprzik, A.: Vorläufer der Internationalen Katalogisierungsprinzipien (2014) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:kasprzik in 2619) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 2619, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=2619)
    
  3. Kasprzik, A.: Voraussetzungen und Anwendungspotentiale einer präzisen Sacherschließung aus Sicht der Wissenschaft (2018) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:kasprzik in 195) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 195, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=195)
    
  4. Kasprzik, A.: Aufbau eines produktiven Dienstes für die automatisierte Inhaltserschließung an der ZBW : ein Status- und Erfahrungsbericht. (2023) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:kasprzik in 1936) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 1936, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=1936)
    
  5. Kasprzik, A.; Kett, J.: Vorschläge für eine Weiterentwicklung der Sacherschließung und Schritte zur fortgesetzten strukturellen Aufwertung der GND (2018) 4.70
    4.6994414 = sum of:
      4.6994414 = weight(author_txt:kasprzik in 599) [ClassicSimilarity], result of:
        4.6994414 = fieldWeight in 599, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.5 = fieldNorm(doc=599)
    

Similar documents (content)

  1. Oberhauser, O.: Automatisches Klassifizieren : Verfahren zur Erschließung elektronischer Dokumente (2004) 0.16
    0.16181241 = sum of:
      0.16181241 = product of:
        0.6742184 = sum of:
          0.085775524 = weight(abstract_txt:klassifikatorische in 3487) [ClassicSimilarity], result of:
            0.085775524 = score(doc=3487,freq=1.0), product of:
              0.18016475 = queryWeight, product of:
                1.079596 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.019169161 = queryNorm
              0.4760949 = fieldWeight in 3487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
          0.050133944 = weight(abstract_txt:analyse in 3487) [ClassicSimilarity], result of:
            0.050133944 = score(doc=3487,freq=1.0), product of:
              0.15868121 = queryWeight, product of:
                1.4328612 = boost
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.019169161 = queryNorm
              0.31594127 = fieldWeight in 3487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
          0.045653913 = weight(abstract_txt:einer in 3487) [ClassicSimilarity], result of:
            0.045653913 = score(doc=3487,freq=4.0), product of:
              0.10750616 = queryWeight, product of:
                1.4444538 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.019169161 = queryNorm
              0.42466322 = fieldWeight in 3487, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
          0.022083724 = weight(abstract_txt:eine in 3487) [ClassicSimilarity], result of:
            0.022083724 = score(doc=3487,freq=1.0), product of:
              0.11574329 = queryWeight, product of:
                1.7306302 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.019169161 = queryNorm
              0.19079918 = fieldWeight in 3487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
          0.20778374 = weight(abstract_txt:klassifizierung in 3487) [ClassicSimilarity], result of:
            0.20778374 = score(doc=3487,freq=2.0), product of:
              0.32496405 = queryWeight, product of:
                2.0504966 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.019169161 = queryNorm
              0.6394053 = fieldWeight in 3487, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
          0.2627876 = weight(abstract_txt:projekte in 3487) [ClassicSimilarity], result of:
            0.2627876 = score(doc=3487,freq=5.0), product of:
              0.32053933 = queryWeight, product of:
                2.4941792 = boost
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.019169161 = queryNorm
              0.8198296 = fieldWeight in 3487, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
        0.24 = coord(6/25)
    
  2. Oberhauser, O.: Automatisches Klassifizieren : Entwicklungsstand - Methodik - Anwendungsbereiche (2005) 0.16
    0.16181241 = sum of:
      0.16181241 = product of:
        0.6742184 = sum of:
          0.085775524 = weight(abstract_txt:klassifikatorische in 163) [ClassicSimilarity], result of:
            0.085775524 = score(doc=163,freq=1.0), product of:
              0.18016475 = queryWeight, product of:
                1.079596 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.019169161 = queryNorm
              0.4760949 = fieldWeight in 163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
          0.050133944 = weight(abstract_txt:analyse in 163) [ClassicSimilarity], result of:
            0.050133944 = score(doc=163,freq=1.0), product of:
              0.15868121 = queryWeight, product of:
                1.4328612 = boost
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.019169161 = queryNorm
              0.31594127 = fieldWeight in 163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
          0.045653913 = weight(abstract_txt:einer in 163) [ClassicSimilarity], result of:
            0.045653913 = score(doc=163,freq=4.0), product of:
              0.10750616 = queryWeight, product of:
                1.4444538 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.019169161 = queryNorm
              0.42466322 = fieldWeight in 163, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
          0.022083724 = weight(abstract_txt:eine in 163) [ClassicSimilarity], result of:
            0.022083724 = score(doc=163,freq=1.0), product of:
              0.11574329 = queryWeight, product of:
                1.7306302 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.019169161 = queryNorm
              0.19079918 = fieldWeight in 163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
          0.20778374 = weight(abstract_txt:klassifizierung in 163) [ClassicSimilarity], result of:
            0.20778374 = score(doc=163,freq=2.0), product of:
              0.32496405 = queryWeight, product of:
                2.0504966 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.019169161 = queryNorm
              0.6394053 = fieldWeight in 163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
          0.2627876 = weight(abstract_txt:projekte in 163) [ClassicSimilarity], result of:
            0.2627876 = score(doc=163,freq=5.0), product of:
              0.32053933 = queryWeight, product of:
                2.4941792 = boost
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.019169161 = queryNorm
              0.8198296 = fieldWeight in 163, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
        0.24 = coord(6/25)
    
  3. Panzer, M.: Semantische Integration heterogener und unterschiedlichsprachiger Wissensorganisationssysteme : CrissCross und jenseits (2008) 0.09
    0.09372031 = sum of:
      0.09372031 = product of:
        0.3905013 = sum of:
          0.09935457 = weight(abstract_txt:verbale in 335) [ClassicSimilarity], result of:
            0.09935457 = score(doc=335,freq=1.0), product of:
              0.15665762 = queryWeight, product of:
                1.0067048 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.019169161 = queryNorm
              0.63421476 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.078125 = fieldNorm(doc=335)
          0.11178837 = weight(abstract_txt:automatisierten in 335) [ClassicSimilarity], result of:
            0.11178837 = score(doc=335,freq=1.0), product of:
              0.16946918 = queryWeight, product of:
                1.0470604 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.019169161 = queryNorm
              0.65963835 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.078125 = fieldNorm(doc=335)
          0.030512622 = weight(abstract_txt:nach in 335) [ClassicSimilarity], result of:
            0.030512622 = score(doc=335,freq=1.0), product of:
              0.08984405 = queryWeight, product of:
                1.078167 = boost
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.019169161 = queryNorm
              0.3396176 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.078125 = fieldNorm(doc=335)
          0.07161992 = weight(abstract_txt:analyse in 335) [ClassicSimilarity], result of:
            0.07161992 = score(doc=335,freq=1.0), product of:
              0.15868121 = queryWeight, product of:
                1.4328612 = boost
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.019169161 = queryNorm
              0.45134467 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.078125 = fieldNorm(doc=335)
          0.03260994 = weight(abstract_txt:einer in 335) [ClassicSimilarity], result of:
            0.03260994 = score(doc=335,freq=1.0), product of:
              0.10750616 = queryWeight, product of:
                1.4444538 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.019169161 = queryNorm
              0.30333087 = fieldWeight in 335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.078125 = fieldNorm(doc=335)
          0.044615857 = weight(abstract_txt:eine in 335) [ClassicSimilarity], result of:
            0.044615857 = score(doc=335,freq=2.0), product of:
              0.11574329 = queryWeight, product of:
                1.7306302 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.019169161 = queryNorm
              0.38547254 = fieldWeight in 335, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.078125 = fieldNorm(doc=335)
        0.24 = coord(6/25)
    
  4. Müller, C.; Sternitzke, N.; Stratmann, R.; Parschik, T.: Kataloganreicherung und Zeitschriftenerschließung mit MyBib eDoc und C-3 am Ibero-Amerikanischen Institut, Preußischer Kulturbesitz : Neue Verfahren zur Optimierung der bibliografischen Nachweissituation in einer großen Spezialbibliothek (2010) 0.09
    0.092368394 = sum of:
      0.092368394 = product of:
        0.38486832 = sum of:
          0.06707302 = weight(abstract_txt:automatisierten in 486) [ClassicSimilarity], result of:
            0.06707302 = score(doc=486,freq=1.0), product of:
              0.16946918 = queryWeight, product of:
                1.0470604 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.019169161 = queryNorm
              0.395783 = fieldWeight in 486, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.046875 = fieldNorm(doc=486)
          0.12320678 = weight(abstract_txt:inhaltsverzeichnisse in 486) [ClassicSimilarity], result of:
            0.12320678 = score(doc=486,freq=3.0), product of:
              0.17624147 = queryWeight, product of:
                1.0677767 = boost
                8.610425 = idf(docFreq=21, maxDocs=44421)
                0.019169161 = queryNorm
              0.6990794 = fieldWeight in 486, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.610425 = idf(docFreq=21, maxDocs=44421)
                0.046875 = fieldNorm(doc=486)
          0.025890818 = weight(abstract_txt:nach in 486) [ClassicSimilarity], result of:
            0.025890818 = score(doc=486,freq=2.0), product of:
              0.08984405 = queryWeight, product of:
                1.078167 = boost
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.019169161 = queryNorm
              0.2881751 = fieldWeight in 486, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.046875 = fieldNorm(doc=486)
          0.073521875 = weight(abstract_txt:gängige in 486) [ClassicSimilarity], result of:
            0.073521875 = score(doc=486,freq=1.0), product of:
              0.18016475 = queryWeight, product of:
                1.079596 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.019169161 = queryNorm
              0.40808135 = fieldWeight in 486, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.046875 = fieldNorm(doc=486)
          0.03388924 = weight(abstract_txt:einer in 486) [ClassicSimilarity], result of:
            0.03388924 = score(doc=486,freq=3.0), product of:
              0.10750616 = queryWeight, product of:
                1.4444538 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.019169161 = queryNorm
              0.31523067 = fieldWeight in 486, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.046875 = fieldNorm(doc=486)
          0.06128661 = weight(abstract_txt:reihe in 486) [ClassicSimilarity], result of:
            0.06128661 = score(doc=486,freq=1.0), product of:
              0.20105392 = queryWeight, product of:
                1.6128635 = boost
                6.5029707 = idf(docFreq=180, maxDocs=44421)
                0.019169161 = queryNorm
              0.30482674 = fieldWeight in 486, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5029707 = idf(docFreq=180, maxDocs=44421)
                0.046875 = fieldNorm(doc=486)
        0.24 = coord(6/25)
    
  5. Kaizik, A.; Gödert, W.; Milanesi, C.: Erfahrungen und Ergebnisse aus der Evaluierung des EU-Projektes EULER im Rahmen des an der FH Köln angesiedelten Projektes EJECT (Evaluation von Subject Gateways des World Wide Web (2001) 0.08
    0.077494614 = sum of:
      0.077494614 = product of:
        0.38747308 = sum of:
          0.1025837 = weight(abstract_txt:methodik in 6801) [ClassicSimilarity], result of:
            0.1025837 = score(doc=6801,freq=1.0), product of:
              0.16003385 = queryWeight, product of:
                1.017495 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.019169161 = queryNorm
              0.6410125 = fieldWeight in 6801, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.078125 = fieldNorm(doc=6801)
          0.07161992 = weight(abstract_txt:analyse in 6801) [ClassicSimilarity], result of:
            0.07161992 = score(doc=6801,freq=1.0), product of:
              0.15868121 = queryWeight, product of:
                1.4328612 = boost
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.019169161 = queryNorm
              0.45134467 = fieldWeight in 6801, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.078125 = fieldNorm(doc=6801)
          0.05648207 = weight(abstract_txt:einer in 6801) [ClassicSimilarity], result of:
            0.05648207 = score(doc=6801,freq=3.0), product of:
              0.10750616 = queryWeight, product of:
                1.4444538 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.019169161 = queryNorm
              0.5253845 = fieldWeight in 6801, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.078125 = fieldNorm(doc=6801)
          0.10214436 = weight(abstract_txt:reihe in 6801) [ClassicSimilarity], result of:
            0.10214436 = score(doc=6801,freq=1.0), product of:
              0.20105392 = queryWeight, product of:
                1.6128635 = boost
                6.5029707 = idf(docFreq=180, maxDocs=44421)
                0.019169161 = queryNorm
              0.5080446 = fieldWeight in 6801, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5029707 = idf(docFreq=180, maxDocs=44421)
                0.078125 = fieldNorm(doc=6801)
          0.054643042 = weight(abstract_txt:eine in 6801) [ClassicSimilarity], result of:
            0.054643042 = score(doc=6801,freq=3.0), product of:
              0.11574329 = queryWeight, product of:
                1.7306302 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.019169161 = queryNorm
              0.4721055 = fieldWeight in 6801, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.078125 = fieldNorm(doc=6801)
        0.2 = coord(5/25)