Document (#39471)

Author
Kasprzik, A.
Title
Automatisierte und semiautomatisierte Klassifizierung : eine Analyse aktueller Projekte
Source
Perspektive Bibliothek. 3(2014) H.1, S.85-110
Year
2014
Abstract
Das sprunghafte Anwachsen der Menge digital verfügbarer Dokumente gepaart mit dem Zeit- und Personalmangel an wissenschaftlichen Bibliotheken legt den Einsatz von halb- oder vollautomatischen Verfahren für die verbale und klassifikatorische Inhaltserschließung nahe. Nach einer kurzen allgemeinen Einführung in die gängige Methodik beleuchtet dieser Artikel eine Reihe von Projekten zur automatisierten Klassifizierung aus dem Zeitraum 2007-2012 und aus dem deutschsprachigen Raum. Ein Großteil der vorgestellten Projekte verwendet Methoden des Maschinellen Lernens aus der Künstlichen Intelligenz, arbeitet meist mit angepassten Versionen einer kommerziellen Software und bezieht sich in der Regel auf die Dewey Decimal Classification (DDC). Als Datengrundlage dienen Metadatensätze, Abstracs, Inhaltsverzeichnisse und Volltexte in diversen Datenformaten. Die abschließende Analyse enthält eine Anordnung der Projekte nach einer Reihe von verschiedenen Kriterien und eine Zusammenfassung der aktuellen Lage und der größten Herausfordungen für automatisierte Klassifizierungsverfahren.
Content
Vgl.: https://journals.ub.uni-heidelberg.de/index.php/bibliothek/article/view/14022.
Theme
Automatisches Indexieren
Automatisches Klassifizieren

Similar documents (author)

  1. Kasprzik, A.: Implementierung eines Hierarchisierungsalgorithmus' für die Konstanzer Systematik : Projektbericht (2013) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:kasprzik in 1277) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 1277, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=1277)
    
  2. Kasprzik, A.: Vorläufer der Internationalen Katalogisierungsprinzipien (2014) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:kasprzik in 1619) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 1619, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=1619)
    
  3. Kasprzik, A.: Voraussetzungen und Anwendungspotentiale einer präzisen Sacherschließung aus Sicht der Wissenschaft (2018) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:kasprzik in 5195) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 5195, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=5195)
    
  4. Kasprzik, A.: Aufbau eines produktiven Dienstes für die automatisierte Inhaltserschließung an der ZBW : ein Status- und Erfahrungsbericht. (2023) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:kasprzik in 935) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 935, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=935)
    
  5. Kasprzik, A.; Kett, J.: Vorschläge für eine Weiterentwicklung der Sacherschließung und Schritte zur fortgesetzten strukturellen Aufwertung der GND (2018) 4.70
    4.697151 = sum of:
      4.697151 = weight(author_txt:kasprzik in 4599) [ClassicSimilarity], result of:
        4.697151 = fieldWeight in 4599, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.5 = fieldNorm(doc=4599)
    

Similar documents (content)

  1. Oberhauser, O.: Automatisches Klassifizieren : Verfahren zur Erschließung elektronischer Dokumente (2004) 0.16
    0.16181396 = sum of:
      0.16181396 = product of:
        0.67422485 = sum of:
          0.085652165 = weight(abstract_txt:klassifikatorische in 2487) [ClassicSimilarity], result of:
            0.085652165 = score(doc=2487,freq=1.0), product of:
              0.18000036 = queryWeight, product of:
                1.0796413 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.019160949 = queryNorm
              0.47584438 = fieldWeight in 2487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2487)
          0.05023143 = weight(abstract_txt:analyse in 2487) [ClassicSimilarity], result of:
            0.05023143 = score(doc=2487,freq=1.0), product of:
              0.15889426 = queryWeight, product of:
                1.4345375 = boost
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.019160949 = queryNorm
              0.3161312 = fieldWeight in 2487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2487)
          0.045697868 = weight(abstract_txt:einer in 2487) [ClassicSimilarity], result of:
            0.045697868 = score(doc=2487,freq=4.0), product of:
              0.10758016 = queryWeight, product of:
                1.4456712 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.019160949 = queryNorm
              0.42477968 = fieldWeight in 2487, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2487)
          0.022092761 = weight(abstract_txt:eine in 2487) [ClassicSimilarity], result of:
            0.022092761 = score(doc=2487,freq=1.0), product of:
              0.11578025 = queryWeight, product of:
                1.7317693 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.019160949 = queryNorm
              0.19081633 = fieldWeight in 2487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2487)
          0.20746757 = weight(abstract_txt:klassifizierung in 2487) [ClassicSimilarity], result of:
            0.20746757 = score(doc=2487,freq=2.0), product of:
              0.32464942 = queryWeight, product of:
                2.0505252 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.019160949 = queryNorm
              0.6390511 = fieldWeight in 2487, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2487)
          0.26308307 = weight(abstract_txt:projekte in 2487) [ClassicSimilarity], result of:
            0.26308307 = score(doc=2487,freq=5.0), product of:
              0.32079443 = queryWeight, product of:
                2.4964154 = boost
                6.7064548 = idf(docFreq=146, maxDocs=44218)
                0.019160949 = queryNorm
              0.82009864 = fieldWeight in 2487, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.7064548 = idf(docFreq=146, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2487)
        0.24 = coord(6/25)
    
  2. Oberhauser, O.: Automatisches Klassifizieren : Entwicklungsstand - Methodik - Anwendungsbereiche (2005) 0.16
    0.16181396 = sum of:
      0.16181396 = product of:
        0.67422485 = sum of:
          0.085652165 = weight(abstract_txt:klassifikatorische in 38) [ClassicSimilarity], result of:
            0.085652165 = score(doc=38,freq=1.0), product of:
              0.18000036 = queryWeight, product of:
                1.0796413 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.019160949 = queryNorm
              0.47584438 = fieldWeight in 38, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.0546875 = fieldNorm(doc=38)
          0.05023143 = weight(abstract_txt:analyse in 38) [ClassicSimilarity], result of:
            0.05023143 = score(doc=38,freq=1.0), product of:
              0.15889426 = queryWeight, product of:
                1.4345375 = boost
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.019160949 = queryNorm
              0.3161312 = fieldWeight in 38, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.0546875 = fieldNorm(doc=38)
          0.045697868 = weight(abstract_txt:einer in 38) [ClassicSimilarity], result of:
            0.045697868 = score(doc=38,freq=4.0), product of:
              0.10758016 = queryWeight, product of:
                1.4456712 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.019160949 = queryNorm
              0.42477968 = fieldWeight in 38, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.0546875 = fieldNorm(doc=38)
          0.022092761 = weight(abstract_txt:eine in 38) [ClassicSimilarity], result of:
            0.022092761 = score(doc=38,freq=1.0), product of:
              0.11578025 = queryWeight, product of:
                1.7317693 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.019160949 = queryNorm
              0.19081633 = fieldWeight in 38, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.0546875 = fieldNorm(doc=38)
          0.20746757 = weight(abstract_txt:klassifizierung in 38) [ClassicSimilarity], result of:
            0.20746757 = score(doc=38,freq=2.0), product of:
              0.32464942 = queryWeight, product of:
                2.0505252 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.019160949 = queryNorm
              0.6390511 = fieldWeight in 38, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0546875 = fieldNorm(doc=38)
          0.26308307 = weight(abstract_txt:projekte in 38) [ClassicSimilarity], result of:
            0.26308307 = score(doc=38,freq=5.0), product of:
              0.32079443 = queryWeight, product of:
                2.4964154 = boost
                6.7064548 = idf(docFreq=146, maxDocs=44218)
                0.019160949 = queryNorm
              0.82009864 = fieldWeight in 38, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.7064548 = idf(docFreq=146, maxDocs=44218)
                0.0546875 = fieldNorm(doc=38)
        0.24 = coord(6/25)
    
  3. Panzer, M.: Semantische Integration heterogener und unterschiedlichsprachiger Wissensorganisationssysteme : CrissCross und jenseits (2008) 0.09
    0.094080836 = sum of:
      0.094080836 = product of:
        0.39200348 = sum of:
          0.09920034 = weight(abstract_txt:verbale in 4335) [ClassicSimilarity], result of:
            0.09920034 = score(doc=4335,freq=1.0), product of:
              0.15650274 = queryWeight, product of:
                1.0067086 = boost
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.019160949 = queryNorm
              0.6338569 = fieldWeight in 4335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.078125 = fieldNorm(doc=4335)
          0.11318571 = weight(abstract_txt:automatisierten in 4335) [ClassicSimilarity], result of:
            0.11318571 = score(doc=4335,freq=1.0), product of:
              0.17088643 = queryWeight, product of:
                1.0519536 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.019160949 = queryNorm
              0.66234463 = fieldWeight in 4335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.078125 = fieldNorm(doc=4335)
          0.030582776 = weight(abstract_txt:nach in 4335) [ClassicSimilarity], result of:
            0.030582776 = score(doc=4335,freq=1.0), product of:
              0.0899859 = queryWeight, product of:
                1.0795556 = boost
                4.350232 = idf(docFreq=1550, maxDocs=44218)
                0.019160949 = queryNorm
              0.33986187 = fieldWeight in 4335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.350232 = idf(docFreq=1550, maxDocs=44218)
                0.078125 = fieldNorm(doc=4335)
          0.071759194 = weight(abstract_txt:analyse in 4335) [ClassicSimilarity], result of:
            0.071759194 = score(doc=4335,freq=1.0), product of:
              0.15889426 = queryWeight, product of:
                1.4345375 = boost
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.019160949 = queryNorm
              0.45161602 = fieldWeight in 4335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.078125 = fieldNorm(doc=4335)
          0.032641333 = weight(abstract_txt:einer in 4335) [ClassicSimilarity], result of:
            0.032641333 = score(doc=4335,freq=1.0), product of:
              0.10758016 = queryWeight, product of:
                1.4456712 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.019160949 = queryNorm
              0.30341405 = fieldWeight in 4335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.078125 = fieldNorm(doc=4335)
          0.04463412 = weight(abstract_txt:eine in 4335) [ClassicSimilarity], result of:
            0.04463412 = score(doc=4335,freq=2.0), product of:
              0.11578025 = queryWeight, product of:
                1.7317693 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.019160949 = queryNorm
              0.3855072 = fieldWeight in 4335, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.078125 = fieldNorm(doc=4335)
        0.24 = coord(6/25)
    
  4. Müller, C.; Sternitzke, N.; Stratmann, R.; Parschik, T.: Kataloganreicherung und Zeitschriftenerschließung mit MyBib eDoc und C-3 am Ibero-Amerikanischen Institut, Preußischer Kulturbesitz : Neue Verfahren zur Optimierung der bibliografischen Nachweissituation in einer großen Spezialbibliothek (2010) 0.09
    0.09253189 = sum of:
      0.09253189 = product of:
        0.38554955 = sum of:
          0.067911424 = weight(abstract_txt:automatisierten in 3499) [ClassicSimilarity], result of:
            0.067911424 = score(doc=3499,freq=1.0), product of:
              0.17088643 = queryWeight, product of:
                1.0519536 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.019160949 = queryNorm
              0.39740676 = fieldWeight in 3499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=3499)
          0.12302744 = weight(abstract_txt:inhaltsverzeichnisse in 3499) [ClassicSimilarity], result of:
            0.12302744 = score(doc=3499,freq=3.0), product of:
              0.1760786 = queryWeight, product of:
                1.0678152 = boost
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.019160949 = queryNorm
              0.69870746 = fieldWeight in 3499, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.6058445 = idf(docFreq=21, maxDocs=44218)
                0.046875 = fieldNorm(doc=3499)
          0.025950348 = weight(abstract_txt:nach in 3499) [ClassicSimilarity], result of:
            0.025950348 = score(doc=3499,freq=2.0), product of:
              0.0899859 = queryWeight, product of:
                1.0795556 = boost
                4.350232 = idf(docFreq=1550, maxDocs=44218)
                0.019160949 = queryNorm
              0.28838238 = fieldWeight in 3499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.350232 = idf(docFreq=1550, maxDocs=44218)
                0.046875 = fieldNorm(doc=3499)
          0.07341614 = weight(abstract_txt:gängige in 3499) [ClassicSimilarity], result of:
            0.07341614 = score(doc=3499,freq=1.0), product of:
              0.18000036 = queryWeight, product of:
                1.0796413 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.019160949 = queryNorm
              0.40786663 = fieldWeight in 3499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.046875 = fieldNorm(doc=3499)
          0.03392187 = weight(abstract_txt:einer in 3499) [ClassicSimilarity], result of:
            0.03392187 = score(doc=3499,freq=3.0), product of:
              0.10758016 = queryWeight, product of:
                1.4456712 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.019160949 = queryNorm
              0.31531715 = fieldWeight in 3499, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.046875 = fieldNorm(doc=3499)
          0.06132232 = weight(abstract_txt:reihe in 3499) [ClassicSimilarity], result of:
            0.06132232 = score(doc=3499,freq=1.0), product of:
              0.20114137 = queryWeight, product of:
                1.6140184 = boost
                6.5039306 = idf(docFreq=179, maxDocs=44218)
                0.019160949 = queryNorm
              0.30487174 = fieldWeight in 3499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5039306 = idf(docFreq=179, maxDocs=44218)
                0.046875 = fieldNorm(doc=3499)
        0.24 = coord(6/25)
    
  5. Kaizik, A.; Gödert, W.; Milanesi, C.: Erfahrungen und Ergebnisse aus der Evaluierung des EU-Projektes EULER im Rahmen des an der FH Köln angesiedelten Projektes EJECT (Evaluation von Subject Gateways des World Wide Web (2001) 0.08
    0.07751825 = sum of:
      0.07751825 = product of:
        0.38759124 = sum of:
          0.102426305 = weight(abstract_txt:methodik in 5801) [ClassicSimilarity], result of:
            0.102426305 = score(doc=5801,freq=1.0), product of:
              0.15987757 = queryWeight, product of:
                1.017505 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.019160949 = queryNorm
              0.6406546 = fieldWeight in 5801, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.078125 = fieldNorm(doc=5801)
          0.071759194 = weight(abstract_txt:analyse in 5801) [ClassicSimilarity], result of:
            0.071759194 = score(doc=5801,freq=1.0), product of:
              0.15889426 = queryWeight, product of:
                1.4345375 = boost
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.019160949 = queryNorm
              0.45161602 = fieldWeight in 5801, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.780685 = idf(docFreq=370, maxDocs=44218)
                0.078125 = fieldNorm(doc=5801)
          0.056536447 = weight(abstract_txt:einer in 5801) [ClassicSimilarity], result of:
            0.056536447 = score(doc=5801,freq=3.0), product of:
              0.10758016 = queryWeight, product of:
                1.4456712 = boost
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.019160949 = queryNorm
              0.52552855 = fieldWeight in 5801, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.8837 = idf(docFreq=2472, maxDocs=44218)
                0.078125 = fieldNorm(doc=5801)
          0.10220387 = weight(abstract_txt:reihe in 5801) [ClassicSimilarity], result of:
            0.10220387 = score(doc=5801,freq=1.0), product of:
              0.20114137 = queryWeight, product of:
                1.6140184 = boost
                6.5039306 = idf(docFreq=179, maxDocs=44218)
                0.019160949 = queryNorm
              0.5081196 = fieldWeight in 5801, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5039306 = idf(docFreq=179, maxDocs=44218)
                0.078125 = fieldNorm(doc=5801)
          0.054665405 = weight(abstract_txt:eine in 5801) [ClassicSimilarity], result of:
            0.054665405 = score(doc=5801,freq=3.0), product of:
              0.11578025 = queryWeight, product of:
                1.7317693 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.019160949 = queryNorm
              0.47214794 = fieldWeight in 5801, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.078125 = fieldNorm(doc=5801)
        0.2 = coord(5/25)