Document (#37123)

Author
Jersek, T.
Title
Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse
Imprint
Köln : Fachhochschule
Year
2012
Pages
V. 56 S
Abstract
Die Arbeit befasst sich mit der Realisierung und der Durchführung einer automatischen DDCKlassifizierung durch das Indexierungssystem Lingo. Dies geschieht durch die Einbeziehung von Relationen des DFG-Projektes CrissCross, anhand derer Lingo bibliographische Titeldatensätze automatisch klassifiziert. Der dabei verwendete Ansatz wird mit dem üblichen methodischen Vorgehen bei automatischen Klassifizierungssystemen verglichen. Das Klassifizierungsverfahren wird daraufhin anhand einer Testkollektion von bibliographischen Titeldatensätzen der Deutschen Nationalbibliothek (DNB) getestet. Es folgt eine Diskussion der Ergebnisse und eine Bewertung des Klassifizierungssystems.
Content
Diplomarbeit, Studiengang Bibliothekswesen, Fakultät für Informations- und Kommunikationswissenschaften, Fachhochschule Köln.
Theme
Automatisches Klassifizieren
Object
Lingo
DDC

Similar documents (content)

  1. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.23
    0.2296261 = sum of:
      0.2296261 = product of:
        1.1481305 = sum of:
          0.049393013 = weight(abstract_txt:einer in 1401) [ClassicSimilarity], result of:
            0.049393013 = score(doc=1401,freq=2.0), product of:
              0.07196377 = queryWeight, product of:
                1.0358148 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.01789391 = queryNorm
              0.6863594 = fieldWeight in 1401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
          0.06460671 = weight(abstract_txt:durch in 1401) [ClassicSimilarity], result of:
            0.06460671 = score(doc=1401,freq=2.0), product of:
              0.0860707 = queryWeight, product of:
                1.1327988 = boost
                4.246169 = idf(docFreq=1728, maxDocs=44421)
                0.01789391 = queryNorm
              0.7506237 = fieldWeight in 1401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.246169 = idf(docFreq=1728, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
          0.13894281 = weight(abstract_txt:ergebnisse in 1401) [ClassicSimilarity], result of:
            0.13894281 = score(doc=1401,freq=2.0), product of:
              0.1434039 = queryWeight, product of:
                1.462196 = boost
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.01789391 = queryNorm
              0.9688914 = fieldWeight in 1401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
          0.19442883 = weight(abstract_txt:automatischen in 1401) [ClassicSimilarity], result of:
            0.19442883 = score(doc=1401,freq=1.0), product of:
              0.22604108 = queryWeight, product of:
                1.8357723 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01789391 = queryNorm
              0.86014825 = fieldWeight in 1401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
          0.7007592 = weight(abstract_txt:lingo in 1401) [ClassicSimilarity], result of:
            0.7007592 = score(doc=1401,freq=1.0), product of:
              0.60826087 = queryWeight, product of:
                3.6882105 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.01789391 = queryNorm
              1.1520702 = fieldWeight in 1401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
        0.2 = coord(5/25)
    
  2. Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.21
    0.21080685 = sum of:
      0.21080685 = product of:
        1.3175428 = sum of:
          0.03398633 = weight(abstract_txt:wird in 4581) [ClassicSimilarity], result of:
            0.03398633 = score(doc=4581,freq=2.0), product of:
              0.06794626 = queryWeight, product of:
                1.0064864 = boost
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.01789391 = queryNorm
              0.50019425 = fieldWeight in 4581, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.09375 = fieldNorm(doc=4581)
          0.0261946 = weight(abstract_txt:einer in 4581) [ClassicSimilarity], result of:
            0.0261946 = score(doc=4581,freq=1.0), product of:
              0.07196377 = queryWeight, product of:
                1.0358148 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.01789391 = queryNorm
              0.36399704 = fieldWeight in 4581, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.09375 = fieldNorm(doc=4581)
          0.20622292 = weight(abstract_txt:automatischen in 4581) [ClassicSimilarity], result of:
            0.20622292 = score(doc=4581,freq=2.0), product of:
              0.22604108 = queryWeight, product of:
                1.8357723 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01789391 = queryNorm
              0.91232497 = fieldWeight in 4581, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.09375 = fieldNorm(doc=4581)
          1.0511389 = weight(abstract_txt:lingo in 4581) [ClassicSimilarity], result of:
            1.0511389 = score(doc=4581,freq=4.0), product of:
              0.60826087 = queryWeight, product of:
                3.6882105 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.01789391 = queryNorm
              1.7281053 = fieldWeight in 4581, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.09375 = fieldNorm(doc=4581)
        0.16 = coord(4/25)
    
  3. Bredack, J.: Terminologieextraktion von Mehrwortgruppen in kunsthistorischen Fachtexten (2013) 0.19
    0.19262117 = sum of:
      0.19262117 = product of:
        0.6879327 = sum of:
          0.022390462 = weight(abstract_txt:wird in 2054) [ClassicSimilarity], result of:
            0.022390462 = score(doc=2054,freq=5.0), product of:
              0.06794626 = queryWeight, product of:
                1.0064864 = boost
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.01789391 = queryNorm
              0.3295319 = fieldWeight in 2054, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.024405377 = weight(abstract_txt:einer in 2054) [ClassicSimilarity], result of:
            0.024405377 = score(doc=2054,freq=5.0), product of:
              0.07196377 = queryWeight, product of:
                1.0358148 = boost
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.01789391 = queryNorm
              0.33913422 = fieldWeight in 2054, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.882635 = idf(docFreq=2486, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.020189596 = weight(abstract_txt:durch in 2054) [ClassicSimilarity], result of:
            0.020189596 = score(doc=2054,freq=2.0), product of:
              0.0860707 = queryWeight, product of:
                1.1327988 = boost
                4.246169 = idf(docFreq=1728, maxDocs=44421)
                0.01789391 = queryNorm
              0.2345699 = fieldWeight in 2054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.246169 = idf(docFreq=1728, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.08004669 = weight(abstract_txt:indexierungssystem in 2054) [ClassicSimilarity], result of:
            0.08004669 = score(doc=2054,freq=1.0), product of:
              0.21560848 = queryWeight, product of:
                1.2677776 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.01789391 = queryNorm
              0.37125948 = fieldWeight in 2054, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.043419626 = weight(abstract_txt:ergebnisse in 2054) [ClassicSimilarity], result of:
            0.043419626 = score(doc=2054,freq=2.0), product of:
              0.1434039 = queryWeight, product of:
                1.462196 = boost
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.01789391 = queryNorm
              0.30277854 = fieldWeight in 2054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.05950646 = weight(abstract_txt:anhand in 2054) [ClassicSimilarity], result of:
            0.05950646 = score(doc=2054,freq=3.0), product of:
              0.15456669 = queryWeight, product of:
                1.5180395 = boost
                5.6902003 = idf(docFreq=407, maxDocs=44421)
                0.01789391 = queryNorm
              0.3849889 = fieldWeight in 2054, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6902003 = idf(docFreq=407, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.43797448 = weight(abstract_txt:lingo in 2054) [ClassicSimilarity], result of:
            0.43797448 = score(doc=2054,freq=4.0), product of:
              0.60826087 = queryWeight, product of:
                3.6882105 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.01789391 = queryNorm
              0.72004384 = fieldWeight in 2054, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
        0.28 = coord(7/25)
    
  4. Reiner, U.: DDC-basierte Suche in heterogenen digitalen Bibliotheks- und Wissensbeständen (2005) 0.15
    0.15369055 = sum of:
      0.15369055 = product of:
        0.5488948 = sum of:
          0.039244037 = weight(abstract_txt:wird in 5854) [ClassicSimilarity], result of:
            0.039244037 = score(doc=5854,freq=6.0), product of:
              0.06794626 = queryWeight, product of:
                1.0064864 = boost
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.01789391 = queryNorm
              0.5775746 = fieldWeight in 5854, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.08430055 = weight(abstract_txt:klassifizierung in 5854) [ClassicSimilarity], result of:
            0.08430055 = score(doc=5854,freq=1.0), product of:
              0.16314629 = queryWeight, product of:
                1.102805 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.01789391 = queryNorm
              0.51671755 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.10018002 = weight(abstract_txt:klassifiziert in 5854) [ClassicSimilarity], result of:
            0.10018002 = score(doc=5854,freq=1.0), product of:
              0.1830393 = queryWeight, product of:
                1.1681061 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.01789391 = queryNorm
              0.5473143 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.12386241 = weight(abstract_txt:titeldatensätze in 5854) [ClassicSimilarity], result of:
            0.12386241 = score(doc=5854,freq=1.0), product of:
              0.2108547 = queryWeight, product of:
                1.2537235 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.01789391 = queryNorm
              0.5874302 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.0491237 = weight(abstract_txt:ergebnisse in 5854) [ClassicSimilarity], result of:
            0.0491237 = score(doc=5854,freq=1.0), product of:
              0.1434039 = queryWeight, product of:
                1.462196 = boost
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.01789391 = queryNorm
              0.34255484 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.054969713 = weight(abstract_txt:anhand in 5854) [ClassicSimilarity], result of:
            0.054969713 = score(doc=5854,freq=1.0), product of:
              0.15456669 = queryWeight, product of:
                1.5180395 = boost
                5.6902003 = idf(docFreq=407, maxDocs=44421)
                0.01789391 = queryNorm
              0.35563752 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6902003 = idf(docFreq=407, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
          0.097214416 = weight(abstract_txt:automatischen in 5854) [ClassicSimilarity], result of:
            0.097214416 = score(doc=5854,freq=1.0), product of:
              0.22604108 = queryWeight, product of:
                1.8357723 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01789391 = queryNorm
              0.43007413 = fieldWeight in 5854, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0625 = fieldNorm(doc=5854)
        0.28 = coord(7/25)
    
  5. Bredack, J.; Lepsky, K.: Automatische Extraktion von Fachterminologie aus Volltexten (2014) 0.15
    0.1522078 = sum of:
      0.1522078 = product of:
        0.95129883 = sum of:
          0.028037291 = weight(abstract_txt:wird in 872) [ClassicSimilarity], result of:
            0.028037291 = score(doc=872,freq=1.0), product of:
              0.06794626 = queryWeight, product of:
                1.0064864 = boost
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.01789391 = queryNorm
              0.4126392 = fieldWeight in 872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.109375 = fieldNorm(doc=872)
          0.22413075 = weight(abstract_txt:indexierungssystem in 872) [ClassicSimilarity], result of:
            0.22413075 = score(doc=872,freq=1.0), product of:
              0.21560848 = queryWeight, product of:
                1.2677776 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.01789391 = queryNorm
              1.0395266 = fieldWeight in 872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.109375 = fieldNorm(doc=872)
          0.085966475 = weight(abstract_txt:ergebnisse in 872) [ClassicSimilarity], result of:
            0.085966475 = score(doc=872,freq=1.0), product of:
              0.1434039 = queryWeight, product of:
                1.462196 = boost
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.01789391 = queryNorm
              0.599471 = fieldWeight in 872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4808774 = idf(docFreq=502, maxDocs=44421)
                0.109375 = fieldNorm(doc=872)
          0.6131643 = weight(abstract_txt:lingo in 872) [ClassicSimilarity], result of:
            0.6131643 = score(doc=872,freq=1.0), product of:
              0.60826087 = queryWeight, product of:
                3.6882105 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.01789391 = queryNorm
              1.0080614 = fieldWeight in 872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.109375 = fieldNorm(doc=872)
        0.16 = coord(4/25)