Document (#36873)

Author
Bredack, J.
Lepsky, K.
Title
Automatische Extraktion von Fachterminologie aus Volltexten
Source
ABI-Technik. 34(2014) H.1, S.2-8
Year
2014
Abstract
Fachterminologie in wissenschaftlichen Texten liegt häufig in Form von Phrasen oder Mehrwortgruppen vor. Vorgestellt wird ein algorithmisches Verfahren zur Identifikation und Extraktion fachtermi­nologischer Mehrwortgruppen. Besonderer Schwerpunkt ist die Einbindung von Funktionswörtern der deutschen Sprache, um die Extraktion komplexer Mehrwortkonstruktionen zu ermöglichen. Eingesetzt wurde das automatische Indexierungssystem Lingo. Die Ergebnisse für eine Extraktion kunsthistorischer Fachterminologie aus dem Reallexikon zur Deutschen Kunstgeschichte belegen die Tauglichkeit des Verfahrens.
Theme
Automatisches Indexieren
Field
Kunst
Object
Lingo
RDK

Similar documents (author)

  1. Lepsky, K.: Art and language : Ernst H. Gombrich and Karl Bühler's theory of language (1996) 5.04
    5.039926 = sum of:
      5.039926 = weight(author_txt:lepsky in 5228) [ClassicSimilarity], result of:
        5.039926 = fieldWeight in 5228, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.063882 = idf(docFreq=37, maxDocs=44421)
          0.625 = fieldNorm(doc=5228)
    
  2. Lepsky, K.: Maschinelle Indexierung von Titelaufnahmen zur Verbesserung der sachlichen Erschließung in Online-Publikumskatalogen (1994) 5.04
    5.039926 = sum of:
      5.039926 = weight(author_txt:lepsky in 7063) [ClassicSimilarity], result of:
        5.039926 = fieldWeight in 7063, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.063882 = idf(docFreq=37, maxDocs=44421)
          0.625 = fieldNorm(doc=7063)
    
  3. Lepsky, K.: RSWK - und was noch? : Stellungnahme zum Bericht 'Sacherschließung in Online-Katalogen' der Expertengruppe Online-Kataloge (1995) 5.04
    5.039926 = sum of:
      5.039926 = weight(author_txt:lepsky in 840) [ClassicSimilarity], result of:
        5.039926 = fieldWeight in 840, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.063882 = idf(docFreq=37, maxDocs=44421)
          0.625 = fieldNorm(doc=840)
    
  4. Lepsky, K.: Bild und Wirklichkeit : die Wirklichkeit im Bild (1987) 5.04
    5.039926 = sum of:
      5.039926 = weight(author_txt:lepsky in 1414) [ClassicSimilarity], result of:
        5.039926 = fieldWeight in 1414, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.063882 = idf(docFreq=37, maxDocs=44421)
          0.625 = fieldNorm(doc=1414)
    
  5. Lepsky, K.: Ernst H. Gombrich : Theorie und Methode (1991) 5.04
    5.039926 = sum of:
      5.039926 = weight(author_txt:lepsky in 1753) [ClassicSimilarity], result of:
        5.039926 = fieldWeight in 1753, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.063882 = idf(docFreq=37, maxDocs=44421)
          0.625 = fieldNorm(doc=1753)
    

Similar documents (content)

  1. Bredack, J.: Terminologieextraktion von Mehrwortgruppen in kunsthistorischen Fachtexten (2013) 0.61
    0.6093641 = sum of:
      0.6093641 = product of:
        1.3849185 = sum of:
          0.027703214 = weight(abstract_txt:verfahren in 2054) [ClassicSimilarity], result of:
            0.027703214 = score(doc=2054,freq=4.0), product of:
              0.06152071 = queryWeight, product of:
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.010673394 = queryNorm
              0.45030713 = fieldWeight in 2054, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.026851896 = weight(abstract_txt:sprache in 2054) [ClassicSimilarity], result of:
            0.026851896 = score(doc=2054,freq=3.0), product of:
              0.06631791 = queryWeight, product of:
                1.0382566 = boost
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.010673394 = queryNorm
              0.40489662 = fieldWeight in 2054, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.027017256 = weight(abstract_txt:texten in 2054) [ClassicSimilarity], result of:
            0.027017256 = score(doc=2054,freq=1.0), product of:
              0.09603924 = queryWeight, product of:
                1.2494351 = boost
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.010673394 = queryNorm
              0.28131476 = fieldWeight in 2054, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.053429876 = weight(abstract_txt:einbindung in 2054) [ClassicSimilarity], result of:
            0.053429876 = score(doc=2054,freq=3.0), product of:
              0.10491484 = queryWeight, product of:
                1.3058935 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.010673394 = queryNorm
              0.509269 = fieldWeight in 2054, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.03869748 = weight(abstract_txt:verfahrens in 2054) [ClassicSimilarity], result of:
            0.03869748 = score(doc=2054,freq=1.0), product of:
              0.12203274 = queryWeight, product of:
                1.4084048 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.010673394 = queryNorm
              0.31710738 = fieldWeight in 2054, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.11326104 = weight(abstract_txt:lingo in 2054) [ClassicSimilarity], result of:
            0.11326104 = score(doc=2054,freq=4.0), product of:
              0.15729742 = queryWeight, product of:
                1.599006 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.010673394 = queryNorm
              0.72004384 = fieldWeight in 2054, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.06210068 = weight(abstract_txt:indexierungssystem in 2054) [ClassicSimilarity], result of:
            0.06210068 = score(doc=2054,freq=1.0), product of:
              0.16727029 = queryWeight, product of:
                1.6489167 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.010673394 = queryNorm
              0.37125948 = fieldWeight in 2054, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.02769862 = weight(abstract_txt:deutschen in 2054) [ClassicSimilarity], result of:
            0.02769862 = score(doc=2054,freq=2.0), product of:
              0.09764724 = queryWeight, product of:
                1.781699 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.010673394 = queryNorm
              0.28366002 = fieldWeight in 2054, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.5076075 = weight(abstract_txt:mehrwortgruppen in 2054) [ClassicSimilarity], result of:
            0.5076075 = score(doc=2054,freq=13.0), product of:
              0.36369345 = queryWeight, product of:
                3.438524 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.010673394 = queryNorm
              1.3957016 = fieldWeight in 2054, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.16989157 = weight(abstract_txt:fachterminologie in 2054) [ClassicSimilarity], result of:
            0.16989157 = score(doc=2054,freq=1.0), product of:
              0.4718923 = queryWeight, product of:
                4.7970185 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.010673394 = queryNorm
              0.36002192 = fieldWeight in 2054, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
          0.3306593 = weight(abstract_txt:extraktion in 2054) [ClassicSimilarity], result of:
            0.3306593 = score(doc=2054,freq=3.0), product of:
              0.56137705 = queryWeight, product of:
                6.041526 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.010673394 = queryNorm
              0.58901465 = fieldWeight in 2054, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2054)
        0.44 = coord(11/25)
    
  2. Grün, S.: Bildung von Komposita-Indextermen auf der Basis einer algorithmischen Mehrwortgruppenanalyse mit Lingo (2015) 0.23
    0.22567824 = sum of:
      0.22567824 = product of:
        1.1283911 = sum of:
          0.0310059 = weight(abstract_txt:sprache in 2335) [ClassicSimilarity], result of:
            0.0310059 = score(doc=2335,freq=1.0), product of:
              0.06631791 = queryWeight, product of:
                1.0382566 = boost
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.010673394 = queryNorm
              0.46753436 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.078125 = fieldNorm(doc=2335)
          0.11326104 = weight(abstract_txt:lingo in 2335) [ClassicSimilarity], result of:
            0.11326104 = score(doc=2335,freq=1.0), product of:
              0.15729742 = queryWeight, product of:
                1.599006 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.010673394 = queryNorm
              0.72004384 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.078125 = fieldNorm(doc=2335)
          0.039171766 = weight(abstract_txt:deutschen in 2335) [ClassicSimilarity], result of:
            0.039171766 = score(doc=2335,freq=1.0), product of:
              0.09764724 = queryWeight, product of:
                1.781699 = boost
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.010673394 = queryNorm
              0.4011559 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.134795 = idf(docFreq=710, maxDocs=44421)
                0.078125 = fieldNorm(doc=2335)
          0.56314 = weight(abstract_txt:mehrwortgruppen in 2335) [ClassicSimilarity], result of:
            0.56314 = score(doc=2335,freq=4.0), product of:
              0.36369345 = queryWeight, product of:
                3.438524 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.010673394 = queryNorm
              1.5483918 = fieldWeight in 2335, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.078125 = fieldNorm(doc=2335)
          0.38181248 = weight(abstract_txt:extraktion in 2335) [ClassicSimilarity], result of:
            0.38181248 = score(doc=2335,freq=1.0), product of:
              0.56137705 = queryWeight, product of:
                6.041526 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.010673394 = queryNorm
              0.68013555 = fieldWeight in 2335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.078125 = fieldNorm(doc=2335)
        0.2 = coord(5/25)
    
  3. Glaesener, L.: Automatisches Indexieren einer informationswissenschaftlichen Datenbank mit Mehrwortgruppen (2012) 0.19
    0.19224708 = sum of:
      0.19224708 = product of:
        1.2015443 = sum of:
          0.08645522 = weight(abstract_txt:texten in 1401) [ClassicSimilarity], result of:
            0.08645522 = score(doc=1401,freq=1.0), product of:
              0.09603924 = queryWeight, product of:
                1.2494351 = boost
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.010673394 = queryNorm
              0.9002072 = fieldWeight in 1401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
          0.18121766 = weight(abstract_txt:lingo in 1401) [ClassicSimilarity], result of:
            0.18121766 = score(doc=1401,freq=1.0), product of:
              0.15729742 = queryWeight, product of:
                1.599006 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.010673394 = queryNorm
              1.1520702 = fieldWeight in 1401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
          0.15356183 = weight(abstract_txt:automatische in 1401) [ClassicSimilarity], result of:
            0.15356183 = score(doc=1401,freq=1.0), product of:
              0.17746802 = queryWeight, product of:
                2.4019523 = boost
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.010673394 = queryNorm
              0.865293 = fieldWeight in 1401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
          0.7803096 = weight(abstract_txt:mehrwortgruppen in 1401) [ClassicSimilarity], result of:
            0.7803096 = score(doc=1401,freq=3.0), product of:
              0.36369345 = queryWeight, product of:
                3.438524 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.010673394 = queryNorm
              2.1455147 = fieldWeight in 1401, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.125 = fieldNorm(doc=1401)
        0.16 = coord(4/25)
    
  4. Witschel, H.F.: Text, Wörter, Morpheme : Möglichkeiten einer automatischen Terminologie-Extraktion (2004) 0.17
    0.1657811 = sum of:
      0.1657811 = product of:
        0.82890546 = sum of:
          0.0383867 = weight(abstract_txt:verfahren in 1126) [ClassicSimilarity], result of:
            0.0383867 = score(doc=1126,freq=3.0), product of:
              0.06152071 = queryWeight, product of:
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.010673394 = queryNorm
              0.62396383 = fieldWeight in 1126, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.0625 = fieldNorm(doc=1126)
          0.02480472 = weight(abstract_txt:sprache in 1126) [ClassicSimilarity], result of:
            0.02480472 = score(doc=1126,freq=1.0), product of:
              0.06631791 = queryWeight, product of:
                1.0382566 = boost
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.010673394 = queryNorm
              0.3740275 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.0625 = fieldNorm(doc=1126)
          0.061915968 = weight(abstract_txt:verfahrens in 1126) [ClassicSimilarity], result of:
            0.061915968 = score(doc=1126,freq=1.0), product of:
              0.12203274 = queryWeight, product of:
                1.4084048 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.010673394 = queryNorm
              0.5073718 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.0625 = fieldNorm(doc=1126)
          0.2718265 = weight(abstract_txt:fachterminologie in 1126) [ClassicSimilarity], result of:
            0.2718265 = score(doc=1126,freq=1.0), product of:
              0.4718923 = queryWeight, product of:
                4.7970185 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.010673394 = queryNorm
              0.5760351 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=1126)
          0.43197152 = weight(abstract_txt:extraktion in 1126) [ClassicSimilarity], result of:
            0.43197152 = score(doc=1126,freq=2.0), product of:
              0.56137705 = queryWeight, product of:
                6.041526 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.010673394 = queryNorm
              0.76948553 = fieldWeight in 1126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0625 = fieldNorm(doc=1126)
        0.2 = coord(5/25)
    
  5. Witschel, H.F.: Terminologie-Extraktion : Möglichkeiten der Kombination statistischer uns musterbasierter Verfahren (2004) 0.14
    0.14119251 = sum of:
      0.14119251 = product of:
        0.70596254 = sum of:
          0.022162572 = weight(abstract_txt:verfahren in 1123) [ClassicSimilarity], result of:
            0.022162572 = score(doc=1123,freq=1.0), product of:
              0.06152071 = queryWeight, product of:
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.010673394 = queryNorm
              0.3602457 = fieldWeight in 1123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7639313 = idf(docFreq=378, maxDocs=44421)
                0.0625 = fieldNorm(doc=1123)
          0.034736592 = weight(abstract_txt:liegt in 1123) [ClassicSimilarity], result of:
            0.034736592 = score(doc=1123,freq=2.0), product of:
              0.06588543 = queryWeight, product of:
                1.0348657 = boost
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.010673394 = queryNorm
              0.5272272 = fieldWeight in 1123, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.0625 = fieldNorm(doc=1123)
          0.04322761 = weight(abstract_txt:texten in 1123) [ClassicSimilarity], result of:
            0.04322761 = score(doc=1123,freq=1.0), product of:
              0.09603924 = queryWeight, product of:
                1.2494351 = boost
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.010673394 = queryNorm
              0.4501036 = fieldWeight in 1123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.201658 = idf(docFreq=89, maxDocs=44421)
                0.0625 = fieldNorm(doc=1123)
          0.076780915 = weight(abstract_txt:automatische in 1123) [ClassicSimilarity], result of:
            0.076780915 = score(doc=1123,freq=1.0), product of:
              0.17746802 = queryWeight, product of:
                2.4019523 = boost
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.010673394 = queryNorm
              0.4326465 = fieldWeight in 1123, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.0625 = fieldNorm(doc=1123)
          0.5290549 = weight(abstract_txt:extraktion in 1123) [ClassicSimilarity], result of:
            0.5290549 = score(doc=1123,freq=3.0), product of:
              0.56137705 = queryWeight, product of:
                6.041526 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.010673394 = queryNorm
              0.94242346 = fieldWeight in 1123, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0625 = fieldNorm(doc=1123)
        0.2 = coord(5/25)