Document (#36639)

Author
Perera, P.
Witte, R.
Title
¬A self-learning context-aware lemmatizer for German
Source
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), October 6-8, 2005
Imprint
Vancouver : Association for Computational Linguistics
Year
2005
Pages
S.636-643
Abstract
Accurate lemmatization of German nouns mandates the use of a lexicon. Comprehensive lexicons, however, are expensive to build and maintain. We present a self-learning lemmatizer capable of automatically creating a full-form lexicon by processing German documents.
Content
Vgl. unter: http://acl.ldc.upenn.edu//H/H05/H05-1080.pdf.
Theme
Computerlinguistik

Similar documents (author)

  1. Witte, L.: Sehnsucht nach Unsterblichkeit (2014) 6.01
    6.0137663 = sum of:
      6.0137663 = weight(author_txt:witte in 2144) [ClassicSimilarity], result of:
        6.0137663 = fieldWeight in 2144, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.625 = fieldNorm(doc=2144)
    
  2. Witte-Petit, K.: Mal schnell die Weilt retten (2021) 4.81
    4.811013 = sum of:
      4.811013 = weight(author_txt:witte in 1241) [ClassicSimilarity], result of:
        4.811013 = fieldWeight in 1241, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.5 = fieldNorm(doc=1241)
    
  3. Witte, R.; Gitzinger, T.: Semantic assistants : user-centric Natural Language Processing services for desktop clients (2009) 4.81
    4.811013 = sum of:
      4.811013 = weight(author_txt:witte in 652) [ClassicSimilarity], result of:
        4.811013 = fieldWeight in 652, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.5 = fieldNorm(doc=652)
    
  4. Witte-Petit, K.: Digitaler Handschlag : Corona-App (2019) 4.81
    4.811013 = sum of:
      4.811013 = weight(author_txt:witte in 957) [ClassicSimilarity], result of:
        4.811013 = fieldWeight in 957, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.5 = fieldNorm(doc=957)
    
  5. Witte-Petit, K.: ¬Der menschliche Kurswert (2017) 4.81
    4.811013 = sum of:
      4.811013 = weight(author_txt:witte in 1226) [ClassicSimilarity], result of:
        4.811013 = fieldWeight in 1226, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.5 = fieldNorm(doc=1226)
    

Similar documents (content)

  1. Stede, M.: Lexicalization in natural language generation (2002) 0.14
    0.14096028 = sum of:
      0.14096028 = product of:
        0.5638411 = sum of:
          0.020649198 = weight(abstract_txt:however in 5245) [ClassicSimilarity], result of:
            0.020649198 = score(doc=5245,freq=2.0), product of:
              0.06351376 = queryWeight, product of:
                1.0194949 = boost
                4.203706 = idf(docFreq=1803, maxDocs=44421)
                0.014820077 = queryNorm
              0.32511377 = fieldWeight in 5245, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.203706 = idf(docFreq=1803, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5245)
          0.023480078 = weight(abstract_txt:processing in 5245) [ClassicSimilarity], result of:
            0.023480078 = score(doc=5245,freq=1.0), product of:
              0.087178364 = queryWeight, product of:
                1.1944157 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.014820077 = queryNorm
              0.26933378 = fieldWeight in 5245, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5245)
          0.033086512 = weight(abstract_txt:automatically in 5245) [ClassicSimilarity], result of:
            0.033086512 = score(doc=5245,freq=1.0), product of:
              0.10957454 = queryWeight, product of:
                1.3390783 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.014820077 = queryNorm
              0.30195436 = fieldWeight in 5245, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5245)
          0.03957016 = weight(abstract_txt:build in 5245) [ClassicSimilarity], result of:
            0.03957016 = score(doc=5245,freq=1.0), product of:
              0.12345847 = queryWeight, product of:
                1.4213845 = boost
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.014820077 = queryNorm
              0.32051393 = fieldWeight in 5245, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5245)
          0.1901499 = weight(abstract_txt:lexicons in 5245) [ClassicSimilarity], result of:
            0.1901499 = score(doc=5245,freq=2.0), product of:
              0.27903783 = queryWeight, product of:
                2.1368926 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.014820077 = queryNorm
              0.68144846 = fieldWeight in 5245, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5245)
          0.25690523 = weight(abstract_txt:lexicon in 5245) [ClassicSimilarity], result of:
            0.25690523 = score(doc=5245,freq=2.0), product of:
              0.4296594 = queryWeight, product of:
                3.7499743 = boost
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.014820077 = queryNorm
              0.59792763 = fieldWeight in 5245, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5245)
        0.25 = coord(6/24)
    
  2. Hubain, R.; Wilde, M. De; Hooland, S. van: Automated SKOS vocabulary design for the biopharmaceutical industry (2016) 0.12
    0.119456604 = sum of:
      0.119456604 = product of:
        0.4095655 = sum of:
          0.040914465 = weight(abstract_txt:documents in 132) [ClassicSimilarity], result of:
            0.040914465 = score(doc=132,freq=3.0), product of:
              0.061107952 = queryWeight, product of:
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.014820077 = queryNorm
              0.66954404 = fieldWeight in 132, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.09375 = fieldNorm(doc=132)
          0.02503061 = weight(abstract_txt:however in 132) [ClassicSimilarity], result of:
            0.02503061 = score(doc=132,freq=1.0), product of:
              0.06351376 = queryWeight, product of:
                1.0194949 = boost
                4.203706 = idf(docFreq=1803, maxDocs=44421)
                0.014820077 = queryNorm
              0.39409742 = fieldWeight in 132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.203706 = idf(docFreq=1803, maxDocs=44421)
                0.09375 = fieldNorm(doc=132)
          0.04025156 = weight(abstract_txt:processing in 132) [ClassicSimilarity], result of:
            0.04025156 = score(doc=132,freq=1.0), product of:
              0.087178364 = queryWeight, product of:
                1.1944157 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.014820077 = queryNorm
              0.46171504 = fieldWeight in 132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.09375 = fieldNorm(doc=132)
          0.04025156 = weight(abstract_txt:full in 132) [ClassicSimilarity], result of:
            0.04025156 = score(doc=132,freq=1.0), product of:
              0.087178364 = queryWeight, product of:
                1.1944157 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.014820077 = queryNorm
              0.46171504 = fieldWeight in 132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.09375 = fieldNorm(doc=132)
          0.05704103 = weight(abstract_txt:creating in 132) [ClassicSimilarity], result of:
            0.05704103 = score(doc=132,freq=1.0), product of:
              0.10998795 = queryWeight, product of:
                1.341602 = boost
                5.531857 = idf(docFreq=477, maxDocs=44421)
                0.014820077 = queryNorm
              0.5186116 = fieldWeight in 132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.531857 = idf(docFreq=477, maxDocs=44421)
                0.09375 = fieldNorm(doc=132)
          0.13421282 = weight(abstract_txt:expensive in 132) [ClassicSimilarity], result of:
            0.13421282 = score(doc=132,freq=1.0), product of:
              0.19457315 = queryWeight, product of:
                1.7844015 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.014820077 = queryNorm
              0.68978083 = fieldWeight in 132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.09375 = fieldNorm(doc=132)
          0.07186347 = weight(abstract_txt:learning in 132) [ClassicSimilarity], result of:
            0.07186347 = score(doc=132,freq=1.0), product of:
              0.16164751 = queryWeight, product of:
                2.3001208 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014820077 = queryNorm
              0.444569 = fieldWeight in 132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.09375 = fieldNorm(doc=132)
        0.29166666 = coord(7/24)
    
  3. Xing, F.Z.; Pallucchini, F.; Cambria, E.: Cognitive-inspired domain adaptation of sentiment lexicons (2019) 0.12
    0.1189852 = sum of:
      0.1189852 = product of:
        0.7139112 = sum of:
          0.04522304 = weight(abstract_txt:build in 104) [ClassicSimilarity], result of:
            0.04522304 = score(doc=104,freq=1.0), product of:
              0.12345847 = queryWeight, product of:
                1.4213845 = boost
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.014820077 = queryNorm
              0.36630163 = fieldWeight in 104, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.0625 = fieldNorm(doc=104)
          0.30732864 = weight(abstract_txt:lexicons in 104) [ClassicSimilarity], result of:
            0.30732864 = score(doc=104,freq=4.0), product of:
              0.27903783 = queryWeight, product of:
                2.1368926 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.014820077 = queryNorm
              1.101387 = fieldWeight in 104, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0625 = fieldNorm(doc=104)
          0.06775353 = weight(abstract_txt:learning in 104) [ClassicSimilarity], result of:
            0.06775353 = score(doc=104,freq=2.0), product of:
              0.16164751 = queryWeight, product of:
                2.3001208 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014820077 = queryNorm
              0.41914365 = fieldWeight in 104, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0625 = fieldNorm(doc=104)
          0.29360595 = weight(abstract_txt:lexicon in 104) [ClassicSimilarity], result of:
            0.29360595 = score(doc=104,freq=2.0), product of:
              0.4296594 = queryWeight, product of:
                3.7499743 = boost
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.014820077 = queryNorm
              0.68334585 = fieldWeight in 104, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.0625 = fieldNorm(doc=104)
        0.16666667 = coord(4/24)
    
  4. Witten, I.H.; Bainbridge, M.; Nichols, D.M.: How to build a digital library (2010) 0.10
    0.1044604 = sum of:
      0.1044604 = product of:
        0.3133812 = sum of:
          0.016132753 = weight(abstract_txt:present in 27) [ClassicSimilarity], result of:
            0.016132753 = score(doc=27,freq=1.0), product of:
              0.067880966 = queryWeight, product of:
                1.0539625 = boost
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.014820077 = queryNorm
              0.23766239 = fieldWeight in 27, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
          0.033205844 = weight(abstract_txt:full in 27) [ClassicSimilarity], result of:
            0.033205844 = score(doc=27,freq=2.0), product of:
              0.087178364 = queryWeight, product of:
                1.1944157 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.014820077 = queryNorm
              0.38089547 = fieldWeight in 27, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
          0.0325041 = weight(abstract_txt:comprehensive in 27) [ClassicSimilarity], result of:
            0.0325041 = score(doc=27,freq=1.0), product of:
              0.10828487 = queryWeight, product of:
                1.3311746 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.014820077 = queryNorm
              0.30017212 = fieldWeight in 27, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
          0.033086512 = weight(abstract_txt:automatically in 27) [ClassicSimilarity], result of:
            0.033086512 = score(doc=27,freq=1.0), product of:
              0.10957454 = queryWeight, product of:
                1.3390783 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.014820077 = queryNorm
              0.30195436 = fieldWeight in 27, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
          0.03327393 = weight(abstract_txt:creating in 27) [ClassicSimilarity], result of:
            0.03327393 = score(doc=27,freq=1.0), product of:
              0.10998795 = queryWeight, product of:
                1.341602 = boost
                5.531857 = idf(docFreq=477, maxDocs=44421)
                0.014820077 = queryNorm
              0.30252343 = fieldWeight in 27, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.531857 = idf(docFreq=477, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
          0.03957016 = weight(abstract_txt:build in 27) [ClassicSimilarity], result of:
            0.03957016 = score(doc=27,freq=1.0), product of:
              0.12345847 = queryWeight, product of:
                1.4213845 = boost
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.014820077 = queryNorm
              0.32051393 = fieldWeight in 27, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
          0.05852615 = weight(abstract_txt:maintain in 27) [ClassicSimilarity], result of:
            0.05852615 = score(doc=27,freq=1.0), product of:
              0.16026634 = queryWeight, product of:
                1.6194677 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.014820077 = queryNorm
              0.36518055 = fieldWeight in 27, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
          0.067081705 = weight(abstract_txt:self in 27) [ClassicSimilarity], result of:
            0.067081705 = score(doc=27,freq=1.0), product of:
              0.22115074 = queryWeight, product of:
                2.6903596 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.014820077 = queryNorm
              0.3033302 = fieldWeight in 27, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.0546875 = fieldNorm(doc=27)
        0.33333334 = coord(8/24)
    
  5. Ko, Y.; Seo, J.: Text classification from unlabeled documents with bootstrapping and feature projection techniques (2009) 0.10
    0.09578587 = sum of:
      0.09578587 = product of:
        0.38314348 = sum of:
          0.03149597 = weight(abstract_txt:documents in 3452) [ClassicSimilarity], result of:
            0.03149597 = score(doc=3452,freq=4.0), product of:
              0.061107952 = queryWeight, product of:
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.014820077 = queryNorm
              0.51541525 = fieldWeight in 3452, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=3452)
          0.016687073 = weight(abstract_txt:however in 3452) [ClassicSimilarity], result of:
            0.016687073 = score(doc=3452,freq=1.0), product of:
              0.06351376 = queryWeight, product of:
                1.0194949 = boost
                4.203706 = idf(docFreq=1803, maxDocs=44421)
                0.014820077 = queryNorm
              0.2627316 = fieldWeight in 3452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.203706 = idf(docFreq=1803, maxDocs=44421)
                0.0625 = fieldNorm(doc=3452)
          0.053475875 = weight(abstract_txt:automatically in 3452) [ClassicSimilarity], result of:
            0.053475875 = score(doc=3452,freq=2.0), product of:
              0.10957454 = queryWeight, product of:
                1.3390783 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.014820077 = queryNorm
              0.48803192 = fieldWeight in 3452, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.0625 = fieldNorm(doc=3452)
          0.056502275 = weight(abstract_txt:accurate in 3452) [ClassicSimilarity], result of:
            0.056502275 = score(doc=3452,freq=1.0), product of:
              0.14321604 = queryWeight, product of:
                1.5309006 = boost
                6.312396 = idf(docFreq=218, maxDocs=44421)
                0.014820077 = queryNorm
              0.39452475 = fieldWeight in 3452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.312396 = idf(docFreq=218, maxDocs=44421)
                0.0625 = fieldNorm(doc=3452)
          0.08947522 = weight(abstract_txt:expensive in 3452) [ClassicSimilarity], result of:
            0.08947522 = score(doc=3452,freq=1.0), product of:
              0.19457315 = queryWeight, product of:
                1.7844015 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.014820077 = queryNorm
              0.4598539 = fieldWeight in 3452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.0625 = fieldNorm(doc=3452)
          0.13550706 = weight(abstract_txt:learning in 3452) [ClassicSimilarity], result of:
            0.13550706 = score(doc=3452,freq=8.0), product of:
              0.16164751 = queryWeight, product of:
                2.3001208 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014820077 = queryNorm
              0.8382873 = fieldWeight in 3452, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0625 = fieldNorm(doc=3452)
        0.25 = coord(6/24)