Document (#34038)

Author
Savoy, J.
Title
Searching strategies for the Hungarian language
Source
Information processing and management. 44(2008) no.1, S.310-324
Year
2008
Abstract
This paper reports on the underlying IR problems encountered when dealing with the complex morphology and compound constructions found in the Hungarian language. It describes evaluations carried out on two general stemming strategies for this language, and also demonstrates that a light stemming approach could be quite effective. Based on searches done on the CLEF test collection, we find that a more aggressive suffix-stripping approach may produce better MAP. When compared to an IR scheme without stemming or one based on only a light stemmer, we find the differences to be statistically significant. When compared with probabilistic, vector-space and language models, we find that the Okapi model results in the best retrieval effectiveness. The resulting MAP is found to be about 35% better than the classical tf idf approach, particularly for very short requests. Finally, we demonstrate that applying an automatic decompounding procedure for both queries and documents significantly improves IR performance (+10%), compared to word-based indexing strategies.
Theme
Computerlinguistik

Similar documents (author)

  1. Savoy, J.: Stemming of French words based on grammatical categories (1993) 5.21
    5.2088575 = sum of:
      5.2088575 = weight(author_txt:savoy in 4649) [ClassicSimilarity], result of:
        5.2088575 = fieldWeight in 4649, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.625 = fieldNorm(doc=4649)
    
  2. Savoy, J.: Effectiveness of information retrieval systems used in a hypertext environment (1993) 5.21
    5.2088575 = sum of:
      5.2088575 = weight(author_txt:savoy in 6510) [ClassicSimilarity], result of:
        5.2088575 = fieldWeight in 6510, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.625 = fieldNorm(doc=6510)
    
  3. Savoy, J.: ¬A learning scheme for information retrieval in hypertext (1994) 5.21
    5.2088575 = sum of:
      5.2088575 = weight(author_txt:savoy in 7291) [ClassicSimilarity], result of:
        5.2088575 = fieldWeight in 7291, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.625 = fieldNorm(doc=7291)
    
  4. Savoy, J.: Bayesian inference networks and spreading activation in hypertext systems (1992) 5.21
    5.2088575 = sum of:
      5.2088575 = weight(author_txt:savoy in 260) [ClassicSimilarity], result of:
        5.2088575 = fieldWeight in 260, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.625 = fieldNorm(doc=260)
    
  5. Savoy, J.: Searching information in legal hypertext systems (1993/94) 5.21
    5.2088575 = sum of:
      5.2088575 = weight(author_txt:savoy in 825) [ClassicSimilarity], result of:
        5.2088575 = fieldWeight in 825, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.625 = fieldNorm(doc=825)
    

Similar documents (content)

  1. Dolamic, L.; Savoy, J.: Indexing and searching strategies for the Russian language (2009) 0.87
    0.86509776 = sum of:
      0.86509776 = product of:
        1.5448174 = sum of:
          0.04968221 = weight(abstract_txt:probabilistic in 288) [ClassicSimilarity], result of:
            0.04968221 = score(doc=288,freq=1.0), product of:
              0.117469504 = queryWeight, product of:
                1.0073137 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.017233148 = queryNorm
              0.4229371 = fieldWeight in 288, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.0718831 = weight(abstract_txt:statistically in 288) [ClassicSimilarity], result of:
            0.0718831 = score(doc=288,freq=2.0), product of:
              0.119270325 = queryWeight, product of:
                1.0150055 = boost
                6.8186655 = idf(docFreq=131, maxDocs=44421)
                0.017233148 = queryNorm
              0.6026906 = fieldWeight in 288, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8186655 = idf(docFreq=131, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.116639905 = weight(abstract_txt:okapi in 288) [ClassicSimilarity], result of:
            0.116639905 = score(doc=288,freq=2.0), product of:
              0.16469458 = queryWeight, product of:
                1.1927291 = boost
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.017233148 = queryNorm
              0.70821947 = fieldWeight in 288, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.1775151 = weight(abstract_txt:stemmer in 288) [ClassicSimilarity], result of:
            0.1775151 = score(doc=288,freq=2.0), product of:
              0.21790712 = queryWeight, product of:
                1.3719487 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.017233148 = queryNorm
              0.8146366 = fieldWeight in 288, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.1775151 = weight(abstract_txt:aggressive in 288) [ClassicSimilarity], result of:
            0.1775151 = score(doc=288,freq=2.0), product of:
              0.21790712 = queryWeight, product of:
                1.3719487 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.017233148 = queryNorm
              0.8146366 = fieldWeight in 288, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.008482646 = weight(abstract_txt:that in 288) [ClassicSimilarity], result of:
            0.008482646 = score(doc=288,freq=1.0), product of:
              0.057389453 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017233148 = queryNorm
              0.14780845 = fieldWeight in 288, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.034610294 = weight(abstract_txt:better in 288) [ClassicSimilarity], result of:
            0.034610294 = score(doc=288,freq=1.0), product of:
              0.11630669 = queryWeight, product of:
                1.4174885 = boost
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.017233148 = queryNorm
              0.29757783 = fieldWeight in 288, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.04362232 = weight(abstract_txt:approach in 288) [ClassicSimilarity], result of:
            0.04362232 = score(doc=288,freq=3.0), product of:
              0.107711904 = queryWeight, product of:
                1.6706853 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.017233148 = queryNorm
              0.4049907 = fieldWeight in 288, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.11527328 = weight(abstract_txt:light in 288) [ClassicSimilarity], result of:
            0.11527328 = score(doc=288,freq=3.0), product of:
              0.17985114 = queryWeight, product of:
                1.7626812 = boost
                5.920724 = idf(docFreq=323, maxDocs=44421)
                0.017233148 = queryNorm
              0.64093715 = fieldWeight in 288, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.920724 = idf(docFreq=323, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.0342807 = weight(abstract_txt:when in 288) [ClassicSimilarity], result of:
            0.0342807 = score(doc=288,freq=1.0), product of:
              0.13229133 = queryWeight, product of:
                1.8515204 = boost
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.017233148 = queryNorm
              0.25913036 = fieldWeight in 288, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.055985197 = weight(abstract_txt:find in 288) [ClassicSimilarity], result of:
            0.055985197 = score(doc=288,freq=1.0), product of:
              0.1834624 = queryWeight, product of:
                2.1804008 = boost
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.017233148 = queryNorm
              0.30515897 = fieldWeight in 288, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.09097269 = weight(abstract_txt:strategies in 288) [ClassicSimilarity], result of:
            0.09097269 = score(doc=288,freq=2.0), product of:
              0.20126224 = queryWeight, product of:
                2.2837257 = boost
                5.113918 = idf(docFreq=725, maxDocs=44421)
                0.017233148 = queryNorm
              0.45201075 = fieldWeight in 288, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.113918 = idf(docFreq=725, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.080602944 = weight(abstract_txt:language in 288) [ClassicSimilarity], result of:
            0.080602944 = score(doc=288,freq=3.0), product of:
              0.17851363 = queryWeight, product of:
                2.4835212 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.017233148 = queryNorm
              0.45152265 = fieldWeight in 288, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
          0.487752 = weight(abstract_txt:stemming in 288) [ClassicSimilarity], result of:
            0.487752 = score(doc=288,freq=6.0), product of:
              0.42747813 = queryWeight, product of:
                3.3282793 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.017233148 = queryNorm
              1.1409987 = fieldWeight in 288, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.0625 = fieldNorm(doc=288)
        0.56 = coord(14/25)
    
  2. Brychcín, T.; Konopík, M.: HPS: High precision stemmer (2015) 0.42
    0.42457318 = sum of:
      0.42457318 = product of:
        1.1793699 = sum of:
          0.12552214 = weight(abstract_txt:stemmer in 3686) [ClassicSimilarity], result of:
            0.12552214 = score(doc=3686,freq=1.0), product of:
              0.21790712 = queryWeight, product of:
                1.3719487 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.017233148 = queryNorm
              0.5760351 = fieldWeight in 3686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.014692373 = weight(abstract_txt:that in 3686) [ClassicSimilarity], result of:
            0.014692373 = score(doc=3686,freq=3.0), product of:
              0.057389453 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017233148 = queryNorm
              0.25601172 = fieldWeight in 3686, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.021937672 = weight(abstract_txt:based in 3686) [ClassicSimilarity], result of:
            0.021937672 = score(doc=3686,freq=2.0), product of:
              0.07797379 = queryWeight, product of:
                1.4214681 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.017233148 = queryNorm
              0.28134674 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.04362232 = weight(abstract_txt:approach in 3686) [ClassicSimilarity], result of:
            0.04362232 = score(doc=3686,freq=3.0), product of:
              0.107711904 = queryWeight, product of:
                1.6706853 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.017233148 = queryNorm
              0.4049907 = fieldWeight in 3686, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.04848023 = weight(abstract_txt:when in 3686) [ClassicSimilarity], result of:
            0.04848023 = score(doc=3686,freq=2.0), product of:
              0.13229133 = queryWeight, product of:
                1.8515204 = boost
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.017233148 = queryNorm
              0.36646566 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.085507214 = weight(abstract_txt:compared in 3686) [ClassicSimilarity], result of:
            0.085507214 = score(doc=3686,freq=2.0), product of:
              0.1931183 = queryWeight, product of:
                2.2370439 = boost
                5.0093837 = idf(docFreq=805, maxDocs=44421)
                0.017233148 = queryNorm
              0.44277114 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.0093837 = idf(docFreq=805, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.046536133 = weight(abstract_txt:language in 3686) [ClassicSimilarity], result of:
            0.046536133 = score(doc=3686,freq=1.0), product of:
              0.17851363 = queryWeight, product of:
                2.4835212 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.017233148 = queryNorm
              0.26068673 = fieldWeight in 3686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.2662394 = weight(abstract_txt:hungarian in 3686) [ClassicSimilarity], result of:
            0.2662394 = score(doc=3686,freq=1.0), product of:
              0.4532273 = queryWeight, product of:
                2.7981772 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.017233148 = queryNorm
              0.5874302 = fieldWeight in 3686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.5268324 = weight(abstract_txt:stemming in 3686) [ClassicSimilarity], result of:
            0.5268324 = score(doc=3686,freq=7.0), product of:
              0.42747813 = queryWeight, product of:
                3.3282793 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.017233148 = queryNorm
              1.2324195 = fieldWeight in 3686, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
        0.36 = coord(9/25)
    
  3. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.29
    0.2931904 = sum of:
      0.2931904 = product of:
        0.73297596 = sum of:
          0.04221989 = weight(abstract_txt:compound in 2536) [ClassicSimilarity], result of:
            0.04221989 = score(doc=2536,freq=1.0), product of:
              0.14417255 = queryWeight, product of:
                1.1159468 = boost
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.017233148 = queryNorm
              0.29284278 = fieldWeight in 2536, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.13676164 = weight(abstract_txt:constructions in 2536) [ClassicSimilarity], result of:
            0.13676164 = score(doc=2536,freq=5.0), product of:
              0.18458258 = queryWeight, product of:
                1.2626923 = boost
                8.482592 = idf(docFreq=24, maxDocs=44421)
                0.017233148 = queryNorm
              0.7409239 = fieldWeight in 2536, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.482592 = idf(docFreq=24, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.012986347 = weight(abstract_txt:that in 2536) [ClassicSimilarity], result of:
            0.012986347 = score(doc=2536,freq=6.0), product of:
              0.057389453 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017233148 = queryNorm
              0.22628456 = fieldWeight in 2536, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.021631433 = weight(abstract_txt:better in 2536) [ClassicSimilarity], result of:
            0.021631433 = score(doc=2536,freq=1.0), product of:
              0.11630669 = queryWeight, product of:
                1.4174885 = boost
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.017233148 = queryNorm
              0.18598615 = fieldWeight in 2536, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.027422091 = weight(abstract_txt:based in 2536) [ClassicSimilarity], result of:
            0.027422091 = score(doc=2536,freq=8.0), product of:
              0.07797379 = queryWeight, product of:
                1.4214681 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.017233148 = queryNorm
              0.35168344 = fieldWeight in 2536, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.022260921 = weight(abstract_txt:approach in 2536) [ClassicSimilarity], result of:
            0.022260921 = score(doc=2536,freq=2.0), product of:
              0.107711904 = queryWeight, product of:
                1.6706853 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.017233148 = queryNorm
              0.20667094 = fieldWeight in 2536, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.09301074 = weight(abstract_txt:light in 2536) [ClassicSimilarity], result of:
            0.09301074 = score(doc=2536,freq=5.0), product of:
              0.17985114 = queryWeight, product of:
                1.7626812 = boost
                5.920724 = idf(docFreq=323, maxDocs=44421)
                0.017233148 = queryNorm
              0.517154 = fieldWeight in 2536, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.920724 = idf(docFreq=323, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.030300144 = weight(abstract_txt:when in 2536) [ClassicSimilarity], result of:
            0.030300144 = score(doc=2536,freq=2.0), product of:
              0.13229133 = queryWeight, product of:
                1.8515204 = boost
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.017233148 = queryNorm
              0.22904104 = fieldWeight in 2536, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.05817017 = weight(abstract_txt:language in 2536) [ClassicSimilarity], result of:
            0.05817017 = score(doc=2536,freq=4.0), product of:
              0.17851363 = queryWeight, product of:
                2.4835212 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.017233148 = queryNorm
              0.3258584 = fieldWeight in 2536, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.2882126 = weight(abstract_txt:hungarian in 2536) [ClassicSimilarity], result of:
            0.2882126 = score(doc=2536,freq=3.0), product of:
              0.4532273 = queryWeight, product of:
                2.7981772 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.017233148 = queryNorm
              0.6359118 = fieldWeight in 2536, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
        0.4 = coord(10/25)
    
  4. Kettunen, K.; Kunttu, T.; Järvelin, K.: To stem or lemmatize a highly inflectional language in a probabilistic IR environment? (2005) 0.26
    0.2649697 = sum of:
      0.2649697 = product of:
        0.73602694 = sum of:
          0.061478596 = weight(abstract_txt:probabilistic in 5395) [ClassicSimilarity], result of:
            0.061478596 = score(doc=5395,freq=2.0), product of:
              0.117469504 = queryWeight, product of:
                1.0073137 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.017233148 = queryNorm
              0.5233579 = fieldWeight in 5395, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.0444754 = weight(abstract_txt:statistically in 5395) [ClassicSimilarity], result of:
            0.0444754 = score(doc=5395,freq=1.0), product of:
              0.119270325 = queryWeight, product of:
                1.0150055 = boost
                6.8186655 = idf(docFreq=131, maxDocs=44421)
                0.017233148 = queryNorm
              0.37289578 = fieldWeight in 5395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8186655 = idf(docFreq=131, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.059107844 = weight(abstract_txt:compound in 5395) [ClassicSimilarity], result of:
            0.059107844 = score(doc=5395,freq=1.0), product of:
              0.14417255 = queryWeight, product of:
                1.1159468 = boost
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.017233148 = queryNorm
              0.40997988 = fieldWeight in 5395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.15532571 = weight(abstract_txt:stemmer in 5395) [ClassicSimilarity], result of:
            0.15532571 = score(doc=5395,freq=2.0), product of:
              0.21790712 = queryWeight, product of:
                1.3719487 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.017233148 = queryNorm
              0.712807 = fieldWeight in 5395, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.012855826 = weight(abstract_txt:that in 5395) [ClassicSimilarity], result of:
            0.012855826 = score(doc=5395,freq=3.0), product of:
              0.057389453 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017233148 = queryNorm
              0.22401026 = fieldWeight in 5395, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.022037188 = weight(abstract_txt:approach in 5395) [ClassicSimilarity], result of:
            0.022037188 = score(doc=5395,freq=1.0), product of:
              0.107711904 = queryWeight, product of:
                1.6706853 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.017233148 = queryNorm
              0.20459381 = fieldWeight in 5395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.05290489 = weight(abstract_txt:compared in 5395) [ClassicSimilarity], result of:
            0.05290489 = score(doc=5395,freq=1.0), product of:
              0.1931183 = queryWeight, product of:
                2.2370439 = boost
                5.0093837 = idf(docFreq=805, maxDocs=44421)
                0.017233148 = queryNorm
              0.27395067 = fieldWeight in 5395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0093837 = idf(docFreq=805, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.081438236 = weight(abstract_txt:language in 5395) [ClassicSimilarity], result of:
            0.081438236 = score(doc=5395,freq=4.0), product of:
              0.17851363 = queryWeight, product of:
                2.4835212 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.017233148 = queryNorm
              0.45620176 = fieldWeight in 5395, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
          0.24640328 = weight(abstract_txt:stemming in 5395) [ClassicSimilarity], result of:
            0.24640328 = score(doc=5395,freq=2.0), product of:
              0.42747813 = queryWeight, product of:
                3.3282793 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.017233148 = queryNorm
              0.5764114 = fieldWeight in 5395, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.0546875 = fieldNorm(doc=5395)
        0.36 = coord(9/25)
    
  5. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.20
    0.20416728 = sum of:
      0.20416728 = product of:
        0.72916883 = sum of:
          0.060759854 = weight(abstract_txt:improves in 3950) [ClassicSimilarity], result of:
            0.060759854 = score(doc=3950,freq=1.0), product of:
              0.1157699 = queryWeight, product of:
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.017233148 = queryNorm
              0.5248329 = fieldWeight in 3950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.15690267 = weight(abstract_txt:stemmer in 3950) [ClassicSimilarity], result of:
            0.15690267 = score(doc=3950,freq=1.0), product of:
              0.21790712 = queryWeight, product of:
                1.3719487 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.017233148 = queryNorm
              0.72004384 = fieldWeight in 3950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.014995342 = weight(abstract_txt:that in 3950) [ClassicSimilarity], result of:
            0.014995342 = score(doc=3950,freq=2.0), product of:
              0.057389453 = queryWeight, product of:
                1.4081477 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.017233148 = queryNorm
              0.2612909 = fieldWeight in 3950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.019390346 = weight(abstract_txt:based in 3950) [ClassicSimilarity], result of:
            0.019390346 = score(doc=3950,freq=1.0), product of:
              0.07797379 = queryWeight, product of:
                1.4214681 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.017233148 = queryNorm
              0.24867775 = fieldWeight in 3950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.042850874 = weight(abstract_txt:when in 3950) [ClassicSimilarity], result of:
            0.042850874 = score(doc=3950,freq=1.0), product of:
              0.13229133 = queryWeight, product of:
                1.8515204 = boost
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.017233148 = queryNorm
              0.32391295 = fieldWeight in 3950, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.08226504 = weight(abstract_txt:language in 3950) [ClassicSimilarity], result of:
            0.08226504 = score(doc=3950,freq=2.0), product of:
              0.17851363 = queryWeight, product of:
                2.4835212 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.017233148 = queryNorm
              0.46083337 = fieldWeight in 3950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.35200468 = weight(abstract_txt:stemming in 3950) [ClassicSimilarity], result of:
            0.35200468 = score(doc=3950,freq=2.0), product of:
              0.42747813 = queryWeight, product of:
                3.3282793 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.017233148 = queryNorm
              0.82344484 = fieldWeight in 3950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
        0.28 = coord(7/25)