Document (#30602)

Author
Talvensaari, T.
Laurikkala, J.
Järvelin, K.
Juhola, M.
Title
¬A study on automatic creation of a comparable document collection in cross-language information retrieval
Source
Journal of documentation. 62(2006) no.3, S.372-387
Year
2006
Abstract
Purpose - To present a method for creating a comparable document collection from two document collections in different languages. Design/methodology/approach - The best query keys were extracted from a Finnish source collection (articles of the newspaper Aamulehti) with the relative average term frequency formula. The keys were translated into English with a dictionary-based query translation program. The resulting lists of words were used as queries that were run against the target collection (Los Angeles Times articles) with the nearest neighbor method. The documents were aligned with unrestricted and date-restricted alignment schemes, which were also combined. Findings - The combined alignment scheme was found the best, when the relatedness of the document pairs was assessed with a five-degree relevance scale. Of the 400 document pairs, roughly 40 percent were highly or fairly related and 75 percent included at least lexical similarity. Research limitations/implications - The number of alignment pairs was small due to the short common time period of the two collections, and their geographical (and thus, topical) remoteness. In future, our aim is to build larger comparable corpora in various languages and use them as source of translation knowledge for the purposes of cross-language information retrieval (CLIR). Practical implications - Readily available parallel corpora are scarce. With this method, two unrelated document collections can relatively easily be aligned to create a CLIR resource. Originality/value - The method can be applied to weakly linked collections and morphologically complex languages, such as Finnish.
Theme
Retrievalstudien
Multilinguale Probleme

Similar documents (author)

  1. Järvelin, K.: ¬An analysis of two approaches in information retrieval : from frameworks to study designs (2007) 4.99
    4.989572 = sum of:
      4.989572 = weight(author_txt:järvelin in 326) [ClassicSimilarity], result of:
        4.989572 = fieldWeight in 326, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.625 = fieldNorm(doc=326)
    
  2. Järvelin, K.: Evaluation (2011) 4.99
    4.989572 = sum of:
      4.989572 = weight(author_txt:järvelin in 548) [ClassicSimilarity], result of:
        4.989572 = fieldWeight in 548, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.625 = fieldNorm(doc=548)
    
  3. Järvelin, K.; Vakkari, P.: ¬The evolution of library and information science 1965-1985 : a content analysis of journal titles (1993) 3.99
    3.9916575 = sum of:
      3.9916575 = weight(author_txt:järvelin in 4649) [ClassicSimilarity], result of:
        3.9916575 = fieldWeight in 4649, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.5 = fieldNorm(doc=4649)
    
  4. Kristensen, J.; Järvelin, K.: ¬The effectiveness of a searching thesaurus in free-text searching in a full-text database (1990) 3.99
    3.9916575 = sum of:
      3.9916575 = weight(author_txt:järvelin in 2043) [ClassicSimilarity], result of:
        3.9916575 = fieldWeight in 2043, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.5 = fieldNorm(doc=2043)
    
  5. Pirkola, A.; Järvelin, K.: Employing the resolution power of search keys (2001) 3.99
    3.9916575 = sum of:
      3.9916575 = weight(author_txt:järvelin in 5907) [ClassicSimilarity], result of:
        3.9916575 = fieldWeight in 5907, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.5 = fieldNorm(doc=5907)
    

Similar documents (content)

  1. Dadashkarimia, J.; Shakery, A.; Failia, H.; Zamani, H.: ¬An expectation-maximization algorithm for query translation based on pseudo-relevant documents (2017) 0.61
    0.6050462 = sum of:
      0.6050462 = product of:
        1.1635505 = sum of:
          0.06754577 = weight(abstract_txt:query in 3296) [ClassicSimilarity], result of:
            0.06754577 = score(doc=3296,freq=8.0), product of:
              0.09186021 = queryWeight, product of:
                1.0607182 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.018217541 = queryNorm
              0.73531044 = fieldWeight in 3296, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.051067583 = weight(abstract_txt:source in 3296) [ClassicSimilarity], result of:
            0.051067583 = score(doc=3296,freq=3.0), product of:
              0.105717875 = queryWeight, product of:
                1.1379168 = boost
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.018217541 = queryNorm
              0.48305532 = fieldWeight in 3296, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.039061733 = weight(abstract_txt:cross in 3296) [ClassicSimilarity], result of:
            0.039061733 = score(doc=3296,freq=1.0), product of:
              0.12752432 = queryWeight, product of:
                1.2497778 = boost
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.018217541 = queryNorm
              0.30630812 = fieldWeight in 3296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.15705884 = weight(abstract_txt:translation in 3296) [ClassicSimilarity], result of:
            0.15705884 = score(doc=3296,freq=9.0), product of:
              0.1550194 = queryWeight, product of:
                1.3779368 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.018217541 = queryNorm
              1.0131559 = fieldWeight in 3296, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.10894806 = weight(abstract_txt:corpora in 3296) [ClassicSimilarity], result of:
            0.10894806 = score(doc=3296,freq=2.0), product of:
              0.20055261 = queryWeight, product of:
                1.567294 = boost
                7.0240583 = idf(docFreq=106, maxDocs=44218)
                0.018217541 = queryNorm
              0.5432393 = fieldWeight in 3296, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.0240583 = idf(docFreq=106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.020833993 = weight(abstract_txt:with in 3296) [ClassicSimilarity], result of:
            0.020833993 = score(doc=3296,freq=4.0), product of:
              0.076201014 = queryWeight, product of:
                1.6733135 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.018217541 = queryNorm
              0.27340835 = fieldWeight in 3296, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.04656515 = weight(abstract_txt:languages in 3296) [ClassicSimilarity], result of:
            0.04656515 = score(doc=3296,freq=1.0), product of:
              0.1641206 = queryWeight, product of:
                1.7364547 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.018217541 = queryNorm
              0.2837252 = fieldWeight in 3296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.17336322 = weight(abstract_txt:clir in 3296) [ClassicSimilarity], result of:
            0.17336322 = score(doc=3296,freq=2.0), product of:
              0.27335057 = queryWeight, product of:
                1.8297691 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.018217541 = queryNorm
              0.6342157 = fieldWeight in 3296, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.18892775 = weight(abstract_txt:aligned in 3296) [ClassicSimilarity], result of:
            0.18892775 = score(doc=3296,freq=2.0), product of:
              0.28947595 = queryWeight, product of:
                1.8829663 = boost
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.018217541 = queryNorm
              0.6526544 = fieldWeight in 3296, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.0810798 = weight(abstract_txt:method in 3296) [ClassicSimilarity], result of:
            0.0810798 = score(doc=3296,freq=4.0), product of:
              0.16469881 = queryWeight, product of:
                2.008614 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.018217541 = queryNorm
              0.4922914 = fieldWeight in 3296, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.044658996 = weight(abstract_txt:collection in 3296) [ClassicSimilarity], result of:
            0.044658996 = score(doc=3296,freq=1.0), product of:
              0.1756742 = queryWeight, product of:
                2.074461 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.018217541 = queryNorm
              0.25421488 = fieldWeight in 3296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.079636216 = weight(abstract_txt:collections in 3296) [ClassicSimilarity], result of:
            0.079636216 = score(doc=3296,freq=3.0), product of:
              0.17911637 = queryWeight, product of:
                2.094686 = boost
                4.693822 = idf(docFreq=1099, maxDocs=44218)
                0.018217541 = queryNorm
              0.444606 = fieldWeight in 3296, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.693822 = idf(docFreq=1099, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
          0.10480347 = weight(abstract_txt:pairs in 3296) [ClassicSimilarity], result of:
            0.10480347 = score(doc=3296,freq=1.0), product of:
              0.28186396 = queryWeight, product of:
                2.2756302 = boost
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.018217541 = queryNorm
              0.3718229 = fieldWeight in 3296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3296)
        0.52 = coord(13/25)
    
  2. Talvensaari, T.; Juhola, M.; Laurikkala, J.; Järvelin, K.: Corpus-based cross-language information retrieval in retrieval of highly relevant documents (2007) 0.57
    0.5718282 = sum of:
      0.5718282 = product of:
        1.0996696 = sum of:
          0.047272194 = weight(abstract_txt:query in 139) [ClassicSimilarity], result of:
            0.047272194 = score(doc=139,freq=3.0), product of:
              0.09186021 = queryWeight, product of:
                1.0607182 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.018217541 = queryNorm
              0.5146101 = fieldWeight in 139, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.033695865 = weight(abstract_txt:source in 139) [ClassicSimilarity], result of:
            0.033695865 = score(doc=139,freq=1.0), product of:
              0.105717875 = queryWeight, product of:
                1.1379168 = boost
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.018217541 = queryNorm
              0.31873384 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.044641983 = weight(abstract_txt:cross in 139) [ClassicSimilarity], result of:
            0.044641983 = score(doc=139,freq=1.0), product of:
              0.12752432 = queryWeight, product of:
                1.2497778 = boost
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.018217541 = queryNorm
              0.35006642 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.054058466 = weight(abstract_txt:combined in 139) [ClassicSimilarity], result of:
            0.054058466 = score(doc=139,freq=1.0), product of:
              0.14487936 = queryWeight, product of:
                1.3321084 = boost
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.018217541 = queryNorm
              0.37312746 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.13378827 = weight(abstract_txt:translation in 139) [ClassicSimilarity], result of:
            0.13378827 = score(doc=139,freq=5.0), product of:
              0.1550194 = queryWeight, product of:
                1.3779368 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.018217541 = queryNorm
              0.8630421 = fieldWeight in 139, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.016836409 = weight(abstract_txt:with in 139) [ClassicSimilarity], result of:
            0.016836409 = score(doc=139,freq=2.0), product of:
              0.076201014 = queryWeight, product of:
                1.6733135 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.018217541 = queryNorm
              0.22094731 = fieldWeight in 139, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.16822161 = weight(abstract_txt:finnish in 139) [ClassicSimilarity], result of:
            0.16822161 = score(doc=139,freq=2.0), product of:
              0.24509919 = queryWeight, product of:
                1.7326356 = boost
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.018217541 = queryNorm
              0.6863409 = fieldWeight in 139, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.14009865 = weight(abstract_txt:clir in 139) [ClassicSimilarity], result of:
            0.14009865 = score(doc=139,freq=1.0), product of:
              0.27335057 = queryWeight, product of:
                1.8297691 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.018217541 = queryNorm
              0.5125237 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.051038854 = weight(abstract_txt:collection in 139) [ClassicSimilarity], result of:
            0.051038854 = score(doc=139,freq=1.0), product of:
              0.1756742 = queryWeight, product of:
                2.074461 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.018217541 = queryNorm
              0.2905313 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.074311644 = weight(abstract_txt:collections in 139) [ClassicSimilarity], result of:
            0.074311644 = score(doc=139,freq=2.0), product of:
              0.17911637 = queryWeight, product of:
                2.094686 = boost
                4.693822 = idf(docFreq=1099, maxDocs=44218)
                0.018217541 = queryNorm
              0.41487914 = fieldWeight in 139, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.693822 = idf(docFreq=1099, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.17713028 = weight(abstract_txt:comparable in 139) [ClassicSimilarity], result of:
            0.17713028 = score(doc=139,freq=2.0), product of:
              0.29038867 = queryWeight, product of:
                2.309786 = boost
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.018217541 = queryNorm
              0.60997653 = fieldWeight in 139, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.09828926 = weight(abstract_txt:were in 139) [ClassicSimilarity], result of:
            0.09828926 = score(doc=139,freq=5.0), product of:
              0.19163172 = queryWeight, product of:
                2.8661838 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.018217541 = queryNorm
              0.512907 = fieldWeight in 139, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
          0.060286097 = weight(abstract_txt:document in 139) [ClassicSimilarity], result of:
            0.060286097 = score(doc=139,freq=1.0), product of:
              0.22470663 = queryWeight, product of:
                2.8734581 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018217541 = queryNorm
              0.26828802 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=139)
        0.52 = coord(13/25)
    
  3. Li, K.W.; Yang, C.C.: Conceptual analysis of parallel corpus collected from the Web (2006) 0.44
    0.43913883 = sum of:
      0.43913883 = product of:
        1.21983 = sum of:
          0.044641983 = weight(abstract_txt:cross in 5051) [ClassicSimilarity], result of:
            0.044641983 = score(doc=5051,freq=1.0), product of:
              0.12752432 = queryWeight, product of:
                1.2497778 = boost
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.018217541 = queryNorm
              0.35006642 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.059831932 = weight(abstract_txt:translation in 5051) [ClassicSimilarity], result of:
            0.059831932 = score(doc=5051,freq=1.0), product of:
              0.1550194 = queryWeight, product of:
                1.3779368 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.018217541 = queryNorm
              0.38596416 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.19687086 = weight(abstract_txt:corpora in 5051) [ClassicSimilarity], result of:
            0.19687086 = score(doc=5051,freq=5.0), product of:
              0.20055261 = queryWeight, product of:
                1.567294 = boost
                7.0240583 = idf(docFreq=106, maxDocs=44218)
                0.018217541 = queryNorm
              0.981642 = fieldWeight in 5051, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.0240583 = idf(docFreq=106, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.10643463 = weight(abstract_txt:languages in 5051) [ClassicSimilarity], result of:
            0.10643463 = score(doc=5051,freq=4.0), product of:
              0.1641206 = queryWeight, product of:
                1.7364547 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.018217541 = queryNorm
              0.64851475 = fieldWeight in 5051, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.15267667 = weight(abstract_txt:aligned in 5051) [ClassicSimilarity], result of:
            0.15267667 = score(doc=5051,freq=1.0), product of:
              0.28947595 = queryWeight, product of:
                1.8829663 = boost
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.018217541 = queryNorm
              0.5274244 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.43879 = idf(docFreq=25, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.10359998 = weight(abstract_txt:method in 5051) [ClassicSimilarity], result of:
            0.10359998 = score(doc=5051,freq=5.0), product of:
              0.16469881 = queryWeight, product of:
                2.008614 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.018217541 = queryNorm
              0.6290269 = fieldWeight in 5051, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.169388 = weight(abstract_txt:pairs in 5051) [ClassicSimilarity], result of:
            0.169388 = score(doc=5051,freq=2.0), product of:
              0.28186396 = queryWeight, product of:
                2.2756302 = boost
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.018217541 = queryNorm
              0.60095656 = fieldWeight in 5051, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.34242976 = weight(abstract_txt:alignment in 5051) [ClassicSimilarity], result of:
            0.34242976 = score(doc=5051,freq=5.0), product of:
              0.3320362 = queryWeight, product of:
                2.4698732 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.018217541 = queryNorm
              1.0313025 = fieldWeight in 5051, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.043956287 = weight(abstract_txt:were in 5051) [ClassicSimilarity], result of:
            0.043956287 = score(doc=5051,freq=1.0), product of:
              0.19163172 = queryWeight, product of:
                2.8661838 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.018217541 = queryNorm
              0.22937898 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
        0.36 = coord(9/25)
    
  4. Yang, C.C.; Li, K.W.: Automatic construction of English/Chinese parallel corpora (2003) 0.38
    0.37960076 = sum of:
      0.37960076 = product of:
        0.9490019 = sum of:
          0.042108733 = weight(abstract_txt:articles in 1683) [ClassicSimilarity], result of:
            0.042108733 = score(doc=1683,freq=3.0), product of:
              0.09296076 = queryWeight, product of:
                1.0670533 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.018217541 = queryNorm
              0.4529732 = fieldWeight in 1683, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.078123465 = weight(abstract_txt:cross in 1683) [ClassicSimilarity], result of:
            0.078123465 = score(doc=1683,freq=4.0), product of:
              0.12752432 = queryWeight, product of:
                1.2497778 = boost
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.018217541 = queryNorm
              0.61261624 = fieldWeight in 1683, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.090677954 = weight(abstract_txt:translation in 1683) [ClassicSimilarity], result of:
            0.090677954 = score(doc=1683,freq=3.0), product of:
              0.1550194 = queryWeight, product of:
                1.3779368 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.018217541 = queryNorm
              0.58494586 = fieldWeight in 1683, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.13343358 = weight(abstract_txt:corpora in 1683) [ClassicSimilarity], result of:
            0.13343358 = score(doc=1683,freq=3.0), product of:
              0.20055261 = queryWeight, product of:
                1.567294 = boost
                7.0240583 = idf(docFreq=106, maxDocs=44218)
                0.018217541 = queryNorm
              0.6653296 = fieldWeight in 1683, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.0240583 = idf(docFreq=106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.010416997 = weight(abstract_txt:with in 1683) [ClassicSimilarity], result of:
            0.010416997 = score(doc=1683,freq=1.0), product of:
              0.076201014 = queryWeight, product of:
                1.6733135 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.018217541 = queryNorm
              0.13670418 = fieldWeight in 1683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.06585307 = weight(abstract_txt:languages in 1683) [ClassicSimilarity], result of:
            0.06585307 = score(doc=1683,freq=2.0), product of:
              0.1641206 = queryWeight, product of:
                1.7364547 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.018217541 = queryNorm
              0.40124804 = fieldWeight in 1683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.0810798 = weight(abstract_txt:method in 1683) [ClassicSimilarity], result of:
            0.0810798 = score(doc=1683,freq=4.0), product of:
              0.16469881 = queryWeight, product of:
                2.008614 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.018217541 = queryNorm
              0.4922914 = fieldWeight in 1683, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.1482145 = weight(abstract_txt:pairs in 1683) [ClassicSimilarity], result of:
            0.1482145 = score(doc=1683,freq=2.0), product of:
              0.28186396 = queryWeight, product of:
                2.2756302 = boost
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.018217541 = queryNorm
              0.525837 = fieldWeight in 1683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.10959377 = weight(abstract_txt:comparable in 1683) [ClassicSimilarity], result of:
            0.10959377 = score(doc=1683,freq=1.0), product of:
              0.29038867 = queryWeight, product of:
                2.309786 = boost
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.018217541 = queryNorm
              0.37740374 = fieldWeight in 1683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.18950012 = weight(abstract_txt:alignment in 1683) [ClassicSimilarity], result of:
            0.18950012 = score(doc=1683,freq=2.0), product of:
              0.3320362 = queryWeight, product of:
                2.4698732 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.018217541 = queryNorm
              0.57072127 = fieldWeight in 1683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
        0.4 = coord(10/25)
    
  5. Pirkola, A.; Puolamäki, D.; Järvelin, K.: Applying query structuring in cross-language retrieval (2003) 0.35
    0.35119307 = sum of:
      0.35119307 = product of:
        0.9755363 = sum of:
          0.06823154 = weight(abstract_txt:query in 1074) [ClassicSimilarity], result of:
            0.06823154 = score(doc=1074,freq=4.0), product of:
              0.09186021 = queryWeight, product of:
                1.0607182 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.018217541 = queryNorm
              0.74277574 = fieldWeight in 1074, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.034730695 = weight(abstract_txt:articles in 1074) [ClassicSimilarity], result of:
            0.034730695 = score(doc=1074,freq=1.0), product of:
              0.09296076 = queryWeight, product of:
                1.0670533 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.018217541 = queryNorm
              0.37360597 = fieldWeight in 1074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.055802476 = weight(abstract_txt:cross in 1074) [ClassicSimilarity], result of:
            0.055802476 = score(doc=1074,freq=1.0), product of:
              0.12752432 = queryWeight, product of:
                1.2497778 = boost
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.018217541 = queryNorm
              0.43758303 = fieldWeight in 1074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.601063 = idf(docFreq=443, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.12953994 = weight(abstract_txt:translation in 1074) [ClassicSimilarity], result of:
            0.12953994 = score(doc=1074,freq=3.0), product of:
              0.1550194 = queryWeight, product of:
                1.3779368 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.018217541 = queryNorm
              0.8356369 = fieldWeight in 1074, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.2575357 = weight(abstract_txt:finnish in 1074) [ClassicSimilarity], result of:
            0.2575357 = score(doc=1074,freq=3.0), product of:
              0.24509919 = queryWeight, product of:
                1.7326356 = boost
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.018217541 = queryNorm
              1.0507407 = fieldWeight in 1074, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.066521645 = weight(abstract_txt:languages in 1074) [ClassicSimilarity], result of:
            0.066521645 = score(doc=1074,freq=1.0), product of:
              0.1641206 = queryWeight, product of:
                1.7364547 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.018217541 = queryNorm
              0.40532172 = fieldWeight in 1074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.17138064 = weight(abstract_txt:keys in 1074) [ClassicSimilarity], result of:
            0.17138064 = score(doc=1074,freq=1.0), product of:
              0.26944193 = queryWeight, product of:
                1.81664 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.018217541 = queryNorm
              0.6360578 = fieldWeight in 1074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.081902966 = weight(abstract_txt:method in 1074) [ClassicSimilarity], result of:
            0.081902966 = score(doc=1074,freq=2.0), product of:
              0.16469881 = queryWeight, product of:
                2.008614 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.018217541 = queryNorm
              0.49728936 = fieldWeight in 1074, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
          0.10989072 = weight(abstract_txt:were in 1074) [ClassicSimilarity], result of:
            0.10989072 = score(doc=1074,freq=4.0), product of:
              0.19163172 = queryWeight, product of:
                2.8661838 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.018217541 = queryNorm
              0.57344747 = fieldWeight in 1074, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.078125 = fieldNorm(doc=1074)
        0.36 = coord(9/25)