Document (#16057)

Author
Greengrass, M.
Title
Conflation methods for searching databases of Latin text
Imprint
London : British Library
Year
1996
Pages
75 S
Series
Research and innovation report; 18
Abstract
Describes the results of a project to develop conflation tools for searching databases of Latin text. Reports on the results of a questionnaire sent to 64 users of Latin text retrieval systems. Describes a Latin stemming algorithm that uses a simple longest match with some recoding but differs from most stemmers in its use of 2 separate suffix dictionaries for processing query and database words. Describes a retrieval system in which a user inputs the principal component of their search term, these components are stemmed and the resulting stems matched against the noun based and verb based stem dictionaries. Evaluates the system, describing its limitations, and a more complex system
Theme
Computerlinguistik

Similar documents (content)

  1. Galvez, C.; Moya-Anegón, F. de: ¬An evaluation of conflation accuracy using finite-state transducers (2006) 0.17
    0.17445312 = sum of:
      0.17445312 = product of:
        0.87226564 = sum of:
          0.026614685 = weight(abstract_txt:retrieval in 599) [ClassicSimilarity], result of:
            0.026614685 = score(doc=599,freq=3.0), product of:
              0.056575507 = queryWeight, product of:
                1.0328814 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01575563 = queryNorm
              0.4704277 = fieldWeight in 599, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=599)
          0.015394463 = weight(abstract_txt:results in 599) [ClassicSimilarity], result of:
            0.015394463 = score(doc=599,freq=1.0), product of:
              0.056645356 = queryWeight, product of:
                1.0335188 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.01575563 = queryNorm
              0.2717692 = fieldWeight in 599, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.078125 = fieldNorm(doc=599)
          0.14724931 = weight(abstract_txt:stemmed in 599) [ClassicSimilarity], result of:
            0.14724931 = score(doc=599,freq=1.0), product of:
              0.20258789 = queryWeight, product of:
                1.3820634 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.01575563 = queryNorm
              0.7268416 = fieldWeight in 599, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.078125 = fieldNorm(doc=599)
          0.13919719 = weight(abstract_txt:dictionaries in 599) [ClassicSimilarity], result of:
            0.13919719 = score(doc=599,freq=1.0), product of:
              0.24585266 = queryWeight, product of:
                2.153147 = boost
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.01575563 = queryNorm
              0.5661813 = fieldWeight in 599, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.078125 = fieldNorm(doc=599)
          0.54381 = weight(abstract_txt:conflation in 599) [ClassicSimilarity], result of:
            0.54381 = score(doc=599,freq=3.0), product of:
              0.42284286 = queryWeight, product of:
                2.8237467 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.01575563 = queryNorm
              1.2860806 = fieldWeight in 599, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.078125 = fieldNorm(doc=599)
        0.2 = coord(5/25)
    
  2. Kraaij, W.; Pohlmann, R.: Evaluation of a Dutch stemming algorithm (1995) 0.17
    0.16543812 = sum of:
      0.16543812 = product of:
        0.5908504 = sum of:
          0.0153659955 = weight(abstract_txt:retrieval in 5866) [ClassicSimilarity], result of:
            0.0153659955 = score(doc=5866,freq=1.0), product of:
              0.056575507 = queryWeight, product of:
                1.0328814 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01575563 = queryNorm
              0.27160156 = fieldWeight in 5866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=5866)
          0.13111523 = weight(abstract_txt:stemming in 5866) [ClassicSimilarity], result of:
            0.13111523 = score(doc=5866,freq=3.0), product of:
              0.13000886 = queryWeight, product of:
                1.1071532 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.01575563 = queryNorm
              1.0085099 = fieldWeight in 5866, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.078125 = fieldNorm(doc=5866)
          0.094063796 = weight(abstract_txt:stem in 5866) [ClassicSimilarity], result of:
            0.094063796 = score(doc=5866,freq=1.0), product of:
              0.15026562 = queryWeight, product of:
                1.1902852 = boost
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.01575563 = queryNorm
              0.6259835 = fieldWeight in 5866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.078125 = fieldNorm(doc=5866)
          0.1360928 = weight(abstract_txt:stemmers in 5866) [ClassicSimilarity], result of:
            0.1360928 = score(doc=5866,freq=1.0), product of:
              0.19222125 = queryWeight, product of:
                1.3462383 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.01575563 = queryNorm
              0.7080008 = fieldWeight in 5866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.078125 = fieldNorm(doc=5866)
          0.14724931 = weight(abstract_txt:stemmed in 5866) [ClassicSimilarity], result of:
            0.14724931 = score(doc=5866,freq=1.0), product of:
              0.20258789 = queryWeight, product of:
                1.3820634 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.01575563 = queryNorm
              0.7268416 = fieldWeight in 5866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.078125 = fieldNorm(doc=5866)
          0.030767823 = weight(abstract_txt:describes in 5866) [ClassicSimilarity], result of:
            0.030767823 = score(doc=5866,freq=1.0), product of:
              0.10288441 = queryWeight, product of:
                1.7059121 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.01575563 = queryNorm
              0.29905233 = fieldWeight in 5866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.078125 = fieldNorm(doc=5866)
          0.036195435 = weight(abstract_txt:text in 5866) [ClassicSimilarity], result of:
            0.036195435 = score(doc=5866,freq=1.0), product of:
              0.11465357 = queryWeight, product of:
                1.8008422 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01575563 = queryNorm
              0.3156939 = fieldWeight in 5866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=5866)
        0.28 = coord(7/25)
    
  3. Hafer, M.A.; Weiss, S.F.: Word segmentation by letter successor varieties (1974) 0.13
    0.13281436 = sum of:
      0.13281436 = product of:
        0.474337 = sum of:
          0.02607696 = weight(abstract_txt:retrieval in 5065) [ClassicSimilarity], result of:
            0.02607696 = score(doc=5065,freq=2.0), product of:
              0.056575507 = queryWeight, product of:
                1.0328814 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01575563 = queryNorm
              0.46092314 = fieldWeight in 5065, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.09375 = fieldNorm(doc=5065)
          0.026125267 = weight(abstract_txt:results in 5065) [ClassicSimilarity], result of:
            0.026125267 = score(doc=5065,freq=2.0), product of:
              0.056645356 = queryWeight, product of:
                1.0335188 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.01575563 = queryNorm
              0.46120757 = fieldWeight in 5065, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.09375 = fieldNorm(doc=5065)
          0.12846616 = weight(abstract_txt:stemming in 5065) [ClassicSimilarity], result of:
            0.12846616 = score(doc=5065,freq=2.0), product of:
              0.13000886 = queryWeight, product of:
                1.1071532 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.01575563 = queryNorm
              0.98813385 = fieldWeight in 5065, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.09375 = fieldNorm(doc=5065)
          0.11287656 = weight(abstract_txt:stem in 5065) [ClassicSimilarity], result of:
            0.11287656 = score(doc=5065,freq=1.0), product of:
              0.15026562 = queryWeight, product of:
                1.1902852 = boost
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.01575563 = queryNorm
              0.7511802 = fieldWeight in 5065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.09375 = fieldNorm(doc=5065)
          0.11861442 = weight(abstract_txt:stems in 5065) [ClassicSimilarity], result of:
            0.11861442 = score(doc=5065,freq=1.0), product of:
              0.15531573 = queryWeight, product of:
                1.2101214 = boost
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.01575563 = queryNorm
              0.7636987 = fieldWeight in 5065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.09375 = fieldNorm(doc=5065)
          0.025256237 = weight(abstract_txt:system in 5065) [ClassicSimilarity], result of:
            0.025256237 = score(doc=5065,freq=1.0), product of:
              0.079874836 = queryWeight, product of:
                1.5030965 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.01575563 = queryNorm
              0.31619766 = fieldWeight in 5065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.09375 = fieldNorm(doc=5065)
          0.03692139 = weight(abstract_txt:describes in 5065) [ClassicSimilarity], result of:
            0.03692139 = score(doc=5065,freq=1.0), product of:
              0.10288441 = queryWeight, product of:
                1.7059121 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.01575563 = queryNorm
              0.35886282 = fieldWeight in 5065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.09375 = fieldNorm(doc=5065)
        0.28 = coord(7/25)
    
  4. Vickery, B.C.; Vickery, A.: ¬An application of language processing for a search interface (1992) 0.13
    0.13059348 = sum of:
      0.13059348 = product of:
        0.5441395 = sum of:
          0.105979174 = weight(abstract_txt:stemming in 3757) [ClassicSimilarity], result of:
            0.105979174 = score(doc=3757,freq=1.0), product of:
              0.13000886 = queryWeight, product of:
                1.1071532 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.01575563 = queryNorm
              0.81516886 = fieldWeight in 3757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.109375 = fieldNorm(doc=3757)
          0.12007014 = weight(abstract_txt:noun in 3757) [ClassicSimilarity], result of:
            0.12007014 = score(doc=3757,freq=1.0), product of:
              0.14129147 = queryWeight, product of:
                1.1541951 = boost
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.01575563 = queryNorm
              0.8498046 = fieldWeight in 3757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.109375 = fieldNorm(doc=3757)
          0.029465608 = weight(abstract_txt:system in 3757) [ClassicSimilarity], result of:
            0.029465608 = score(doc=3757,freq=1.0), product of:
              0.079874836 = queryWeight, product of:
                1.5030965 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.01575563 = queryNorm
              0.36889726 = fieldWeight in 3757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.109375 = fieldNorm(doc=3757)
          0.043074954 = weight(abstract_txt:describes in 3757) [ClassicSimilarity], result of:
            0.043074954 = score(doc=3757,freq=1.0), product of:
              0.10288441 = queryWeight, product of:
                1.7059121 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.01575563 = queryNorm
              0.41867328 = fieldWeight in 3757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.109375 = fieldNorm(doc=3757)
          0.050673608 = weight(abstract_txt:text in 3757) [ClassicSimilarity], result of:
            0.050673608 = score(doc=3757,freq=1.0), product of:
              0.11465357 = queryWeight, product of:
                1.8008422 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01575563 = queryNorm
              0.44197148 = fieldWeight in 3757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.109375 = fieldNorm(doc=3757)
          0.19487604 = weight(abstract_txt:dictionaries in 3757) [ClassicSimilarity], result of:
            0.19487604 = score(doc=3757,freq=1.0), product of:
              0.24585266 = queryWeight, product of:
                2.153147 = boost
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.01575563 = queryNorm
              0.7926538 = fieldWeight in 3757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.109375 = fieldNorm(doc=3757)
        0.24 = coord(6/25)
    
  5. Maturana, M.T.I.: Beneficios de la utilizacion de lenguajes controlados en el analisis y recuperacion de informacion (1997) 0.12
    0.12113377 = sum of:
      0.12113377 = product of:
        0.75708604 = sum of:
          0.08786532 = weight(abstract_txt:differs in 3055) [ClassicSimilarity], result of:
            0.08786532 = score(doc=3055,freq=1.0), product of:
              0.1271556 = queryWeight, product of:
                1.0949366 = boost
                7.370734 = idf(docFreq=75, maxDocs=44421)
                0.01575563 = queryNorm
              0.6910063 = fieldWeight in 3055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.370734 = idf(docFreq=75, maxDocs=44421)
                0.09375 = fieldNorm(doc=3055)
          0.034558956 = weight(abstract_txt:searching in 3055) [ClassicSimilarity], result of:
            0.034558956 = score(doc=3055,freq=1.0), product of:
              0.0860018 = queryWeight, product of:
                1.2734737 = boost
                4.2862926 = idf(docFreq=1660, maxDocs=44421)
                0.01575563 = queryNorm
              0.4018399 = fieldWeight in 3055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2862926 = idf(docFreq=1660, maxDocs=44421)
                0.09375 = fieldNorm(doc=3055)
          0.043434523 = weight(abstract_txt:text in 3055) [ClassicSimilarity], result of:
            0.043434523 = score(doc=3055,freq=1.0), product of:
              0.11465357 = queryWeight, product of:
                1.8008422 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01575563 = queryNorm
              0.3788327 = fieldWeight in 3055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.09375 = fieldNorm(doc=3055)
          0.59122723 = weight(abstract_txt:latin in 3055) [ClassicSimilarity], result of:
            0.59122723 = score(doc=3055,freq=2.0), product of:
              0.5710009 = queryWeight, product of:
                4.6405516 = boost
                7.809647 = idf(docFreq=48, maxDocs=44421)
                0.01575563 = queryNorm
              1.0354227 = fieldWeight in 3055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.809647 = idf(docFreq=48, maxDocs=44421)
                0.09375 = fieldNorm(doc=3055)
        0.16 = coord(4/25)