Document (#16057)

Author
Greengrass, M.
Title
Conflation methods for searching databases of Latin text
Imprint
London : British Library
Year
1996
Pages
75 S
Series
Research and innovation report; 18
Abstract
Describes the results of a project to develop conflation tools for searching databases of Latin text. Reports on the results of a questionnaire sent to 64 users of Latin text retrieval systems. Describes a Latin stemming algorithm that uses a simple longest match with some recoding but differs from most stemmers in its use of 2 separate suffix dictionaries for processing query and database words. Describes a retrieval system in which a user inputs the principal component of their search term, these components are stemmed and the resulting stems matched against the noun based and verb based stem dictionaries. Evaluates the system, describing its limitations, and a more complex system
Theme
Computerlinguistik

Similar documents (content)

  1. Galvez, C.; Moya-Anegón, F. de: ¬An evaluation of conflation accuracy using finite-state transducers (2006) 0.17
    0.17434283 = sum of:
      0.17434283 = product of:
        0.8717141 = sum of:
          0.026604705 = weight(abstract_txt:retrieval in 5599) [ClassicSimilarity], result of:
            0.026604705 = score(doc=5599,freq=3.0), product of:
              0.056576435 = queryWeight, product of:
                1.03318 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.015757501 = queryNorm
              0.47024357 = fieldWeight in 5599, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=5599)
          0.015457007 = weight(abstract_txt:results in 5599) [ClassicSimilarity], result of:
            0.015457007 = score(doc=5599,freq=1.0), product of:
              0.056813814 = queryWeight, product of:
                1.0353452 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.015757501 = queryNorm
              0.27206424 = fieldWeight in 5599, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=5599)
          0.1471495 = weight(abstract_txt:stemmed in 5599) [ClassicSimilarity], result of:
            0.1471495 = score(doc=5599,freq=1.0), product of:
              0.2025503 = queryWeight, product of:
                1.3823234 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.015757501 = queryNorm
              0.72648376 = fieldWeight in 5599, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.078125 = fieldNorm(doc=5599)
          0.13904452 = weight(abstract_txt:dictionaries in 5599) [ClassicSimilarity], result of:
            0.13904452 = score(doc=5599,freq=1.0), product of:
              0.24573836 = queryWeight, product of:
                2.1532512 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.015757501 = queryNorm
              0.56582344 = fieldWeight in 5599, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.078125 = fieldNorm(doc=5599)
          0.5434584 = weight(abstract_txt:conflation in 5599) [ClassicSimilarity], result of:
            0.5434584 = score(doc=5599,freq=3.0), product of:
              0.42277324 = queryWeight, product of:
                2.8243074 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.015757501 = queryNorm
              1.2854607 = fieldWeight in 5599, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.078125 = fieldNorm(doc=5599)
        0.2 = coord(5/25)
    
  2. Kraaij, W.; Pohlmann, R.: Evaluation of a Dutch stemming algorithm (1995) 0.17
    0.16533048 = sum of:
      0.16533048 = product of:
        0.590466 = sum of:
          0.015360233 = weight(abstract_txt:retrieval in 5798) [ClassicSimilarity], result of:
            0.015360233 = score(doc=5798,freq=1.0), product of:
              0.056576435 = queryWeight, product of:
                1.03318 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.015757501 = queryNorm
              0.27149525 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.13097832 = weight(abstract_txt:stemming in 5798) [ClassicSimilarity], result of:
            0.13097832 = score(doc=5798,freq=3.0), product of:
              0.12995297 = queryWeight, product of:
                1.1072261 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.015757501 = queryNorm
              1.0078901 = fieldWeight in 5798, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.09397766 = weight(abstract_txt:stem in 5798) [ClassicSimilarity], result of:
            0.09397766 = score(doc=5798,freq=1.0), product of:
              0.1502139 = queryWeight, product of:
                1.1904147 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.015757501 = queryNorm
              0.6256256 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.13599522 = weight(abstract_txt:stemmers in 5798) [ClassicSimilarity], result of:
            0.13599522 = score(doc=5798,freq=1.0), product of:
              0.19218056 = queryWeight, product of:
                1.3464739 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.015757501 = queryNorm
              0.707643 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.1471495 = weight(abstract_txt:stemmed in 5798) [ClassicSimilarity], result of:
            0.1471495 = score(doc=5798,freq=1.0), product of:
              0.2025503 = queryWeight, product of:
                1.3823234 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.015757501 = queryNorm
              0.72648376 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.030700363 = weight(abstract_txt:describes in 5798) [ClassicSimilarity], result of:
            0.030700363 = score(doc=5798,freq=1.0), product of:
              0.10276134 = queryWeight, product of:
                1.7053704 = boost
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.015757501 = queryNorm
              0.298754 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
          0.0363047 = weight(abstract_txt:text in 5798) [ClassicSimilarity], result of:
            0.0363047 = score(doc=5798,freq=1.0), product of:
              0.11491481 = queryWeight, product of:
                1.8033991 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015757501 = queryNorm
              0.3159271 = fieldWeight in 5798, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=5798)
        0.28 = coord(7/25)
    
  3. Hafer, M.A.; Weiss, S.F.: Word segmentation by letter successor varieties (1974) 0.13
    0.13272542 = sum of:
      0.13272542 = product of:
        0.47401935 = sum of:
          0.02606718 = weight(abstract_txt:retrieval in 4997) [ClassicSimilarity], result of:
            0.02606718 = score(doc=4997,freq=2.0), product of:
              0.056576435 = queryWeight, product of:
                1.03318 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.015757501 = queryNorm
              0.4607427 = fieldWeight in 4997, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.09375 = fieldNorm(doc=4997)
          0.02623141 = weight(abstract_txt:results in 4997) [ClassicSimilarity], result of:
            0.02623141 = score(doc=4997,freq=2.0), product of:
              0.056813814 = queryWeight, product of:
                1.0353452 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.015757501 = queryNorm
              0.4617083 = fieldWeight in 4997, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.09375 = fieldNorm(doc=4997)
          0.128332 = weight(abstract_txt:stemming in 4997) [ClassicSimilarity], result of:
            0.128332 = score(doc=4997,freq=2.0), product of:
              0.12995297 = queryWeight, product of:
                1.1072261 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.015757501 = queryNorm
              0.9875266 = fieldWeight in 4997, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.09375 = fieldNorm(doc=4997)
          0.1127732 = weight(abstract_txt:stem in 4997) [ClassicSimilarity], result of:
            0.1127732 = score(doc=4997,freq=1.0), product of:
              0.1502139 = queryWeight, product of:
                1.1904147 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.015757501 = queryNorm
              0.7507508 = fieldWeight in 4997, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.09375 = fieldNorm(doc=4997)
          0.11850918 = weight(abstract_txt:stems in 4997) [ClassicSimilarity], result of:
            0.11850918 = score(doc=4997,freq=1.0), product of:
              0.15526523 = queryWeight, product of:
                1.2102646 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.015757501 = queryNorm
              0.7632693 = fieldWeight in 4997, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.09375 = fieldNorm(doc=4997)
          0.025265943 = weight(abstract_txt:system in 4997) [ClassicSimilarity], result of:
            0.025265943 = score(doc=4997,freq=1.0), product of:
              0.07991659 = queryWeight, product of:
                1.5039116 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.015757501 = queryNorm
              0.3161539 = fieldWeight in 4997, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.09375 = fieldNorm(doc=4997)
          0.036840435 = weight(abstract_txt:describes in 4997) [ClassicSimilarity], result of:
            0.036840435 = score(doc=4997,freq=1.0), product of:
              0.10276134 = queryWeight, product of:
                1.7053704 = boost
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.015757501 = queryNorm
              0.3585048 = fieldWeight in 4997, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.09375 = fieldNorm(doc=4997)
        0.28 = coord(7/25)
    
  4. Vickery, B.C.; Vickery, A.: ¬An application of language processing for a search interface (1992) 0.13
    0.13050447 = sum of:
      0.13050447 = product of:
        0.54376864 = sum of:
          0.10586851 = weight(abstract_txt:stemming in 2757) [ClassicSimilarity], result of:
            0.10586851 = score(doc=2757,freq=1.0), product of:
              0.12995297 = queryWeight, product of:
                1.1072261 = boost
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.015757501 = queryNorm
              0.8146679 = fieldWeight in 2757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.448392 = idf(docFreq=69, maxDocs=44218)
                0.109375 = fieldNorm(doc=2757)
          0.11995377 = weight(abstract_txt:noun in 2757) [ClassicSimilarity], result of:
            0.11995377 = score(doc=2757,freq=1.0), product of:
              0.1412378 = queryWeight, product of:
                1.1543 = boost
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.015757501 = queryNorm
              0.8493036 = fieldWeight in 2757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.109375 = fieldNorm(doc=2757)
          0.029476933 = weight(abstract_txt:system in 2757) [ClassicSimilarity], result of:
            0.029476933 = score(doc=2757,freq=1.0), product of:
              0.07991659 = queryWeight, product of:
                1.5039116 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.015757501 = queryNorm
              0.36884624 = fieldWeight in 2757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.109375 = fieldNorm(doc=2757)
          0.042980507 = weight(abstract_txt:describes in 2757) [ClassicSimilarity], result of:
            0.042980507 = score(doc=2757,freq=1.0), product of:
              0.10276134 = queryWeight, product of:
                1.7053704 = boost
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.015757501 = queryNorm
              0.4182556 = fieldWeight in 2757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8240511 = idf(docFreq=2624, maxDocs=44218)
                0.109375 = fieldNorm(doc=2757)
          0.05082658 = weight(abstract_txt:text in 2757) [ClassicSimilarity], result of:
            0.05082658 = score(doc=2757,freq=1.0), product of:
              0.11491481 = queryWeight, product of:
                1.8033991 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015757501 = queryNorm
              0.4422979 = fieldWeight in 2757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.109375 = fieldNorm(doc=2757)
          0.19466233 = weight(abstract_txt:dictionaries in 2757) [ClassicSimilarity], result of:
            0.19466233 = score(doc=2757,freq=1.0), product of:
              0.24573836 = queryWeight, product of:
                2.1532512 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.015757501 = queryNorm
              0.7921528 = fieldWeight in 2757, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.109375 = fieldNorm(doc=2757)
        0.24 = coord(6/25)
    
  5. Maturana, M.T.I.: Beneficios de la utilizacion de lenguajes controlados en el analisis y recuperacion de informacion (1997) 0.12
    0.12104731 = sum of:
      0.12104731 = product of:
        0.7565457 = sum of:
          0.08777174 = weight(abstract_txt:differs in 2055) [ClassicSimilarity], result of:
            0.08777174 = score(doc=2055,freq=1.0), product of:
              0.12709916 = queryWeight, product of:
                1.0950011 = boost
                7.3661537 = idf(docFreq=75, maxDocs=44218)
                0.015757501 = queryNorm
              0.6905769 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3661537 = idf(docFreq=75, maxDocs=44218)
                0.09375 = fieldNorm(doc=2055)
          0.034548715 = weight(abstract_txt:searching in 2055) [ClassicSimilarity], result of:
            0.034548715 = score(doc=2055,freq=1.0), product of:
              0.08600772 = queryWeight, product of:
                1.2738754 = boost
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.015757501 = queryNorm
              0.40169317 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.284727 = idf(docFreq=1655, maxDocs=44218)
                0.09375 = fieldNorm(doc=2055)
          0.04356564 = weight(abstract_txt:text in 2055) [ClassicSimilarity], result of:
            0.04356564 = score(doc=2055,freq=1.0), product of:
              0.11491481 = queryWeight, product of:
                1.8033991 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.015757501 = queryNorm
              0.37911248 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=2055)
          0.5906596 = weight(abstract_txt:latin in 2055) [ClassicSimilarity], result of:
            0.5906596 = score(doc=2055,freq=2.0), product of:
              0.57078743 = queryWeight, product of:
                4.640988 = boost
                7.805067 = idf(docFreq=48, maxDocs=44218)
                0.015757501 = queryNorm
              1.0348154 = fieldWeight in 2055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.805067 = idf(docFreq=48, maxDocs=44218)
                0.09375 = fieldNorm(doc=2055)
        0.16 = coord(4/25)