Document (#23335)

Author
Wollf, J.G.
Title
¬A scalable technique for best-match retrieval of sequential information using metrics-guided search
Source
Journal of information science. 20(1994) no.1, S.16-28
Year
1994
Abstract
Describes a new technique for retrieving information by finding the best match or matches between a textual query and a textual database. The technique uses principles of beam search with a measure of probability to guide the search and prune the search tree. Unlike many methods for comparing strings, the method gives a set of alternative matches, graded by the quality of the matching. The new technique is embodies in a software simulation SP21 which runs on a conventional computer. Presnts examples showing best-match retrieval of information from a textual database. Presents analytic and emprirical evidence on the performance of the technique. It lends itself well to parallel processing. Discusses planned developments
Theme
Retrievalalgorithmen

Similar documents (content)

  1. Loughran, H.: ¬A review of nearest neighbour information retrieval (1994) 0.20
    0.19766392 = sum of:
      0.19766392 = product of:
        0.8235997 = sum of:
          0.02943905 = weight(abstract_txt:retrieval in 684) [ClassicSimilarity], result of:
            0.02943905 = score(doc=684,freq=1.0), product of:
              0.06774411 = queryWeight, product of:
                1.0673798 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.018256197 = queryNorm
              0.4345625 = fieldWeight in 684, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.125 = fieldNorm(doc=684)
          0.0148744825 = weight(abstract_txt:information in 684) [ClassicSimilarity], result of:
            0.0148744825 = score(doc=684,freq=1.0), product of:
              0.049194213 = queryWeight, product of:
                1.1140018 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.018256197 = queryNorm
              0.30236244 = fieldWeight in 684, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.125 = fieldNorm(doc=684)
          0.06839839 = weight(abstract_txt:search in 684) [ClassicSimilarity], result of:
            0.06839839 = score(doc=684,freq=1.0), product of:
              0.14972568 = queryWeight, product of:
                2.2441216 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.018256197 = queryNorm
              0.45682475 = fieldWeight in 684, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.125 = fieldNorm(doc=684)
          0.13270377 = weight(abstract_txt:best in 684) [ClassicSimilarity], result of:
            0.13270377 = score(doc=684,freq=1.0), product of:
              0.21161264 = queryWeight, product of:
                2.3104665 = boost
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.018256197 = queryNorm
              0.62710696 = fieldWeight in 684, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.125 = fieldNorm(doc=684)
          0.27148807 = weight(abstract_txt:match in 684) [ClassicSimilarity], result of:
            0.27148807 = score(doc=684,freq=1.0), product of:
              0.34102532 = queryWeight, product of:
                2.9330683 = boost
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.018256197 = queryNorm
              0.7960936 = fieldWeight in 684, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.125 = fieldNorm(doc=684)
          0.30669597 = weight(abstract_txt:technique in 684) [ClassicSimilarity], result of:
            0.30669597 = score(doc=684,freq=1.0), product of:
              0.43857217 = queryWeight, product of:
                4.294116 = boost
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.018256197 = queryNorm
              0.6993056 = fieldWeight in 684, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.125 = fieldNorm(doc=684)
        0.24 = coord(6/25)
    
  2. Sakai, T.: On the reliability of information retrieval metrics based on graded relevance (2007) 0.17
    0.16866285 = sum of:
      0.16866285 = product of:
        0.7027619 = sum of:
          0.14046872 = weight(abstract_txt:metrics in 1910) [ClassicSimilarity], result of:
            0.14046872 = score(doc=1910,freq=5.0), product of:
              0.12191452 = queryWeight, product of:
                1.0125022 = boost
                6.595522 = idf(docFreq=164, maxDocs=44421)
                0.018256197 = queryNorm
              1.1521902 = fieldWeight in 1910, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.595522 = idf(docFreq=164, maxDocs=44421)
                0.078125 = fieldNorm(doc=1910)
          0.018399406 = weight(abstract_txt:retrieval in 1910) [ClassicSimilarity], result of:
            0.018399406 = score(doc=1910,freq=1.0), product of:
              0.06774411 = queryWeight, product of:
                1.0673798 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.018256197 = queryNorm
              0.27160156 = fieldWeight in 1910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=1910)
          0.009296551 = weight(abstract_txt:information in 1910) [ClassicSimilarity], result of:
            0.009296551 = score(doc=1910,freq=1.0), product of:
              0.049194213 = queryWeight, product of:
                1.1140018 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.018256197 = queryNorm
              0.18897653 = fieldWeight in 1910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=1910)
          0.10684128 = weight(abstract_txt:runs in 1910) [ClassicSimilarity], result of:
            0.10684128 = score(doc=1910,freq=1.0), product of:
              0.17370743 = queryWeight, product of:
                1.2085856 = boost
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.018256197 = queryNorm
              0.61506456 = fieldWeight in 1910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.078125 = fieldNorm(doc=1910)
          0.2840999 = weight(abstract_txt:graded in 1910) [ClassicSimilarity], result of:
            0.2840999 = score(doc=1910,freq=4.0), product of:
              0.21003246 = queryWeight, product of:
                1.3289586 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.018256197 = queryNorm
              1.3526477 = fieldWeight in 1910, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.078125 = fieldNorm(doc=1910)
          0.14365605 = weight(abstract_txt:best in 1910) [ClassicSimilarity], result of:
            0.14365605 = score(doc=1910,freq=3.0), product of:
              0.21161264 = queryWeight, product of:
                2.3104665 = boost
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.018256197 = queryNorm
              0.6788632 = fieldWeight in 1910, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.078125 = fieldNorm(doc=1910)
        0.24 = coord(6/25)
    
  3. Sormunen, E.; Kekäläinen, J.; Koivisto, J.; Järvelin, K.: Document text characteristics affect the ranking of the most relevant documents by expanded structured queries (2001) 0.15
    0.15032019 = sum of:
      0.15032019 = product of:
        0.5368578 = sum of:
          0.020816553 = weight(abstract_txt:retrieval in 5487) [ClassicSimilarity], result of:
            0.020816553 = score(doc=5487,freq=2.0), product of:
              0.06774411 = queryWeight, product of:
                1.0673798 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.018256197 = queryNorm
              0.3072821 = fieldWeight in 5487, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=5487)
          0.0148744825 = weight(abstract_txt:information in 5487) [ClassicSimilarity], result of:
            0.0148744825 = score(doc=5487,freq=4.0), product of:
              0.049194213 = queryWeight, product of:
                1.1140018 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.018256197 = queryNorm
              0.30236244 = fieldWeight in 5487, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=5487)
          0.034199197 = weight(abstract_txt:search in 5487) [ClassicSimilarity], result of:
            0.034199197 = score(doc=5487,freq=1.0), product of:
              0.14972568 = queryWeight, product of:
                2.2441216 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.018256197 = queryNorm
              0.22841237 = fieldWeight in 5487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=5487)
          0.06635188 = weight(abstract_txt:best in 5487) [ClassicSimilarity], result of:
            0.06635188 = score(doc=5487,freq=1.0), product of:
              0.21161264 = queryWeight, product of:
                2.3104665 = boost
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.018256197 = queryNorm
              0.31355348 = fieldWeight in 5487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.0625 = fieldNorm(doc=5487)
          0.11152362 = weight(abstract_txt:textual in 5487) [ClassicSimilarity], result of:
            0.11152362 = score(doc=5487,freq=1.0), product of:
              0.29914656 = queryWeight, product of:
                2.7470772 = boost
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.018256197 = queryNorm
              0.37280595 = fieldWeight in 5487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.0625 = fieldNorm(doc=5487)
          0.13574404 = weight(abstract_txt:match in 5487) [ClassicSimilarity], result of:
            0.13574404 = score(doc=5487,freq=1.0), product of:
              0.34102532 = queryWeight, product of:
                2.9330683 = boost
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.018256197 = queryNorm
              0.3980468 = fieldWeight in 5487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.0625 = fieldNorm(doc=5487)
          0.15334798 = weight(abstract_txt:technique in 5487) [ClassicSimilarity], result of:
            0.15334798 = score(doc=5487,freq=1.0), product of:
              0.43857217 = queryWeight, product of:
                4.294116 = boost
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.018256197 = queryNorm
              0.3496528 = fieldWeight in 5487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.0625 = fieldNorm(doc=5487)
        0.28 = coord(7/25)
    
  4. Hoad, T.C.; Zobel, J.: Methods for identifying versioned and plagiarized documents (2003) 0.12
    0.12243855 = sum of:
      0.12243855 = product of:
        0.5101606 = sum of:
          0.014719525 = weight(abstract_txt:retrieval in 159) [ClassicSimilarity], result of:
            0.014719525 = score(doc=159,freq=1.0), product of:
              0.06774411 = queryWeight, product of:
                1.0673798 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.018256197 = queryNorm
              0.21728125 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.0074372413 = weight(abstract_txt:information in 159) [ClassicSimilarity], result of:
            0.0074372413 = score(doc=159,freq=1.0), product of:
              0.049194213 = queryWeight, product of:
                1.1140018 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.018256197 = queryNorm
              0.15118122 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.06904115 = weight(abstract_txt:strings in 159) [ClassicSimilarity], result of:
            0.06904115 = score(doc=159,freq=1.0), product of:
              0.15066221 = queryWeight, product of:
                1.1255646 = boost
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.018256197 = queryNorm
              0.45825124 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.06635188 = weight(abstract_txt:best in 159) [ClassicSimilarity], result of:
            0.06635188 = score(doc=159,freq=1.0), product of:
              0.21161264 = queryWeight, product of:
                2.3104665 = boost
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.018256197 = queryNorm
              0.31355348 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.13574404 = weight(abstract_txt:match in 159) [ClassicSimilarity], result of:
            0.13574404 = score(doc=159,freq=1.0), product of:
              0.34102532 = queryWeight, product of:
                2.9330683 = boost
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.018256197 = queryNorm
              0.3980468 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.21686679 = weight(abstract_txt:technique in 159) [ClassicSimilarity], result of:
            0.21686679 = score(doc=159,freq=2.0), product of:
              0.43857217 = queryWeight, product of:
                4.294116 = boost
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.018256197 = queryNorm
              0.4944837 = fieldWeight in 159, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
        0.24 = coord(6/25)
    
  5. He, W.; Erdelez, S.; Wang, F.-K.; Shyu, C.-R.: ¬The effects of conceptual description and search practice on users' mental models and information seeking in a case-based reasoning retrieval system (2008) 0.12
    0.121623136 = sum of:
      0.121623136 = product of:
        0.6081157 = sum of:
          0.020816553 = weight(abstract_txt:retrieval in 3036) [ClassicSimilarity], result of:
            0.020816553 = score(doc=3036,freq=2.0), product of:
              0.06774411 = queryWeight, product of:
                1.0673798 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.018256197 = queryNorm
              0.3072821 = fieldWeight in 3036, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=3036)
          0.0074372413 = weight(abstract_txt:information in 3036) [ClassicSimilarity], result of:
            0.0074372413 = score(doc=3036,freq=1.0), product of:
              0.049194213 = queryWeight, product of:
                1.1140018 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.018256197 = queryNorm
              0.15118122 = fieldWeight in 3036, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=3036)
          0.12796168 = weight(abstract_txt:search in 3036) [ClassicSimilarity], result of:
            0.12796168 = score(doc=3036,freq=14.0), product of:
              0.14972568 = queryWeight, product of:
                2.2441216 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.018256197 = queryNorm
              0.8546409 = fieldWeight in 3036, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=3036)
          0.14836732 = weight(abstract_txt:best in 3036) [ClassicSimilarity], result of:
            0.14836732 = score(doc=3036,freq=5.0), product of:
              0.21161264 = queryWeight, product of:
                2.3104665 = boost
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.018256197 = queryNorm
              0.70112693 = fieldWeight in 3036, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.0168557 = idf(docFreq=799, maxDocs=44421)
                0.0625 = fieldNorm(doc=3036)
          0.3035329 = weight(abstract_txt:match in 3036) [ClassicSimilarity], result of:
            0.3035329 = score(doc=3036,freq=5.0), product of:
              0.34102532 = queryWeight, product of:
                2.9330683 = boost
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.018256197 = queryNorm
              0.8900597 = fieldWeight in 3036, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.0625 = fieldNorm(doc=3036)
        0.2 = coord(5/25)