Document (#6275)

Author
Uratani, N.
Takeda, M.
Title
¬A fast string-searching algorithm for multiple patterns
Source
Information processing and management. 29(1993) no.6, S.775-791
Year
1993
Abstract
The string-searching problem is to find all occurrences of pattern(s) in a text string. The Aho-Corasick string searching algorithm simultaneously finds all occurrences of multiple patterns in one pass through the text. The Boyer-Moore algorithm is the fastest algorithm for a single pattern. By combining the ideas of these two algorithms, presents an efficient string searching algorithm for multiple patterns. The algorithm runs in sublinear time, on the average, as the BM algorithm achieves, and its preprocessing time is linear proportional to the sum of the lengths of the patterns like the AC algorithm
Theme
Retrievalalgorithmen

Similar documents (content)

  1. Baeza-Yates, R.A.: String searching algorithms (1992) 0.19
    0.19122241 = sum of:
      0.19122241 = product of:
        1.1951401 = sum of:
          0.041691948 = weight(abstract_txt:text in 4505) [ClassicSimilarity], result of:
            0.041691948 = score(doc=4505,freq=1.0), product of:
              0.06603223 = queryWeight, product of:
                1.3976067 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.011692162 = queryNorm
              0.6313878 = fieldWeight in 4505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.15625 = fieldNorm(doc=4505)
          0.09951741 = weight(abstract_txt:searching in 4505) [ClassicSimilarity], result of:
            0.09951741 = score(doc=4505,freq=1.0), product of:
              0.14859262 = queryWeight, product of:
                2.9649723 = boost
                4.2862926 = idf(docFreq=1660, maxDocs=44421)
                0.011692162 = queryNorm
              0.6697332 = fieldWeight in 4505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2862926 = idf(docFreq=1660, maxDocs=44421)
                0.15625 = fieldNorm(doc=4505)
          0.58462685 = weight(abstract_txt:string in 4505) [ClassicSimilarity], result of:
            0.58462685 = score(doc=4505,freq=1.0), product of:
              0.52113914 = queryWeight, product of:
                6.2080307 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.011692162 = queryNorm
              1.1218249 = fieldWeight in 4505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.15625 = fieldNorm(doc=4505)
          0.4693039 = weight(abstract_txt:algorithm in 4505) [ClassicSimilarity], result of:
            0.4693039 = score(doc=4505,freq=1.0), product of:
              0.5264745 = queryWeight, product of:
                7.892701 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.011692162 = queryNorm
              0.8914087 = fieldWeight in 4505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.15625 = fieldNorm(doc=4505)
        0.16 = coord(4/25)
    
  2. Cui, H.; Boufford, D.; Selden, P.: Semantic annotation of biosystematics literature without training examples (2010) 0.16
    0.16023676 = sum of:
      0.16023676 = product of:
        0.6676532 = sum of:
          0.03751736 = weight(abstract_txt:linear in 409) [ClassicSimilarity], result of:
            0.03751736 = score(doc=409,freq=1.0), product of:
              0.08998278 = queryWeight, product of:
                1.1536437 = boost
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.011692162 = queryNorm
              0.41693935 = fieldWeight in 409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.0625 = fieldNorm(doc=409)
          0.06166598 = weight(abstract_txt:runs in 409) [ClassicSimilarity], result of:
            0.06166598 = score(doc=409,freq=1.0), product of:
              0.1253242 = queryWeight, product of:
                1.3614744 = boost
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.011692162 = queryNorm
              0.49205163 = fieldWeight in 409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.0625 = fieldNorm(doc=409)
          0.016676778 = weight(abstract_txt:text in 409) [ClassicSimilarity], result of:
            0.016676778 = score(doc=409,freq=1.0), product of:
              0.06603223 = queryWeight, product of:
                1.3976067 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.011692162 = queryNorm
              0.25255513 = fieldWeight in 409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=409)
          0.018047685 = weight(abstract_txt:time in 409) [ClassicSimilarity], result of:
            0.018047685 = score(doc=409,freq=1.0), product of:
              0.069603145 = queryWeight, product of:
                1.4348993 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.011692162 = queryNorm
              0.2592941 = fieldWeight in 409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.0625 = fieldNorm(doc=409)
          0.07392331 = weight(abstract_txt:patterns in 409) [ClassicSimilarity], result of:
            0.07392331 = score(doc=409,freq=1.0), product of:
              0.22449782 = queryWeight, product of:
                3.644417 = boost
                5.2685275 = idf(docFreq=621, maxDocs=44421)
                0.011692162 = queryNorm
              0.32928297 = fieldWeight in 409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2685275 = idf(docFreq=621, maxDocs=44421)
                0.0625 = fieldNorm(doc=409)
          0.45982206 = weight(abstract_txt:algorithm in 409) [ClassicSimilarity], result of:
            0.45982206 = score(doc=409,freq=6.0), product of:
              0.5264745 = queryWeight, product of:
                7.892701 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.011692162 = queryNorm
              0.8733986 = fieldWeight in 409, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.0625 = fieldNorm(doc=409)
        0.24 = coord(6/25)
    
  3. Ackermann, J.: Knuth-Morris-Pratt (2005) 0.15
    0.15331373 = sum of:
      0.15331373 = product of:
        0.6388072 = sum of:
          0.021380862 = weight(abstract_txt:fast in 990) [ClassicSimilarity], result of:
            0.021380862 = score(doc=990,freq=1.0), product of:
              0.067610785 = queryWeight, product of:
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.011692162 = queryNorm
              0.3162345 = fieldWeight in 990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.0546875 = fieldNorm(doc=990)
          0.014592182 = weight(abstract_txt:text in 990) [ClassicSimilarity], result of:
            0.014592182 = score(doc=990,freq=1.0), product of:
              0.06603223 = queryWeight, product of:
                1.3976067 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.011692162 = queryNorm
              0.22098574 = fieldWeight in 990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=990)
          0.0985057 = weight(abstract_txt:moore in 990) [ClassicSimilarity], result of:
            0.0985057 = score(doc=990,freq=1.0), product of:
              0.18720038 = queryWeight, product of:
                1.6639695 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.011692162 = queryNorm
              0.5262046 = fieldWeight in 990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.0546875 = fieldNorm(doc=990)
          0.10760793 = weight(abstract_txt:boyer in 990) [ClassicSimilarity], result of:
            0.10760793 = score(doc=990,freq=1.0), product of:
              0.19856164 = queryWeight, product of:
                1.7137192 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.011692162 = queryNorm
              0.5419372 = fieldWeight in 990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0546875 = fieldNorm(doc=990)
          0.10734497 = weight(abstract_txt:pattern in 990) [ClassicSimilarity], result of:
            0.10734497 = score(doc=990,freq=4.0), product of:
              0.15734163 = queryWeight, product of:
                2.1573908 = boost
                6.2376356 = idf(docFreq=235, maxDocs=44421)
                0.011692162 = queryNorm
              0.6822414 = fieldWeight in 990, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2376356 = idf(docFreq=235, maxDocs=44421)
                0.0546875 = fieldNorm(doc=990)
          0.2893755 = weight(abstract_txt:string in 990) [ClassicSimilarity], result of:
            0.2893755 = score(doc=990,freq=2.0), product of:
              0.52113914 = queryWeight, product of:
                6.2080307 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.011692162 = queryNorm
              0.55527496 = fieldWeight in 990, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.0546875 = fieldNorm(doc=990)
        0.24 = coord(6/25)
    
  4. Mustafa, S.H.: Word-oriented approximate string matching using occurrence heuristic tables : a heuristic for searching Arabic text (2005) 0.11
    0.111267745 = sum of:
      0.111267745 = product of:
        0.5563387 = sum of:
          0.020845974 = weight(abstract_txt:text in 2715) [ClassicSimilarity], result of:
            0.020845974 = score(doc=2715,freq=1.0), product of:
              0.06603223 = queryWeight, product of:
                1.3976067 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.011692162 = queryNorm
              0.3156939 = fieldWeight in 2715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=2715)
          0.12931478 = weight(abstract_txt:occurrences in 2715) [ClassicSimilarity], result of:
            0.12931478 = score(doc=2715,freq=1.0), product of:
              0.22293246 = queryWeight, product of:
                2.567992 = boost
                7.4248013 = idf(docFreq=71, maxDocs=44421)
                0.011692162 = queryNorm
              0.5800626 = fieldWeight in 2715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4248013 = idf(docFreq=71, maxDocs=44421)
                0.078125 = fieldNorm(doc=2715)
          0.06410584 = weight(abstract_txt:multiple in 2715) [ClassicSimilarity], result of:
            0.06410584 = score(doc=2715,freq=1.0), product of:
              0.15984657 = queryWeight, product of:
                2.6632032 = boost
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.011692162 = queryNorm
              0.40104604 = fieldWeight in 2715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.078125 = fieldNorm(doc=2715)
          0.049758706 = weight(abstract_txt:searching in 2715) [ClassicSimilarity], result of:
            0.049758706 = score(doc=2715,freq=1.0), product of:
              0.14859262 = queryWeight, product of:
                2.9649723 = boost
                4.2862926 = idf(docFreq=1660, maxDocs=44421)
                0.011692162 = queryNorm
              0.3348666 = fieldWeight in 2715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2862926 = idf(docFreq=1660, maxDocs=44421)
                0.078125 = fieldNorm(doc=2715)
          0.29231343 = weight(abstract_txt:string in 2715) [ClassicSimilarity], result of:
            0.29231343 = score(doc=2715,freq=1.0), product of:
              0.52113914 = queryWeight, product of:
                6.2080307 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.011692162 = queryNorm
              0.56091243 = fieldWeight in 2715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.078125 = fieldNorm(doc=2715)
        0.2 = coord(5/25)
    
  5. Wang, P.; Hao, T.; Yan, J.; Jin, L.: Large-scale extraction of drug-disease pairs from the medical literature (2017) 0.11
    0.10620717 = sum of:
      0.10620717 = product of:
        0.53103584 = sum of:
          0.052620515 = weight(abstract_txt:achieves in 4927) [ClassicSimilarity], result of:
            0.052620515 = score(doc=4927,freq=1.0), product of:
              0.112747766 = queryWeight, product of:
                1.291356 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.011692162 = queryNorm
              0.46671006 = fieldWeight in 4927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.023584526 = weight(abstract_txt:text in 4927) [ClassicSimilarity], result of:
            0.023584526 = score(doc=4927,freq=2.0), product of:
              0.06603223 = queryWeight, product of:
                1.3976067 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.011692162 = queryNorm
              0.3571669 = fieldWeight in 4927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.018047685 = weight(abstract_txt:time in 4927) [ClassicSimilarity], result of:
            0.018047685 = score(doc=4927,freq=1.0), product of:
              0.069603145 = queryWeight, product of:
                1.4348993 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.011692162 = queryNorm
              0.2592941 = fieldWeight in 4927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.061339986 = weight(abstract_txt:pattern in 4927) [ClassicSimilarity], result of:
            0.061339986 = score(doc=4927,freq=1.0), product of:
              0.15734163 = queryWeight, product of:
                2.1573908 = boost
                6.2376356 = idf(docFreq=235, maxDocs=44421)
                0.011692162 = queryNorm
              0.38985223 = fieldWeight in 4927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2376356 = idf(docFreq=235, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.37544313 = weight(abstract_txt:algorithm in 4927) [ClassicSimilarity], result of:
            0.37544313 = score(doc=4927,freq=4.0), product of:
              0.5264745 = queryWeight, product of:
                7.892701 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.011692162 = queryNorm
              0.71312696 = fieldWeight in 4927, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
        0.2 = coord(5/25)