Document (#19144)

Author
Brooks, T.A.
Title
Orthography as a fundamental impediment to online information retrieval
Source
Journal of the American Society for Information Science. 49(1998) no.8, S.731-741
Year
1998
Abstract
Orthography is the linguisitc study of written language: elements of text such as letters, punctuation marks, and spelling. Information retrieval systems operate in the orthographic realm matching some text strings (i.e., index entries) from documents with other text strings (i.e., query terms) from patrons. During the early history of information retrieval, it has been convenient to assume the rationality and uniformity of orthography in order to concentrate effort building information retrieval systems. Fundamental orthographic problems have persisted into modern information retrieval systems, however, where white-space normalization and the arbitrary treatment of punctuation have exaverbated the orthographic impediment to information retrieval

Similar documents (author)

  1. Brooks, T.A.: ¬The model of science and scientific models in librarianship (1989) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:brooks in 417) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 417, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=417)
    
  2. Brooks, T.A.: Private acts and public objects : an investigation of citer motivations (1985) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:brooks in 648) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 648, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=648)
    
  3. Brooks, T.A.: Evidence of complex citer motivation (1986) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:brooks in 649) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 649, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=649)
    
  4. Brooks, D.: System-system interaction in computerized indexing of visual materials : a selected review (1988) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:brooks in 655) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 655, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=655)
    
  5. Brooks, L.: Nonanalytic concept formation and memory for instances (1978) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:brooks in 793) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 793, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=793)
    

Similar documents (content)

  1. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.12
    0.11770158 = sum of:
      0.11770158 = product of:
        0.49042326 = sum of:
          0.010975839 = weight(abstract_txt:have in 3541) [ClassicSimilarity], result of:
            0.010975839 = score(doc=3541,freq=2.0), product of:
              0.04435757 = queryWeight, product of:
                3.199388 = idf(docFreq=4924, maxDocs=44421)
                0.013864392 = queryNorm
              0.24744003 = fieldWeight in 3541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.199388 = idf(docFreq=4924, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3541)
          0.09218696 = weight(abstract_txt:spelling in 3541) [ClassicSimilarity], result of:
            0.09218696 = score(doc=3541,freq=3.0), product of:
              0.12708135 = queryWeight, product of:
                1.1968564 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.013864392 = queryNorm
              0.7254169 = fieldWeight in 3541, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3541)
          0.014109944 = weight(abstract_txt:systems in 3541) [ClassicSimilarity], result of:
            0.014109944 = score(doc=3541,freq=1.0), product of:
              0.07563681 = queryWeight, product of:
                1.5992942 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.013864392 = queryNorm
              0.18654864 = fieldWeight in 3541, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3541)
          0.014230222 = weight(abstract_txt:information in 3541) [ClassicSimilarity], result of:
            0.014230222 = score(doc=3541,freq=2.0), product of:
              0.07606604 = queryWeight, product of:
                2.268152 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.013864392 = queryNorm
              0.18707721 = fieldWeight in 3541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3541)
          0.042245943 = weight(abstract_txt:retrieval in 3541) [ClassicSimilarity], result of:
            0.042245943 = score(doc=3541,freq=2.0), product of:
              0.15712297 = queryWeight, product of:
                3.2598424 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013864392 = queryNorm
              0.26887184 = fieldWeight in 3541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3541)
          0.31667435 = weight(abstract_txt:orthographic in 3541) [ClassicSimilarity], result of:
            0.31667435 = score(doc=3541,freq=1.0), product of:
              0.6018084 = queryWeight, product of:
                4.5111876 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.013864392 = queryNorm
              0.5262046 = fieldWeight in 3541, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3541)
        0.24 = coord(6/25)
    
  2. Levow, G.-A.; Oard, D.W.; Resnik, P.: Dictionary-based techniques for cross-language information retrieval (2005) 0.11
    0.10905565 = sum of:
      0.10905565 = product of:
        0.54527825 = sum of:
          0.01567977 = weight(abstract_txt:have in 2025) [ClassicSimilarity], result of:
            0.01567977 = score(doc=2025,freq=2.0), product of:
              0.04435757 = queryWeight, product of:
                3.199388 = idf(docFreq=4924, maxDocs=44421)
                0.013864392 = queryNorm
              0.35348576 = fieldWeight in 2025, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.199388 = idf(docFreq=4924, maxDocs=44421)
                0.078125 = fieldNorm(doc=2025)
          0.020157063 = weight(abstract_txt:systems in 2025) [ClassicSimilarity], result of:
            0.020157063 = score(doc=2025,freq=1.0), product of:
              0.07563681 = queryWeight, product of:
                1.5992942 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.013864392 = queryNorm
              0.26649806 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.078125 = fieldNorm(doc=2025)
          0.014374696 = weight(abstract_txt:information in 2025) [ClassicSimilarity], result of:
            0.014374696 = score(doc=2025,freq=1.0), product of:
              0.07606604 = queryWeight, product of:
                2.268152 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.013864392 = queryNorm
              0.18897653 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=2025)
          0.042674843 = weight(abstract_txt:retrieval in 2025) [ClassicSimilarity], result of:
            0.042674843 = score(doc=2025,freq=1.0), product of:
              0.15712297 = queryWeight, product of:
                3.2598424 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013864392 = queryNorm
              0.27160156 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=2025)
          0.4523919 = weight(abstract_txt:orthographic in 2025) [ClassicSimilarity], result of:
            0.4523919 = score(doc=2025,freq=1.0), product of:
              0.6018084 = queryWeight, product of:
                4.5111876 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.013864392 = queryNorm
              0.7517208 = fieldWeight in 2025, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.078125 = fieldNorm(doc=2025)
        0.2 = coord(5/25)
    
  3. Whitney , C.; Schiff, L.: ¬The Melvyl Recommender Project : developing library recommendation services (2006) 0.09
    0.08780092 = sum of:
      0.08780092 = product of:
        0.3135747 = sum of:
          0.019203719 = weight(abstract_txt:have in 2173) [ClassicSimilarity], result of:
            0.019203719 = score(doc=2173,freq=3.0), product of:
              0.04435757 = queryWeight, product of:
                3.199388 = idf(docFreq=4924, maxDocs=44421)
                0.013864392 = queryNorm
              0.43292987 = fieldWeight in 2173, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.199388 = idf(docFreq=4924, maxDocs=44421)
                0.078125 = fieldNorm(doc=2173)
          0.07564222 = weight(abstract_txt:patrons in 2173) [ClassicSimilarity], result of:
            0.07564222 = score(doc=2173,freq=2.0), product of:
              0.1005173 = queryWeight, product of:
                1.0644408 = boost
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.013864392 = queryNorm
              0.7525294 = fieldWeight in 2173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.078125 = fieldNorm(doc=2173)
          0.076034516 = weight(abstract_txt:spelling in 2173) [ClassicSimilarity], result of:
            0.076034516 = score(doc=2173,freq=1.0), product of:
              0.12708135 = queryWeight, product of:
                1.1968564 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.013864392 = queryNorm
              0.59831375 = fieldWeight in 2173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.078125 = fieldNorm(doc=2173)
          0.028506393 = weight(abstract_txt:systems in 2173) [ClassicSimilarity], result of:
            0.028506393 = score(doc=2173,freq=2.0), product of:
              0.07563681 = queryWeight, product of:
                1.5992942 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.013864392 = queryNorm
              0.37688518 = fieldWeight in 2173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.078125 = fieldNorm(doc=2173)
          0.033507634 = weight(abstract_txt:text in 2173) [ClassicSimilarity], result of:
            0.033507634 = score(doc=2173,freq=1.0), product of:
              0.10613962 = queryWeight, product of:
                1.8945258 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.013864392 = queryNorm
              0.3156939 = fieldWeight in 2173, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=2173)
          0.020328889 = weight(abstract_txt:information in 2173) [ClassicSimilarity], result of:
            0.020328889 = score(doc=2173,freq=2.0), product of:
              0.07606604 = queryWeight, product of:
                2.268152 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.013864392 = queryNorm
              0.26725316 = fieldWeight in 2173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=2173)
          0.060351342 = weight(abstract_txt:retrieval in 2173) [ClassicSimilarity], result of:
            0.060351342 = score(doc=2173,freq=2.0), product of:
              0.15712297 = queryWeight, product of:
                3.2598424 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013864392 = queryNorm
              0.3841026 = fieldWeight in 2173, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=2173)
        0.28 = coord(7/25)
    
  4. Taghva, K.; Borsack, J.; Condit, A.: Evaluation of model-based retrieval effectiveness with OCR text (1996) 0.08
    0.076730505 = sum of:
      0.076730505 = product of:
        0.3836525 = sum of:
          0.034207672 = weight(abstract_txt:systems in 4553) [ClassicSimilarity], result of:
            0.034207672 = score(doc=4553,freq=2.0), product of:
              0.07563681 = queryWeight, product of:
                1.5992942 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.013864392 = queryNorm
              0.4522622 = fieldWeight in 4553, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.09375 = fieldNorm(doc=4553)
          0.06964431 = weight(abstract_txt:text in 4553) [ClassicSimilarity], result of:
            0.06964431 = score(doc=4553,freq=3.0), product of:
              0.10613962 = queryWeight, product of:
                1.8945258 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.013864392 = queryNorm
              0.6561575 = fieldWeight in 4553, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.09375 = fieldNorm(doc=4553)
          0.017249634 = weight(abstract_txt:information in 4553) [ClassicSimilarity], result of:
            0.017249634 = score(doc=4553,freq=1.0), product of:
              0.07606604 = queryWeight, product of:
                2.268152 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.013864392 = queryNorm
              0.22677183 = fieldWeight in 4553, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.09375 = fieldNorm(doc=4553)
          0.16013125 = weight(abstract_txt:strings in 4553) [ClassicSimilarity], result of:
            0.16013125 = score(doc=4553,freq=1.0), product of:
              0.23295991 = queryWeight, product of:
                2.2916944 = boost
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.013864392 = queryNorm
              0.68737686 = fieldWeight in 4553, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.09375 = fieldNorm(doc=4553)
          0.10241963 = weight(abstract_txt:retrieval in 4553) [ClassicSimilarity], result of:
            0.10241963 = score(doc=4553,freq=4.0), product of:
              0.15712297 = queryWeight, product of:
                3.2598424 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013864392 = queryNorm
              0.6518438 = fieldWeight in 4553, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.09375 = fieldNorm(doc=4553)
        0.2 = coord(5/25)
    
  5. Young, C.W.; Eastman, C.M.; Oakman, R.L.: ¬An analysis of ill-formed input in natural language queries to document retrieval systems (1991) 0.08
    0.07603218 = sum of:
      0.07603218 = product of:
        0.47520113 = sum of:
          0.10752905 = weight(abstract_txt:spelling in 6263) [ClassicSimilarity], result of:
            0.10752905 = score(doc=6263,freq=2.0), product of:
              0.12708135 = queryWeight, product of:
                1.1968564 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.013864392 = queryNorm
              0.8461434 = fieldWeight in 6263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.078125 = fieldNorm(doc=6263)
          0.014374696 = weight(abstract_txt:information in 6263) [ClassicSimilarity], result of:
            0.014374696 = score(doc=6263,freq=1.0), product of:
              0.07606604 = queryWeight, product of:
                2.268152 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.013864392 = queryNorm
              0.18897653 = fieldWeight in 6263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=6263)
          0.31062254 = weight(abstract_txt:punctuation in 6263) [ClassicSimilarity], result of:
            0.31062254 = score(doc=6263,freq=2.0), product of:
              0.32476056 = queryWeight, product of:
                2.705813 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.013864392 = queryNorm
              0.9564663 = fieldWeight in 6263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.078125 = fieldNorm(doc=6263)
          0.042674843 = weight(abstract_txt:retrieval in 6263) [ClassicSimilarity], result of:
            0.042674843 = score(doc=6263,freq=1.0), product of:
              0.15712297 = queryWeight, product of:
                3.2598424 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.013864392 = queryNorm
              0.27160156 = fieldWeight in 6263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=6263)
        0.16 = coord(4/25)