Document (#23264)

Author
Young, C.W.
Eastman, C.M.
Oakman, R.L.
Title
¬An analysis of ill-formed input in natural language queries to document retrieval systems
Source
Information processing and management. 27(1991) no.6, S.615-622
Year
1991
Abstract
Natrual language document retrieval queries from the Thomas Cooper Library, South Carolina Univ. were analysed in oder to investigate the frequency of various types of ill-formed input, such as spelling errors, cooccurrence violations, conjunctions, ellipsis, and missing or incorrect punctuation. Users were requested to write out their requests for information in complete sentences on the form normally used by the library. The primary reason for analysing ill-formed inputs was to determine whether there is a significant need to study ill-formed inputs in detail. Results indicated that most of the queries were sentence fragments and that many of them contained some type of ill-formed input. Conjunctions caused the most problems. The next most serious problem was caused by punctuation errors. Spelling errors occured in a small number of queries. The remaining types of ill-formed input considered, allipsis and cooccurrence violations, were not found in the queries
Theme
Benutzerstudien
Sprachretrieval

Similar documents (author)

  1. Eastman, C.M.: Overlaps in postings to thesaurus terms : a preliminary study (1988) 2.27
    2.267544 = sum of:
      2.267544 = product of:
        4.535088 = sum of:
          4.535088 = weight(author_txt:eastman in 3623) [ClassicSimilarity], result of:
            4.535088 = score(doc=3623,freq=1.0), product of:
              0.7799306 = queryWeight, product of:
                1.1163163 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.07509637 = queryNorm
              5.814733 = fieldWeight in 3623, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.625 = fieldNorm(doc=3623)
        0.5 = coord(1/2)
    
  2. Eastman, C.M.: 30,000 hits may be better than 300 : precision anomalies in Internet searches (2002) 2.27
    2.267544 = sum of:
      2.267544 = product of:
        4.535088 = sum of:
          4.535088 = weight(author_txt:eastman in 231) [ClassicSimilarity], result of:
            4.535088 = score(doc=231,freq=1.0), product of:
              0.7799306 = queryWeight, product of:
                1.1163163 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.07509637 = queryNorm
              5.814733 = fieldWeight in 231, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.625 = fieldNorm(doc=231)
        0.5 = coord(1/2)
    
  3. Chang, Y.F.; Eastman, C.M.: ¬An information retrieval system for reusable software (1993) 1.81
    1.8140352 = sum of:
      1.8140352 = product of:
        3.6280704 = sum of:
          3.6280704 = weight(author_txt:eastman in 6347) [ClassicSimilarity], result of:
            3.6280704 = score(doc=6347,freq=1.0), product of:
              0.7799306 = queryWeight, product of:
                1.1163163 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.07509637 = queryNorm
              4.6517863 = fieldWeight in 6347, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.5 = fieldNorm(doc=6347)
        0.5 = coord(1/2)
    
  4. Eastman, C.M.; Carter, R.M.: Anthropological perspectives on classification schemes (1994) 1.81
    1.8140352 = sum of:
      1.8140352 = product of:
        3.6280704 = sum of:
          3.6280704 = weight(author_txt:eastman in 502) [ClassicSimilarity], result of:
            3.6280704 = score(doc=502,freq=1.0), product of:
              0.7799306 = queryWeight, product of:
                1.1163163 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.07509637 = queryNorm
              4.6517863 = fieldWeight in 502, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.5 = fieldNorm(doc=502)
        0.5 = coord(1/2)
    
  5. Rose, J.R.; Eastman, C.M.: Hierarchical classification as an aid to browsing (1994) 1.81
    1.8140352 = sum of:
      1.8140352 = product of:
        3.6280704 = sum of:
          3.6280704 = weight(author_txt:eastman in 508) [ClassicSimilarity], result of:
            3.6280704 = score(doc=508,freq=1.0), product of:
              0.7799306 = queryWeight, product of:
                1.1163163 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.07509637 = queryNorm
              4.6517863 = fieldWeight in 508, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.5 = fieldNorm(doc=508)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Lee, Y.-H.; Evens, M.W.: Natural language interface for an expert system (1998) 0.19
    0.19267017 = sum of:
      0.19267017 = product of:
        0.6881077 = sum of:
          0.08637304 = weight(abstract_txt:fragments in 6108) [ClassicSimilarity], result of:
            0.08637304 = score(doc=6108,freq=1.0), product of:
              0.11349079 = queryWeight, product of:
                1.0974053 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.01273935 = queryNorm
              0.7610577 = fieldWeight in 6108, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.09375 = fieldNorm(doc=6108)
          0.033136073 = weight(abstract_txt:language in 6108) [ClassicSimilarity], result of:
            0.033136073 = score(doc=6108,freq=2.0), product of:
              0.059920557 = queryWeight, product of:
                1.1276898 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.01273935 = queryNorm
              0.5530001 = fieldWeight in 6108, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.09375 = fieldNorm(doc=6108)
          0.028745584 = weight(abstract_txt:types in 6108) [ClassicSimilarity], result of:
            0.028745584 = score(doc=6108,freq=1.0), product of:
              0.06866982 = queryWeight, product of:
                1.2072152 = boost
                4.4651284 = idf(docFreq=1388, maxDocs=44421)
                0.01273935 = queryNorm
              0.4186058 = fieldWeight in 6108, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4651284 = idf(docFreq=1388, maxDocs=44421)
                0.09375 = fieldNorm(doc=6108)
          0.051400054 = weight(abstract_txt:most in 6108) [ClassicSimilarity], result of:
            0.051400054 = score(doc=6108,freq=3.0), product of:
              0.08029421 = queryWeight, product of:
                1.5987829 = boost
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.01273935 = queryNorm
              0.6401465 = fieldWeight in 6108, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.09375 = fieldNorm(doc=6108)
          0.2051168 = weight(abstract_txt:spelling in 6108) [ClassicSimilarity], result of:
            0.2051168 = score(doc=6108,freq=2.0), product of:
              0.20201144 = queryWeight, product of:
                2.070569 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.01273935 = queryNorm
              1.0153722 = fieldWeight in 6108, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.09375 = fieldNorm(doc=6108)
          0.13599458 = weight(abstract_txt:errors in 6108) [ClassicSimilarity], result of:
            0.13599458 = score(doc=6108,freq=1.0), product of:
              0.2215287 = queryWeight, product of:
                2.6555982 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01273935 = queryNorm
              0.6138915 = fieldWeight in 6108, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.09375 = fieldNorm(doc=6108)
          0.14734162 = weight(abstract_txt:input in 6108) [ClassicSimilarity], result of:
            0.14734162 = score(doc=6108,freq=1.0), product of:
              0.25720465 = queryWeight, product of:
                3.3041224 = boost
                6.110481 = idf(docFreq=267, maxDocs=44421)
                0.01273935 = queryNorm
              0.57285756 = fieldWeight in 6108, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.110481 = idf(docFreq=267, maxDocs=44421)
                0.09375 = fieldNorm(doc=6108)
        0.28 = coord(7/25)
    
  2. Drabenstott, K.M.; Weller, M.S.: Handling spelling errors in online catalog searches (1996) 0.12
    0.12142735 = sum of:
      0.12142735 = product of:
        0.6071367 = sum of:
          0.01978389 = weight(abstract_txt:most in 6973) [ClassicSimilarity], result of:
            0.01978389 = score(doc=6973,freq=1.0), product of:
              0.08029421 = queryWeight, product of:
                1.5987829 = boost
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.01273935 = queryNorm
              0.2463925 = fieldWeight in 6973, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.0625 = fieldNorm(doc=6973)
          0.030035023 = weight(abstract_txt:were in 6973) [ClassicSimilarity], result of:
            0.030035023 = score(doc=6973,freq=2.0), product of:
              0.0926541 = queryWeight, product of:
                1.9831203 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.01273935 = queryNorm
              0.3241629 = fieldWeight in 6973, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.0625 = fieldNorm(doc=6973)
          0.21621206 = weight(abstract_txt:spelling in 6973) [ClassicSimilarity], result of:
            0.21621206 = score(doc=6973,freq=5.0), product of:
              0.20201144 = queryWeight, product of:
                2.070569 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.01273935 = queryNorm
              1.0702962 = fieldWeight in 6973, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.0625 = fieldNorm(doc=6973)
          0.1813261 = weight(abstract_txt:errors in 6973) [ClassicSimilarity], result of:
            0.1813261 = score(doc=6973,freq=4.0), product of:
              0.2215287 = queryWeight, product of:
                2.6555982 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01273935 = queryNorm
              0.818522 = fieldWeight in 6973, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.0625 = fieldNorm(doc=6973)
          0.15977968 = weight(abstract_txt:queries in 6973) [ClassicSimilarity], result of:
            0.15977968 = score(doc=6973,freq=5.0), product of:
              0.22410439 = queryWeight, product of:
                3.4482355 = boost
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.01273935 = queryNorm
              0.7129699 = fieldWeight in 6973, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.0625 = fieldNorm(doc=6973)
        0.2 = coord(5/25)
    
  3. Soricut, R.; Marcu, D.: Abstractive headline generation using WIDL-expressions (2007) 0.10
    0.1049909 = sum of:
      0.1049909 = product of:
        0.5249545 = sum of:
          0.03381936 = weight(abstract_txt:language in 1943) [ClassicSimilarity], result of:
            0.03381936 = score(doc=1943,freq=3.0), product of:
              0.059920557 = queryWeight, product of:
                1.1276898 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.01273935 = queryNorm
              0.5644033 = fieldWeight in 1943, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.078125 = fieldNorm(doc=1943)
          0.042613737 = weight(abstract_txt:document in 1943) [ClassicSimilarity], result of:
            0.042613737 = score(doc=1943,freq=4.0), product of:
              0.063511506 = queryWeight, product of:
                1.1609886 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01273935 = queryNorm
              0.6709609 = fieldWeight in 1943, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=1943)
          0.024729865 = weight(abstract_txt:most in 1943) [ClassicSimilarity], result of:
            0.024729865 = score(doc=1943,freq=1.0), product of:
              0.08029421 = queryWeight, product of:
                1.5987829 = boost
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.01273935 = queryNorm
              0.30799064 = fieldWeight in 1943, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.078125 = fieldNorm(doc=1943)
          0.17364377 = weight(abstract_txt:input in 1943) [ClassicSimilarity], result of:
            0.17364377 = score(doc=1943,freq=2.0), product of:
              0.25720465 = queryWeight, product of:
                3.3041224 = boost
                6.110481 = idf(docFreq=267, maxDocs=44421)
                0.01273935 = queryNorm
              0.6751191 = fieldWeight in 1943, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.110481 = idf(docFreq=267, maxDocs=44421)
                0.078125 = fieldNorm(doc=1943)
          0.25014776 = weight(abstract_txt:formed in 1943) [ClassicSimilarity], result of:
            0.25014776 = score(doc=1943,freq=1.0), product of:
              0.47316304 = queryWeight, product of:
                5.488678 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.01273935 = queryNorm
              0.5286714 = fieldWeight in 1943, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.078125 = fieldNorm(doc=1943)
        0.2 = coord(5/25)
    
  4. Spink, A.; Wolfram, D.; Jansen, B.J.; Saracevic, T.: Searching the Web : the public and their queries (2001) 0.09
    0.090851754 = sum of:
      0.090851754 = product of:
        0.37854898 = sum of:
          0.01171537 = weight(abstract_txt:language in 980) [ClassicSimilarity], result of:
            0.01171537 = score(doc=980,freq=1.0), product of:
              0.059920557 = queryWeight, product of:
                1.1276898 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.01273935 = queryNorm
              0.19551504 = fieldWeight in 980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.046875 = fieldNorm(doc=980)
          0.020983985 = weight(abstract_txt:most in 980) [ClassicSimilarity], result of:
            0.020983985 = score(doc=980,freq=2.0), product of:
              0.08029421 = queryWeight, product of:
                1.5987829 = boost
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.01273935 = queryNorm
              0.2613387 = fieldWeight in 980, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.046875 = fieldNorm(doc=980)
          0.02758893 = weight(abstract_txt:were in 980) [ClassicSimilarity], result of:
            0.02758893 = score(doc=980,freq=3.0), product of:
              0.0926541 = queryWeight, product of:
                1.9831203 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.01273935 = queryNorm
              0.29776263 = fieldWeight in 980, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.046875 = fieldNorm(doc=980)
          0.072519735 = weight(abstract_txt:spelling in 980) [ClassicSimilarity], result of:
            0.072519735 = score(doc=980,freq=1.0), product of:
              0.20201144 = queryWeight, product of:
                2.070569 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.01273935 = queryNorm
              0.35898826 = fieldWeight in 980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.046875 = fieldNorm(doc=980)
          0.06799729 = weight(abstract_txt:errors in 980) [ClassicSimilarity], result of:
            0.06799729 = score(doc=980,freq=1.0), product of:
              0.2215287 = queryWeight, product of:
                2.6555982 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01273935 = queryNorm
              0.30694574 = fieldWeight in 980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.046875 = fieldNorm(doc=980)
          0.17774369 = weight(abstract_txt:queries in 980) [ClassicSimilarity], result of:
            0.17774369 = score(doc=980,freq=11.0), product of:
              0.22410439 = queryWeight, product of:
                3.4482355 = boost
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.01273935 = queryNorm
              0.79312897 = fieldWeight in 980, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.046875 = fieldNorm(doc=980)
        0.24 = coord(6/25)
    
  5. Dewdney, P.; Michell, G.: Oranges and peaches : understanding communication accidents in the reference interview (1996) 0.09
    0.08704963 = sum of:
      0.08704963 = product of:
        0.54406023 = sum of:
          0.10484252 = weight(abstract_txt:caused in 6491) [ClassicSimilarity], result of:
            0.10484252 = score(doc=6491,freq=1.0), product of:
              0.16270846 = queryWeight, product of:
                1.8582615 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.01273935 = queryNorm
              0.64435816 = fieldWeight in 6491, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.09375 = fieldNorm(doc=6491)
          0.031856954 = weight(abstract_txt:were in 6491) [ClassicSimilarity], result of:
            0.031856954 = score(doc=6491,freq=1.0), product of:
              0.0926541 = queryWeight, product of:
                1.9831203 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.01273935 = queryNorm
              0.3438267 = fieldWeight in 6491, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.09375 = fieldNorm(doc=6491)
          0.10718347 = weight(abstract_txt:queries in 6491) [ClassicSimilarity], result of:
            0.10718347 = score(doc=6491,freq=1.0), product of:
              0.22410439 = queryWeight, product of:
                3.4482355 = boost
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.01273935 = queryNorm
              0.47827476 = fieldWeight in 6491, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.09375 = fieldNorm(doc=6491)
          0.30017728 = weight(abstract_txt:formed in 6491) [ClassicSimilarity], result of:
            0.30017728 = score(doc=6491,freq=1.0), product of:
              0.47316304 = queryWeight, product of:
                5.488678 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.01273935 = queryNorm
              0.6344056 = fieldWeight in 6491, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.09375 = fieldNorm(doc=6491)
        0.16 = coord(4/25)