Document (#44187)

Author
Zakaria, M.S.
Title
Measuring typographical errors in online catalogs of academic libraries using Ballard's list : a case study from Egypt
Source
Cataloging and classification quarterly. 61(2023) no.7-8, S.848-870
Year
2023
Abstract
Typographical errors in bibliographic records of online library catalogs are a common troublesome phenomenon, spread all over the world. They can affect the retrieval and identification of items in information retrieval systems and thus prevent users from finding the documents they need. The present study was conducted to measure typographical errors in the online catalog of the Egyptian Universities Libraries Consortium (EULC). The investigation depended on Terry Ballard's typographical error terms list. The EULC catalog was searched to identify matched erroneous records. The study found that the total number of erroneous records reached 1686, whereas the mean error rate for each record is 11.24, which is very high. About 396 erroneous records (23.49%) have been retrieved from Section C of Ballard's list (Moderate Probability). The typographical errors found within the abstracts of the study's sample records represented 35.82%. Omissions were the first common type of errors with 54.51%, followed by transpositions at 17.08%. Regarding the analysis of parts of speech, the study found that 63.46% of errors occur in noun terms. The results of the study indicated that typographical errors still pose a serious challenge for information retrieval systems, especially for library systems in the Arab environment. The study proposes some solutions for Egyptian university libraries in order to avoid typographic mistakes in the future.
Content
Vgl.: https://www.tandfonline.com/doi/full/10.1080/01639374.2023.2282579.
Theme
Formalerschließung
Location
Ägypten
Aid
Ballard-Liste

Similar documents (content)

  1. Beall, J.; Kafadar, K.: ¬The effectiveness of copy cotaloging at eliminating typographical errors in shared bibliographic records (2004) 0.78
    0.7795655 = sum of:
      0.7795655 = product of:
        1.9489138 = sum of:
          0.01645673 = weight(abstract_txt:retrieval in 5849) [ClassicSimilarity], result of:
            0.01645673 = score(doc=5849,freq=1.0), product of:
              0.050492868 = queryWeight, product of:
                1.2895949 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.011262491 = queryNorm
              0.3259219 = fieldWeight in 5849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.019274632 = weight(abstract_txt:online in 5849) [ClassicSimilarity], result of:
            0.019274632 = score(doc=5849,freq=1.0), product of:
              0.056103725 = queryWeight, product of:
                1.3593589 = boost
                3.6645708 = idf(docFreq=3092, maxDocs=44421)
                0.011262491 = queryNorm
              0.3435535 = fieldWeight in 5849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6645708 = idf(docFreq=3092, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.030027032 = weight(abstract_txt:libraries in 5849) [ClassicSimilarity], result of:
            0.030027032 = score(doc=5849,freq=2.0), product of:
              0.059841063 = queryWeight, product of:
                1.4039056 = boost
                3.78466 = idf(docFreq=2742, maxDocs=44421)
                0.011262491 = queryNorm
              0.50177974 = fieldWeight in 5849, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.78466 = idf(docFreq=2742, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.08230524 = weight(abstract_txt:catalogs in 5849) [ClassicSimilarity], result of:
            0.08230524 = score(doc=5849,freq=2.0), product of:
              0.10238717 = queryWeight, product of:
                1.4993927 = boost
                6.0631127 = idf(docFreq=280, maxDocs=44421)
                0.011262491 = queryNorm
              0.8038628 = fieldWeight in 5849, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0631127 = idf(docFreq=280, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.035019826 = weight(abstract_txt:found in 5849) [ClassicSimilarity], result of:
            0.035019826 = score(doc=5849,freq=1.0), product of:
              0.08353663 = queryWeight, product of:
                1.6587341 = boost
                4.4716287 = idf(docFreq=1379, maxDocs=44421)
                0.011262491 = queryNorm
              0.4192152 = fieldWeight in 5849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4716287 = idf(docFreq=1379, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.08419403 = weight(abstract_txt:error in 5849) [ClassicSimilarity], result of:
            0.08419403 = score(doc=5849,freq=1.0), product of:
              0.13096586 = queryWeight, product of:
                1.6957885 = boost
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.011262491 = queryNorm
              0.64287007 = fieldWeight in 5849, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.044127513 = weight(abstract_txt:study in 5849) [ClassicSimilarity], result of:
            0.044127513 = score(doc=5849,freq=2.0), product of:
              0.09745571 = queryWeight, product of:
                2.53371 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.011262491 = queryNorm
              0.45279557 = fieldWeight in 5849, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.0979121 = weight(abstract_txt:records in 5849) [ClassicSimilarity], result of:
            0.0979121 = score(doc=5849,freq=3.0), product of:
              0.13629118 = queryWeight, product of:
                2.735247 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.011262491 = queryNorm
              0.7184038 = fieldWeight in 5849, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.5737723 = weight(abstract_txt:errors in 5849) [ClassicSimilarity], result of:
            0.5737723 = score(doc=5849,freq=5.0), product of:
              0.4179872 = queryWeight, product of:
                5.667717 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.011262491 = queryNorm
              1.3727031 = fieldWeight in 5849, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
          0.9658244 = weight(abstract_txt:typographical in 5849) [ClassicSimilarity], result of:
            0.9658244 = score(doc=5849,freq=3.0), product of:
              0.6661459 = queryWeight, product of:
                6.624269 = boost
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.011262491 = queryNorm
              1.4498692 = fieldWeight in 5849, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.09375 = fieldNorm(doc=5849)
        0.4 = coord(10/25)
    
  2. Beall, J.; Kafadar, K.: Measuring typographical errors' impact on retrieval in bibliographic databases (2007) 0.51
    0.5056784 = sum of:
      0.5056784 = product of:
        1.4046621 = sum of:
          0.006857709 = weight(abstract_txt:from in 386) [ClassicSimilarity], result of:
            0.006857709 = score(doc=386,freq=1.0), product of:
              0.031810794 = queryWeight, product of:
                1.0235888 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.011262491 = queryNorm
              0.21557805 = fieldWeight in 386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.013713942 = weight(abstract_txt:retrieval in 386) [ClassicSimilarity], result of:
            0.013713942 = score(doc=386,freq=1.0), product of:
              0.050492868 = queryWeight, product of:
                1.2895949 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.011262491 = queryNorm
              0.27160156 = fieldWeight in 386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.016062193 = weight(abstract_txt:online in 386) [ClassicSimilarity], result of:
            0.016062193 = score(doc=386,freq=1.0), product of:
              0.056103725 = queryWeight, product of:
                1.3593589 = boost
                3.6645708 = idf(docFreq=3092, maxDocs=44421)
                0.011262491 = queryNorm
              0.28629458 = fieldWeight in 386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6645708 = idf(docFreq=3092, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.048498824 = weight(abstract_txt:catalogs in 386) [ClassicSimilarity], result of:
            0.048498824 = score(doc=386,freq=1.0), product of:
              0.10238717 = queryWeight, product of:
                1.4993927 = boost
                6.0631127 = idf(docFreq=280, maxDocs=44421)
                0.011262491 = queryNorm
              0.47368068 = fieldWeight in 386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0631127 = idf(docFreq=280, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.07016169 = weight(abstract_txt:error in 386) [ClassicSimilarity], result of:
            0.07016169 = score(doc=386,freq=1.0), product of:
              0.13096586 = queryWeight, product of:
                1.6957885 = boost
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.011262491 = queryNorm
              0.53572506 = fieldWeight in 386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.036772925 = weight(abstract_txt:study in 386) [ClassicSimilarity], result of:
            0.036772925 = score(doc=386,freq=2.0), product of:
              0.09745571 = queryWeight, product of:
                2.53371 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.011262491 = queryNorm
              0.37732962 = fieldWeight in 386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.105336644 = weight(abstract_txt:records in 386) [ClassicSimilarity], result of:
            0.105336644 = score(doc=386,freq=5.0), product of:
              0.13629118 = queryWeight, product of:
                2.735247 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.011262491 = queryNorm
              0.7728794 = fieldWeight in 386, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.30240458 = weight(abstract_txt:errors in 386) [ClassicSimilarity], result of:
            0.30240458 = score(doc=386,freq=2.0), product of:
              0.4179872 = queryWeight, product of:
                5.667717 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.011262491 = queryNorm
              0.7234781 = fieldWeight in 386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
          0.8048537 = weight(abstract_txt:typographical in 386) [ClassicSimilarity], result of:
            0.8048537 = score(doc=386,freq=3.0), product of:
              0.6661459 = queryWeight, product of:
                6.624269 = boost
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.011262491 = queryNorm
              1.2082243 = fieldWeight in 386, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.078125 = fieldNorm(doc=386)
        0.36 = coord(9/25)
    
  3. Ojala, M.: Troubleshooting your search : whatever can go wrong, will go wrong (1995) 0.27
    0.2677219 = sum of:
      0.2677219 = product of:
        1.6732619 = sum of:
          0.02569951 = weight(abstract_txt:online in 4145) [ClassicSimilarity], result of:
            0.02569951 = score(doc=4145,freq=1.0), product of:
              0.056103725 = queryWeight, product of:
                1.3593589 = boost
                3.6645708 = idf(docFreq=3092, maxDocs=44421)
                0.011262491 = queryNorm
              0.45807135 = fieldWeight in 4145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6645708 = idf(docFreq=3092, maxDocs=44421)
                0.125 = fieldNorm(doc=4145)
          0.11225871 = weight(abstract_txt:error in 4145) [ClassicSimilarity], result of:
            0.11225871 = score(doc=4145,freq=1.0), product of:
              0.13096586 = queryWeight, product of:
                1.6957885 = boost
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.011262491 = queryNorm
              0.8571601 = fieldWeight in 4145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.125 = fieldNorm(doc=4145)
          0.4838473 = weight(abstract_txt:errors in 4145) [ClassicSimilarity], result of:
            0.4838473 = score(doc=4145,freq=2.0), product of:
              0.4179872 = queryWeight, product of:
                5.667717 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.011262491 = queryNorm
              1.1575649 = fieldWeight in 4145, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.125 = fieldNorm(doc=4145)
          1.0514565 = weight(abstract_txt:typographical in 4145) [ClassicSimilarity], result of:
            1.0514565 = score(doc=4145,freq=2.0), product of:
              0.6661459 = queryWeight, product of:
                6.624269 = boost
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.011262491 = queryNorm
              1.5784177 = fieldWeight in 4145, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.125 = fieldNorm(doc=4145)
        0.16 = coord(4/25)
    
  4. Zeng, L.: Quality control of Chinese-language records using a rule-based data validation system : Part 1: an evaluation of the quality of Chinese-language records in the OCLC OLUC database (1993) 0.22
    0.22310773 = sum of:
      0.22310773 = product of:
        1.1155386 = sum of:
          0.006857709 = weight(abstract_txt:from in 705) [ClassicSimilarity], result of:
            0.006857709 = score(doc=705,freq=1.0), product of:
              0.031810794 = queryWeight, product of:
                1.0235888 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.011262491 = queryNorm
              0.21557805 = fieldWeight in 705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=705)
          0.026002387 = weight(abstract_txt:study in 705) [ClassicSimilarity], result of:
            0.026002387 = score(doc=705,freq=1.0), product of:
              0.09745571 = queryWeight, product of:
                2.53371 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.011262491 = queryNorm
              0.26681235 = fieldWeight in 705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.078125 = fieldNorm(doc=705)
          0.09421597 = weight(abstract_txt:records in 705) [ClassicSimilarity], result of:
            0.09421597 = score(doc=705,freq=4.0), product of:
              0.13629118 = queryWeight, product of:
                2.735247 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.011262491 = queryNorm
              0.6912844 = fieldWeight in 705, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.078125 = fieldNorm(doc=705)
          0.52378005 = weight(abstract_txt:errors in 705) [ClassicSimilarity], result of:
            0.52378005 = score(doc=705,freq=6.0), product of:
              0.4179872 = queryWeight, product of:
                5.667717 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.011262491 = queryNorm
              1.2531008 = fieldWeight in 705, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.078125 = fieldNorm(doc=705)
          0.46468252 = weight(abstract_txt:typographical in 705) [ClassicSimilarity], result of:
            0.46468252 = score(doc=705,freq=1.0), product of:
              0.6661459 = queryWeight, product of:
                6.624269 = boost
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.011262491 = queryNorm
              0.69756866 = fieldWeight in 705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.078125 = fieldNorm(doc=705)
        0.2 = coord(5/25)
    
  5. Shin, H.-s.: Quality of Korean cataloging records in shared databases (2003) 0.20
    0.20395324 = sum of:
      0.20395324 = product of:
        0.7284044 = sum of:
          0.014387488 = weight(abstract_txt:terms in 498) [ClassicSimilarity], result of:
            0.014387488 = score(doc=498,freq=1.0), product of:
              0.045542274 = queryWeight, product of:
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.011262491 = queryNorm
              0.31591502 = fieldWeight in 498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.078125 = fieldNorm(doc=498)
          0.013713942 = weight(abstract_txt:retrieval in 498) [ClassicSimilarity], result of:
            0.013713942 = score(doc=498,freq=1.0), product of:
              0.050492868 = queryWeight, product of:
                1.2895949 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.011262491 = queryNorm
              0.27160156 = fieldWeight in 498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=498)
          0.029183187 = weight(abstract_txt:found in 498) [ClassicSimilarity], result of:
            0.029183187 = score(doc=498,freq=1.0), product of:
              0.08353663 = queryWeight, product of:
                1.6587341 = boost
                4.4716287 = idf(docFreq=1379, maxDocs=44421)
                0.011262491 = queryNorm
              0.34934598 = fieldWeight in 498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4716287 = idf(docFreq=1379, maxDocs=44421)
                0.078125 = fieldNorm(doc=498)
          0.14032339 = weight(abstract_txt:error in 498) [ClassicSimilarity], result of:
            0.14032339 = score(doc=498,freq=4.0), product of:
              0.13096586 = queryWeight, product of:
                1.6957885 = boost
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.011262491 = queryNorm
              1.0714501 = fieldWeight in 498, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.078125 = fieldNorm(doc=498)
          0.045037456 = weight(abstract_txt:study in 498) [ClassicSimilarity], result of:
            0.045037456 = score(doc=498,freq=3.0), product of:
              0.09745571 = queryWeight, product of:
                2.53371 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.011262491 = queryNorm
              0.46213254 = fieldWeight in 498, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.078125 = fieldNorm(doc=498)
          0.11539052 = weight(abstract_txt:records in 498) [ClassicSimilarity], result of:
            0.11539052 = score(doc=498,freq=6.0), product of:
              0.13629118 = queryWeight, product of:
                2.735247 = boost
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.011262491 = queryNorm
              0.846647 = fieldWeight in 498, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.42422 = idf(docFreq=1446, maxDocs=44421)
                0.078125 = fieldNorm(doc=498)
          0.37036845 = weight(abstract_txt:errors in 498) [ClassicSimilarity], result of:
            0.37036845 = score(doc=498,freq=3.0), product of:
              0.4179872 = queryWeight, product of:
                5.667717 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.011262491 = queryNorm
              0.88607603 = fieldWeight in 498, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.078125 = fieldNorm(doc=498)
        0.28 = coord(7/25)