Document (#11257)

Author
Cheng, P.T.K.
Wu, A.K.W.
Title
ACS: an automatic classification system
Source
Journal of information science. 21(1995) no.4, S.289-299
Year
1995
Abstract
In this paper, we introduce ACS, an automatic classification system for school libraries. First, various approaches towards automatic classification, namely (i) rule-based, (ii) browse and search, and (iii) partial match, are critically reviewed. The central issues of scheme selection, text analysis and similarity measures are discussed. A novel approach towards detecting book-class similarity with Modified Overlap Coefficient (MOC) is also proposed. Finally, the design and implementation of ACS is presented. The test result of over 80% correctness in automatic classification and a cost reduction of 75% compared to manual classification suggest that ACS is highly adoptable
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Cheng, L.R.L.: Beyond bilingualism : a quest for communicative competence (1996) 5.21
    5.2088575 = sum of:
      5.2088575 = weight(author_txt:cheng in 5291) [ClassicSimilarity], result of:
        5.2088575 = fieldWeight in 5291, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.625 = fieldNorm(doc=5291)
    
  2. Cheng, K.-H.: Automatic identification for topics of electronic documents (1997) 4.17
    4.167086 = sum of:
      4.167086 = weight(author_txt:cheng in 2811) [ClassicSimilarity], result of:
        4.167086 = fieldWeight in 2811, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.5 = fieldNorm(doc=2811)
    
  3. Cheng, L.-y.: On bibliographic(al) control (1998) 4.17
    4.167086 = sum of:
      4.167086 = weight(author_txt:cheng in 4376) [ClassicSimilarity], result of:
        4.167086 = fieldWeight in 4376, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.5 = fieldNorm(doc=4376)
    
  4. Harter, S.P.; Cheng, Y.-R.: Colinked descriptors : improving vocabulary selection for end-user searching (1996) 3.65
    3.6462004 = sum of:
      3.6462004 = weight(author_txt:cheng in 4284) [ClassicSimilarity], result of:
        3.6462004 = fieldWeight in 4284, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.4375 = fieldNorm(doc=4284)
    
  5. Cheng, W.-N.; Khoo, C.S.G.: Information and argument structures in Sociology research abstracts (2018) 3.65
    3.6462004 = sum of:
      3.6462004 = weight(author_txt:cheng in 750) [ClassicSimilarity], result of:
        3.6462004 = fieldWeight in 750, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.334172 = idf(docFreq=28, maxDocs=44421)
          0.4375 = fieldNorm(doc=750)
    

Similar documents (content)

  1. Gowtham, M.S.; Kamat, S.K.: ¬An expert system as a tool to classification (1995) 0.18
    0.1795833 = sum of:
      0.1795833 = product of:
        0.6413689 = sum of:
          0.09957369 = weight(abstract_txt:scheme in 3803) [ClassicSimilarity], result of:
            0.09957369 = score(doc=3803,freq=4.0), product of:
              0.11614506 = queryWeight, product of:
                5.4868593 = idf(docFreq=499, maxDocs=44421)
                0.021167858 = queryNorm
              0.85732174 = fieldWeight in 3803, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4868593 = idf(docFreq=499, maxDocs=44421)
                0.078125 = fieldNorm(doc=3803)
          0.08737839 = weight(abstract_txt:class in 3803) [ClassicSimilarity], result of:
            0.08737839 = score(doc=3803,freq=2.0), product of:
              0.13412726 = queryWeight, product of:
                1.074628 = boost
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.021167858 = queryNorm
              0.65145886 = fieldWeight in 3803, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.078125 = fieldNorm(doc=3803)
          0.063657016 = weight(abstract_txt:manual in 3803) [ClassicSimilarity], result of:
            0.063657016 = score(doc=3803,freq=1.0), product of:
              0.13682176 = queryWeight, product of:
                1.0853685 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.021167858 = queryNorm
              0.46525505 = fieldWeight in 3803, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.078125 = fieldNorm(doc=3803)
          0.08744391 = weight(abstract_txt:rule in 3803) [ClassicSimilarity], result of:
            0.08744391 = score(doc=3803,freq=1.0), product of:
              0.16907422 = queryWeight, product of:
                1.2065306 = boost
                6.6200633 = idf(docFreq=160, maxDocs=44421)
                0.021167858 = queryNorm
              0.5171924 = fieldWeight in 3803, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6200633 = idf(docFreq=160, maxDocs=44421)
                0.078125 = fieldNorm(doc=3803)
          0.04005861 = weight(abstract_txt:system in 3803) [ClassicSimilarity], result of:
            0.04005861 = score(doc=3803,freq=3.0), product of:
              0.08777237 = queryWeight, product of:
                1.229401 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.021167858 = queryNorm
              0.45639202 = fieldWeight in 3803, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.078125 = fieldNorm(doc=3803)
          0.097184926 = weight(abstract_txt:modified in 3803) [ClassicSimilarity], result of:
            0.097184926 = score(doc=3803,freq=1.0), product of:
              0.18140821 = queryWeight, product of:
                1.2497643 = boost
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.021167858 = queryNorm
              0.53572506 = fieldWeight in 3803, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.078125 = fieldNorm(doc=3803)
          0.16607235 = weight(abstract_txt:classification in 3803) [ClassicSimilarity], result of:
            0.16607235 = score(doc=3803,freq=3.0), product of:
              0.30742475 = queryWeight, product of:
                3.6379275 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.021167858 = queryNorm
              0.5402049 = fieldWeight in 3803, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=3803)
        0.28 = coord(7/25)
    
  2. Pong, J.Y.-H.; Kwok, R.C.-W.; Lau, R.Y.-K.; Hao, J.-X.; Wong, P.C.-C.: ¬A comparative study of two automatic document classification methods in a library setting (2008) 0.12
    0.12252863 = sum of:
      0.12252863 = product of:
        0.6126431 = sum of:
          0.039829474 = weight(abstract_txt:scheme in 3532) [ClassicSimilarity], result of:
            0.039829474 = score(doc=3532,freq=1.0), product of:
              0.11614506 = queryWeight, product of:
                5.4868593 = idf(docFreq=499, maxDocs=44421)
                0.021167858 = queryNorm
              0.3429287 = fieldWeight in 3532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4868593 = idf(docFreq=499, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.08820574 = weight(abstract_txt:manual in 3532) [ClassicSimilarity], result of:
            0.08820574 = score(doc=3532,freq=3.0), product of:
              0.13682176 = queryWeight, product of:
                1.0853685 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.021167858 = queryNorm
              0.64467627 = fieldWeight in 3532, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.026166173 = weight(abstract_txt:system in 3532) [ClassicSimilarity], result of:
            0.026166173 = score(doc=3532,freq=2.0), product of:
              0.08777237 = queryWeight, product of:
                1.229401 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.021167858 = queryNorm
              0.298114 = fieldWeight in 3532, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.18788944 = weight(abstract_txt:classification in 3532) [ClassicSimilarity], result of:
            0.18788944 = score(doc=3532,freq=6.0), product of:
              0.30742475 = queryWeight, product of:
                3.6379275 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.021167858 = queryNorm
              0.61117214 = fieldWeight in 3532, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.27055225 = weight(abstract_txt:automatic in 3532) [ClassicSimilarity], result of:
            0.27055225 = score(doc=3532,freq=4.0), product of:
              0.41658002 = queryWeight, product of:
                3.7877285 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.021167858 = queryNorm
              0.64946043 = fieldWeight in 3532, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
        0.2 = coord(5/25)
    
  3. Larson, R.R.: Experiments in automatic Library of Congress Classification (1992) 0.11
    0.108861186 = sum of:
      0.108861186 = product of:
        0.68038243 = sum of:
          0.11010838 = weight(abstract_txt:match in 1053) [ClassicSimilarity], result of:
            0.11010838 = score(doc=1053,freq=2.0), product of:
              0.15648088 = queryWeight, product of:
                1.1607275 = boost
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.021167858 = queryNorm
              0.70365393 = fieldWeight in 1053, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3687487 = idf(docFreq=206, maxDocs=44421)
                0.078125 = fieldNorm(doc=1053)
          0.13937353 = weight(abstract_txt:partial in 1053) [ClassicSimilarity], result of:
            0.13937353 = score(doc=1053,freq=2.0), product of:
              0.18310541 = queryWeight, product of:
                1.2555969 = boost
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.021167858 = queryNorm
              0.7611655 = fieldWeight in 1053, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.078125 = fieldNorm(doc=1053)
          0.19176385 = weight(abstract_txt:classification in 1053) [ClassicSimilarity], result of:
            0.19176385 = score(doc=1053,freq=4.0), product of:
              0.30742475 = queryWeight, product of:
                3.6379275 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.021167858 = queryNorm
              0.6237749 = fieldWeight in 1053, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=1053)
          0.23913665 = weight(abstract_txt:automatic in 1053) [ClassicSimilarity], result of:
            0.23913665 = score(doc=1053,freq=2.0), product of:
              0.41658002 = queryWeight, product of:
                3.7877285 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.021167858 = queryNorm
              0.5740473 = fieldWeight in 1053, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=1053)
        0.16 = coord(4/25)
    
  4. Adamson, G.W.; Boreham, J.: ¬The use of an association measure based on character structure to identify semantically related pairs of words and document titles (1974) 0.10
    0.10081233 = sum of:
      0.10081233 = product of:
        0.63007706 = sum of:
          0.16963965 = weight(abstract_txt:coefficient in 1398) [ClassicSimilarity], result of:
            0.16963965 = score(doc=1398,freq=1.0), product of:
              0.23289227 = queryWeight, product of:
                1.4160454 = boost
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.021167858 = queryNorm
              0.7284039 = fieldWeight in 1398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.09375 = fieldNorm(doc=1398)
          0.14246494 = weight(abstract_txt:similarity in 1398) [ClassicSimilarity], result of:
            0.14246494 = score(doc=1398,freq=1.0), product of:
              0.26118737 = queryWeight, product of:
                2.1207561 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.021167858 = queryNorm
              0.5454511 = fieldWeight in 1398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.09375 = fieldNorm(doc=1398)
          0.11505831 = weight(abstract_txt:classification in 1398) [ClassicSimilarity], result of:
            0.11505831 = score(doc=1398,freq=1.0), product of:
              0.30742475 = queryWeight, product of:
                3.6379275 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.021167858 = queryNorm
              0.37426496 = fieldWeight in 1398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.09375 = fieldNorm(doc=1398)
          0.20291418 = weight(abstract_txt:automatic in 1398) [ClassicSimilarity], result of:
            0.20291418 = score(doc=1398,freq=1.0), product of:
              0.41658002 = queryWeight, product of:
                3.7877285 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.021167858 = queryNorm
              0.48709533 = fieldWeight in 1398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.09375 = fieldNorm(doc=1398)
        0.16 = coord(4/25)
    
  5. Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.10
    0.09961786 = sum of:
      0.09961786 = product of:
        0.4980893 = sum of:
          0.05073443 = weight(abstract_txt:novel in 400) [ClassicSimilarity], result of:
            0.05073443 = score(doc=400,freq=1.0), product of:
              0.11761414 = queryWeight, product of:
                1.0063045 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.021167858 = queryNorm
              0.43136334 = fieldWeight in 400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.078125 = fieldNorm(doc=400)
          0.063657016 = weight(abstract_txt:manual in 400) [ClassicSimilarity], result of:
            0.063657016 = score(doc=400,freq=1.0), product of:
              0.13682176 = queryWeight, product of:
                1.0853685 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.021167858 = queryNorm
              0.46525505 = fieldWeight in 400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.078125 = fieldNorm(doc=400)
          0.11872079 = weight(abstract_txt:similarity in 400) [ClassicSimilarity], result of:
            0.11872079 = score(doc=400,freq=1.0), product of:
              0.26118737 = queryWeight, product of:
                2.1207561 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.021167858 = queryNorm
              0.4545426 = fieldWeight in 400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.078125 = fieldNorm(doc=400)
          0.095881924 = weight(abstract_txt:classification in 400) [ClassicSimilarity], result of:
            0.095881924 = score(doc=400,freq=1.0), product of:
              0.30742475 = queryWeight, product of:
                3.6379275 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.021167858 = queryNorm
              0.31188744 = fieldWeight in 400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=400)
          0.16909514 = weight(abstract_txt:automatic in 400) [ClassicSimilarity], result of:
            0.16909514 = score(doc=400,freq=1.0), product of:
              0.41658002 = queryWeight, product of:
                3.7877285 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.021167858 = queryNorm
              0.40591276 = fieldWeight in 400, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=400)
        0.2 = coord(5/25)