Document (#11288)

Author
Savic, D.
Title
Automatic classification of office documents : review of available methods and techniques
Source
Records management quarterly. 29(1995) no.4, S.3-18
Year
1995
Abstract
Classification of office documents is one of the administrative functions carried out by almost every organization and institution which sends and receives correspondence. Processing of this increasing amount of information coming and out going mail, in particular its classification, is time consuming and expensive. More and more organizations are seeking a solution for meeting this challenge by designing computer based systems for automatic classification. Examines the present status of available knowledge and methodology which can be used for automatic classification of office documents. Besides a review of classic methods and techniques, the focus id also placed on the application of artificial intelligence
Theme
Dokumentenmanagement
Automatisches Klassifizieren

Similar documents (content)

  1. Savic, D.: Designing an expert system for classifying office documents (1994) 0.17
    0.17353521 = sum of:
      0.17353521 = product of:
        0.867676 = sum of:
          0.093465924 = weight(abstract_txt:artificial in 2654) [ClassicSimilarity], result of:
            0.093465924 = score(doc=2654,freq=1.0), product of:
              0.12375525 = queryWeight, product of:
                6.0419855 = idf(docFreq=286, maxDocs=44421)
                0.020482546 = queryNorm
              0.7552482 = fieldWeight in 2654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0419855 = idf(docFreq=286, maxDocs=44421)
                0.125 = fieldNorm(doc=2654)
          0.08912043 = weight(abstract_txt:documents in 2654) [ClassicSimilarity], result of:
            0.08912043 = score(doc=2654,freq=1.0), product of:
              0.17290996 = queryWeight, product of:
                2.0473347 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.020482546 = queryNorm
              0.51541525 = fieldWeight in 2654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.125 = fieldNorm(doc=2654)
          0.17830501 = weight(abstract_txt:automatic in 2654) [ClassicSimilarity], result of:
            0.17830501 = score(doc=2654,freq=1.0), product of:
              0.27454332 = queryWeight, product of:
                2.5797894 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.020482546 = queryNorm
              0.64946043 = fieldWeight in 2654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.125 = fieldNorm(doc=2654)
          0.37197912 = weight(abstract_txt:office in 2654) [ClassicSimilarity], result of:
            0.37197912 = score(doc=2654,freq=1.0), product of:
              0.4482437 = queryWeight, product of:
                3.2963698 = boost
                6.6388726 = idf(docFreq=157, maxDocs=44421)
                0.020482546 = queryNorm
              0.8298591 = fieldWeight in 2654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6388726 = idf(docFreq=157, maxDocs=44421)
                0.125 = fieldNorm(doc=2654)
          0.13480558 = weight(abstract_txt:classification in 2654) [ClassicSimilarity], result of:
            0.13480558 = score(doc=2654,freq=1.0), product of:
              0.27014068 = queryWeight, product of:
                3.3036816 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.020482546 = queryNorm
              0.49901992 = fieldWeight in 2654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.125 = fieldNorm(doc=2654)
        0.2 = coord(5/25)
    
  2. Pong, J.Y.-H.; Kwok, R.C.-W.; Lau, R.Y.-K.; Hao, J.-X.; Wong, P.C.-C.: ¬A comparative study of two automatic document classification methods in a library setting (2008) 0.15
    0.1473961 = sum of:
      0.1473961 = product of:
        0.52641463 = sum of:
          0.016599912 = weight(abstract_txt:more in 3532) [ClassicSimilarity], result of:
            0.016599912 = score(doc=3532,freq=1.0), product of:
              0.07820393 = queryWeight, product of:
                1.1242101 = boost
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.020482546 = queryNorm
              0.21226442 = fieldWeight in 3532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.030144474 = weight(abstract_txt:methods in 3532) [ClassicSimilarity], result of:
            0.030144474 = score(doc=3532,freq=1.0), product of:
              0.11640274 = queryWeight, product of:
                1.37156 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.020482546 = queryNorm
              0.25896704 = fieldWeight in 3532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.033797782 = weight(abstract_txt:available in 3532) [ClassicSimilarity], result of:
            0.033797782 = score(doc=3532,freq=1.0), product of:
              0.12562716 = queryWeight, product of:
                1.4248691 = boost
                4.304519 = idf(docFreq=1630, maxDocs=44421)
                0.020482546 = queryNorm
              0.26903245 = fieldWeight in 3532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.304519 = idf(docFreq=1630, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.039447337 = weight(abstract_txt:techniques in 3532) [ClassicSimilarity], result of:
            0.039447337 = score(doc=3532,freq=1.0), product of:
              0.13926326 = queryWeight, product of:
                1.5002079 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.020482546 = queryNorm
              0.28325734 = fieldWeight in 3532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.06301766 = weight(abstract_txt:documents in 3532) [ClassicSimilarity], result of:
            0.06301766 = score(doc=3532,freq=2.0), product of:
              0.17290996 = queryWeight, product of:
                2.0473347 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.020482546 = queryNorm
              0.3644536 = fieldWeight in 3532, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.17830501 = weight(abstract_txt:automatic in 3532) [ClassicSimilarity], result of:
            0.17830501 = score(doc=3532,freq=4.0), product of:
              0.27454332 = queryWeight, product of:
                2.5797894 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.020482546 = queryNorm
              0.64946043 = fieldWeight in 3532, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
          0.16510245 = weight(abstract_txt:classification in 3532) [ClassicSimilarity], result of:
            0.16510245 = score(doc=3532,freq=6.0), product of:
              0.27014068 = queryWeight, product of:
                3.3036816 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.020482546 = queryNorm
              0.61117214 = fieldWeight in 3532, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=3532)
        0.28 = coord(7/25)
    
  3. Malak, P.: Is the Artificial Intelligence applicable for the libraries purposes? (2005) 0.12
    0.12373866 = sum of:
      0.12373866 = product of:
        0.38668332 = sum of:
          0.035049725 = weight(abstract_txt:artificial in 4006) [ClassicSimilarity], result of:
            0.035049725 = score(doc=4006,freq=1.0), product of:
              0.12375525 = queryWeight, product of:
                6.0419855 = idf(docFreq=286, maxDocs=44421)
                0.020482546 = queryNorm
              0.2832181 = fieldWeight in 4006, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0419855 = idf(docFreq=286, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
          0.061644107 = weight(abstract_txt:meeting in 4006) [ClassicSimilarity], result of:
            0.061644107 = score(doc=4006,freq=2.0), product of:
              0.14311713 = queryWeight, product of:
                1.0753851 = boost
                6.497461 = idf(docFreq=181, maxDocs=44421)
                0.020482546 = queryNorm
              0.43072486 = fieldWeight in 4006, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.497461 = idf(docFreq=181, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
          0.012449934 = weight(abstract_txt:more in 4006) [ClassicSimilarity], result of:
            0.012449934 = score(doc=4006,freq=1.0), product of:
              0.07820393 = queryWeight, product of:
                1.1242101 = boost
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.020482546 = queryNorm
              0.15919831 = fieldWeight in 4006, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
          0.03197304 = weight(abstract_txt:methods in 4006) [ClassicSimilarity], result of:
            0.03197304 = score(doc=4006,freq=2.0), product of:
              0.11640274 = queryWeight, product of:
                1.37156 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.020482546 = queryNorm
              0.27467602 = fieldWeight in 4006, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
          0.025348336 = weight(abstract_txt:available in 4006) [ClassicSimilarity], result of:
            0.025348336 = score(doc=4006,freq=1.0), product of:
              0.12562716 = queryWeight, product of:
                1.4248691 = boost
                4.304519 = idf(docFreq=1630, maxDocs=44421)
                0.020482546 = queryNorm
              0.20177433 = fieldWeight in 4006, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.304519 = idf(docFreq=1630, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
          0.081862345 = weight(abstract_txt:documents in 4006) [ClassicSimilarity], result of:
            0.081862345 = score(doc=4006,freq=6.0), product of:
              0.17290996 = queryWeight, product of:
                2.0473347 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.020482546 = queryNorm
              0.47343916 = fieldWeight in 4006, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
          0.066864386 = weight(abstract_txt:automatic in 4006) [ClassicSimilarity], result of:
            0.066864386 = score(doc=4006,freq=1.0), product of:
              0.27454332 = queryWeight, product of:
                2.5797894 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.020482546 = queryNorm
              0.24354766 = fieldWeight in 4006, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
          0.07149146 = weight(abstract_txt:classification in 4006) [ClassicSimilarity], result of:
            0.07149146 = score(doc=4006,freq=2.0), product of:
              0.27014068 = queryWeight, product of:
                3.3036816 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.020482546 = queryNorm
              0.26464528 = fieldWeight in 4006, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.046875 = fieldNorm(doc=4006)
        0.32 = coord(8/25)
    
  4. Desale, S.K.; Kumbhar, R.: Research on automatic classification of documents in library environment : a literature review (2013) 0.11
    0.10522871 = sum of:
      0.10522871 = product of:
        0.65767944 = sum of:
          0.10387045 = weight(abstract_txt:review in 2071) [ClassicSimilarity], result of:
            0.10387045 = score(doc=2071,freq=3.0), product of:
              0.15867396 = queryWeight, product of:
                1.6013491 = boost
                4.837664 = idf(docFreq=956, maxDocs=44421)
                0.020482546 = queryNorm
              0.65461564 = fieldWeight in 2071, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.837664 = idf(docFreq=956, maxDocs=44421)
                0.078125 = fieldNorm(doc=2071)
          0.12454959 = weight(abstract_txt:documents in 2071) [ClassicSimilarity], result of:
            0.12454959 = score(doc=2071,freq=5.0), product of:
              0.17290996 = queryWeight, product of:
                2.0473347 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.020482546 = queryNorm
              0.72031474 = fieldWeight in 2071, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=2071)
          0.22288127 = weight(abstract_txt:automatic in 2071) [ClassicSimilarity], result of:
            0.22288127 = score(doc=2071,freq=4.0), product of:
              0.27454332 = queryWeight, product of:
                2.5797894 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.020482546 = queryNorm
              0.8118255 = fieldWeight in 2071, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=2071)
          0.20637807 = weight(abstract_txt:classification in 2071) [ClassicSimilarity], result of:
            0.20637807 = score(doc=2071,freq=6.0), product of:
              0.27014068 = queryWeight, product of:
                3.3036816 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.020482546 = queryNorm
              0.7639652 = fieldWeight in 2071, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=2071)
        0.16 = coord(4/25)
    
  5. Hendley, T.: Document image processing : going beyond the black-and-white barrier (1995) 0.10
    0.10344346 = sum of:
      0.10344346 = product of:
        0.5172173 = sum of:
          0.02074989 = weight(abstract_txt:more in 2502) [ClassicSimilarity], result of:
            0.02074989 = score(doc=2502,freq=1.0), product of:
              0.07820393 = queryWeight, product of:
                1.1242101 = boost
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.020482546 = queryNorm
              0.26533052 = fieldWeight in 2502, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.078125 = fieldNorm(doc=2502)
          0.04224723 = weight(abstract_txt:available in 2502) [ClassicSimilarity], result of:
            0.04224723 = score(doc=2502,freq=1.0), product of:
              0.12562716 = queryWeight, product of:
                1.4248691 = boost
                4.304519 = idf(docFreq=1630, maxDocs=44421)
                0.020482546 = queryNorm
              0.33629057 = fieldWeight in 2502, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.304519 = idf(docFreq=1630, maxDocs=44421)
                0.078125 = fieldNorm(doc=2502)
          0.0697337 = weight(abstract_txt:techniques in 2502) [ClassicSimilarity], result of:
            0.0697337 = score(doc=2502,freq=2.0), product of:
              0.13926326 = queryWeight, product of:
                1.5002079 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.020482546 = queryNorm
              0.50073296 = fieldWeight in 2502, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.078125 = fieldNorm(doc=2502)
          0.05570027 = weight(abstract_txt:documents in 2502) [ClassicSimilarity], result of:
            0.05570027 = score(doc=2502,freq=1.0), product of:
              0.17290996 = queryWeight, product of:
                2.0473347 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.020482546 = queryNorm
              0.32213452 = fieldWeight in 2502, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=2502)
          0.32878616 = weight(abstract_txt:office in 2502) [ClassicSimilarity], result of:
            0.32878616 = score(doc=2502,freq=2.0), product of:
              0.4482437 = queryWeight, product of:
                3.2963698 = boost
                6.6388726 = idf(docFreq=157, maxDocs=44421)
                0.020482546 = queryNorm
              0.7334987 = fieldWeight in 2502, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6388726 = idf(docFreq=157, maxDocs=44421)
                0.078125 = fieldNorm(doc=2502)
        0.2 = coord(5/25)