Document (#34561)

Author
Yi, K.
Title
Automatic text classification using library classification schemes : trends, issues and challenges
Source
International cataloguing and bibliographic control. 36(2007) no.4, S.78-82
Year
2007
Abstract
The proliferation of digital resources and their integration into a traditional library setting has created a pressing need for an automated tool that organizes textual information based on library classification schemes. Automated text classification is a research field of developing tools, methods, and models to automate text classification. This article describes the current popular approach for text classification and major text classification projects and applications that are based on library classification schemes. Related issues and challenges are discussed, and a number of considerations for the challenges are examined.
Theme
Automatisches Klassifizieren

Similar documents (content)

  1. Yi, K.: Challenges in automated classification using library classification schemes (2006) 0.43
    0.4347733 = sum of:
      0.4347733 = product of:
        1.3586665 = sum of:
          0.055852138 = weight(abstract_txt:tool in 810) [ClassicSimilarity], result of:
            0.055852138 = score(doc=810,freq=1.0), product of:
              0.09028072 = queryWeight, product of:
                1.0449309 = boost
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.017457122 = queryNorm
              0.61864966 = fieldWeight in 810, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
          0.06838044 = weight(abstract_txt:projects in 810) [ClassicSimilarity], result of:
            0.06838044 = score(doc=810,freq=1.0), product of:
              0.10332127 = queryWeight, product of:
                1.1178536 = boost
                5.2945876 = idf(docFreq=605, maxDocs=44421)
                0.017457122 = queryNorm
              0.66182345 = fieldWeight in 810, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2945876 = idf(docFreq=605, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
          0.08871314 = weight(abstract_txt:popular in 810) [ClassicSimilarity], result of:
            0.08871314 = score(doc=810,freq=1.0), product of:
              0.122902416 = queryWeight, product of:
                1.2191869 = boost
                5.7745414 = idf(docFreq=374, maxDocs=44421)
                0.017457122 = queryNorm
              0.7218177 = fieldWeight in 810, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7745414 = idf(docFreq=374, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
          0.1035077 = weight(abstract_txt:library in 810) [ClassicSimilarity], result of:
            0.1035077 = score(doc=810,freq=3.0), product of:
              0.14992124 = queryWeight, product of:
                2.6930947 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.017457122 = queryNorm
              0.69041383 = fieldWeight in 810, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
          0.18696891 = weight(abstract_txt:challenges in 810) [ClassicSimilarity], result of:
            0.18696891 = score(doc=810,freq=1.0), product of:
              0.29137692 = queryWeight, product of:
                3.251459 = boost
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.017457122 = queryNorm
              0.6416737 = fieldWeight in 810, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
          0.23424089 = weight(abstract_txt:schemes in 810) [ClassicSimilarity], result of:
            0.23424089 = score(doc=810,freq=1.0), product of:
              0.3386237 = queryWeight, product of:
                3.5051725 = boost
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.017457122 = queryNorm
              0.6917439 = fieldWeight in 810, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
          0.15199664 = weight(abstract_txt:text in 810) [ClassicSimilarity], result of:
            0.15199664 = score(doc=810,freq=1.0), product of:
              0.30091774 = queryWeight, product of:
                4.265785 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.017457122 = queryNorm
              0.50511026 = fieldWeight in 810, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
          0.4690067 = weight(abstract_txt:classification in 810) [ClassicSimilarity], result of:
            0.4690067 = score(doc=810,freq=4.0), product of:
              0.46992782 = queryWeight, product of:
                6.742961 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017457122 = queryNorm
              0.99803984 = fieldWeight in 810, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.125 = fieldNorm(doc=810)
        0.32 = coord(8/25)
    
  2. Kumbhar, R.: Library classification trends in the 21st century (2012) 0.24
    0.23589389 = sum of:
      0.23589389 = product of:
        0.84247816 = sum of:
          0.030595466 = weight(abstract_txt:applications in 1736) [ClassicSimilarity], result of:
            0.030595466 = score(doc=1736,freq=1.0), product of:
              0.08268369 = queryWeight, product of:
                4.7363873 = idf(docFreq=1058, maxDocs=44421)
                0.017457122 = queryNorm
              0.37003025 = fieldWeight in 1736, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7363873 = idf(docFreq=1058, maxDocs=44421)
                0.078125 = fieldNorm(doc=1736)
          0.034907587 = weight(abstract_txt:tool in 1736) [ClassicSimilarity], result of:
            0.034907587 = score(doc=1736,freq=1.0), product of:
              0.09028072 = queryWeight, product of:
                1.0449309 = boost
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.017457122 = queryNorm
              0.38665605 = fieldWeight in 1736, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.078125 = fieldNorm(doc=1736)
          0.04038718 = weight(abstract_txt:automatic in 1736) [ClassicSimilarity], result of:
            0.04038718 = score(doc=1736,freq=1.0), product of:
              0.09949719 = queryWeight, product of:
                1.0969719 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017457122 = queryNorm
              0.40591276 = fieldWeight in 1736, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=1736)
          0.050357968 = weight(abstract_txt:trends in 1736) [ClassicSimilarity], result of:
            0.050357968 = score(doc=1736,freq=1.0), product of:
              0.11526406 = queryWeight, product of:
                1.1806931 = boost
                5.59222 = idf(docFreq=449, maxDocs=44421)
                0.017457122 = queryNorm
              0.43689218 = fieldWeight in 1736, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.59222 = idf(docFreq=449, maxDocs=44421)
                0.078125 = fieldNorm(doc=1736)
          0.08351741 = weight(abstract_txt:library in 1736) [ClassicSimilarity], result of:
            0.08351741 = score(doc=1736,freq=5.0), product of:
              0.14992124 = queryWeight, product of:
                2.6930947 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.017457122 = queryNorm
              0.55707526 = fieldWeight in 1736, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.078125 = fieldNorm(doc=1736)
          0.0949979 = weight(abstract_txt:text in 1736) [ClassicSimilarity], result of:
            0.0949979 = score(doc=1736,freq=1.0), product of:
              0.30091774 = queryWeight, product of:
                4.265785 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.017457122 = queryNorm
              0.3156939 = fieldWeight in 1736, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=1736)
          0.5077146 = weight(abstract_txt:classification in 1736) [ClassicSimilarity], result of:
            0.5077146 = score(doc=1736,freq=12.0), product of:
              0.46992782 = queryWeight, product of:
                6.742961 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017457122 = queryNorm
              1.0804098 = fieldWeight in 1736, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=1736)
        0.28 = coord(7/25)
    
  3. Wang, J.: ¬An extensive study on automated Dewey Decimal Classification (2009) 0.23
    0.23242906 = sum of:
      0.23242906 = product of:
        0.72634083 = sum of:
          0.024476374 = weight(abstract_txt:applications in 159) [ClassicSimilarity], result of:
            0.024476374 = score(doc=159,freq=1.0), product of:
              0.08268369 = queryWeight, product of:
                4.7363873 = idf(docFreq=1058, maxDocs=44421)
                0.017457122 = queryNorm
              0.2960242 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7363873 = idf(docFreq=1058, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.032309745 = weight(abstract_txt:automatic in 159) [ClassicSimilarity], result of:
            0.032309745 = score(doc=159,freq=1.0), product of:
              0.09949719 = queryWeight, product of:
                1.0969719 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.017457122 = queryNorm
              0.32473022 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.033504024 = weight(abstract_txt:created in 159) [ClassicSimilarity], result of:
            0.033504024 = score(doc=159,freq=1.0), product of:
              0.10193417 = queryWeight, product of:
                1.1103246 = boost
                5.2589273 = idf(docFreq=627, maxDocs=44421)
                0.017457122 = queryNorm
              0.32868296 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2589273 = idf(docFreq=627, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.0809593 = weight(abstract_txt:automated in 159) [ClassicSimilarity], result of:
            0.0809593 = score(doc=159,freq=1.0), product of:
              0.23126484 = queryWeight, product of:
                2.3651564 = boost
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.017457122 = queryNorm
              0.3500718 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.05175385 = weight(abstract_txt:library in 159) [ClassicSimilarity], result of:
            0.05175385 = score(doc=159,freq=3.0), product of:
              0.14992124 = queryWeight, product of:
                2.6930947 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.017457122 = queryNorm
              0.34520692 = fieldWeight in 159, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.117120445 = weight(abstract_txt:schemes in 159) [ClassicSimilarity], result of:
            0.117120445 = score(doc=159,freq=1.0), product of:
              0.3386237 = queryWeight, product of:
                3.5051725 = boost
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.017457122 = queryNorm
              0.34587196 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.07599832 = weight(abstract_txt:text in 159) [ClassicSimilarity], result of:
            0.07599832 = score(doc=159,freq=1.0), product of:
              0.30091774 = queryWeight, product of:
                4.265785 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.017457122 = queryNorm
              0.25255513 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
          0.31021875 = weight(abstract_txt:classification in 159) [ClassicSimilarity], result of:
            0.31021875 = score(doc=159,freq=7.0), product of:
              0.46992782 = queryWeight, product of:
                6.742961 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017457122 = queryNorm
              0.6601413 = fieldWeight in 159, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=159)
        0.32 = coord(8/25)
    
  4. Batley, S.: Classification in theory and practice (2005) 0.22
    0.22187911 = sum of:
      0.22187911 = product of:
        0.7924254 = sum of:
          0.03485607 = weight(abstract_txt:examined in 2170) [ClassicSimilarity], result of:
            0.03485607 = score(doc=2170,freq=3.0), product of:
              0.099269 = queryWeight, product of:
                1.0957133 = boost
                5.189722 = idf(docFreq=672, maxDocs=44421)
                0.017457122 = queryNorm
              0.35112742 = fieldWeight in 2170, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.189722 = idf(docFreq=672, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2170)
          0.027722856 = weight(abstract_txt:popular in 2170) [ClassicSimilarity], result of:
            0.027722856 = score(doc=2170,freq=1.0), product of:
              0.122902416 = queryWeight, product of:
                1.2191869 = boost
                5.7745414 = idf(docFreq=374, maxDocs=44421)
                0.017457122 = queryNorm
              0.22556803 = fieldWeight in 2170, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7745414 = idf(docFreq=374, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2170)
          0.02288798 = weight(abstract_txt:issues in 2170) [ClassicSimilarity], result of:
            0.02288798 = score(doc=2170,freq=1.0), product of:
              0.13627519 = queryWeight, product of:
                1.8155719 = boost
                4.299626 = idf(docFreq=1638, maxDocs=44421)
                0.017457122 = queryNorm
              0.16795413 = fieldWeight in 2170, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.299626 = idf(docFreq=1638, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2170)
          0.041758705 = weight(abstract_txt:library in 2170) [ClassicSimilarity], result of:
            0.041758705 = score(doc=2170,freq=5.0), product of:
              0.14992124 = queryWeight, product of:
                2.6930947 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.017457122 = queryNorm
              0.27853763 = fieldWeight in 2170, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2170)
          0.23147961 = weight(abstract_txt:schemes in 2170) [ClassicSimilarity], result of:
            0.23147961 = score(doc=2170,freq=10.0), product of:
              0.3386237 = queryWeight, product of:
                3.5051725 = boost
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.017457122 = queryNorm
              0.6835895 = fieldWeight in 2170, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2170)
          0.08227059 = weight(abstract_txt:text in 2170) [ClassicSimilarity], result of:
            0.08227059 = score(doc=2170,freq=3.0), product of:
              0.30091774 = queryWeight, product of:
                4.265785 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.017457122 = queryNorm
              0.27339894 = fieldWeight in 2170, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2170)
          0.35144955 = weight(abstract_txt:classification in 2170) [ClassicSimilarity], result of:
            0.35144955 = score(doc=2170,freq=23.0), product of:
              0.46992782 = queryWeight, product of:
                6.742961 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017457122 = queryNorm
              0.74787986 = fieldWeight in 2170, product of:
                4.7958317 = tf(freq=23.0), with freq of:
                  23.0 = termFreq=23.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2170)
        0.28 = coord(7/25)
    
  5. Hurt, C.D.: Classification and subject analysis : looking to the future at a distance (1997) 0.20
    0.2020686 = sum of:
      0.2020686 = product of:
        1.010343 = sum of:
          0.14919648 = weight(abstract_txt:proliferation in 6998) [ClassicSimilarity], result of:
            0.14919648 = score(doc=6998,freq=1.0), product of:
              0.18999207 = queryWeight, product of:
                1.5158556 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.017457122 = queryNorm
              0.78527737 = fieldWeight in 6998, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.109375 = fieldNorm(doc=6998)
          0.052290175 = weight(abstract_txt:library in 6998) [ClassicSimilarity], result of:
            0.052290175 = score(doc=6998,freq=1.0), product of:
              0.14992124 = queryWeight, product of:
                2.6930947 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.017457122 = queryNorm
              0.3487843 = fieldWeight in 6998, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.109375 = fieldNorm(doc=6998)
          0.16359779 = weight(abstract_txt:challenges in 6998) [ClassicSimilarity], result of:
            0.16359779 = score(doc=6998,freq=1.0), product of:
              0.29137692 = queryWeight, product of:
                3.251459 = boost
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.017457122 = queryNorm
              0.5614645 = fieldWeight in 6998, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1333895 = idf(docFreq=711, maxDocs=44421)
                0.109375 = fieldNorm(doc=6998)
          0.2898583 = weight(abstract_txt:schemes in 6998) [ClassicSimilarity], result of:
            0.2898583 = score(doc=6998,freq=2.0), product of:
              0.3386237 = queryWeight, product of:
                3.5051725 = boost
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.017457122 = queryNorm
              0.85598946 = fieldWeight in 6998, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5339513 = idf(docFreq=476, maxDocs=44421)
                0.109375 = fieldNorm(doc=6998)
          0.35540023 = weight(abstract_txt:classification in 6998) [ClassicSimilarity], result of:
            0.35540023 = score(doc=6998,freq=3.0), product of:
              0.46992782 = queryWeight, product of:
                6.742961 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017457122 = queryNorm
              0.75628686 = fieldWeight in 6998, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.109375 = fieldNorm(doc=6998)
        0.2 = coord(5/25)