Document (#30274)

Author
Yoon, Y.
Lee, C.
Lee, G.G.
Title
¬An effective procedure for constructing a hierarchical text classification system
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.431-442
Year
2006
Abstract
In text categorization tasks, classification on some class hierarchies has better results than in cases without the hierarchy. Currently, because a large number of documents are divided into several subgroups in a hierarchy, we can appropriately use a hierarchical classification method. However, we have no systematic method to build a hierarchical classification system that performs well with large collections of practical data. In this article, we introduce a new evaluation scheme for internal node classifiers, which can be used effectively to develop a hierarchical classification system. We also show that our method for constructing the hierarchical classification system is very effective, especially for the task of constructing classifiers applied to hierarchy tree with a lot of levels.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Yoon, L.L.: ¬The performance of cited references as an approach to information retrieval (1994) 5.58
    5.5805492 = sum of:
      5.5805492 = weight(author_txt:yoon in 8218) [ClassicSimilarity], result of:
        5.5805492 = fieldWeight in 8218, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.928879 = idf(docFreq=15, maxDocs=44421)
          0.625 = fieldNorm(doc=8218)
    
  2. Yoon, J.W.: Utilizing quantitative users' reactions to represent affective meanings of an image (2010) 5.58
    5.5805492 = sum of:
      5.5805492 = weight(author_txt:yoon in 571) [ClassicSimilarity], result of:
        5.5805492 = fieldWeight in 571, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.928879 = idf(docFreq=15, maxDocs=44421)
          0.625 = fieldNorm(doc=571)
    
  3. Yoon, J.W.: Towards a user-oriented thesaurus for non-domain-specific image collections (2009) 5.58
    5.5805492 = sum of:
      5.5805492 = weight(author_txt:yoon in 221) [ClassicSimilarity], result of:
        5.5805492 = fieldWeight in 221, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.928879 = idf(docFreq=15, maxDocs=44421)
          0.625 = fieldNorm(doc=221)
    
  4. Yoon, K.: Conceptual syntagmatic associations in user tagging (2012) 5.58
    5.5805492 = sum of:
      5.5805492 = weight(author_txt:yoon in 1240) [ClassicSimilarity], result of:
        5.5805492 = fieldWeight in 1240, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.928879 = idf(docFreq=15, maxDocs=44421)
          0.625 = fieldNorm(doc=1240)
    
  5. Yoon, A.: Data reusers' trust development (2017) 5.58
    5.5805492 = sum of:
      5.5805492 = weight(author_txt:yoon in 4532) [ClassicSimilarity], result of:
        5.5805492 = fieldWeight in 4532, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.928879 = idf(docFreq=15, maxDocs=44421)
          0.625 = fieldNorm(doc=4532)
    

Similar documents (content)

  1. Gauch, S.; Chandramouli, A.; Ranganathan, S.: Training a hierarchical classifier using inter document relationships (2009) 0.28
    0.27993953 = sum of:
      0.27993953 = product of:
        1.1664147 = sum of:
          0.0395844 = weight(abstract_txt:text in 3697) [ClassicSimilarity], result of:
            0.0395844 = score(doc=3697,freq=2.0), product of:
              0.08866309 = queryWeight, product of:
                1.3857216 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.015834002 = queryNorm
              0.4464586 = fieldWeight in 3697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.052593756 = weight(abstract_txt:large in 3697) [ClassicSimilarity], result of:
            0.052593756 = score(doc=3697,freq=2.0), product of:
              0.10715595 = queryWeight, product of:
                1.5233955 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.015834002 = queryNorm
              0.4908151 = fieldWeight in 3697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.31334358 = weight(abstract_txt:classifiers in 3697) [ClassicSimilarity], result of:
            0.31334358 = score(doc=3697,freq=3.0), product of:
              0.30764055 = queryWeight, product of:
                2.5812278 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.015834002 = queryNorm
              1.018538 = fieldWeight in 3697, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.25333822 = weight(abstract_txt:hierarchy in 3697) [ClassicSimilarity], result of:
            0.25333822 = score(doc=3697,freq=2.0), product of:
              0.34985736 = queryWeight, product of:
                3.3712866 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.015834002 = queryNorm
              0.7241186 = fieldWeight in 3697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.16194047 = weight(abstract_txt:classification in 3697) [ClassicSimilarity], result of:
            0.16194047 = score(doc=3697,freq=4.0), product of:
              0.25961363 = queryWeight, product of:
                4.10704 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.015834002 = queryNorm
              0.6237749 = fieldWeight in 3697, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.3456143 = weight(abstract_txt:hierarchical in 3697) [ClassicSimilarity], result of:
            0.3456143 = score(doc=3697,freq=3.0), product of:
              0.445729 = queryWeight, product of:
                4.912584 = boost
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.015834002 = queryNorm
              0.77539116 = fieldWeight in 3697, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
        0.24 = coord(6/25)
    
  2. Sun, A.; Lim, E.-P.; Ng, W.-K.: Performance measurement framework for hierarchical text classification (2003) 0.24
    0.2391781 = sum of:
      0.2391781 = product of:
        0.9965755 = sum of:
          0.05452619 = weight(abstract_txt:tree in 2808) [ClassicSimilarity], result of:
            0.05452619 = score(doc=2808,freq=1.0), product of:
              0.12737091 = queryWeight, product of:
                1.1744233 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.015834002 = queryNorm
              0.42808983 = fieldWeight in 2808, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.0625 = fieldNorm(doc=2808)
          0.022392318 = weight(abstract_txt:text in 2808) [ClassicSimilarity], result of:
            0.022392318 = score(doc=2808,freq=1.0), product of:
              0.08866309 = queryWeight, product of:
                1.3857216 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.015834002 = queryNorm
              0.25255513 = fieldWeight in 2808, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=2808)
          0.046350423 = weight(abstract_txt:method in 2808) [ClassicSimilarity], result of:
            0.046350423 = score(doc=2808,freq=1.0), product of:
              0.16484523 = queryWeight, product of:
                2.3141334 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.015834002 = queryNorm
              0.2811754 = fieldWeight in 2808, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=2808)
          0.32361987 = weight(abstract_txt:classifiers in 2808) [ClassicSimilarity], result of:
            0.32361987 = score(doc=2808,freq=5.0), product of:
              0.30764055 = queryWeight, product of:
                2.5812278 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.015834002 = queryNorm
              1.0519415 = fieldWeight in 2808, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0625 = fieldNorm(doc=2808)
          0.15866862 = weight(abstract_txt:classification in 2808) [ClassicSimilarity], result of:
            0.15866862 = score(doc=2808,freq=6.0), product of:
              0.25961363 = queryWeight, product of:
                4.10704 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.015834002 = queryNorm
              0.61117214 = fieldWeight in 2808, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=2808)
          0.391018 = weight(abstract_txt:hierarchical in 2808) [ClassicSimilarity], result of:
            0.391018 = score(doc=2808,freq=6.0), product of:
              0.445729 = queryWeight, product of:
                4.912584 = boost
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.015834002 = queryNorm
              0.877255 = fieldWeight in 2808, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.0625 = fieldNorm(doc=2808)
        0.24 = coord(6/25)
    
  3. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.24
    0.23863964 = sum of:
      0.23863964 = product of:
        0.99433184 = sum of:
          0.07885829 = weight(abstract_txt:performs in 3760) [ClassicSimilarity], result of:
            0.07885829 = score(doc=3760,freq=1.0), product of:
              0.14037563 = queryWeight, product of:
                1.2329215 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.015834002 = queryNorm
              0.56176627 = fieldWeight in 3760, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.078125 = fieldNorm(doc=3760)
          0.068562195 = weight(abstract_txt:text in 3760) [ClassicSimilarity], result of:
            0.068562195 = score(doc=3760,freq=6.0), product of:
              0.08866309 = queryWeight, product of:
                1.3857216 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.015834002 = queryNorm
              0.7732891 = fieldWeight in 3760, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=3760)
          0.032551624 = weight(abstract_txt:system in 3760) [ClassicSimilarity], result of:
            0.032551624 = score(doc=3760,freq=1.0), product of:
              0.12353648 = queryWeight, product of:
                2.3132212 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.015834002 = queryNorm
              0.26349807 = fieldWeight in 3760, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.078125 = fieldNorm(doc=3760)
          0.25333822 = weight(abstract_txt:hierarchy in 3760) [ClassicSimilarity], result of:
            0.25333822 = score(doc=3760,freq=2.0), product of:
              0.34985736 = queryWeight, product of:
                3.3712866 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.015834002 = queryNorm
              0.7241186 = fieldWeight in 3760, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.078125 = fieldNorm(doc=3760)
          0.16194047 = weight(abstract_txt:classification in 3760) [ClassicSimilarity], result of:
            0.16194047 = score(doc=3760,freq=4.0), product of:
              0.25961363 = queryWeight, product of:
                4.10704 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.015834002 = queryNorm
              0.6237749 = fieldWeight in 3760, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=3760)
          0.39908105 = weight(abstract_txt:hierarchical in 3760) [ClassicSimilarity], result of:
            0.39908105 = score(doc=3760,freq=4.0), product of:
              0.445729 = queryWeight, product of:
                4.912584 = boost
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.015834002 = queryNorm
              0.8953446 = fieldWeight in 3760, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.078125 = fieldNorm(doc=3760)
        0.24 = coord(6/25)
    
  4. Li, T.; Zhu, S.; Ogihara, M.: Hierarchical document classification using automatically generated hierarchy (2007) 0.23
    0.23290156 = sum of:
      0.23290156 = product of:
        0.8317913 = sum of:
          0.105114974 = weight(abstract_txt:categorization in 797) [ClassicSimilarity], result of:
            0.105114974 = score(doc=797,freq=3.0), product of:
              0.11788615 = queryWeight, product of:
                1.1298504 = boost
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.015834002 = queryNorm
              0.89166516 = fieldWeight in 797, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.078125 = fieldNorm(doc=797)
          0.131841 = weight(abstract_txt:hierarchies in 797) [ClassicSimilarity], result of:
            0.131841 = score(doc=797,freq=3.0), product of:
              0.13710503 = queryWeight, product of:
                1.2184739 = boost
                7.1063476 = idf(docFreq=98, maxDocs=44421)
                0.015834002 = queryNorm
              0.96160585 = fieldWeight in 797, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.1063476 = idf(docFreq=98, maxDocs=44421)
                0.078125 = fieldNorm(doc=797)
          0.0395844 = weight(abstract_txt:text in 797) [ClassicSimilarity], result of:
            0.0395844 = score(doc=797,freq=2.0), product of:
              0.08866309 = queryWeight, product of:
                1.3857216 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.015834002 = queryNorm
              0.4464586 = fieldWeight in 797, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=797)
          0.037189405 = weight(abstract_txt:large in 797) [ClassicSimilarity], result of:
            0.037189405 = score(doc=797,freq=1.0), product of:
              0.10715595 = queryWeight, product of:
                1.5233955 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.015834002 = queryNorm
              0.3470587 = fieldWeight in 797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.078125 = fieldNorm(doc=797)
          0.057938028 = weight(abstract_txt:method in 797) [ClassicSimilarity], result of:
            0.057938028 = score(doc=797,freq=1.0), product of:
              0.16484523 = queryWeight, product of:
                2.3141334 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.015834002 = queryNorm
              0.35146925 = fieldWeight in 797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.078125 = fieldNorm(doc=797)
          0.114509195 = weight(abstract_txt:classification in 797) [ClassicSimilarity], result of:
            0.114509195 = score(doc=797,freq=2.0), product of:
              0.25961363 = queryWeight, product of:
                4.10704 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.015834002 = queryNorm
              0.44107544 = fieldWeight in 797, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=797)
          0.3456143 = weight(abstract_txt:hierarchical in 797) [ClassicSimilarity], result of:
            0.3456143 = score(doc=797,freq=3.0), product of:
              0.445729 = queryWeight, product of:
                4.912584 = boost
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.015834002 = queryNorm
              0.77539116 = fieldWeight in 797, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.078125 = fieldNorm(doc=797)
        0.28 = coord(7/25)
    
  5. Pons-Porrata, A.; Berlanga-Llavori, R.; Ruiz-Shulcloper, J.: Topic discovery based on text mining techniques (2007) 0.19
    0.19108298 = sum of:
      0.19108298 = product of:
        0.6824392 = sum of:
          0.042699914 = weight(abstract_txt:build in 1916) [ClassicSimilarity], result of:
            0.042699914 = score(doc=1916,freq=1.0), product of:
              0.09325629 = queryWeight, product of:
                1.0049133 = boost
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.015834002 = queryNorm
              0.45787704 = fieldWeight in 1916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.860826 = idf(docFreq=343, maxDocs=44421)
                0.078125 = fieldNorm(doc=1916)
          0.07611844 = weight(abstract_txt:hierarchies in 1916) [ClassicSimilarity], result of:
            0.07611844 = score(doc=1916,freq=1.0), product of:
              0.13710503 = queryWeight, product of:
                1.2184739 = boost
                7.1063476 = idf(docFreq=98, maxDocs=44421)
                0.015834002 = queryNorm
              0.5551834 = fieldWeight in 1916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1063476 = idf(docFreq=98, maxDocs=44421)
                0.078125 = fieldNorm(doc=1916)
          0.046034947 = weight(abstract_txt:system in 1916) [ClassicSimilarity], result of:
            0.046034947 = score(doc=1916,freq=2.0), product of:
              0.12353648 = queryWeight, product of:
                2.3132212 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.015834002 = queryNorm
              0.37264252 = fieldWeight in 1916, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.078125 = fieldNorm(doc=1916)
          0.057938028 = weight(abstract_txt:method in 1916) [ClassicSimilarity], result of:
            0.057938028 = score(doc=1916,freq=1.0), product of:
              0.16484523 = queryWeight, product of:
                2.3141334 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.015834002 = queryNorm
              0.35146925 = fieldWeight in 1916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.078125 = fieldNorm(doc=1916)
          0.17913717 = weight(abstract_txt:hierarchy in 1916) [ClassicSimilarity], result of:
            0.17913717 = score(doc=1916,freq=1.0), product of:
              0.34985736 = queryWeight, product of:
                3.3712866 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.015834002 = queryNorm
              0.5120292 = fieldWeight in 1916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.078125 = fieldNorm(doc=1916)
          0.080970235 = weight(abstract_txt:classification in 1916) [ClassicSimilarity], result of:
            0.080970235 = score(doc=1916,freq=1.0), product of:
              0.25961363 = queryWeight, product of:
                4.10704 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.015834002 = queryNorm
              0.31188744 = fieldWeight in 1916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=1916)
          0.19954053 = weight(abstract_txt:hierarchical in 1916) [ClassicSimilarity], result of:
            0.19954053 = score(doc=1916,freq=1.0), product of:
              0.445729 = queryWeight, product of:
                4.912584 = boost
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.015834002 = queryNorm
              0.4476723 = fieldWeight in 1916, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7302055 = idf(docFreq=391, maxDocs=44421)
                0.078125 = fieldNorm(doc=1916)
        0.28 = coord(7/25)