Document (#20912)

Deogun, J.S.
Feature selection and effective classifiers
Journal of the American Society for Information Science. 49(1998) no.5, S.423-434
Develops and analyzes 4 algorithms for feature selection in the context of rough set methodology. Develops the notion of accuracy of classification that can be used for upper or lower classification methods and defines the feature selection problem. Presents a discussion of upper classifiers and develops 4 features selection heuristics and discusses the family of stepwise backward selection algorithms. Analyzes the worst case time complexity in all algorithms presented. Discusses details of the experiments and results of using a family of stepwise backward selection learning data sets and a duodenal ulcer data set. Includes the experimental setup and results of comparison of lower classifiers and upper classiers on the duodenal ulcer data set. Discusses exteded decision tables
Contribution to a special issue devoted to knowledge discovery and data mining
Data Mining

Similar documents (content)

  1. Aphinyanaphongs, Y.; Fu, L.D.; Li, Z.; Peskin, E.R.; Efstathiadis, E.; Aliferis, C.F.; Statnikov, A.: ¬A comprehensive empirical comparison of modern supervised classification and feature selection methods for text categorization (2014) 0.15
    0.14974259 = sum of:
      0.14974259 = product of:
        0.7487129 = sum of:
          0.04018067 = weight(abstract_txt:classification in 2496) [ClassicSimilarity], result of:
            0.04018067 = score(doc=2496,freq=4.0), product of:
              0.064415336 = queryWeight, product of:
                1.3716558 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011763492 = queryNorm
              0.6237749 = fieldWeight in 2496, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=2496)
          0.01749346 = weight(abstract_txt:data in 2496) [ClassicSimilarity], result of:
            0.01749346 = score(doc=2496,freq=1.0), product of:
              0.06723758 = queryWeight, product of:
                1.7163355 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.011763492 = queryNorm
              0.26017386 = fieldWeight in 2496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.078125 = fieldNorm(doc=2496)
          0.19478932 = weight(abstract_txt:feature in 2496) [ClassicSimilarity], result of:
            0.19478932 = score(doc=2496,freq=4.0), product of:
              0.21121188 = queryWeight, product of:
                3.0419726 = boost
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.011763492 = queryNorm
              0.92224604 = fieldWeight in 2496, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.078125 = fieldNorm(doc=2496)
          0.20199215 = weight(abstract_txt:classifiers in 2496) [ClassicSimilarity], result of:
            0.20199215 = score(doc=2496,freq=1.0), product of:
              0.343493 = queryWeight, product of:
                3.8793154 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.011763492 = queryNorm
              0.58805317 = fieldWeight in 2496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.078125 = fieldNorm(doc=2496)
          0.2942573 = weight(abstract_txt:selection in 2496) [ClassicSimilarity], result of:
            0.2942573 = score(doc=2496,freq=4.0), product of:
              0.35035077 = queryWeight, product of:
                5.540675 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.011763492 = queryNorm
              0.83989346 = fieldWeight in 2496, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.078125 = fieldNorm(doc=2496)
        0.2 = coord(5/25)
  2. Dietterich, T.G.: Machine-learning research : four current directions (1997) 0.15
    0.14893104 = sum of:
      0.14893104 = product of:
        0.7446552 = sum of:
          0.06669092 = weight(abstract_txt:accuracy in 4321) [ClassicSimilarity], result of:
            0.06669092 = score(doc=4321,freq=1.0), product of:
              0.07167136 = queryWeight, product of:
                1.023077 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.011763492 = queryNorm
              0.9305101 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.04018067 = weight(abstract_txt:classification in 4321) [ClassicSimilarity], result of:
            0.04018067 = score(doc=4321,freq=1.0), product of:
              0.064415336 = queryWeight, product of:
                1.3716558 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011763492 = queryNorm
              0.6237749 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.058362205 = weight(abstract_txt:discusses in 4321) [ClassicSimilarity], result of:
            0.058362205 = score(doc=4321,freq=1.0), product of:
              0.09457202 = queryWeight, product of:
                2.0355299 = boost
                3.9495623 = idf(docFreq=2325, maxDocs=44421)
                0.011763492 = queryNorm
              0.61711913 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9495623 = idf(docFreq=2325, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.17543711 = weight(abstract_txt:algorithms in 4321) [ClassicSimilarity], result of:
            0.17543711 = score(doc=4321,freq=1.0), product of:
              0.1969802 = queryWeight, product of:
                2.9376998 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.011763492 = queryNorm
              0.8906332 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
          0.4039843 = weight(abstract_txt:classifiers in 4321) [ClassicSimilarity], result of:
            0.4039843 = score(doc=4321,freq=1.0), product of:
              0.343493 = queryWeight, product of:
                3.8793154 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.011763492 = queryNorm
              1.1761063 = fieldWeight in 4321, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.15625 = fieldNorm(doc=4321)
        0.2 = coord(5/25)
  3. Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.14
    0.13793674 = sum of:
      0.13793674 = product of:
        0.6896837 = sum of:
          0.013292116 = weight(abstract_txt:results in 6480) [ClassicSimilarity], result of:
            0.013292116 = score(doc=6480,freq=1.0), product of:
              0.048909575 = queryWeight, product of:
                1.1952189 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.011763492 = queryNorm
              0.2717692 = fieldWeight in 6480, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.044923354 = weight(abstract_txt:classification in 6480) [ClassicSimilarity], result of:
            0.044923354 = score(doc=6480,freq=5.0), product of:
              0.064415336 = queryWeight, product of:
                1.3716558 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011763492 = queryNorm
              0.6974015 = fieldWeight in 6480, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.13773684 = weight(abstract_txt:feature in 6480) [ClassicSimilarity], result of:
            0.13773684 = score(doc=6480,freq=2.0), product of:
              0.21121188 = queryWeight, product of:
                3.0419726 = boost
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.011763492 = queryNorm
              0.65212643 = fieldWeight in 6480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.28566003 = weight(abstract_txt:classifiers in 6480) [ClassicSimilarity], result of:
            0.28566003 = score(doc=6480,freq=2.0), product of:
              0.343493 = queryWeight, product of:
                3.8793154 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.011763492 = queryNorm
              0.83163273 = fieldWeight in 6480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
          0.20807135 = weight(abstract_txt:selection in 6480) [ClassicSimilarity], result of:
            0.20807135 = score(doc=6480,freq=2.0), product of:
              0.35035077 = queryWeight, product of:
                5.540675 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.011763492 = queryNorm
              0.59389436 = fieldWeight in 6480, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.078125 = fieldNorm(doc=6480)
        0.2 = coord(5/25)
  4. Yoon, Y.; Lee, G.G.: Efficient implementation of associative classifiers for document classification (2007) 0.13
    0.13328171 = sum of:
      0.13328171 = product of:
        0.55534047 = sum of:
          0.02667637 = weight(abstract_txt:accuracy in 1909) [ClassicSimilarity], result of:
            0.02667637 = score(doc=1909,freq=1.0), product of:
              0.07167136 = queryWeight, product of:
                1.023077 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.011763492 = queryNorm
              0.37220404 = fieldWeight in 1909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.0625 = fieldNorm(doc=1909)
          0.010633692 = weight(abstract_txt:results in 1909) [ClassicSimilarity], result of:
            0.010633692 = score(doc=1909,freq=1.0), product of:
              0.048909575 = queryWeight, product of:
                1.1952189 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.011763492 = queryNorm
              0.21741535 = fieldWeight in 1909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=1909)
          0.042523224 = weight(abstract_txt:classification in 1909) [ClassicSimilarity], result of:
            0.042523224 = score(doc=1909,freq=7.0), product of:
              0.064415336 = queryWeight, product of:
                1.3716558 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011763492 = queryNorm
              0.6601413 = fieldWeight in 1909, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=1909)
          0.07791573 = weight(abstract_txt:feature in 1909) [ClassicSimilarity], result of:
            0.07791573 = score(doc=1909,freq=1.0), product of:
              0.21121188 = queryWeight, product of:
                3.0419726 = boost
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.011763492 = queryNorm
              0.36889842 = fieldWeight in 1909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.0625 = fieldNorm(doc=1909)
          0.27988854 = weight(abstract_txt:classifiers in 1909) [ClassicSimilarity], result of:
            0.27988854 = score(doc=1909,freq=3.0), product of:
              0.343493 = queryWeight, product of:
                3.8793154 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.011763492 = queryNorm
              0.81483036 = fieldWeight in 1909, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0625 = fieldNorm(doc=1909)
          0.11770292 = weight(abstract_txt:selection in 1909) [ClassicSimilarity], result of:
            0.11770292 = score(doc=1909,freq=1.0), product of:
              0.35035077 = queryWeight, product of:
                5.540675 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.011763492 = queryNorm
              0.33595738 = fieldWeight in 1909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.0625 = fieldNorm(doc=1909)
        0.24 = coord(6/25)
  5. Mengle, S.S.R.; Goharian, N.: Ambiguity measure feature-selection algorithm (2009) 0.13
    0.13169561 = sum of:
      0.13169561 = product of:
        0.54873174 = sum of:
          0.026011996 = weight(abstract_txt:complexity in 3804) [ClassicSimilarity], result of:
            0.026011996 = score(doc=3804,freq=1.0), product of:
              0.07047638 = queryWeight, product of:
                1.0145123 = boost
                5.90541 = idf(docFreq=328, maxDocs=44421)
                0.011763492 = queryNorm
              0.3690881 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.90541 = idf(docFreq=328, maxDocs=44421)
                0.0625 = fieldNorm(doc=3804)
          0.02667637 = weight(abstract_txt:accuracy in 3804) [ClassicSimilarity], result of:
            0.02667637 = score(doc=3804,freq=1.0), product of:
              0.07167136 = queryWeight, product of:
                1.023077 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.011763492 = queryNorm
              0.37220404 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.0625 = fieldNorm(doc=3804)
          0.010633692 = weight(abstract_txt:results in 3804) [ClassicSimilarity], result of:
            0.010633692 = score(doc=3804,freq=1.0), product of:
              0.048909575 = queryWeight, product of:
                1.1952189 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.011763492 = queryNorm
              0.21741535 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=3804)
          0.016072268 = weight(abstract_txt:classification in 3804) [ClassicSimilarity], result of:
            0.016072268 = score(doc=3804,freq=1.0), product of:
              0.064415336 = queryWeight, product of:
                1.3716558 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011763492 = queryNorm
              0.24950996 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=3804)
          0.20614564 = weight(abstract_txt:feature in 3804) [ClassicSimilarity], result of:
            0.20614564 = score(doc=3804,freq=7.0), product of:
              0.21121188 = queryWeight, product of:
                3.0419726 = boost
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.011763492 = queryNorm
              0.9760135 = fieldWeight in 3804, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.0625 = fieldNorm(doc=3804)
          0.26319176 = weight(abstract_txt:selection in 3804) [ClassicSimilarity], result of:
            0.26319176 = score(doc=3804,freq=5.0), product of:
              0.35035077 = queryWeight, product of:
                5.540675 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.011763492 = queryNorm
              0.75122356 = fieldWeight in 3804, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.0625 = fieldNorm(doc=3804)
        0.24 = coord(6/25)