Document (#43993)

Author
Tao, J.
Zhou, L.
Hickey, K.
Title
Making sense of the black-boxes : toward interpretable text classification using deep learning models
Source
Journal of the Association for Information Science and Technology. 74(2023) no.6, S.685-700
Year
2023
Abstract
Text classification is a common task in data science. Despite the superior performances of deep learning based models in various text classification tasks, their black-box nature poses significant challenges for wide adoption. The knowledge-to-action framework emphasizes several principles concerning the application and use of knowledge, such as ease-of-use, customization, and feedback. With the guidance of the above principles and the properties of interpretable machine learning, we identify the design requirements for and propose an interpretable deep learning (IDeL) based framework for text classification models. IDeL comprises three main components: feature penetration, instance aggregation, and feature perturbation. We evaluate our implementation of the framework with two distinct case studies: fake news detection and social question categorization. The experiment results provide evidence for the efficacy of IDeL components in enhancing the interpretability of text classification models. Moreover, the findings are generalizable across binary and multi-label, multi-class classification problems. The proposed IDeL framework introduce a unique iField perspective for building trusted models in data science by improving the transparency and access to advanced black-box models.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24642.
Theme
Computerlinguistik

Similar documents (author)

  1. Hickey, D.J.: Subject analysis: an interpretative survey (1976/77) 2.05
    2.048608 = sum of:
      2.048608 = product of:
        4.097216 = sum of:
          4.097216 = weight(author_txt:hickey in 1709) [ClassicSimilarity], result of:
            4.097216 = score(doc=1709,freq=1.0), product of:
              0.76912206 = queryWeight, product of:
                1.097015 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.08225629 = queryNorm
              5.3271337 = fieldWeight in 1709, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.625 = fieldNorm(doc=1709)
        0.5 = coord(1/2)
    
  2. Hickey, R.: Datenbankverwaltung auf dem PC : eine praxisorientierte Einführung für jeden Anwender (1993) 2.05
    2.048608 = sum of:
      2.048608 = product of:
        4.097216 = sum of:
          4.097216 = weight(author_txt:hickey in 6592) [ClassicSimilarity], result of:
            4.097216 = score(doc=6592,freq=1.0), product of:
              0.76912206 = queryWeight, product of:
                1.097015 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.08225629 = queryNorm
              5.3271337 = fieldWeight in 6592, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.625 = fieldNorm(doc=6592)
        0.5 = coord(1/2)
    
  3. Hickey, T.B.: ¬The Experimental Library System (XLS) (1989) 2.05
    2.048608 = sum of:
      2.048608 = product of:
        4.097216 = sum of:
          4.097216 = weight(author_txt:hickey in 2943) [ClassicSimilarity], result of:
            4.097216 = score(doc=2943,freq=1.0), product of:
              0.76912206 = queryWeight, product of:
                1.097015 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.08225629 = queryNorm
              5.3271337 = fieldWeight in 2943, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.625 = fieldNorm(doc=2943)
        0.5 = coord(1/2)
    
  4. Hickey, T.B.: Present and future capabilities of the online journal (1995) 2.05
    2.048608 = sum of:
      2.048608 = product of:
        4.097216 = sum of:
          4.097216 = weight(author_txt:hickey in 3097) [ClassicSimilarity], result of:
            4.097216 = score(doc=3097,freq=1.0), product of:
              0.76912206 = queryWeight, product of:
                1.097015 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.08225629 = queryNorm
              5.3271337 = fieldWeight in 3097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.625 = fieldNorm(doc=3097)
        0.5 = coord(1/2)
    
  5. Hickey, D.D.: Doralyn Joanne Hickey, 1929-1987 : a brother librarian's perspective (1998) 2.05
    2.048608 = sum of:
      2.048608 = product of:
        4.097216 = sum of:
          4.097216 = weight(author_txt:hickey in 3464) [ClassicSimilarity], result of:
            4.097216 = score(doc=3464,freq=1.0), product of:
              0.76912206 = queryWeight, product of:
                1.097015 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.08225629 = queryNorm
              5.3271337 = fieldWeight in 3464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.625 = fieldNorm(doc=3464)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Wang, P.; Li, X.: Assessing the quality of information on Wikipedia : a deep-learning approach (2020) 0.19
    0.19274701 = sum of:
      0.19274701 = product of:
        0.6883822 = sum of:
          0.103383556 = weight(abstract_txt:feature in 505) [ClassicSimilarity], result of:
            0.103383556 = score(doc=505,freq=4.0), product of:
              0.1401247 = queryWeight, product of:
                1.6155357 = boost
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.014695059 = queryNorm
              0.73779684 = fieldWeight in 505, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.0625 = fieldNorm(doc=505)
          0.06673408 = weight(abstract_txt:framework in 505) [ClassicSimilarity], result of:
            0.06673408 = score(doc=505,freq=2.0), product of:
              0.16613667 = queryWeight, product of:
                2.4877515 = boost
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.014695059 = queryNorm
              0.40168184 = fieldWeight in 505, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.0625 = fieldNorm(doc=505)
          0.07582105 = weight(abstract_txt:learning in 505) [ClassicSimilarity], result of:
            0.07582105 = score(doc=505,freq=2.0), product of:
              0.18089513 = queryWeight, product of:
                2.5958984 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014695059 = queryNorm
              0.41914365 = fieldWeight in 505, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0625 = fieldNorm(doc=505)
          0.18636243 = weight(abstract_txt:deep in 505) [ClassicSimilarity], result of:
            0.18636243 = score(doc=505,freq=3.0), product of:
              0.26149455 = queryWeight, product of:
                2.7029386 = boost
                6.5834737 = idf(docFreq=166, maxDocs=44421)
                0.014695059 = queryNorm
              0.7126819 = fieldWeight in 505, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5834737 = idf(docFreq=166, maxDocs=44421)
                0.0625 = fieldNorm(doc=505)
          0.041467674 = weight(abstract_txt:text in 505) [ClassicSimilarity], result of:
            0.041467674 = score(doc=505,freq=1.0), product of:
              0.16419256 = queryWeight, product of:
                2.765069 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014695059 = queryNorm
              0.25255513 = fieldWeight in 505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=505)
          0.04798285 = weight(abstract_txt:classification in 505) [ClassicSimilarity], result of:
            0.04798285 = score(doc=505,freq=1.0), product of:
              0.19230835 = queryWeight, product of:
                3.2780755 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.014695059 = queryNorm
              0.24950996 = fieldWeight in 505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=505)
          0.16663055 = weight(abstract_txt:models in 505) [ClassicSimilarity], result of:
            0.16663055 = score(doc=505,freq=5.0), product of:
              0.2579015 = queryWeight, product of:
                3.79618 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.014695059 = queryNorm
              0.64610153 = fieldWeight in 505, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.0625 = fieldNorm(doc=505)
        0.28 = coord(7/25)
    
  2. Li, H.; Wu, H.; Li, D.; Lin, S.; Su, Z.; Luo, X.: PSI: A probabilistic semantic interpretable framework for fine-grained image ranking (2018) 0.15
    0.14664884 = sum of:
      0.14664884 = product of:
        0.7332442 = sum of:
          0.017940367 = weight(abstract_txt:science in 577) [ClassicSimilarity], result of:
            0.017940367 = score(doc=577,freq=1.0), product of:
              0.05963683 = queryWeight, product of:
                1.0539415 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.014695059 = queryNorm
              0.30082697 = fieldWeight in 577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.078125 = fieldNorm(doc=577)
          0.0834176 = weight(abstract_txt:framework in 577) [ClassicSimilarity], result of:
            0.0834176 = score(doc=577,freq=2.0), product of:
              0.16613667 = queryWeight, product of:
                2.4877515 = boost
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.014695059 = queryNorm
              0.5021023 = fieldWeight in 577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.078125 = fieldNorm(doc=577)
          0.0947763 = weight(abstract_txt:learning in 577) [ClassicSimilarity], result of:
            0.0947763 = score(doc=577,freq=2.0), product of:
              0.18089513 = queryWeight, product of:
                2.5958984 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014695059 = queryNorm
              0.52392954 = fieldWeight in 577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.078125 = fieldNorm(doc=577)
          0.3508113 = weight(abstract_txt:interpretable in 577) [ClassicSimilarity], result of:
            0.3508113 = score(doc=577,freq=1.0), product of:
              0.49549565 = queryWeight, product of:
                3.7207012 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.014695059 = queryNorm
              0.7080008 = fieldWeight in 577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.078125 = fieldNorm(doc=577)
          0.18629861 = weight(abstract_txt:models in 577) [ClassicSimilarity], result of:
            0.18629861 = score(doc=577,freq=4.0), product of:
              0.2579015 = queryWeight, product of:
                3.79618 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.014695059 = queryNorm
              0.7223635 = fieldWeight in 577, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.078125 = fieldNorm(doc=577)
        0.2 = coord(5/25)
    
  3. Angelini, M.; Fazzini, V.; Ferro, N.; Santucci, G.; Silvello, G.: CLAIRE: A combinatorial visual analytics system for information retrieval evaluation (2018) 0.14
    0.13834237 = sum of:
      0.13834237 = product of:
        0.5764265 = sum of:
          0.05458168 = weight(abstract_txt:performances in 49) [ClassicSimilarity], result of:
            0.05458168 = score(doc=49,freq=1.0), product of:
              0.12606163 = queryWeight, product of:
                1.083517 = boost
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.014695059 = queryNorm
              0.43297613 = fieldWeight in 49, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.0546875 = fieldNorm(doc=49)
          0.08610452 = weight(abstract_txt:boxes in 49) [ClassicSimilarity], result of:
            0.08610452 = score(doc=49,freq=1.0), product of:
              0.17083189 = queryWeight, product of:
                1.26133 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.014695059 = queryNorm
              0.5040307 = fieldWeight in 49, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0546875 = fieldNorm(doc=49)
          0.10165614 = weight(abstract_txt:components in 49) [ClassicSimilarity], result of:
            0.10165614 = score(doc=49,freq=7.0), product of:
              0.12568536 = queryWeight, product of:
                1.5300359 = boost
                5.59 = idf(docFreq=450, maxDocs=44421)
                0.014695059 = queryNorm
              0.80881447 = fieldWeight in 49, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.59 = idf(docFreq=450, maxDocs=44421)
                0.0546875 = fieldNorm(doc=49)
          0.094146855 = weight(abstract_txt:deep in 49) [ClassicSimilarity], result of:
            0.094146855 = score(doc=49,freq=1.0), product of:
              0.26149455 = queryWeight, product of:
                2.7029386 = boost
                6.5834737 = idf(docFreq=166, maxDocs=44421)
                0.014695059 = queryNorm
              0.36003372 = fieldWeight in 49, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5834737 = idf(docFreq=166, maxDocs=44421)
                0.0546875 = fieldNorm(doc=49)
          0.17473283 = weight(abstract_txt:black in 49) [ClassicSimilarity], result of:
            0.17473283 = score(doc=49,freq=1.0), product of:
              0.39491937 = queryWeight, product of:
                3.3216898 = boost
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.014695059 = queryNorm
              0.44245192 = fieldWeight in 49, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.0546875 = fieldNorm(doc=49)
          0.06520451 = weight(abstract_txt:models in 49) [ClassicSimilarity], result of:
            0.06520451 = score(doc=49,freq=1.0), product of:
              0.2579015 = queryWeight, product of:
                3.79618 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.014695059 = queryNorm
              0.2528272 = fieldWeight in 49, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.0546875 = fieldNorm(doc=49)
        0.24 = coord(6/25)
    
  4. Yan, X.; Li, X.; Song, D.: ¬A correlation analysis on LSA and HAL semantic space models (2004) 0.11
    0.10751279 = sum of:
      0.10751279 = product of:
        0.67195493 = sum of:
          0.1739574 = weight(abstract_txt:boxes in 3152) [ClassicSimilarity], result of:
            0.1739574 = score(doc=3152,freq=2.0), product of:
              0.17083189 = queryWeight, product of:
                1.26133 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.014695059 = queryNorm
              1.0182958 = fieldWeight in 3152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.078125 = fieldNorm(doc=3152)
          0.05183459 = weight(abstract_txt:text in 3152) [ClassicSimilarity], result of:
            0.05183459 = score(doc=3152,freq=1.0), product of:
              0.16419256 = queryWeight, product of:
                2.765069 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014695059 = queryNorm
              0.3156939 = fieldWeight in 3152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=3152)
          0.35301363 = weight(abstract_txt:black in 3152) [ClassicSimilarity], result of:
            0.35301363 = score(doc=3152,freq=2.0), product of:
              0.39491937 = queryWeight, product of:
                3.3216898 = boost
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.014695059 = queryNorm
              0.8938879 = fieldWeight in 3152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.078125 = fieldNorm(doc=3152)
          0.093149304 = weight(abstract_txt:models in 3152) [ClassicSimilarity], result of:
            0.093149304 = score(doc=3152,freq=1.0), product of:
              0.2579015 = queryWeight, product of:
                3.79618 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.014695059 = queryNorm
              0.36118174 = fieldWeight in 3152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.078125 = fieldNorm(doc=3152)
        0.16 = coord(4/25)
    
  5. Singh, V.K.; Ghosh, I.; Sonagara, D.: Detecting fake news stories via multimodal analysis (2021) 0.11
    0.105188325 = sum of:
      0.105188325 = product of:
        0.4382847 = sum of:
          0.05006459 = weight(abstract_txt:poses in 1089) [ClassicSimilarity], result of:
            0.05006459 = score(doc=1089,freq=1.0), product of:
              0.10887065 = queryWeight, product of:
                1.0069308 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.014695059 = queryNorm
              0.4598539 = fieldWeight in 1089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.0625 = fieldNorm(doc=1089)
          0.020297207 = weight(abstract_txt:science in 1089) [ClassicSimilarity], result of:
            0.020297207 = score(doc=1089,freq=2.0), product of:
              0.05963683 = queryWeight, product of:
                1.0539415 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.014695059 = queryNorm
              0.34034684 = fieldWeight in 1089, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.0625 = fieldNorm(doc=1089)
          0.1679658 = weight(abstract_txt:fake in 1089) [ClassicSimilarity], result of:
            0.1679658 = score(doc=1089,freq=7.0), product of:
              0.1275474 = queryWeight, product of:
                1.0898834 = boost
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.014695059 = queryNorm
              1.3168893 = fieldWeight in 1089, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.0625 = fieldNorm(doc=1089)
          0.053613577 = weight(abstract_txt:learning in 1089) [ClassicSimilarity], result of:
            0.053613577 = score(doc=1089,freq=1.0), product of:
              0.18089513 = queryWeight, product of:
                2.5958984 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014695059 = queryNorm
              0.29637933 = fieldWeight in 1089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0625 = fieldNorm(doc=1089)
          0.07182411 = weight(abstract_txt:text in 1089) [ClassicSimilarity], result of:
            0.07182411 = score(doc=1089,freq=3.0), product of:
              0.16419256 = queryWeight, product of:
                2.765069 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014695059 = queryNorm
              0.4374383 = fieldWeight in 1089, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1089)
          0.07451944 = weight(abstract_txt:models in 1089) [ClassicSimilarity], result of:
            0.07451944 = score(doc=1089,freq=1.0), product of:
              0.2579015 = queryWeight, product of:
                3.79618 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.014695059 = queryNorm
              0.28894538 = fieldWeight in 1089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.0625 = fieldNorm(doc=1089)
        0.24 = coord(6/25)