Document (#43992)

Author
Tao, J.
Zhou, L.
Hickey, K.
Title
Making sense of the black-boxes : toward interpretable text classification using deep learning models
Source
Journal of the Association for Information Science and Technology. 74(2023) no.6, S.685-700
Year
2023
Abstract
Text classification is a common task in data science. Despite the superior performances of deep learning based models in various text classification tasks, their black-box nature poses significant challenges for wide adoption. The knowledge-to-action framework emphasizes several principles concerning the application and use of knowledge, such as ease-of-use, customization, and feedback. With the guidance of the above principles and the properties of interpretable machine learning, we identify the design requirements for and propose an interpretable deep learning (IDeL) based framework for text classification models. IDeL comprises three main components: feature penetration, instance aggregation, and feature perturbation. We evaluate our implementation of the framework with two distinct case studies: fake news detection and social question categorization. The experiment results provide evidence for the efficacy of IDeL components in enhancing the interpretability of text classification models. Moreover, the findings are generalizable across binary and multi-label, multi-class classification problems. The proposed IDeL framework introduce a unique iField perspective for building trusted models in data science by improving the transparency and access to advanced black-box models.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24642.
Theme
Computerlinguistik

Similar documents (author)

  1. Hickey, D.J.: Subject analysis: an interpretative survey (1976/77) 2.03
    2.0345094 = sum of:
      2.0345094 = product of:
        4.069019 = sum of:
          4.069019 = weight(author_txt:hickey in 1710) [ClassicSimilarity], result of:
            4.069019 = score(doc=1710,freq=1.0), product of:
              0.76423967 = queryWeight, product of:
                1.0885733 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.08241225 = queryNorm
              5.3242707 = fieldWeight in 1710, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.625 = fieldNorm(doc=1710)
        0.5 = coord(1/2)
    
  2. Hickey, R.: Datenbankverwaltung auf dem PC : eine praxisorientierte Einführung für jeden Anwender (1993) 2.03
    2.0345094 = sum of:
      2.0345094 = product of:
        4.069019 = sum of:
          4.069019 = weight(author_txt:hickey in 6593) [ClassicSimilarity], result of:
            4.069019 = score(doc=6593,freq=1.0), product of:
              0.76423967 = queryWeight, product of:
                1.0885733 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.08241225 = queryNorm
              5.3242707 = fieldWeight in 6593, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.625 = fieldNorm(doc=6593)
        0.5 = coord(1/2)
    
  3. Hickey, T.B.: ¬The Experimental Library System (XLS) (1989) 2.03
    2.0345094 = sum of:
      2.0345094 = product of:
        4.069019 = sum of:
          4.069019 = weight(author_txt:hickey in 2875) [ClassicSimilarity], result of:
            4.069019 = score(doc=2875,freq=1.0), product of:
              0.76423967 = queryWeight, product of:
                1.0885733 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.08241225 = queryNorm
              5.3242707 = fieldWeight in 2875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.625 = fieldNorm(doc=2875)
        0.5 = coord(1/2)
    
  4. Hickey, T.B.: Present and future capabilities of the online journal (1995) 2.03
    2.0345094 = sum of:
      2.0345094 = product of:
        4.069019 = sum of:
          4.069019 = weight(author_txt:hickey in 3029) [ClassicSimilarity], result of:
            4.069019 = score(doc=3029,freq=1.0), product of:
              0.76423967 = queryWeight, product of:
                1.0885733 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.08241225 = queryNorm
              5.3242707 = fieldWeight in 3029, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.625 = fieldNorm(doc=3029)
        0.5 = coord(1/2)
    
  5. Hickey, D.D.: Doralyn Joanne Hickey, 1929-1987 : a brother librarian's perspective (1998) 2.03
    2.0345094 = sum of:
      2.0345094 = product of:
        4.069019 = sum of:
          4.069019 = weight(author_txt:hickey in 2464) [ClassicSimilarity], result of:
            4.069019 = score(doc=2464,freq=1.0), product of:
              0.76423967 = queryWeight, product of:
                1.0885733 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.08241225 = queryNorm
              5.3242707 = fieldWeight in 2464, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.625 = fieldNorm(doc=2464)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Wang, P.; Li, X.: Assessing the quality of information on Wikipedia : a deep-learning approach (2020) 0.19
    0.1925749 = sum of:
      0.1925749 = product of:
        0.6877675 = sum of:
          0.102786586 = weight(abstract_txt:feature in 5505) [ClassicSimilarity], result of:
            0.102786586 = score(doc=5505,freq=4.0), product of:
              0.13935205 = queryWeight, product of:
                1.6161258 = boost
                5.9008293 = idf(docFreq=328, maxDocs=44218)
                0.01461252 = queryNorm
              0.73760366 = fieldWeight in 5505, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9008293 = idf(docFreq=328, maxDocs=44218)
                0.0625 = fieldNorm(doc=5505)
          0.06668144 = weight(abstract_txt:framework in 5505) [ClassicSimilarity], result of:
            0.06668144 = score(doc=5505,freq=2.0), product of:
              0.16577245 = queryWeight, product of:
                2.4928129 = boost
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.01461252 = queryNorm
              0.40224677 = fieldWeight in 5505, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.0625 = fieldNorm(doc=5505)
          0.07586344 = weight(abstract_txt:learning in 5505) [ClassicSimilarity], result of:
            0.07586344 = score(doc=5505,freq=2.0), product of:
              0.18066087 = queryWeight, product of:
                2.602349 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.01461252 = queryNorm
              0.41992182 = fieldWeight in 5505, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=5505)
          0.18555239 = weight(abstract_txt:deep in 5505) [ClassicSimilarity], result of:
            0.18555239 = score(doc=5505,freq=3.0), product of:
              0.26030156 = queryWeight, product of:
                2.7052195 = boost
                6.5848994 = idf(docFreq=165, maxDocs=44218)
                0.01461252 = queryNorm
              0.71283627 = fieldWeight in 5505, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5848994 = idf(docFreq=165, maxDocs=44218)
                0.0625 = fieldNorm(doc=5505)
          0.041352116 = weight(abstract_txt:text in 5505) [ClassicSimilarity], result of:
            0.041352116 = score(doc=5505,freq=1.0), product of:
              0.16361417 = queryWeight, product of:
                2.768847 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01461252 = queryNorm
              0.25274166 = fieldWeight in 5505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=5505)
          0.047740247 = weight(abstract_txt:classification in 5505) [ClassicSimilarity], result of:
            0.047740247 = score(doc=5505,freq=1.0), product of:
              0.19134007 = queryWeight, product of:
                3.2800622 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.01461252 = queryNorm
              0.2495047 = fieldWeight in 5505, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=5505)
          0.1677913 = weight(abstract_txt:models in 5505) [ClassicSimilarity], result of:
            0.1677913 = score(doc=5505,freq=5.0), product of:
              0.25866586 = queryWeight, product of:
                3.8137188 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.01461252 = queryNorm
              0.64867973 = fieldWeight in 5505, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=5505)
        0.28 = coord(7/25)
    
  2. Li, H.; Wu, H.; Li, D.; Lin, S.; Su, Z.; Luo, X.: PSI: A probabilistic semantic interpretable framework for fine-grained image ranking (2018) 0.15
    0.14818557 = sum of:
      0.14818557 = product of:
        0.7409278 = sum of:
          0.017994808 = weight(abstract_txt:science in 4577) [ClassicSimilarity], result of:
            0.017994808 = score(doc=4577,freq=1.0), product of:
              0.059657793 = queryWeight, product of:
                1.0574311 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.01461252 = queryNorm
              0.3016338 = fieldWeight in 4577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.078125 = fieldNorm(doc=4577)
          0.08335179 = weight(abstract_txt:framework in 4577) [ClassicSimilarity], result of:
            0.08335179 = score(doc=4577,freq=2.0), product of:
              0.16577245 = queryWeight, product of:
                2.4928129 = boost
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.01461252 = queryNorm
              0.50280845 = fieldWeight in 4577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.078125 = fieldNorm(doc=4577)
          0.094829306 = weight(abstract_txt:learning in 4577) [ClassicSimilarity], result of:
            0.094829306 = score(doc=4577,freq=2.0), product of:
              0.18066087 = queryWeight, product of:
                2.602349 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.01461252 = queryNorm
              0.5249023 = fieldWeight in 4577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.078125 = fieldNorm(doc=4577)
          0.35715553 = weight(abstract_txt:interpretable in 4577) [ClassicSimilarity], result of:
            0.35715553 = score(doc=4577,freq=1.0), product of:
              0.5006156 = queryWeight, product of:
                3.7515981 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.01461252 = queryNorm
              0.71343267 = fieldWeight in 4577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.078125 = fieldNorm(doc=4577)
          0.18759638 = weight(abstract_txt:models in 4577) [ClassicSimilarity], result of:
            0.18759638 = score(doc=4577,freq=4.0), product of:
              0.25866586 = queryWeight, product of:
                3.8137188 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.01461252 = queryNorm
              0.725246 = fieldWeight in 4577, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=4577)
        0.2 = coord(5/25)
    
  3. Angelini, M.; Fazzini, V.; Ferro, N.; Santucci, G.; Silvello, G.: CLAIRE: A combinatorial visual analytics system for information retrieval evaluation (2018) 0.14
    0.13769856 = sum of:
      0.13769856 = product of:
        0.573744 = sum of:
          0.05421495 = weight(abstract_txt:performances in 5049) [ClassicSimilarity], result of:
            0.05421495 = score(doc=5049,freq=1.0), product of:
              0.12528712 = queryWeight, product of:
                1.0835693 = boost
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.01461252 = queryNorm
              0.43272567 = fieldWeight in 5049, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5049)
          0.08554691 = weight(abstract_txt:boxes in 5049) [ClassicSimilarity], result of:
            0.08554691 = score(doc=5049,freq=1.0), product of:
              0.16981 = queryWeight, product of:
                1.2614938 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.01461252 = queryNorm
              0.5037802 = fieldWeight in 5049, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5049)
          0.101020485 = weight(abstract_txt:components in 5049) [ClassicSimilarity], result of:
            0.101020485 = score(doc=5049,freq=7.0), product of:
              0.12495223 = queryWeight, product of:
                1.530349 = boost
                5.58764 = idf(docFreq=449, maxDocs=44218)
                0.01461252 = queryNorm
              0.8084729 = fieldWeight in 5049, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.58764 = idf(docFreq=449, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5049)
          0.09373763 = weight(abstract_txt:deep in 5049) [ClassicSimilarity], result of:
            0.09373763 = score(doc=5049,freq=1.0), product of:
              0.26030156 = queryWeight, product of:
                2.7052195 = boost
                6.5848994 = idf(docFreq=165, maxDocs=44218)
                0.01461252 = queryNorm
              0.36011168 = fieldWeight in 5049, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5848994 = idf(docFreq=165, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5049)
          0.17356525 = weight(abstract_txt:black in 5049) [ClassicSimilarity], result of:
            0.17356525 = score(doc=5049,freq=1.0), product of:
              0.3925027 = queryWeight, product of:
                3.3218915 = boost
                8.085969 = idf(docFreq=36, maxDocs=44218)
                0.01461252 = queryNorm
              0.44220144 = fieldWeight in 5049, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.085969 = idf(docFreq=36, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5049)
          0.06565873 = weight(abstract_txt:models in 5049) [ClassicSimilarity], result of:
            0.06565873 = score(doc=5049,freq=1.0), product of:
              0.25866586 = queryWeight, product of:
                3.8137188 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.01461252 = queryNorm
              0.2538361 = fieldWeight in 5049, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5049)
        0.24 = coord(6/25)
    
  4. Yan, X.; Li, X.; Song, D.: ¬A correlation analysis on LSA and HAL semantic space models (2004) 0.11
    0.10703582 = sum of:
      0.10703582 = product of:
        0.6689739 = sum of:
          0.17283086 = weight(abstract_txt:boxes in 2152) [ClassicSimilarity], result of:
            0.17283086 = score(doc=2152,freq=2.0), product of:
              0.16981 = queryWeight, product of:
                1.2614938 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.01461252 = queryNorm
              1.0177897 = fieldWeight in 2152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.078125 = fieldNorm(doc=2152)
          0.051690146 = weight(abstract_txt:text in 2152) [ClassicSimilarity], result of:
            0.051690146 = score(doc=2152,freq=1.0), product of:
              0.16361417 = queryWeight, product of:
                2.768847 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01461252 = queryNorm
              0.3159271 = fieldWeight in 2152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=2152)
          0.35065475 = weight(abstract_txt:black in 2152) [ClassicSimilarity], result of:
            0.35065475 = score(doc=2152,freq=2.0), product of:
              0.3925027 = queryWeight, product of:
                3.3218915 = boost
                8.085969 = idf(docFreq=36, maxDocs=44218)
                0.01461252 = queryNorm
              0.8933818 = fieldWeight in 2152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.085969 = idf(docFreq=36, maxDocs=44218)
                0.078125 = fieldNorm(doc=2152)
          0.09379819 = weight(abstract_txt:models in 2152) [ClassicSimilarity], result of:
            0.09379819 = score(doc=2152,freq=1.0), product of:
              0.25866586 = queryWeight, product of:
                3.8137188 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.01461252 = queryNorm
              0.362623 = fieldWeight in 2152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=2152)
        0.16 = coord(4/25)
    
  5. Singh, V.K.; Ghosh, I.; Sonagara, D.: Detecting fake news stories via multimodal analysis (2021) 0.10
    0.10499785 = sum of:
      0.10499785 = product of:
        0.43749106 = sum of:
          0.04998729 = weight(abstract_txt:poses in 88) [ClassicSimilarity], result of:
            0.04998729 = score(doc=88,freq=1.0), product of:
              0.108577244 = queryWeight, product of:
                1.0087253 = boost
                7.3661537 = idf(docFreq=75, maxDocs=44218)
                0.01461252 = queryNorm
              0.4603846 = fieldWeight in 88, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3661537 = idf(docFreq=75, maxDocs=44218)
                0.0625 = fieldNorm(doc=88)
          0.0203588 = weight(abstract_txt:science in 88) [ClassicSimilarity], result of:
            0.0203588 = score(doc=88,freq=2.0), product of:
              0.059657793 = queryWeight, product of:
                1.0574311 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.01461252 = queryNorm
              0.3412597 = fieldWeight in 88, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0625 = fieldNorm(doc=88)
          0.16683891 = weight(abstract_txt:fake in 88) [ClassicSimilarity], result of:
            0.16683891 = score(doc=88,freq=7.0), product of:
              0.1267646 = queryWeight, product of:
                1.0899397 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.01461252 = queryNorm
              1.3161318 = fieldWeight in 88, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=88)
          0.053643554 = weight(abstract_txt:learning in 88) [ClassicSimilarity], result of:
            0.053643554 = score(doc=88,freq=1.0), product of:
              0.18066087 = queryWeight, product of:
                2.602349 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.01461252 = queryNorm
              0.29692957 = fieldWeight in 88, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=88)
          0.071623966 = weight(abstract_txt:text in 88) [ClassicSimilarity], result of:
            0.071623966 = score(doc=88,freq=3.0), product of:
              0.16361417 = queryWeight, product of:
                2.768847 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01461252 = queryNorm
              0.4377614 = fieldWeight in 88, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=88)
          0.07503855 = weight(abstract_txt:models in 88) [ClassicSimilarity], result of:
            0.07503855 = score(doc=88,freq=1.0), product of:
              0.25866586 = queryWeight, product of:
                3.8137188 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.01461252 = queryNorm
              0.2900984 = fieldWeight in 88, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=88)
        0.24 = coord(6/25)