Document (#37239)

Author
Alberts, I.
Forest, D.
Title
Email pragmatics and automatic classification : a study in the organizational context
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.904-922
Year
2012
Abstract
This paper presents a two-phased research project aiming to improve email triage for public administration managers. The first phase developed a typology of email classification patterns through a qualitative study involving 34 participants. Inspired by the fields of pragmatics and speech act theory, this typology comprising four top level categories and 13 subcategories represents the typical email triage behaviors of managers in an organizational context. The second study phase was conducted on a corpus of 1,703 messages using email samples of two managers. Using the k-NN (k-nearest neighbor) algorithm, statistical treatments automatically classified the email according to lexical and nonlexical features representative of managers' triage patterns. The automatic classification of email according to the lexicon of the messages was found to be substantially more efficient when k = 2 and n = 2,000. For four categories, the average recall rate was 94.32%, the average precision rate was 94.50%, and the accuracy rate was 94.54%. For 13 categories, the average recall rate was 91.09%, the average precision rate was 84.18%, and the accuracy rate was 88.70%. It appears that a message's nonlexical features are also deeply influenced by email pragmatics. Features related to the recipient and the sender were the most relevant for characterizing email.
Theme
Automatisches Klassifizieren
Object
Email

Similar documents (content)

  1. Gao, N.; Dredze, M.; Oard, D.W.: Person entity linking in email with NIL detection (2017) 0.14
    0.13600178 = sum of:
      0.13600178 = product of:
        0.8500111 = sum of:
          0.049557745 = weight(abstract_txt:accuracy in 4830) [ClassicSimilarity], result of:
            0.049557745 = score(doc=4830,freq=2.0), product of:
              0.09414894 = queryWeight, product of:
                1.3430523 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.01177122 = queryNorm
              0.526376 = fieldWeight in 4830, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
          0.023298642 = weight(abstract_txt:features in 4830) [ClassicSimilarity], result of:
            0.023298642 = score(doc=4830,freq=1.0), product of:
              0.08209851 = queryWeight, product of:
                1.5360255 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.01177122 = queryNorm
              0.28378886 = fieldWeight in 4830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
          0.054445047 = weight(abstract_txt:messages in 4830) [ClassicSimilarity], result of:
            0.054445047 = score(doc=4830,freq=1.0), product of:
              0.12629612 = queryWeight, product of:
                1.5555364 = boost
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.01177122 = queryNorm
              0.4310904 = fieldWeight in 4830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
          0.72270966 = weight(abstract_txt:email in 4830) [ClassicSimilarity], result of:
            0.72270966 = score(doc=4830,freq=4.0), product of:
              0.7363956 = queryWeight, product of:
                7.967958 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.01177122 = queryNorm
              0.981415 = fieldWeight in 4830, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
        0.16 = coord(4/25)
    
  2. Jörgensen, P.: Incorporating context in text analysis by interactive activation with competition artificial neural networks (2005) 0.13
    0.1280149 = sum of:
      0.1280149 = product of:
        0.6400745 = sum of:
          0.009913621 = weight(abstract_txt:study in 2039) [ClassicSimilarity], result of:
            0.009913621 = score(doc=2039,freq=1.0), product of:
              0.04644472 = queryWeight, product of:
                1.1553112 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.01177122 = queryNorm
              0.21344988 = fieldWeight in 2039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.0625 = fieldNorm(doc=2039)
          0.023914704 = weight(abstract_txt:according in 2039) [ClassicSimilarity], result of:
            0.023914704 = score(doc=2039,freq=1.0), product of:
              0.072978415 = queryWeight, product of:
                1.182449 = boost
                5.2431293 = idf(docFreq=637, maxDocs=44421)
                0.01177122 = queryNorm
              0.32769558 = fieldWeight in 2039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2431293 = idf(docFreq=637, maxDocs=44421)
                0.0625 = fieldNorm(doc=2039)
          0.040768176 = weight(abstract_txt:phase in 2039) [ClassicSimilarity], result of:
            0.040768176 = score(doc=2039,freq=1.0), product of:
              0.10414345 = queryWeight, product of:
                1.4125414 = boost
                6.263388 = idf(docFreq=229, maxDocs=44421)
                0.01177122 = queryNorm
              0.39146176 = fieldWeight in 2039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.263388 = idf(docFreq=229, maxDocs=44421)
                0.0625 = fieldNorm(doc=2039)
          0.054445047 = weight(abstract_txt:messages in 2039) [ClassicSimilarity], result of:
            0.054445047 = score(doc=2039,freq=1.0), product of:
              0.12629612 = queryWeight, product of:
                1.5555364 = boost
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.01177122 = queryNorm
              0.4310904 = fieldWeight in 2039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.0625 = fieldNorm(doc=2039)
          0.51103294 = weight(abstract_txt:email in 2039) [ClassicSimilarity], result of:
            0.51103294 = score(doc=2039,freq=2.0), product of:
              0.7363956 = queryWeight, product of:
                7.967958 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.01177122 = queryNorm
              0.6939652 = fieldWeight in 2039, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=2039)
        0.2 = coord(5/25)
    
  3. Rodriguez-Esteban, R.; Vishnyakova, D.; Rinaldi, F.: Revisiting the decay of scientific email addresses (2022) 0.12
    0.123440966 = sum of:
      0.123440966 = product of:
        1.5430121 = sum of:
          0.009913621 = weight(abstract_txt:study in 1450) [ClassicSimilarity], result of:
            0.009913621 = score(doc=1450,freq=1.0), product of:
              0.04644472 = queryWeight, product of:
                1.1553112 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.01177122 = queryNorm
              0.21344988 = fieldWeight in 1450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.0625 = fieldNorm(doc=1450)
          1.5330986 = weight(abstract_txt:email in 1450) [ClassicSimilarity], result of:
            1.5330986 = score(doc=1450,freq=18.0), product of:
              0.7363956 = queryWeight, product of:
                7.967958 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.01177122 = queryNorm
              2.0818954 = fieldWeight in 1450, product of:
                4.2426405 = tf(freq=18.0), with freq of:
                  18.0 = termFreq=18.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=1450)
        0.08 = coord(2/25)
    
  4. Na, J.-C.; Sui, H.; Khoo, C.; Chan, S.; Zhou, Y.: Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews (2004) 0.08
    0.08064539 = sum of:
      0.08064539 = product of:
        0.33602247 = sum of:
          0.02216753 = weight(abstract_txt:study in 3624) [ClassicSimilarity], result of:
            0.02216753 = score(doc=3624,freq=5.0), product of:
              0.04644472 = queryWeight, product of:
                1.1553112 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.01177122 = queryNorm
              0.47728845 = fieldWeight in 3624, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.0625 = fieldNorm(doc=3624)
          0.023271339 = weight(abstract_txt:automatic in 3624) [ClassicSimilarity], result of:
            0.023271339 = score(doc=3624,freq=1.0), product of:
              0.07166361 = queryWeight, product of:
                1.1717489 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.01177122 = queryNorm
              0.32473022 = fieldWeight in 3624, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=3624)
          0.070085235 = weight(abstract_txt:accuracy in 3624) [ClassicSimilarity], result of:
            0.070085235 = score(doc=3624,freq=4.0), product of:
              0.09414894 = queryWeight, product of:
                1.3430523 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.01177122 = queryNorm
              0.7444081 = fieldWeight in 3624, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.0625 = fieldNorm(doc=3624)
          0.03540733 = weight(abstract_txt:classification in 3624) [ClassicSimilarity], result of:
            0.03540733 = score(doc=3624,freq=5.0), product of:
              0.06346296 = queryWeight, product of:
                1.3504888 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.01177122 = queryNorm
              0.55792123 = fieldWeight in 3624, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=3624)
          0.023298642 = weight(abstract_txt:features in 3624) [ClassicSimilarity], result of:
            0.023298642 = score(doc=3624,freq=1.0), product of:
              0.08209851 = queryWeight, product of:
                1.5360255 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.01177122 = queryNorm
              0.28378886 = fieldWeight in 3624, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.0625 = fieldNorm(doc=3624)
          0.1617924 = weight(abstract_txt:rate in 3624) [ClassicSimilarity], result of:
            0.1617924 = score(doc=3624,freq=2.0), product of:
              0.2988273 = queryWeight, product of:
                4.1443453 = boost
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.01177122 = queryNorm
              0.54142445 = fieldWeight in 3624, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.0625 = fieldNorm(doc=3624)
        0.24 = coord(6/25)
    
  5. Kim, Y.H.; Kim, H.H.: Development and validation of evaluation indicators for a consortium of institutional repositories : a case study of dcollection (2008) 0.08
    0.0791331 = sum of:
      0.0791331 = product of:
        0.3956655 = sum of:
          0.009913621 = weight(abstract_txt:study in 2882) [ClassicSimilarity], result of:
            0.009913621 = score(doc=2882,freq=1.0), product of:
              0.04644472 = queryWeight, product of:
                1.1553112 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.01177122 = queryNorm
              0.21344988 = fieldWeight in 2882, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.0625 = fieldNorm(doc=2882)
          0.023131758 = weight(abstract_txt:four in 2882) [ClassicSimilarity], result of:
            0.023131758 = score(doc=2882,freq=1.0), product of:
              0.07137676 = queryWeight, product of:
                1.1694014 = boost
                5.1852746 = idf(docFreq=675, maxDocs=44421)
                0.01177122 = queryNorm
              0.32407966 = fieldWeight in 2882, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1852746 = idf(docFreq=675, maxDocs=44421)
                0.0625 = fieldNorm(doc=2882)
          0.04886094 = weight(abstract_txt:categories in 2882) [ClassicSimilarity], result of:
            0.04886094 = score(doc=2882,freq=2.0), product of:
              0.10676103 = queryWeight, product of:
                1.7516091 = boost
                5.177905 = idf(docFreq=680, maxDocs=44421)
                0.01177122 = queryNorm
              0.45766646 = fieldWeight in 2882, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.177905 = idf(docFreq=680, maxDocs=44421)
                0.0625 = fieldNorm(doc=2882)
          0.08495019 = weight(abstract_txt:managers in 2882) [ClassicSimilarity], result of:
            0.08495019 = score(doc=2882,freq=1.0), product of:
              0.21406089 = queryWeight, product of:
                2.8639724 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.01177122 = queryNorm
              0.3968506 = fieldWeight in 2882, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0625 = fieldNorm(doc=2882)
          0.228809 = weight(abstract_txt:rate in 2882) [ClassicSimilarity], result of:
            0.228809 = score(doc=2882,freq=4.0), product of:
              0.2988273 = queryWeight, product of:
                4.1443453 = boost
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.01177122 = queryNorm
              0.7656898 = fieldWeight in 2882, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.0625 = fieldNorm(doc=2882)
        0.2 = coord(5/25)