Document (#42004)

Author
Xu, L.
Qiu, J.
Title
Unsupervised multi-class sentiment classification approach
Source
Knowledge organization. 46(2019) no.1, S.15-32
Year
2019
Abstract
Real-time and accurate multi-class sentiment classification serves as a tool to gauge public user experiences and provide a decision-making basis for timely analysis. In the field of sentiment classification, there is an urgent need for an accurate and efficient multi-class sentiment classification method. With the aim to overcome the drawbacks of the existing methods, we propose a novel, unsupervised multi-class sentiment classification method called Gaussian mixture model of multi-class sentiment classification (GMSC). Based on the Gaussian mixture model (GMM), the GMSC consists of the following essential phases: first, combining a dictionary with microblog texts to calculate and construct the feature matrix of sentiment for each sample; second, introducing a dimension reduction method to avoid the in-fluence of a sparse feature matrix on the results; third, modeling the multi-class sentiment classification procedure based on GMM; and lastly, computing the probability distribution of different categories of sentiment by using GMM to partition sentiments in microblogs into distinct components and classify them via a Gaussian process regression. The results indicate the GMSC approach's accuracy is better and manual tagging time is reduced when compared to semi-supervised and unsupervised sentiment classification methods within the same parameters.
Content
DOI:10.5771/0943-7444-2019-1-15.

Similar documents (content)

  1. Chen, Z.; Huang, Y.; Tian, J.; Liu, X.; Fu, K.; Huang, T.: Joint model for subsentence-level sentiment analysis with Markov logic (2015) 0.23
    0.22502394 = sum of:
      0.22502394 = product of:
        1.1251197 = sum of:
          0.009950568 = weight(abstract_txt:model in 3210) [ClassicSimilarity], result of:
            0.009950568 = score(doc=3210,freq=1.0), product of:
              0.03997434 = queryWeight, product of:
                1.0667175 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.00940904 = queryNorm
              0.24892388 = fieldWeight in 3210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.07618387 = weight(abstract_txt:sentiments in 3210) [ClassicSimilarity], result of:
            0.07618387 = score(doc=3210,freq=2.0), product of:
              0.097822346 = queryWeight, product of:
                1.1799479 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.00940904 = queryNorm
              0.7787982 = fieldWeight in 3210, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.032386836 = weight(abstract_txt:feature in 3210) [ClassicSimilarity], result of:
            0.032386836 = score(doc=3210,freq=1.0), product of:
              0.08779337 = queryWeight, product of:
                1.5808462 = boost
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.00940904 = queryNorm
              0.36889842 = fieldWeight in 3210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9023747 = idf(docFreq=329, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.056687437 = weight(abstract_txt:classification in 3210) [ClassicSimilarity], result of:
            0.056687437 = score(doc=3210,freq=2.0), product of:
              0.16065119 = queryWeight, product of:
                4.276916 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.00940904 = queryNorm
              0.35286036 = fieldWeight in 3210, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.94991094 = weight(abstract_txt:sentiment in 3210) [ClassicSimilarity], result of:
            0.94991094 = score(doc=3210,freq=8.0), product of:
              0.71389 = queryWeight, product of:
                10.079974 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.00940904 = queryNorm
              1.3306124 = fieldWeight in 3210, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
        0.2 = coord(5/25)
    
  2. Melo, P.F.; Dalip, D.H.; Junior, M.M.; Gonçalves, M.A.; Benevenuto, F.: 10SENT : a stable sentiment analysis method based on the combination of off-the-shelf approaches (2019) 0.21
    0.2104381 = sum of:
      0.2104381 = product of:
        0.87682545 = sum of:
          0.032791372 = weight(abstract_txt:supervised in 990) [ClassicSimilarity], result of:
            0.032791372 = score(doc=990,freq=1.0), product of:
              0.070260696 = queryWeight, product of:
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.00940904 = queryNorm
              0.46671006 = fieldWeight in 990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0625 = fieldNorm(doc=990)
          0.019406278 = weight(abstract_txt:methods in 990) [ClassicSimilarity], result of:
            0.019406278 = score(doc=990,freq=3.0), product of:
              0.04326504 = queryWeight, product of:
                1.1097555 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.00940904 = queryNorm
              0.44854406 = fieldWeight in 990, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=990)
          0.021511532 = weight(abstract_txt:method in 990) [ClassicSimilarity], result of:
            0.021511532 = score(doc=990,freq=1.0), product of:
              0.07650574 = queryWeight, product of:
                1.8073882 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.00940904 = queryNorm
              0.2811754 = fieldWeight in 990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=990)
          0.18133298 = weight(abstract_txt:unsupervised in 990) [ClassicSimilarity], result of:
            0.18133298 = score(doc=990,freq=3.0), product of:
              0.21971375 = queryWeight, product of:
                3.0629013 = boost
                7.62393 = idf(docFreq=58, maxDocs=44421)
                0.00940904 = queryNorm
              0.82531464 = fieldWeight in 990, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.62393 = idf(docFreq=58, maxDocs=44421)
                0.0625 = fieldNorm(doc=990)
          0.04008407 = weight(abstract_txt:classification in 990) [ClassicSimilarity], result of:
            0.04008407 = score(doc=990,freq=1.0), product of:
              0.16065119 = queryWeight, product of:
                4.276916 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.00940904 = queryNorm
              0.24950996 = fieldWeight in 990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=990)
          0.58169925 = weight(abstract_txt:sentiment in 990) [ClassicSimilarity], result of:
            0.58169925 = score(doc=990,freq=3.0), product of:
              0.71389 = queryWeight, product of:
                10.079974 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.00940904 = queryNorm
              0.81483036 = fieldWeight in 990, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0625 = fieldNorm(doc=990)
        0.24 = coord(6/25)
    
  3. Jansen, B.J.; Zhang, M.; Sobel, K.; Chowdury, A.: Twitter power : tweets as electronic word of mouth (2009) 0.16
    0.15606177 = sum of:
      0.15606177 = product of:
        0.78030884 = sum of:
          0.011204219 = weight(abstract_txt:methods in 144) [ClassicSimilarity], result of:
            0.011204219 = score(doc=144,freq=1.0), product of:
              0.04326504 = queryWeight, product of:
                1.1097555 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.00940904 = queryNorm
              0.25896704 = fieldWeight in 144, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=144)
          0.07618387 = weight(abstract_txt:sentiments in 144) [ClassicSimilarity], result of:
            0.07618387 = score(doc=144,freq=2.0), product of:
              0.097822346 = queryWeight, product of:
                1.1799479 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.00940904 = queryNorm
              0.7787982 = fieldWeight in 144, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0625 = fieldNorm(doc=144)
          0.08719266 = weight(abstract_txt:microblog in 144) [ClassicSimilarity], result of:
            0.08719266 = score(doc=144,freq=2.0), product of:
              0.10703259 = queryWeight, product of:
                1.2342461 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.00940904 = queryNorm
              0.8146366 = fieldWeight in 144, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=144)
          0.13077264 = weight(abstract_txt:microblogs in 144) [ClassicSimilarity], result of:
            0.13077264 = score(doc=144,freq=4.0), product of:
              0.11130909 = queryWeight, product of:
                1.2586619 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.00940904 = queryNorm
              1.1748604 = fieldWeight in 144, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=144)
          0.47495547 = weight(abstract_txt:sentiment in 144) [ClassicSimilarity], result of:
            0.47495547 = score(doc=144,freq=2.0), product of:
              0.71389 = queryWeight, product of:
                10.079974 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.00940904 = queryNorm
              0.6653062 = fieldWeight in 144, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0625 = fieldNorm(doc=144)
        0.2 = coord(5/25)
    
  4. Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.16
    0.15565349 = sum of:
      0.15565349 = product of:
        0.48641717 = sum of:
          0.064158276 = weight(abstract_txt:supervised in 95) [ClassicSimilarity], result of:
            0.064158276 = score(doc=95,freq=5.0), product of:
              0.070260696 = queryWeight, product of:
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.00940904 = queryNorm
              0.913146 = fieldWeight in 95, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.008706747 = weight(abstract_txt:model in 95) [ClassicSimilarity], result of:
            0.008706747 = score(doc=95,freq=1.0), product of:
              0.03997434 = queryWeight, product of:
                1.0667175 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.00940904 = queryNorm
              0.2178084 = fieldWeight in 95, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.0138645135 = weight(abstract_txt:methods in 95) [ClassicSimilarity], result of:
            0.0138645135 = score(doc=95,freq=2.0), product of:
              0.04326504 = queryWeight, product of:
                1.1097555 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.00940904 = queryNorm
              0.32045534 = fieldWeight in 95, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.009840883 = weight(abstract_txt:time in 95) [ClassicSimilarity], result of:
            0.009840883 = score(doc=95,freq=1.0), product of:
              0.04337439 = queryWeight, product of:
                1.1111571 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.00940904 = queryNorm
              0.22688234 = fieldWeight in 95, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.026619164 = weight(abstract_txt:method in 95) [ClassicSimilarity], result of:
            0.026619164 = score(doc=95,freq=2.0), product of:
              0.07650574 = queryWeight, product of:
                1.8073882 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.00940904 = queryNorm
              0.3479368 = fieldWeight in 95, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.1052207 = weight(abstract_txt:classification in 95) [ClassicSimilarity], result of:
            0.1052207 = score(doc=95,freq=9.0), product of:
              0.16065119 = queryWeight, product of:
                4.276916 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.00940904 = queryNorm
              0.6549637 = fieldWeight in 95, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.084754646 = weight(abstract_txt:class in 95) [ClassicSimilarity], result of:
            0.084754646 = score(doc=95,freq=1.0), product of:
              0.2628412 = queryWeight, product of:
                4.7376842 = boost
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.00940904 = queryNorm
              0.3224557 = fieldWeight in 95, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8963327 = idf(docFreq=331, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.17325224 = weight(abstract_txt:multi in 95) [ClassicSimilarity], result of:
            0.17325224 = score(doc=95,freq=4.0), product of:
              0.2666963 = queryWeight, product of:
                4.7723017 = boost
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.00940904 = queryNorm
              0.6496237 = fieldWeight in 95, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
        0.32 = coord(8/25)
    
  5. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.16
    0.15508926 = sum of:
      0.15508926 = product of:
        0.9693079 = sum of:
          0.0138645135 = weight(abstract_txt:methods in 89) [ClassicSimilarity], result of:
            0.0138645135 = score(doc=89,freq=2.0), product of:
              0.04326504 = queryWeight, product of:
                1.1097555 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.00940904 = queryNorm
              0.32045534 = fieldWeight in 89, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
          0.037645184 = weight(abstract_txt:method in 89) [ClassicSimilarity], result of:
            0.037645184 = score(doc=89,freq=4.0), product of:
              0.07650574 = queryWeight, product of:
                1.8073882 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.00940904 = queryNorm
              0.49205697 = fieldWeight in 89, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
          0.08662612 = weight(abstract_txt:multi in 89) [ClassicSimilarity], result of:
            0.08662612 = score(doc=89,freq=1.0), product of:
              0.2666963 = queryWeight, product of:
                4.7723017 = boost
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.00940904 = queryNorm
              0.32481185 = fieldWeight in 89, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
          0.83117205 = weight(abstract_txt:sentiment in 89) [ClassicSimilarity], result of:
            0.83117205 = score(doc=89,freq=8.0), product of:
              0.71389 = queryWeight, product of:
                10.079974 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.00940904 = queryNorm
              1.1642859 = fieldWeight in 89, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
        0.16 = coord(4/25)