Document (#42043)

Author
Xiao, D.
Ji, Y.
Li, Y.
Zhuang, F.
Shi, C.
Title
Coupled matrix factorization and topic modeling for aspect mining
Source
Information processing and management. 54(2018) no.6, S.861-873
Year
2018
Abstract
Aspect mining, which aims to extract ad hoc aspects from online reviews and predict rating or opinion on each aspect, can satisfy the personalized needs for evaluation of specific aspect on product quality. Recently, with the increase of related research, how to effectively integrate rating and review information has become the key issue for addressing this problem. Considering that matrix factorization is an effective tool for rating prediction and topic modeling is widely used for review processing, it is a natural idea to combine matrix factorization and topic modeling for aspect mining (or called aspect rating prediction). However, this idea faces several challenges on how to address suitable sharing factors, scale mismatch, and dependency relation of rating and review information. In this paper, we propose a novel model to effectively integrate Matrix factorization and Topic modeling for Aspect rating prediction (MaToAsp). To overcome the above challenges and ensure the performance, MaToAsp employs items as the sharing factors to combine matrix factorization and topic modeling, and introduces an interpretive preference probability to eliminate scale mismatch. In the hybrid model, we establish a dependency relation from ratings to sentiment terms in phrases. The experiments on two real datasets including Chinese Dianping and English Tripadvisor prove that MaToAsp not only obtains reasonable aspect identification but also achieves the best aspect rating prediction performance, compared to recent representative baselines.
Content
Vgl.: https://doi.org/10.1016/j.ipm.2018.05.002.

Similar documents (author)

  1. Xiao, Y.: Modern development of classification : research and practice in the People's Republic of China (1992) 5.54
    5.5426593 = sum of:
      5.5426593 = weight(author_txt:xiao in 1908) [ClassicSimilarity], result of:
        5.5426593 = fieldWeight in 1908, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.868255 = idf(docFreq=16, maxDocs=44421)
          0.625 = fieldNorm(doc=1908)
    
  2. Xiao, Y.: Faceted classification : a consideration of its features as a paradigm of knowledge organization (1994) 5.54
    5.5426593 = sum of:
      5.5426593 = weight(author_txt:xiao in 7546) [ClassicSimilarity], result of:
        5.5426593 = fieldWeight in 7546, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.868255 = idf(docFreq=16, maxDocs=44421)
          0.625 = fieldNorm(doc=7546)
    
  3. Xiao, G.: ¬A knowledge classification model based on the relationship between science and human needs (2013) 5.54
    5.5426593 = sum of:
      5.5426593 = weight(author_txt:xiao in 1138) [ClassicSimilarity], result of:
        5.5426593 = fieldWeight in 1138, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.868255 = idf(docFreq=16, maxDocs=44421)
          0.625 = fieldNorm(doc=1138)
    
  4. Xiao, L.: Effects of rationale awareness in online ideation crowdsourcing tasks (2014) 5.54
    5.5426593 = sum of:
      5.5426593 = weight(author_txt:xiao in 2329) [ClassicSimilarity], result of:
        5.5426593 = fieldWeight in 2329, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.868255 = idf(docFreq=16, maxDocs=44421)
          0.625 = fieldNorm(doc=2329)
    
  5. Xiao, L.; Askin, A.: What influences online deliberation? : A wikipedia study (2014) 4.43
    4.4341273 = sum of:
      4.4341273 = weight(author_txt:xiao in 2254) [ClassicSimilarity], result of:
        4.4341273 = fieldWeight in 2254, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.868255 = idf(docFreq=16, maxDocs=44421)
          0.5 = fieldNorm(doc=2254)
    

Similar documents (content)

  1. Su, L.T.; Chen, H.L.: Evaluation of Web search engines by undergraduate students (1999) 0.12
    0.12444311 = sum of:
      0.12444311 = product of:
        0.51851296 = sum of:
          0.017662458 = weight(abstract_txt:performance in 546) [ClassicSimilarity], result of:
            0.017662458 = score(doc=546,freq=2.0), product of:
              0.04943434 = queryWeight, product of:
                1.1531253 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.009279679 = queryNorm
              0.35729125 = fieldWeight in 546, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0546875 = fieldNorm(doc=546)
          0.015399796 = weight(abstract_txt:factors in 546) [ClassicSimilarity], result of:
            0.015399796 = score(doc=546,freq=1.0), product of:
              0.056843564 = queryWeight, product of:
                1.2365246 = boost
                4.9538813 = idf(docFreq=851, maxDocs=44421)
                0.009279679 = queryNorm
              0.2709154 = fieldWeight in 546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9538813 = idf(docFreq=851, maxDocs=44421)
                0.0546875 = fieldNorm(doc=546)
          0.019792644 = weight(abstract_txt:relation in 546) [ClassicSimilarity], result of:
            0.019792644 = score(doc=546,freq=1.0), product of:
              0.06719555 = queryWeight, product of:
                1.344412 = boost
                5.38611 = idf(docFreq=552, maxDocs=44421)
                0.009279679 = queryNorm
              0.2945529 = fieldWeight in 546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.38611 = idf(docFreq=552, maxDocs=44421)
                0.0546875 = fieldNorm(doc=546)
          0.040875863 = weight(abstract_txt:topic in 546) [ClassicSimilarity], result of:
            0.040875863 = score(doc=546,freq=1.0), product of:
              0.14789811 = queryWeight, product of:
                3.1536496 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.009279679 = queryNorm
              0.27637854 = fieldWeight in 546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0546875 = fieldNorm(doc=546)
          0.28558832 = weight(abstract_txt:rating in 546) [ClassicSimilarity], result of:
            0.28558832 = score(doc=546,freq=2.0), product of:
              0.47992972 = queryWeight, product of:
                6.7217903 = boost
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.009279679 = queryNorm
              0.5950628 = fieldWeight in 546, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.0546875 = fieldNorm(doc=546)
          0.13919389 = weight(abstract_txt:aspect in 546) [ClassicSimilarity], result of:
            0.13919389 = score(doc=546,freq=1.0), product of:
              0.40721357 = queryWeight, product of:
                7.020685 = boost
                6.250429 = idf(docFreq=232, maxDocs=44421)
                0.009279679 = queryNorm
              0.34182036 = fieldWeight in 546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.250429 = idf(docFreq=232, maxDocs=44421)
                0.0546875 = fieldNorm(doc=546)
        0.24 = coord(6/25)
    
  2. Su, Z.; Li, D.; Li, H.; Luo, X.: Boosting attribute recognition with latent topics by matrix factorization (2017) 0.12
    0.11810102 = sum of:
      0.11810102 = product of:
        0.7381314 = sum of:
          0.029601146 = weight(abstract_txt:scale in 4693) [ClassicSimilarity], result of:
            0.029601146 = score(doc=4693,freq=1.0), product of:
              0.069280185 = queryWeight, product of:
                1.3651068 = boost
                5.4690194 = idf(docFreq=508, maxDocs=44421)
                0.009279679 = queryNorm
              0.42726713 = fieldWeight in 4693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4690194 = idf(docFreq=508, maxDocs=44421)
                0.078125 = fieldNorm(doc=4693)
          0.14537333 = weight(abstract_txt:matrix in 4693) [ClassicSimilarity], result of:
            0.14537333 = score(doc=4693,freq=1.0), product of:
              0.27166885 = queryWeight, product of:
                4.2741723 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.009279679 = queryNorm
              0.53511226 = fieldWeight in 4693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.078125 = fieldNorm(doc=4693)
          0.3643086 = weight(abstract_txt:factorization in 4693) [ClassicSimilarity], result of:
            0.3643086 = score(doc=4693,freq=1.0), product of:
              0.5012214 = queryWeight, product of:
                5.805598 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.009279679 = queryNorm
              0.7268416 = fieldWeight in 4693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.078125 = fieldNorm(doc=4693)
          0.1988484 = weight(abstract_txt:aspect in 4693) [ClassicSimilarity], result of:
            0.1988484 = score(doc=4693,freq=1.0), product of:
              0.40721357 = queryWeight, product of:
                7.020685 = boost
                6.250429 = idf(docFreq=232, maxDocs=44421)
                0.009279679 = queryNorm
              0.48831478 = fieldWeight in 4693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.250429 = idf(docFreq=232, maxDocs=44421)
                0.078125 = fieldNorm(doc=4693)
        0.16 = coord(4/25)
    
  3. Greenstein-Messica, A.; Rokach, L.; Shabtai, A.: Personal-discount sensitivity prediction for mobile coupon conversion optimization (2017) 0.10
    0.10209015 = sum of:
      0.10209015 = product of:
        0.63806343 = sum of:
          0.15223409 = weight(abstract_txt:prediction in 4751) [ClassicSimilarity], result of:
            0.15223409 = score(doc=4751,freq=2.0), product of:
              0.23952526 = queryWeight, product of:
                3.5896554 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.009279679 = queryNorm
              0.63556594 = fieldWeight in 4751, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.0625 = fieldNorm(doc=4751)
          0.07808381 = weight(abstract_txt:modeling in 4751) [ClassicSimilarity], result of:
            0.07808381 = score(doc=4751,freq=1.0), product of:
              0.20830388 = queryWeight, product of:
                3.7426636 = boost
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.009279679 = queryNorm
              0.3748553 = fieldWeight in 4751, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.0625 = fieldNorm(doc=4751)
          0.11629867 = weight(abstract_txt:matrix in 4751) [ClassicSimilarity], result of:
            0.11629867 = score(doc=4751,freq=1.0), product of:
              0.27166885 = queryWeight, product of:
                4.2741723 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.009279679 = queryNorm
              0.42808983 = fieldWeight in 4751, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.0625 = fieldNorm(doc=4751)
          0.29144686 = weight(abstract_txt:factorization in 4751) [ClassicSimilarity], result of:
            0.29144686 = score(doc=4751,freq=1.0), product of:
              0.5012214 = queryWeight, product of:
                5.805598 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.009279679 = queryNorm
              0.5814733 = fieldWeight in 4751, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.0625 = fieldNorm(doc=4751)
        0.16 = coord(4/25)
    
  4. Greenstein-Messica, A.; Rokach, L.; Shabtai, A.: Personal-discount sensitivity prediction for mobile coupon conversion optimization (2017) 0.10
    0.10209015 = sum of:
      0.10209015 = product of:
        0.63806343 = sum of:
          0.15223409 = weight(abstract_txt:prediction in 4761) [ClassicSimilarity], result of:
            0.15223409 = score(doc=4761,freq=2.0), product of:
              0.23952526 = queryWeight, product of:
                3.5896554 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.009279679 = queryNorm
              0.63556594 = fieldWeight in 4761, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.0625 = fieldNorm(doc=4761)
          0.07808381 = weight(abstract_txt:modeling in 4761) [ClassicSimilarity], result of:
            0.07808381 = score(doc=4761,freq=1.0), product of:
              0.20830388 = queryWeight, product of:
                3.7426636 = boost
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.009279679 = queryNorm
              0.3748553 = fieldWeight in 4761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.0625 = fieldNorm(doc=4761)
          0.11629867 = weight(abstract_txt:matrix in 4761) [ClassicSimilarity], result of:
            0.11629867 = score(doc=4761,freq=1.0), product of:
              0.27166885 = queryWeight, product of:
                4.2741723 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.009279679 = queryNorm
              0.42808983 = fieldWeight in 4761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.0625 = fieldNorm(doc=4761)
          0.29144686 = weight(abstract_txt:factorization in 4761) [ClassicSimilarity], result of:
            0.29144686 = score(doc=4761,freq=1.0), product of:
              0.5012214 = queryWeight, product of:
                5.805598 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.009279679 = queryNorm
              0.5814733 = fieldWeight in 4761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.0625 = fieldNorm(doc=4761)
        0.16 = coord(4/25)
    
  5. Ferreira, R.S.; Graça Pimentel, M. de; Cristo, M.: ¬A wikification prediction model based on the combination of latent, dyadic, and monadic features (2018) 0.08
    0.08474636 = sum of:
      0.08474636 = product of:
        0.52966475 = sum of:
          0.014273422 = weight(abstract_txt:performance in 119) [ClassicSimilarity], result of:
            0.014273422 = score(doc=119,freq=1.0), product of:
              0.04943434 = queryWeight, product of:
                1.1531253 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.009279679 = queryNorm
              0.28873494 = fieldWeight in 119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0625 = fieldNorm(doc=119)
          0.107645765 = weight(abstract_txt:prediction in 119) [ClassicSimilarity], result of:
            0.107645765 = score(doc=119,freq=1.0), product of:
              0.23952526 = queryWeight, product of:
                3.5896554 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.009279679 = queryNorm
              0.449413 = fieldWeight in 119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.0625 = fieldNorm(doc=119)
          0.11629867 = weight(abstract_txt:matrix in 119) [ClassicSimilarity], result of:
            0.11629867 = score(doc=119,freq=1.0), product of:
              0.27166885 = queryWeight, product of:
                4.2741723 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.009279679 = queryNorm
              0.42808983 = fieldWeight in 119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.0625 = fieldNorm(doc=119)
          0.29144686 = weight(abstract_txt:factorization in 119) [ClassicSimilarity], result of:
            0.29144686 = score(doc=119,freq=1.0), product of:
              0.5012214 = queryWeight, product of:
                5.805598 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.009279679 = queryNorm
              0.5814733 = fieldWeight in 119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.0625 = fieldNorm(doc=119)
        0.16 = coord(4/25)