Document (#42918)

Author
Tay, W.
Zhang, X.
Karimi , S.
Title
Beyond mean rating : probabilistic aggregation of star ratings based on helpfulness
Source
Journal of the Association for Information Science and Technology. 71(2020) no.7, S.784-799
Year
2020
Abstract
The star-rating mechanism of customer reviews is used universally by the online population to compare and select merchants, movies, products, and services. The consensus opinion from aggregation of star ratings is used as a proxy for item quality. Online reviews are noisy and effective aggregation of star ratings to accurately reflect the "true quality" of products and services is challenging. The mean-rating aggregation model is widely used and other aggregation models are also proposed. These existing aggregation models rely on a large number of reviews to tolerate noise. However, many products rarely have reviews. We propose probabilistic aggregation models for review ratings based on the Dirichlet distribution to combat data sparsity in reviews. We further propose to exploit the "helpfulness" social information and time to filter noisy reviews and effectively aggregate ratings to compute the consensus opinion. Our experiments on an Amazon data set show that our probabilistic aggregation models based on "helpfulness" achieve better performance than the statistical and heuristic baseline approaches.
Content
https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24297.
Theme
Informetrie

Similar documents (author)

  1. Zhang, M.; Zhang, Y.: Professional organizations in Twittersphere : an empirical study of U.S. library and information science professional organizations-related Tweets (2020) 4.53
    4.5277104 = sum of:
      4.5277104 = weight(author_txt:zhang in 775) [ClassicSimilarity], result of:
        4.5277104 = score(doc=775,freq=2.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.527711 = fieldWeight in 775, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.5 = fieldNorm(doc=775)
    
  2. Zhang, Y.; Zhang, C.: Enhancing keyphrase extraction from microblogs using human reading time (2021) 4.53
    4.5277104 = sum of:
      4.5277104 = weight(author_txt:zhang in 1238) [ClassicSimilarity], result of:
        4.5277104 = score(doc=1238,freq=2.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.527711 = fieldWeight in 1238, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.5 = fieldNorm(doc=1238)
    
  3. Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 4.00
    4.0019684 = sum of:
      4.0019684 = weight(author_txt:zhang in 7710) [ClassicSimilarity], result of:
        4.0019684 = score(doc=7710,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.001969 = fieldWeight in 7710, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.625 = fieldNorm(doc=7710)
    
  4. Zhang, A.: Multimedia file formats on the Internet : a beginner's guide for PC users (1995) 4.00
    4.0019684 = sum of:
      4.0019684 = weight(author_txt:zhang in 3280) [ClassicSimilarity], result of:
        4.0019684 = score(doc=3280,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.001969 = fieldWeight in 3280, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.625 = fieldNorm(doc=3280)
    
  5. Zhang, J.: ¬A representational analysis of relational information displays (1996) 4.00
    4.0019684 = sum of:
      4.0019684 = weight(author_txt:zhang in 6471) [ClassicSimilarity], result of:
        4.0019684 = score(doc=6471,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.001969 = fieldWeight in 6471, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.625 = fieldNorm(doc=6471)
    

Similar documents (content)

  1. Xiao, D.; Ji, Y.; Li, Y.; Zhuang, F.; Shi, C.: Coupled matrix factorization and topic modeling for aspect mining (2018) 0.17
    0.17499064 = sum of:
      0.17499064 = product of:
        0.62496656 = sum of:
          0.016543314 = weight(abstract_txt:quality in 42) [ClassicSimilarity], result of:
            0.016543314 = score(doc=42,freq=1.0), product of:
              0.056991488 = queryWeight, product of:
                1.1592834 = boost
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.010584927 = queryNorm
              0.2902769 = fieldWeight in 42, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.0625 = fieldNorm(doc=42)
          0.00937234 = weight(abstract_txt:used in 42) [ClassicSimilarity], result of:
            0.00937234 = score(doc=42,freq=1.0), product of:
              0.044667408 = queryWeight, product of:
                1.2569721 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.010584927 = queryNorm
              0.20982501 = fieldWeight in 42, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=42)
          0.022448324 = weight(abstract_txt:propose in 42) [ClassicSimilarity], result of:
            0.022448324 = score(doc=42,freq=1.0), product of:
              0.06985287 = queryWeight, product of:
                1.2834436 = boost
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.010584927 = queryNorm
              0.32136577 = fieldWeight in 42, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.0625 = fieldNorm(doc=42)
          0.053615853 = weight(abstract_txt:opinion in 42) [ClassicSimilarity], result of:
            0.053615853 = score(doc=42,freq=1.0), product of:
              0.12481223 = queryWeight, product of:
                1.7155888 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.010584927 = queryNorm
              0.4295721 = fieldWeight in 42, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0625 = fieldNorm(doc=42)
          0.2985002 = weight(abstract_txt:rating in 42) [ClassicSimilarity], result of:
            0.2985002 = score(doc=42,freq=7.0), product of:
              0.23461503 = queryWeight, product of:
                2.8807673 = boost
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.010584927 = queryNorm
              1.2722979 = fieldWeight in 42, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.0625 = fieldNorm(doc=42)
          0.060055196 = weight(abstract_txt:reviews in 42) [ClassicSimilarity], result of:
            0.060055196 = score(doc=42,freq=1.0), product of:
              0.19414929 = queryWeight, product of:
                3.7060661 = boost
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.010584927 = queryNorm
              0.30932483 = fieldWeight in 42, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.0625 = fieldNorm(doc=42)
          0.16443135 = weight(abstract_txt:ratings in 42) [ClassicSimilarity], result of:
            0.16443135 = score(doc=42,freq=1.0), product of:
              0.35757303 = queryWeight, product of:
                4.5913143 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.010584927 = queryNorm
              0.4598539 = fieldWeight in 42, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.0625 = fieldNorm(doc=42)
        0.28 = coord(7/25)
    
  2. Zhu, J.; Han, L.; Gou, Z.; Yuan, X.: ¬A fuzzy clustering-based denoising model for evaluating uncertainty in collaborative filtering recommender systems (2018) 0.15
    0.15379493 = sum of:
      0.15379493 = product of:
        0.5492676 = sum of:
          0.048401255 = weight(abstract_txt:movies in 460) [ClassicSimilarity], result of:
            0.048401255 = score(doc=460,freq=1.0), product of:
              0.09253146 = queryWeight, product of:
                1.0445142 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.010584927 = queryNorm
              0.5230789 = fieldWeight in 460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=460)
          0.016543314 = weight(abstract_txt:quality in 460) [ClassicSimilarity], result of:
            0.016543314 = score(doc=460,freq=1.0), product of:
              0.056991488 = queryWeight, product of:
                1.1592834 = boost
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.010584927 = queryNorm
              0.2902769 = fieldWeight in 460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.0625 = fieldNorm(doc=460)
          0.011297232 = weight(abstract_txt:based in 460) [ClassicSimilarity], result of:
            0.011297232 = score(doc=460,freq=2.0), product of:
              0.04015412 = queryWeight, product of:
                1.191778 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.010584927 = queryNorm
              0.28134674 = fieldWeight in 460, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=460)
          0.00937234 = weight(abstract_txt:used in 460) [ClassicSimilarity], result of:
            0.00937234 = score(doc=460,freq=1.0), product of:
              0.044667408 = queryWeight, product of:
                1.2569721 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.010584927 = queryNorm
              0.20982501 = fieldWeight in 460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=460)
          0.09121252 = weight(abstract_txt:noisy in 460) [ClassicSimilarity], result of:
            0.09121252 = score(doc=460,freq=1.0), product of:
              0.17786805 = queryWeight, product of:
                2.0480173 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.010584927 = queryNorm
              0.51281 = fieldWeight in 460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0625 = fieldNorm(doc=460)
          0.043578237 = weight(abstract_txt:products in 460) [ClassicSimilarity], result of:
            0.043578237 = score(doc=460,freq=1.0), product of:
              0.124433845 = queryWeight, product of:
                2.0979712 = boost
                5.6033936 = idf(docFreq=444, maxDocs=44421)
                0.010584927 = queryNorm
              0.3502121 = fieldWeight in 460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033936 = idf(docFreq=444, maxDocs=44421)
                0.0625 = fieldNorm(doc=460)
          0.3288627 = weight(abstract_txt:ratings in 460) [ClassicSimilarity], result of:
            0.3288627 = score(doc=460,freq=4.0), product of:
              0.35757303 = queryWeight, product of:
                4.5913143 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.010584927 = queryNorm
              0.9197078 = fieldWeight in 460, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.0625 = fieldNorm(doc=460)
        0.28 = coord(7/25)
    
  3. Chua, A.Y.K.; Banerjee, S.: Understanding review helpfulness as a function of reviewer reputation, review rating, and review depth (2015) 0.14
    0.14104652 = sum of:
      0.14104652 = product of:
        0.88154083 = sum of:
          0.063709594 = weight(abstract_txt:amazon in 2641) [ClassicSimilarity], result of:
            0.063709594 = score(doc=2641,freq=1.0), product of:
              0.08481266 = queryWeight, product of:
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.010584927 = queryNorm
              0.7511802 = fieldWeight in 2641, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.09375 = fieldNorm(doc=2641)
          0.23933259 = weight(abstract_txt:rating in 2641) [ClassicSimilarity], result of:
            0.23933259 = score(doc=2641,freq=2.0), product of:
              0.23461503 = queryWeight, product of:
                2.8807673 = boost
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.010584927 = queryNorm
              1.0201076 = fieldWeight in 2641, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.09375 = fieldNorm(doc=2641)
          0.4511024 = weight(abstract_txt:helpfulness in 2641) [ClassicSimilarity], result of:
            0.4511024 = score(doc=2641,freq=2.0), product of:
              0.35799038 = queryWeight, product of:
                3.5584917 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.010584927 = queryNorm
              1.2600964 = fieldWeight in 2641, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.09375 = fieldNorm(doc=2641)
          0.1273963 = weight(abstract_txt:reviews in 2641) [ClassicSimilarity], result of:
            0.1273963 = score(doc=2641,freq=2.0), product of:
              0.19414929 = queryWeight, product of:
                3.7060661 = boost
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.010584927 = queryNorm
              0.65617704 = fieldWeight in 2641, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.09375 = fieldNorm(doc=2641)
        0.16 = coord(4/25)
    
  4. Malik, M.S.I.; Hussain, A.: ¬An analysis of review content and reviewer variables that contribute to review helpfulness (2018) 0.13
    0.12752613 = sum of:
      0.12752613 = product of:
        0.63763064 = sum of:
          0.06006598 = weight(abstract_txt:amazon in 91) [ClassicSimilarity], result of:
            0.06006598 = score(doc=91,freq=2.0), product of:
              0.08481266 = queryWeight, product of:
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.010584927 = queryNorm
              0.70821947 = fieldWeight in 91, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.0625 = fieldNorm(doc=91)
          0.00937234 = weight(abstract_txt:used in 91) [ClassicSimilarity], result of:
            0.00937234 = score(doc=91,freq=1.0), product of:
              0.044667408 = queryWeight, product of:
                1.2569721 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.010584927 = queryNorm
              0.20982501 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=91)
          0.032633394 = weight(abstract_txt:models in 91) [ClassicSimilarity], result of:
            0.032633394 = score(doc=91,freq=1.0), product of:
              0.11293967 = queryWeight, product of:
                2.3079314 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.010584927 = queryNorm
              0.28894538 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.0625 = fieldNorm(doc=91)
          0.47550374 = weight(abstract_txt:helpfulness in 91) [ClassicSimilarity], result of:
            0.47550374 = score(doc=91,freq=5.0), product of:
              0.35799038 = queryWeight, product of:
                3.5584917 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.010584927 = queryNorm
              1.3282584 = fieldWeight in 91, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.0625 = fieldNorm(doc=91)
          0.060055196 = weight(abstract_txt:reviews in 91) [ClassicSimilarity], result of:
            0.060055196 = score(doc=91,freq=1.0), product of:
              0.19414929 = queryWeight, product of:
                3.7060661 = boost
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.010584927 = queryNorm
              0.30932483 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9491973 = idf(docFreq=855, maxDocs=44421)
                0.0625 = fieldNorm(doc=91)
        0.2 = coord(5/25)
    
  5. Li, H.; Bhowmick, S.S.; Sun, A.: AffRank: affinity-driven ranking of products in online social rating networks (2011) 0.12
    0.118555695 = sum of:
      0.118555695 = product of:
        0.49398208 = sum of:
          0.016543314 = weight(abstract_txt:quality in 483) [ClassicSimilarity], result of:
            0.016543314 = score(doc=483,freq=1.0), product of:
              0.056991488 = queryWeight, product of:
                1.1592834 = boost
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.010584927 = queryNorm
              0.2902769 = fieldWeight in 483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.0625 = fieldNorm(doc=483)
          0.00798835 = weight(abstract_txt:based in 483) [ClassicSimilarity], result of:
            0.00798835 = score(doc=483,freq=1.0), product of:
              0.04015412 = queryWeight, product of:
                1.191778 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.010584927 = queryNorm
              0.1989422 = fieldWeight in 483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=483)
          0.022448324 = weight(abstract_txt:propose in 483) [ClassicSimilarity], result of:
            0.022448324 = score(doc=483,freq=1.0), product of:
              0.06985287 = queryWeight, product of:
                1.2834436 = boost
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.010584927 = queryNorm
              0.32136577 = fieldWeight in 483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.0625 = fieldNorm(doc=483)
          0.087156475 = weight(abstract_txt:products in 483) [ClassicSimilarity], result of:
            0.087156475 = score(doc=483,freq=4.0), product of:
              0.124433845 = queryWeight, product of:
                2.0979712 = boost
                5.6033936 = idf(docFreq=444, maxDocs=44421)
                0.010584927 = queryNorm
              0.7004242 = fieldWeight in 483, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.6033936 = idf(docFreq=444, maxDocs=44421)
                0.0625 = fieldNorm(doc=483)
          0.19541425 = weight(abstract_txt:rating in 483) [ClassicSimilarity], result of:
            0.19541425 = score(doc=483,freq=3.0), product of:
              0.23461503 = queryWeight, product of:
                2.8807673 = boost
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.010584927 = queryNorm
              0.8329145 = fieldWeight in 483, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.0625 = fieldNorm(doc=483)
          0.16443135 = weight(abstract_txt:ratings in 483) [ClassicSimilarity], result of:
            0.16443135 = score(doc=483,freq=1.0), product of:
              0.35757303 = queryWeight, product of:
                4.5913143 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.010584927 = queryNorm
              0.4598539 = fieldWeight in 483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.0625 = fieldNorm(doc=483)
        0.24 = coord(6/25)