Document (#41461)

Author
Zhu, J.
Han, L.
Gou, Z.
Yuan, X.
Title
¬A fuzzy clustering-based denoising model for evaluating uncertainty in collaborative filtering recommender systems
Source
Journal of the Association for Information Science and Technology. 69(2018) no.9, S.1109-1121
Year
2018
Abstract
Recommender systems are effective in predicting the most suitable products for users, such as movies and books. To facilitate personalized recommendations, the quality of item ratings should be guaranteed. However, a few ratings might not be accurate enough due to the uncertainty of user behavior and are referred to as natural noise. In this article, we present a novel fuzzy clustering-based method for detecting noisy ratings. The entropy of a subset of the original ratings dataset is used to indicate the data-driven uncertainty, and evaluation metrics are adopted to represent the prediction-driven uncertainty. After the repetition of resampling and the execution of a recommendation algorithm, the entropy and evaluation metrics vectors are obtained and are empirically categorized to identify the proportion of the potential noise. Then, the fuzzy C-means-based denoising (FCMD) algorithm is performed to verify the natural noise under the assumption that natural noise is primarily the result of the exceptional behavior of users. Finally, a case study is performed using two real-world datasets. The experimental results show that our proposal outperforms previous proposals and has an advantage in dealing with natural noise.
Content
Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24036.
Theme
Retrievalalgorithmen

Similar documents (author)

  1. Yuan, W.: End-user searching behavior in information retrieval : a longitudinal study (1997) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:yuan in 394) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 394, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=394)
    
  2. Yuan, W.; Meadow, C.T.: ¬A study of the use of variables in information retrieval user studies (1999) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 3943) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 3943, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=3943)
    
  3. Jin, Z.; Yuan, C.: On the ambiguity of information retrieval for visualization (1998) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 4216) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 4216, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=4216)
    
  4. Yuan, X.; Belkin, N.J.: Investigating information retrieval support techniques for different information-seeking strategies (2010) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 686) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 686, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=686)
    
  5. Yuan, X.; Belkin, N.J.: Evaluating an integrated system supporting multiple information-seeking strategies (2010) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 979) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 979, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=979)
    

Similar documents (content)

  1. Tay, W.; Zhang, X.; Karimi , S.: Beyond mean rating : probabilistic aggregation of star ratings based on helpfulness (2020) 0.16
    0.15577082 = sum of:
      0.15577082 = product of:
        0.77885413 = sum of:
          0.086253315 = weight(abstract_txt:noisy in 917) [ClassicSimilarity], result of:
            0.086253315 = score(doc=917,freq=2.0), product of:
              0.118933536 = queryWeight, product of:
                1.0987763 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.01319224 = queryNorm
              0.7252228 = fieldWeight in 917, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0625 = fieldNorm(doc=917)
          0.064728126 = weight(abstract_txt:movies in 917) [ClassicSimilarity], result of:
            0.064728126 = score(doc=917,freq=1.0), product of:
              0.12374447 = queryWeight, product of:
                1.1207792 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.01319224 = queryNorm
              0.5230789 = fieldWeight in 917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=917)
          0.015108047 = weight(abstract_txt:based in 917) [ClassicSimilarity], result of:
            0.015108047 = score(doc=917,freq=2.0), product of:
              0.05369903 = queryWeight, product of:
                1.2787952 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.01319224 = queryNorm
              0.28134674 = fieldWeight in 917, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=917)
          0.35183653 = weight(abstract_txt:ratings in 917) [ClassicSimilarity], result of:
            0.35183653 = score(doc=917,freq=4.0), product of:
              0.38255253 = queryWeight, product of:
                3.9412382 = boost
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.01319224 = queryNorm
              0.9197078 = fieldWeight in 917, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.357662 = idf(docFreq=76, maxDocs=44421)
                0.0625 = fieldNorm(doc=917)
          0.26092812 = weight(abstract_txt:noise in 917) [ClassicSimilarity], result of:
            0.26092812 = score(doc=917,freq=1.0), product of:
              0.53596246 = queryWeight, product of:
                5.215661 = boost
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.01319224 = queryNorm
              0.48684028 = fieldWeight in 917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.0625 = fieldNorm(doc=917)
        0.2 = coord(5/25)
    
  2. Cole, C.: Shannon revisited : information in terms of uncertainty (1993) 0.11
    0.1128867 = sum of:
      0.1128867 = product of:
        0.94072247 = sum of:
          0.1673067 = weight(abstract_txt:entropy in 4068) [ClassicSimilarity], result of:
            0.1673067 = score(doc=4068,freq=1.0), product of:
              0.22408968 = queryWeight, product of:
                2.1329618 = boost
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.01319224 = queryNorm
              0.74660605 = fieldWeight in 4068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.09375 = fieldNorm(doc=4068)
          0.38202357 = weight(abstract_txt:uncertainty in 4068) [ClassicSimilarity], result of:
            0.38202357 = score(doc=4068,freq=3.0), product of:
              0.33945012 = queryWeight, product of:
                3.7125742 = boost
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.01319224 = queryNorm
              1.1254189 = fieldWeight in 4068, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.09375 = fieldNorm(doc=4068)
          0.39139217 = weight(abstract_txt:noise in 4068) [ClassicSimilarity], result of:
            0.39139217 = score(doc=4068,freq=1.0), product of:
              0.53596246 = queryWeight, product of:
                5.215661 = boost
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.01319224 = queryNorm
              0.73026043 = fieldWeight in 4068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.09375 = fieldNorm(doc=4068)
        0.12 = coord(3/25)
    
  3. Agarwal, B.; Ramampiaro, H.; Langseth, H.; Ruocco, M.: ¬A deep network model for paraphrase detection in short text messages (2018) 0.11
    0.11153093 = sum of:
      0.11153093 = product of:
        0.4647122 = sum of:
          0.050293364 = weight(abstract_txt:detecting in 43) [ClassicSimilarity], result of:
            0.050293364 = score(doc=43,freq=1.0), product of:
              0.104585364 = queryWeight, product of:
                1.0303686 = boost
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.01319224 = queryNorm
              0.4808834 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.694134 = idf(docFreq=54, maxDocs=44421)
                0.0625 = fieldNorm(doc=43)
          0.060990307 = weight(abstract_txt:noisy in 43) [ClassicSimilarity], result of:
            0.060990307 = score(doc=43,freq=1.0), product of:
              0.118933536 = queryWeight, product of:
                1.0987763 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.01319224 = queryNorm
              0.51281 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0625 = fieldNorm(doc=43)
          0.019851277 = weight(abstract_txt:evaluation in 43) [ClassicSimilarity], result of:
            0.019851277 = score(doc=43,freq=1.0), product of:
              0.07090324 = queryWeight, product of:
                1.1997899 = boost
                4.479632 = idf(docFreq=1368, maxDocs=44421)
                0.01319224 = queryNorm
              0.279977 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.479632 = idf(docFreq=1368, maxDocs=44421)
                0.0625 = fieldNorm(doc=43)
          0.015108047 = weight(abstract_txt:based in 43) [ClassicSimilarity], result of:
            0.015108047 = score(doc=43,freq=2.0), product of:
              0.05369903 = queryWeight, product of:
                1.2787952 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.01319224 = queryNorm
              0.28134674 = fieldWeight in 43, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=43)
          0.057541072 = weight(abstract_txt:natural in 43) [ClassicSimilarity], result of:
            0.057541072 = score(doc=43,freq=1.0), product of:
              0.18160832 = queryWeight, product of:
                2.7155325 = boost
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.01319224 = queryNorm
              0.3168416 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.0625 = fieldNorm(doc=43)
          0.26092812 = weight(abstract_txt:noise in 43) [ClassicSimilarity], result of:
            0.26092812 = score(doc=43,freq=1.0), product of:
              0.53596246 = queryWeight, product of:
                5.215661 = boost
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.01319224 = queryNorm
              0.48684028 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.0625 = fieldNorm(doc=43)
        0.24 = coord(6/25)
    
  4. Zhao, L.; Wu, L.; Huang, X.: Using query expansion in graph-based approach for query-focused multi-document summarization (2009) 0.09
    0.093717895 = sum of:
      0.093717895 = product of:
        0.46858945 = sum of:
          0.024814095 = weight(abstract_txt:evaluation in 3449) [ClassicSimilarity], result of:
            0.024814095 = score(doc=3449,freq=1.0), product of:
              0.07090324 = queryWeight, product of:
                1.1997899 = boost
                4.479632 = idf(docFreq=1368, maxDocs=44421)
                0.01319224 = queryNorm
              0.34997123 = fieldWeight in 3449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.479632 = idf(docFreq=1368, maxDocs=44421)
                0.078125 = fieldNorm(doc=3449)
          0.013353754 = weight(abstract_txt:based in 3449) [ClassicSimilarity], result of:
            0.013353754 = score(doc=3449,freq=1.0), product of:
              0.05369903 = queryWeight, product of:
                1.2787952 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.01319224 = queryNorm
              0.24867775 = fieldWeight in 3449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=3449)
          0.051255673 = weight(abstract_txt:algorithm in 3449) [ClassicSimilarity], result of:
            0.051255673 = score(doc=3449,freq=1.0), product of:
              0.11499927 = queryWeight, product of:
                1.5279871 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.01319224 = queryNorm
              0.44570434 = fieldWeight in 3449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.078125 = fieldNorm(doc=3449)
          0.053005777 = weight(abstract_txt:performed in 3449) [ClassicSimilarity], result of:
            0.053005777 = score(doc=3449,freq=1.0), product of:
              0.117602326 = queryWeight, product of:
                1.5451837 = boost
                5.7692223 = idf(docFreq=376, maxDocs=44421)
                0.01319224 = queryNorm
              0.4507205 = fieldWeight in 3449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7692223 = idf(docFreq=376, maxDocs=44421)
                0.078125 = fieldNorm(doc=3449)
          0.32616016 = weight(abstract_txt:noise in 3449) [ClassicSimilarity], result of:
            0.32616016 = score(doc=3449,freq=1.0), product of:
              0.53596246 = queryWeight, product of:
                5.215661 = boost
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.01319224 = queryNorm
              0.60855037 = fieldWeight in 3449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.078125 = fieldNorm(doc=3449)
        0.2 = coord(5/25)
    
  5. Longshu, L.; Xia, Z.: On an aproximate fuzzy information retrieval agent (1998) 0.09
    0.09176272 = sum of:
      0.09176272 = product of:
        0.7646894 = sum of:
          0.026707508 = weight(abstract_txt:based in 4294) [ClassicSimilarity], result of:
            0.026707508 = score(doc=4294,freq=1.0), product of:
              0.05369903 = queryWeight, product of:
                1.2787952 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.01319224 = queryNorm
              0.4973555 = fieldWeight in 4294, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.15625 = fieldNorm(doc=4294)
          0.14497295 = weight(abstract_txt:algorithm in 4294) [ClassicSimilarity], result of:
            0.14497295 = score(doc=4294,freq=2.0), product of:
              0.11499927 = queryWeight, product of:
                1.5279871 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.01319224 = queryNorm
              1.2606423 = fieldWeight in 4294, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.15625 = fieldNorm(doc=4294)
          0.59300894 = weight(abstract_txt:fuzzy in 4294) [ClassicSimilarity], result of:
            0.59300894 = score(doc=4294,freq=5.0), product of:
              0.24808186 = queryWeight, product of:
                2.7486236 = boost
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.01319224 = queryNorm
              2.390376 = fieldWeight in 4294, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.15625 = fieldNorm(doc=4294)
        0.12 = coord(3/25)