Document (#36447)

Author
Seki, K.
Uehara, K.
Title
Opinionated document retrieval using subjective triggers
Source
Journal of the American Society for Information Science and Technology. 62(2011) no.5, S.861-876
Year
2011
Abstract
This article proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigger model-originally developed for incorporating distant word dependencies-in order to model the characteristics of personal opinions that cannot be properly modeled by standard n-grams. Our primary assumption is that there are two constituents to form a subjective opinion. One is the subject of the opinion or the object that the opinion is about, and the other is a subjective expression; the former is regarded as a triggering word and the latter as a triggered word. We automatically identify those subjective trigger patterns to build a language model from a corpus of product customer reviews. Experimental results on the Text Retrieval Conference Blog track test collections show that, when used for reranking initial search results, our proposed model significantly improves opinionated document retrieval. In addition, we report on an experiment on dynamic adaptation of the model to a given query, which is found effective for most of the difficult queries categorized under politics and organizations. We also demonstrate that, without any modification to the proposed model itself, it can be effectively applied to polarized opinion retrieval.

Similar documents (content)

  1. Belbachir, F.; Boughanem, M.: Using language models to improve opinion detection (2018) 0.48
    0.4833934 = sum of:
      0.4833934 = product of:
        1.5106044 = sum of:
          0.04904516 = weight(abstract_txt:blog in 44) [ClassicSimilarity], result of:
            0.04904516 = score(doc=44,freq=1.0), product of:
              0.115716115 = queryWeight, product of:
                1.0358548 = boost
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.014413874 = queryNorm
              0.42384037 = fieldWeight in 44, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
          0.034188893 = weight(abstract_txt:language in 44) [ClassicSimilarity], result of:
            0.034188893 = score(doc=44,freq=5.0), product of:
              0.0670306 = queryWeight, product of:
                1.1149452 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.014413874 = queryNorm
              0.51004905 = fieldWeight in 44, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
          0.009853627 = weight(abstract_txt:that in 44) [ClassicSimilarity], result of:
            0.009853627 = score(doc=44,freq=2.0), product of:
              0.05387333 = queryWeight, product of:
                1.5804249 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.014413874 = queryNorm
              0.18290362 = fieldWeight in 44, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
          0.055961743 = weight(abstract_txt:document in 44) [ClassicSimilarity], result of:
            0.055961743 = score(doc=44,freq=5.0), product of:
              0.10657148 = queryWeight, product of:
                1.7218015 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.014413874 = queryNorm
              0.52510995 = fieldWeight in 44, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
          0.05421579 = weight(abstract_txt:retrieval in 44) [ClassicSimilarity], result of:
            0.05421579 = score(doc=44,freq=6.0), product of:
              0.116417915 = queryWeight, product of:
                2.3232548 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.014413874 = queryNorm
              0.4656997 = fieldWeight in 44, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
          0.1686807 = weight(abstract_txt:subjective in 44) [ClassicSimilarity], result of:
            0.1686807 = score(doc=44,freq=2.0), product of:
              0.3321875 = queryWeight, product of:
                3.5101333 = boost
                6.565669 = idf(docFreq=169, maxDocs=44421)
                0.014413874 = queryNorm
              0.5077876 = fieldWeight in 44, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.565669 = idf(docFreq=169, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
          0.47399333 = weight(abstract_txt:opinion in 44) [ClassicSimilarity], result of:
            0.47399333 = score(doc=44,freq=12.0), product of:
              0.36403024 = queryWeight, product of:
                3.6745205 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.014413874 = queryNorm
              1.3020713 = fieldWeight in 44, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
          0.66466516 = weight(abstract_txt:opinionated in 44) [ClassicSimilarity], result of:
            0.66466516 = score(doc=44,freq=6.0), product of:
              0.5220615 = queryWeight, product of:
                3.8108637 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.014413874 = queryNorm
              1.2731549 = fieldWeight in 44, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.0546875 = fieldNorm(doc=44)
        0.32 = coord(8/25)
    
  2. Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.19
    0.18849349 = sum of:
      0.18849349 = product of:
        0.78538954 = sum of:
          0.023563003 = weight(abstract_txt:proposed in 1605) [ClassicSimilarity], result of:
            0.023563003 = score(doc=1605,freq=1.0), product of:
              0.08181486 = queryWeight, product of:
                1.2317797 = boost
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.014413874 = queryNorm
              0.28800395 = fieldWeight in 1605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.0625 = fieldNorm(doc=1605)
          0.02860212 = weight(abstract_txt:document in 1605) [ClassicSimilarity], result of:
            0.02860212 = score(doc=1605,freq=1.0), product of:
              0.10657148 = queryWeight, product of:
                1.7218015 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.014413874 = queryNorm
              0.26838437 = fieldWeight in 1605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=1605)
          0.058089927 = weight(abstract_txt:word in 1605) [ClassicSimilarity], result of:
            0.058089927 = score(doc=1605,freq=1.0), product of:
              0.17091338 = queryWeight, product of:
                2.1804726 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.014413874 = queryNorm
              0.33987933 = fieldWeight in 1605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0625 = fieldNorm(doc=1605)
          0.02529543 = weight(abstract_txt:retrieval in 1605) [ClassicSimilarity], result of:
            0.02529543 = score(doc=1605,freq=1.0), product of:
              0.116417915 = queryWeight, product of:
                2.3232548 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.014413874 = queryNorm
              0.21728125 = fieldWeight in 1605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=1605)
          0.23610377 = weight(abstract_txt:subjective in 1605) [ClassicSimilarity], result of:
            0.23610377 = score(doc=1605,freq=3.0), product of:
              0.3321875 = queryWeight, product of:
                3.5101333 = boost
                6.565669 = idf(docFreq=169, maxDocs=44421)
                0.014413874 = queryNorm
              0.7107545 = fieldWeight in 1605, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.565669 = idf(docFreq=169, maxDocs=44421)
                0.0625 = fieldNorm(doc=1605)
          0.41373524 = weight(abstract_txt:opinion in 1605) [ClassicSimilarity], result of:
            0.41373524 = score(doc=1605,freq=7.0), product of:
              0.36403024 = queryWeight, product of:
                3.6745205 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.014413874 = queryNorm
              1.1365409 = fieldWeight in 1605, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0625 = fieldNorm(doc=1605)
        0.24 = coord(6/25)
    
  3. Guo, L.; Wan, X.: Exploiting syntactic and semantic relationships between terms for opinion retrieval (2012) 0.18
    0.17548212 = sum of:
      0.17548212 = product of:
        0.73117554 = sum of:
          0.029453754 = weight(abstract_txt:proposed in 1492) [ClassicSimilarity], result of:
            0.029453754 = score(doc=1492,freq=1.0), product of:
              0.08181486 = queryWeight, product of:
                1.2317797 = boost
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.014413874 = queryNorm
              0.36000493 = fieldWeight in 1492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.078125 = fieldNorm(doc=1492)
          0.0099536665 = weight(abstract_txt:that in 1492) [ClassicSimilarity], result of:
            0.0099536665 = score(doc=1492,freq=1.0), product of:
              0.05387333 = queryWeight, product of:
                1.5804249 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.014413874 = queryNorm
              0.18476056 = fieldWeight in 1492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=1492)
          0.035752647 = weight(abstract_txt:document in 1492) [ClassicSimilarity], result of:
            0.035752647 = score(doc=1492,freq=1.0), product of:
              0.10657148 = queryWeight, product of:
                1.7218015 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.014413874 = queryNorm
              0.33548045 = fieldWeight in 1492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=1492)
          0.044716425 = weight(abstract_txt:retrieval in 1492) [ClassicSimilarity], result of:
            0.044716425 = score(doc=1492,freq=2.0), product of:
              0.116417915 = queryWeight, product of:
                2.3232548 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.014413874 = queryNorm
              0.3841026 = fieldWeight in 1492, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=1492)
          0.51716906 = weight(abstract_txt:opinion in 1492) [ClassicSimilarity], result of:
            0.51716906 = score(doc=1492,freq=7.0), product of:
              0.36403024 = queryWeight, product of:
                3.6745205 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.014413874 = queryNorm
              1.4206761 = fieldWeight in 1492, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.078125 = fieldNorm(doc=1492)
          0.094129995 = weight(abstract_txt:model in 1492) [ClassicSimilarity], result of:
            0.094129995 = score(doc=1492,freq=2.0), product of:
              0.21391265 = queryWeight, product of:
                3.726226 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.014413874 = queryNorm
              0.4400394 = fieldWeight in 1492, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.078125 = fieldNorm(doc=1492)
        0.24 = coord(6/25)
    
  4. Lhadj, L.S.; Boughanem, M.; Amrouche, K.: Enhancing information retrieval through concept-based language modeling and semantic smoothing (2016) 0.14
    0.14143743 = sum of:
      0.14143743 = product of:
        0.44199198 = sum of:
          0.057806276 = weight(abstract_txt:dependencies in 4221) [ClassicSimilarity], result of:
            0.057806276 = score(doc=4221,freq=1.0), product of:
              0.11811864 = queryWeight, product of:
                1.0465529 = boost
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.014413874 = queryNorm
              0.48939165 = fieldWeight in 4221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
          0.06650814 = weight(abstract_txt:grams in 4221) [ClassicSimilarity], result of:
            0.06650814 = score(doc=4221,freq=1.0), product of:
              0.12969352 = queryWeight, product of:
                1.0966325 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.014413874 = queryNorm
              0.51281 = fieldWeight in 4221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
          0.030265834 = weight(abstract_txt:language in 4221) [ClassicSimilarity], result of:
            0.030265834 = score(doc=4221,freq=3.0), product of:
              0.0670306 = queryWeight, product of:
                1.1149452 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.014413874 = queryNorm
              0.45152265 = fieldWeight in 4221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
          0.015925867 = weight(abstract_txt:that in 4221) [ClassicSimilarity], result of:
            0.015925867 = score(doc=4221,freq=4.0), product of:
              0.05387333 = queryWeight, product of:
                1.5804249 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.014413874 = queryNorm
              0.2956169 = fieldWeight in 4221, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
          0.02860212 = weight(abstract_txt:document in 4221) [ClassicSimilarity], result of:
            0.02860212 = score(doc=4221,freq=1.0), product of:
              0.10657148 = queryWeight, product of:
                1.7218015 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.014413874 = queryNorm
              0.26838437 = fieldWeight in 4221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
          0.100614704 = weight(abstract_txt:word in 4221) [ClassicSimilarity], result of:
            0.100614704 = score(doc=4221,freq=3.0), product of:
              0.17091338 = queryWeight, product of:
                2.1804726 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.014413874 = queryNorm
              0.58868825 = fieldWeight in 4221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
          0.03577314 = weight(abstract_txt:retrieval in 4221) [ClassicSimilarity], result of:
            0.03577314 = score(doc=4221,freq=2.0), product of:
              0.116417915 = queryWeight, product of:
                2.3232548 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.014413874 = queryNorm
              0.3072821 = fieldWeight in 4221, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
          0.10649593 = weight(abstract_txt:model in 4221) [ClassicSimilarity], result of:
            0.10649593 = score(doc=4221,freq=4.0), product of:
              0.21391265 = queryWeight, product of:
                3.726226 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.014413874 = queryNorm
              0.49784777 = fieldWeight in 4221, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0625 = fieldNorm(doc=4221)
        0.32 = coord(8/25)
    
  5. Ye, Z.; He, B.; Wang, L.; Luo, T.: Utilizing term proximity for blog post retrieval (2013) 0.13
    0.13309178 = sum of:
      0.13309178 = product of:
        0.66545886 = sum of:
          0.056051616 = weight(abstract_txt:blog in 2126) [ClassicSimilarity], result of:
            0.056051616 = score(doc=2126,freq=1.0), product of:
              0.115716115 = queryWeight, product of:
                1.0358548 = boost
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.014413874 = queryNorm
              0.484389 = fieldWeight in 2126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.0625 = fieldNorm(doc=2126)
          0.007962934 = weight(abstract_txt:that in 2126) [ClassicSimilarity], result of:
            0.007962934 = score(doc=2126,freq=1.0), product of:
              0.05387333 = queryWeight, product of:
                1.5804249 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.014413874 = queryNorm
              0.14780845 = fieldWeight in 2126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=2126)
          0.043812968 = weight(abstract_txt:retrieval in 2126) [ClassicSimilarity], result of:
            0.043812968 = score(doc=2126,freq=3.0), product of:
              0.116417915 = queryWeight, product of:
                2.3232548 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.014413874 = queryNorm
              0.37634215 = fieldWeight in 2126, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=2126)
          0.119066074 = weight(abstract_txt:model in 2126) [ClassicSimilarity], result of:
            0.119066074 = score(doc=2126,freq=5.0), product of:
              0.21391265 = queryWeight, product of:
                3.726226 = boost
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.014413874 = queryNorm
              0.5566107 = fieldWeight in 2126, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9827821 = idf(docFreq=2249, maxDocs=44421)
                0.0625 = fieldNorm(doc=2126)
          0.43856525 = weight(abstract_txt:opinionated in 2126) [ClassicSimilarity], result of:
            0.43856525 = score(doc=2126,freq=2.0), product of:
              0.5220615 = queryWeight, product of:
                3.8108637 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.014413874 = queryNorm
              0.8400643 = fieldWeight in 2126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.0625 = fieldNorm(doc=2126)
        0.2 = coord(5/25)