Document (#40964)

Author
Toraman, C.
Can, F.
Title
Discovering story chains : a framework based on zigzagged search and news actors
Source
Journal of the Association for Information Science and Technology. 68(2017) no.12, S.2795-2808
Year
2017
Abstract
A story chain is a set of related news articles that reveal how different events are connected. This study presents a framework for discovering story chains, given an input document, in a text collection. The framework has 3 complementary parts that i) scan the collection, ii) measure the similarity between chain-member candidates and the chain, and iii) measure similarity among news articles. For scanning, we apply a novel text-mining method that uses a zigzagged search that reinvestigates past documents based on the updated chain. We also utilize social networks of news actors to reveal connections among news articles. We conduct 2 user studies in terms of 4 effectiveness measures-relevance, coverage, coherence, and ability to disclose relations. The first user study compares several versions of the framework, by varying parameters, to set a guideline for use. The second compares the framework with 3 baselines. The results show that our method provides statistically significant improvement in effectiveness in 61% of pairwise comparisons, with medium or large effect size; in the remainder, none of the baselines significantly outperforms our method.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23885/full.
Field
Kommunikationswissenschaften

Similar documents (content)

  1. Zhao, X.; Jin, P.; Yue, L.: Discovering topic time from web news (2015) 0.19
    0.19386303 = sum of:
      0.19386303 = product of:
        0.692368 = sum of:
          0.016802125 = weight(abstract_txt:text in 3673) [ClassicSimilarity], result of:
            0.016802125 = score(doc=3673,freq=1.0), product of:
              0.066528544 = queryWeight, product of:
                1.0148128 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01622355 = queryNorm
              0.25255513 = fieldWeight in 3673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=3673)
          0.0337029 = weight(abstract_txt:effectiveness in 3673) [ClassicSimilarity], result of:
            0.0337029 = score(doc=3673,freq=1.0), product of:
              0.10581406 = queryWeight, product of:
                1.2798339 = boost
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.01622355 = queryNorm
              0.3185106 = fieldWeight in 3673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.0625 = fieldNorm(doc=3673)
          0.040908527 = weight(abstract_txt:measure in 3673) [ClassicSimilarity], result of:
            0.040908527 = score(doc=3673,freq=1.0), product of:
              0.12040405 = queryWeight, product of:
                1.3652195 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.01622355 = queryNorm
              0.3397604 = fieldWeight in 3673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0625 = fieldNorm(doc=3673)
          0.011908262 = weight(abstract_txt:that in 3673) [ClassicSimilarity], result of:
            0.011908262 = score(doc=3673,freq=2.0), product of:
              0.056968413 = queryWeight, product of:
                1.4848036 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.01622355 = queryNorm
              0.20903271 = fieldWeight in 3673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=3673)
          0.1482372 = weight(abstract_txt:discovering in 3673) [ClassicSimilarity], result of:
            0.1482372 = score(doc=3673,freq=2.0), product of:
              0.22545508 = queryWeight, product of:
                1.8681508 = boost
                7.438788 = idf(docFreq=70, maxDocs=44421)
                0.01622355 = queryNorm
              0.6575022 = fieldWeight in 3673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.438788 = idf(docFreq=70, maxDocs=44421)
                0.0625 = fieldNorm(doc=3673)
          0.08449913 = weight(abstract_txt:framework in 3673) [ClassicSimilarity], result of:
            0.08449913 = score(doc=3673,freq=2.0), product of:
              0.21036333 = queryWeight, product of:
                2.8532312 = boost
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.01622355 = queryNorm
              0.40168184 = fieldWeight in 3673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.0625 = fieldNorm(doc=3673)
          0.35630983 = weight(abstract_txt:news in 3673) [ClassicSimilarity], result of:
            0.35630983 = score(doc=3673,freq=7.0), product of:
              0.36162996 = queryWeight, product of:
                3.7409694 = boost
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.01622355 = queryNorm
              0.98528844 = fieldWeight in 3673, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.0625 = fieldNorm(doc=3673)
        0.28 = coord(7/25)
    
  2. Bounhas, I.; Elayeb, B.; Evrard, F.; Slimani, Y.: Toward a computer study of the reliability of Arabic stories (2010) 0.15
    0.15461102 = sum of:
      0.15461102 = product of:
        0.7730551 = sum of:
          0.040908527 = weight(abstract_txt:measure in 696) [ClassicSimilarity], result of:
            0.040908527 = score(doc=696,freq=1.0), product of:
              0.12040405 = queryWeight, product of:
                1.3652195 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.01622355 = queryNorm
              0.3397604 = fieldWeight in 696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0625 = fieldNorm(doc=696)
          0.011908262 = weight(abstract_txt:that in 696) [ClassicSimilarity], result of:
            0.011908262 = score(doc=696,freq=2.0), product of:
              0.056968413 = queryWeight, product of:
                1.4848036 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.01622355 = queryNorm
              0.20903271 = fieldWeight in 696, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=696)
          0.26920426 = weight(abstract_txt:chains in 696) [ClassicSimilarity], result of:
            0.26920426 = score(doc=696,freq=3.0), product of:
              0.29316542 = queryWeight, product of:
                2.130288 = boost
                8.482592 = idf(docFreq=24, maxDocs=44421)
                0.01622355 = queryNorm
              0.9182675 = fieldWeight in 696, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.482592 = idf(docFreq=24, maxDocs=44421)
                0.0625 = fieldNorm(doc=696)
          0.25811195 = weight(abstract_txt:story in 696) [ClassicSimilarity], result of:
            0.25811195 = score(doc=696,freq=3.0), product of:
              0.32630768 = queryWeight, product of:
                2.752588 = boost
                7.3070183 = idf(docFreq=80, maxDocs=44421)
                0.01622355 = queryNorm
              0.79100794 = fieldWeight in 696, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.3070183 = idf(docFreq=80, maxDocs=44421)
                0.0625 = fieldNorm(doc=696)
          0.19292213 = weight(abstract_txt:chain in 696) [ClassicSimilarity], result of:
            0.19292213 = score(doc=696,freq=1.0), product of:
              0.4266089 = queryWeight, product of:
                3.6342256 = boost
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.01622355 = queryNorm
              0.45222247 = fieldWeight in 696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.0625 = fieldNorm(doc=696)
        0.2 = coord(5/25)
    
  3. Ou, S.; Khoo, C.S.G.; Goh, D.H.: Multi-document summarization of news articles using an event-based framework (2006) 0.14
    0.14139117 = sum of:
      0.14139117 = product of:
        0.70695585 = sum of:
          0.011908262 = weight(abstract_txt:that in 782) [ClassicSimilarity], result of:
            0.011908262 = score(doc=782,freq=2.0), product of:
              0.056968413 = queryWeight, product of:
                1.4848036 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.01622355 = queryNorm
              0.20903271 = fieldWeight in 782, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=782)
          0.034779143 = weight(abstract_txt:method in 782) [ClassicSimilarity], result of:
            0.034779143 = score(doc=782,freq=1.0), product of:
              0.123691976 = queryWeight, product of:
                1.6947215 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.01622355 = queryNorm
              0.2811754 = fieldWeight in 782, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=782)
          0.110358976 = weight(abstract_txt:articles in 782) [ClassicSimilarity], result of:
            0.110358976 = score(doc=782,freq=7.0), product of:
              0.13962656 = queryWeight, product of:
                1.8005766 = boost
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.01622355 = queryNorm
              0.7903867 = fieldWeight in 782, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.0625 = fieldNorm(doc=782)
          0.16899826 = weight(abstract_txt:framework in 782) [ClassicSimilarity], result of:
            0.16899826 = score(doc=782,freq=8.0), product of:
              0.21036333 = queryWeight, product of:
                2.8532312 = boost
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.01622355 = queryNorm
              0.8033637 = fieldWeight in 782, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.0625 = fieldNorm(doc=782)
          0.38091123 = weight(abstract_txt:news in 782) [ClassicSimilarity], result of:
            0.38091123 = score(doc=782,freq=8.0), product of:
              0.36162996 = queryWeight, product of:
                3.7409694 = boost
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.01622355 = queryNorm
              1.0533177 = fieldWeight in 782, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.0625 = fieldNorm(doc=782)
        0.2 = coord(5/25)
    
  4. Xianghao, G.; Yixin, Z.; Li, Y.: ¬A new method of news test understanding and abstracting based on speech acts theory (1998) 0.13
    0.133272 = sum of:
      0.133272 = product of:
        0.83295006 = sum of:
          0.014735723 = weight(abstract_txt:that in 4532) [ClassicSimilarity], result of:
            0.014735723 = score(doc=4532,freq=1.0), product of:
              0.056968413 = queryWeight, product of:
                1.4848036 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.01622355 = queryNorm
              0.2586648 = fieldWeight in 4532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.109375 = fieldNorm(doc=4532)
          0.08607398 = weight(abstract_txt:method in 4532) [ClassicSimilarity], result of:
            0.08607398 = score(doc=4532,freq=2.0), product of:
              0.123691976 = queryWeight, product of:
                1.6947215 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.01622355 = queryNorm
              0.6958736 = fieldWeight in 4532, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.109375 = fieldNorm(doc=4532)
          0.26078677 = weight(abstract_txt:story in 4532) [ClassicSimilarity], result of:
            0.26078677 = score(doc=4532,freq=1.0), product of:
              0.32630768 = queryWeight, product of:
                2.752588 = boost
                7.3070183 = idf(docFreq=80, maxDocs=44421)
                0.01622355 = queryNorm
              0.7992051 = fieldWeight in 4532, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3070183 = idf(docFreq=80, maxDocs=44421)
                0.109375 = fieldNorm(doc=4532)
          0.4713536 = weight(abstract_txt:news in 4532) [ClassicSimilarity], result of:
            0.4713536 = score(doc=4532,freq=4.0), product of:
              0.36162996 = queryWeight, product of:
                3.7409694 = boost
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.01622355 = queryNorm
              1.3034141 = fieldWeight in 4532, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.109375 = fieldNorm(doc=4532)
        0.16 = coord(4/25)
    
  5. Lehmann, J.; Castillo, C.; Lalmas, M.; Baeza-Yates, R.: Story-focused reading in online news and its potential for user engagement (2017) 0.13
    0.13027327 = sum of:
      0.13027327 = product of:
        0.8142079 = sum of:
          0.020625716 = weight(abstract_txt:that in 4529) [ClassicSimilarity], result of:
            0.020625716 = score(doc=4529,freq=6.0), product of:
              0.056968413 = queryWeight, product of:
                1.4848036 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.01622355 = queryNorm
              0.3620553 = fieldWeight in 4529, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=4529)
          0.07224691 = weight(abstract_txt:articles in 4529) [ClassicSimilarity], result of:
            0.07224691 = score(doc=4529,freq=3.0), product of:
              0.13962656 = queryWeight, product of:
                1.8005766 = boost
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.01622355 = queryNorm
              0.51742953 = fieldWeight in 4529, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.0625 = fieldNorm(doc=4529)
          0.36502543 = weight(abstract_txt:story in 4529) [ClassicSimilarity], result of:
            0.36502543 = score(doc=4529,freq=6.0), product of:
              0.32630768 = queryWeight, product of:
                2.752588 = boost
                7.3070183 = idf(docFreq=80, maxDocs=44421)
                0.01622355 = queryNorm
              1.1186541 = fieldWeight in 4529, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.3070183 = idf(docFreq=80, maxDocs=44421)
                0.0625 = fieldNorm(doc=4529)
          0.35630983 = weight(abstract_txt:news in 4529) [ClassicSimilarity], result of:
            0.35630983 = score(doc=4529,freq=7.0), product of:
              0.36162996 = queryWeight, product of:
                3.7409694 = boost
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.01622355 = queryNorm
              0.98528844 = fieldWeight in 4529, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.0625 = fieldNorm(doc=4529)
        0.16 = coord(4/25)