Document (#31117)

Author
Thelwall, M.
Prabowo, R.
Fairclough, R.
Title
Are raw RSS feeds suitable for broad issue scanning? : a science concern case study
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.12, S.1644-1654
Year
2006
Abstract
Broad issue scanning is the task of identifying important public debates arising in a given broad issue; really simple syndication (RSS) feeds are a natural information source for investigating broad issues. RSS, as originally conceived, is a method for publishing timely and concise information on the Internet, for example, about the main stories in a news site or the latest postings in a blog. RSS feeds are potentially a nonintrusive source of high-quality data about public opinion: Monitoring a large number may allow quantitative methods to extract information relevant to a given need. In this article we describe an RSS feed-based coword frequency method to identify bursts of discussion relevant to a given broad issue. A case study of public science concerns is used to demonstrate the method and assess the suitability of raw RSS feeds for broad issue scanning (i.e., without data cleansing). An attempt to identify genuine science concern debates from the corpus through investigating the top 1,000 "burst" words found only two genuine debates, however. The low success rate was mainly caused by a few pathological feeds that dominated the results and obscured any significant debates. The results point to the need to develop effective data cleansing procedures for RSS feeds, particularly if there is not a large quantity of discussion about the broad issue, and a range of potential techniques is suggested. Finally, the analysis confirmed that the time series information generated by real-time monitoring of RSS feeds could usefully illustrate the evolution of new debates relevant to a broad issue.
Object
RSS

Similar documents (author)

  1. Thelwall, M.; Thelwall, S.: ¬A thematic analysis of highly retweeted early COVID-19 tweets : consensus, information, dissent and lockdown life (2020) 4.89
    4.888919 = sum of:
      4.888919 = weight(author_txt:thelwall in 1179) [ClassicSimilarity], result of:
        4.888919 = fieldWeight in 1179, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.9139757 = idf(docFreq=119, maxDocs=44421)
          0.5 = fieldNorm(doc=1179)
    
  2. Thelwall, M.: Extracting macroscopic information from Web links (2001) 4.32
    4.3212347 = sum of:
      4.3212347 = weight(author_txt:thelwall in 851) [ClassicSimilarity], result of:
        4.3212347 = fieldWeight in 851, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.9139757 = idf(docFreq=119, maxDocs=44421)
          0.625 = fieldNorm(doc=851)
    
  3. Thelwall, M.: Conceptualizing documentation on the Web : an evaluation of different heuristic-based models for counting links between university Web sites (2002) 4.32
    4.3212347 = sum of:
      4.3212347 = weight(author_txt:thelwall in 1978) [ClassicSimilarity], result of:
        4.3212347 = fieldWeight in 1978, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.9139757 = idf(docFreq=119, maxDocs=44421)
          0.625 = fieldNorm(doc=1978)
    
  4. Thelwall, M.: Text characteristics of English language university Web sites (2005) 4.32
    4.3212347 = sum of:
      4.3212347 = weight(author_txt:thelwall in 4463) [ClassicSimilarity], result of:
        4.3212347 = fieldWeight in 4463, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.9139757 = idf(docFreq=119, maxDocs=44421)
          0.625 = fieldNorm(doc=4463)
    
  5. Thelwall, M.: Bibliometrics to webometrics (2009) 4.32
    4.3212347 = sum of:
      4.3212347 = weight(author_txt:thelwall in 5239) [ClassicSimilarity], result of:
        4.3212347 = fieldWeight in 5239, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.9139757 = idf(docFreq=119, maxDocs=44421)
          0.625 = fieldNorm(doc=5239)
    

Similar documents (content)

  1. Freelon, D.; Pruden, M.L.; Malmer, D.; Wu, Q.; Xia, Y.; Johnson, D.; Chen, E.; Crist, A.: What's in your PIE? : understanding the contents of personalized information environments with PIEGraph (2024) 0.17
    0.17405146 = sum of:
      0.17405146 = product of:
        0.7252144 = sum of:
          0.0045061987 = weight(abstract_txt:information in 2356) [ClassicSimilarity], result of:
            0.0045061987 = score(doc=2356,freq=1.0), product of:
              0.029806605 = queryWeight, product of:
                1.0055677 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.012254155 = queryNorm
              0.15118122 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.019720778 = weight(abstract_txt:data in 2356) [ClassicSimilarity], result of:
            0.019720778 = score(doc=2356,freq=5.0), product of:
              0.042372625 = queryWeight, product of:
                1.0383132 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.012254155 = queryNorm
              0.46541315 = fieldWeight in 2356, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.014337119 = weight(abstract_txt:about in 2356) [ClassicSimilarity], result of:
            0.014337119 = score(doc=2356,freq=1.0), product of:
              0.058582414 = queryWeight, product of:
                1.2208697 = boost
                3.9157467 = idf(docFreq=2405, maxDocs=44421)
                0.012254155 = queryNorm
              0.24473417 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9157467 = idf(docFreq=2405, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.023699071 = weight(abstract_txt:relevant in 2356) [ClassicSimilarity], result of:
            0.023699071 = score(doc=2356,freq=1.0), product of:
              0.08189931 = queryWeight, product of:
                1.4435298 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.012254155 = queryNorm
              0.2893684 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.16768348 = weight(abstract_txt:debates in 2356) [ClassicSimilarity], result of:
            0.16768348 = score(doc=2356,freq=1.0), product of:
              0.35787866 = queryWeight, product of:
                3.895632 = boost
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.012254155 = queryNorm
              0.46854845 = fieldWeight in 2356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
          0.49526778 = weight(abstract_txt:feeds in 2356) [ClassicSimilarity], result of:
            0.49526778 = score(doc=2356,freq=2.0), product of:
              0.6541364 = queryWeight, product of:
                6.2317243 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.012254155 = queryNorm
              0.75713223 = fieldWeight in 2356, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0625 = fieldNorm(doc=2356)
        0.24 = coord(6/25)
    
  2. Farooq, U.; Ganoe, C.H.; Carroll, J.M.; Councill, I.G.; Giles, C.L.: Design and evaluation of awareness mechanisms in CiteSeer (2008) 0.16
    0.15803525 = sum of:
      0.15803525 = product of:
        0.7901762 = sum of:
          0.0063727275 = weight(abstract_txt:information in 3051) [ClassicSimilarity], result of:
            0.0063727275 = score(doc=3051,freq=2.0), product of:
              0.029806605 = queryWeight, product of:
                1.0055677 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.012254155 = queryNorm
              0.21380253 = fieldWeight in 3051, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=3051)
          0.013633212 = weight(abstract_txt:science in 3051) [ClassicSimilarity], result of:
            0.013633212 = score(doc=3051,freq=1.0), product of:
              0.0566489 = queryWeight, product of:
                1.2005532 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.012254155 = queryNorm
              0.24066156 = fieldWeight in 3051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.0625 = fieldNorm(doc=3051)
          0.04605683 = weight(abstract_txt:investigating in 3051) [ClassicSimilarity], result of:
            0.04605683 = score(doc=3051,freq=1.0), product of:
              0.111418754 = queryWeight, product of:
                1.3747356 = boost
                6.613871 = idf(docFreq=161, maxDocs=44421)
                0.012254155 = queryNorm
              0.41336694 = fieldWeight in 3051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.613871 = idf(docFreq=161, maxDocs=44421)
                0.0625 = fieldNorm(doc=3051)
          0.023699071 = weight(abstract_txt:relevant in 3051) [ClassicSimilarity], result of:
            0.023699071 = score(doc=3051,freq=1.0), product of:
              0.08189931 = queryWeight, product of:
                1.4435298 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.012254155 = queryNorm
              0.2893684 = fieldWeight in 3051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0625 = fieldNorm(doc=3051)
          0.70041436 = weight(abstract_txt:feeds in 3051) [ClassicSimilarity], result of:
            0.70041436 = score(doc=3051,freq=4.0), product of:
              0.6541364 = queryWeight, product of:
                6.2317243 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.012254155 = queryNorm
              1.0707467 = fieldWeight in 3051, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0625 = fieldNorm(doc=3051)
        0.2 = coord(5/25)
    
  3. Thelwall, M.; Prabowo, R.: Identifying and characterizing public science-related fears from RSS feeds (2007) 0.12
    0.12305709 = sum of:
      0.12305709 = product of:
        0.61528546 = sum of:
          0.041743018 = weight(abstract_txt:science in 1137) [ClassicSimilarity], result of:
            0.041743018 = score(doc=1137,freq=6.0), product of:
              0.0566489 = queryWeight, product of:
                1.2005532 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.012254155 = queryNorm
              0.73687255 = fieldWeight in 1137, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.078125 = fieldNorm(doc=1137)
          0.050380602 = weight(abstract_txt:concern in 1137) [ClassicSimilarity], result of:
            0.050380602 = score(doc=1137,freq=1.0), product of:
              0.10193684 = queryWeight, product of:
                1.314939 = boost
                6.326189 = idf(docFreq=215, maxDocs=44421)
                0.012254155 = queryNorm
              0.49423352 = fieldWeight in 1137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.326189 = idf(docFreq=215, maxDocs=44421)
                0.078125 = fieldNorm(doc=1137)
          0.02717816 = weight(abstract_txt:method in 1137) [ClassicSimilarity], result of:
            0.02717816 = score(doc=1137,freq=1.0), product of:
              0.07732728 = queryWeight, product of:
                1.4026588 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.012254155 = queryNorm
              0.35146925 = fieldWeight in 1137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.078125 = fieldNorm(doc=1137)
          0.05822468 = weight(abstract_txt:public in 1137) [ClassicSimilarity], result of:
            0.05822468 = score(doc=1137,freq=4.0), product of:
              0.08095384 = queryWeight, product of:
                1.4351734 = boost
                4.603092 = idf(docFreq=1209, maxDocs=44421)
                0.012254155 = queryNorm
              0.71923316 = fieldWeight in 1137, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.603092 = idf(docFreq=1209, maxDocs=44421)
                0.078125 = fieldNorm(doc=1137)
          0.43775898 = weight(abstract_txt:feeds in 1137) [ClassicSimilarity], result of:
            0.43775898 = score(doc=1137,freq=1.0), product of:
              0.6541364 = queryWeight, product of:
                6.2317243 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.012254155 = queryNorm
              0.66921663 = fieldWeight in 1137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.078125 = fieldNorm(doc=1137)
        0.2 = coord(5/25)
    
  4. Cornelius, I.: Theorizing information for information science (2002) 0.11
    0.10718271 = sum of:
      0.10718271 = product of:
        0.33494598 = sum of:
          0.016422141 = weight(abstract_txt:information in 5244) [ClassicSimilarity], result of:
            0.016422141 = score(doc=5244,freq=34.0), product of:
              0.029806605 = queryWeight, product of:
                1.0055677 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.012254155 = queryNorm
              0.5509565 = fieldWeight in 5244, product of:
                5.8309517 = tf(freq=34.0), with freq of:
                  34.0 = termFreq=34.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
          0.00954728 = weight(abstract_txt:data in 5244) [ClassicSimilarity], result of:
            0.00954728 = score(doc=5244,freq=3.0), product of:
              0.042372625 = queryWeight, product of:
                1.0383132 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.012254155 = queryNorm
              0.22531718 = fieldWeight in 5244, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
          0.013168565 = weight(abstract_txt:source in 5244) [ClassicSimilarity], result of:
            0.013168565 = score(doc=5244,freq=1.0), product of:
              0.06615072 = queryWeight, product of:
                1.0592716 = boost
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.012254155 = queryNorm
              0.19906911 = fieldWeight in 5244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0961695 = idf(docFreq=738, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
          0.013503571 = weight(abstract_txt:discussion in 5244) [ClassicSimilarity], result of:
            0.013503571 = score(doc=5244,freq=1.0), product of:
              0.067267925 = queryWeight, product of:
                1.068179 = boost
                5.1390233 = idf(docFreq=707, maxDocs=44421)
                0.012254155 = queryNorm
              0.2007431 = fieldWeight in 5244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1390233 = idf(docFreq=707, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
          0.025562273 = weight(abstract_txt:science in 5244) [ClassicSimilarity], result of:
            0.025562273 = score(doc=5244,freq=9.0), product of:
              0.0566489 = queryWeight, product of:
                1.2005532 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.012254155 = queryNorm
              0.45124042 = fieldWeight in 5244, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
          0.012672342 = weight(abstract_txt:about in 5244) [ClassicSimilarity], result of:
            0.012672342 = score(doc=5244,freq=2.0), product of:
              0.058582414 = queryWeight, product of:
                1.2208697 = boost
                3.9157467 = idf(docFreq=2405, maxDocs=44421)
                0.012254155 = queryNorm
              0.21631649 = fieldWeight in 5244, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9157467 = idf(docFreq=2405, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
          0.025190301 = weight(abstract_txt:concern in 5244) [ClassicSimilarity], result of:
            0.025190301 = score(doc=5244,freq=1.0), product of:
              0.10193684 = queryWeight, product of:
                1.314939 = boost
                6.326189 = idf(docFreq=215, maxDocs=44421)
                0.012254155 = queryNorm
              0.24711676 = fieldWeight in 5244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.326189 = idf(docFreq=215, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
          0.21887949 = weight(abstract_txt:feeds in 5244) [ClassicSimilarity], result of:
            0.21887949 = score(doc=5244,freq=1.0), product of:
              0.6541364 = queryWeight, product of:
                6.2317243 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.012254155 = queryNorm
              0.33460832 = fieldWeight in 5244, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0390625 = fieldNorm(doc=5244)
        0.32 = coord(8/25)
    
  5. Otterbacher, J.; Radev, D.: Exploring fact-focused relevance and novelty detection (2008) 0.09
    0.09078402 = sum of:
      0.09078402 = product of:
        0.32422864 = sum of:
          0.007804965 = weight(abstract_txt:information in 3210) [ClassicSimilarity], result of:
            0.007804965 = score(doc=3210,freq=3.0), product of:
              0.029806605 = queryWeight, product of:
                1.0055677 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.012254155 = queryNorm
              0.26185355 = fieldWeight in 3210, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.019029919 = weight(abstract_txt:identify in 3210) [ClassicSimilarity], result of:
            0.019029919 = score(doc=3210,freq=1.0), product of:
              0.06180926 = queryWeight, product of:
                1.0239218 = boost
                4.9261017 = idf(docFreq=875, maxDocs=44421)
                0.012254155 = queryNorm
              0.30788136 = fieldWeight in 3210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9261017 = idf(docFreq=875, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.040304482 = weight(abstract_txt:concern in 3210) [ClassicSimilarity], result of:
            0.040304482 = score(doc=3210,freq=1.0), product of:
              0.10193684 = queryWeight, product of:
                1.314939 = boost
                6.326189 = idf(docFreq=215, maxDocs=44421)
                0.012254155 = queryNorm
              0.39538682 = fieldWeight in 3210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.326189 = idf(docFreq=215, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.033515546 = weight(abstract_txt:relevant in 3210) [ClassicSimilarity], result of:
            0.033515546 = score(doc=3210,freq=2.0), product of:
              0.08189931 = queryWeight, product of:
                1.4435298 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.012254155 = queryNorm
              0.40922868 = fieldWeight in 3210, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.024752364 = weight(abstract_txt:given in 3210) [ClassicSimilarity], result of:
            0.024752364 = score(doc=3210,freq=1.0), product of:
              0.084308326 = queryWeight, product of:
                1.4646063 = boost
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.012254155 = queryNorm
              0.29359335 = fieldWeight in 3210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.076377146 = weight(abstract_txt:issue in 3210) [ClassicSimilarity], result of:
            0.076377146 = score(doc=3210,freq=1.0), product of:
              0.23700668 = queryWeight, product of:
                3.7510629 = boost
                5.156118 = idf(docFreq=695, maxDocs=44421)
                0.012254155 = queryNorm
              0.32225737 = fieldWeight in 3210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.156118 = idf(docFreq=695, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
          0.12244422 = weight(abstract_txt:broad in 3210) [ClassicSimilarity], result of:
            0.12244422 = score(doc=3210,freq=1.0), product of:
              0.33942288 = queryWeight, product of:
                4.7988877 = boost
                5.7718782 = idf(docFreq=375, maxDocs=44421)
                0.012254155 = queryNorm
              0.3607424 = fieldWeight in 3210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7718782 = idf(docFreq=375, maxDocs=44421)
                0.0625 = fieldNorm(doc=3210)
        0.28 = coord(7/25)