Document (#34060)

Author
Santini, M.
Title
Zero, single, or multi? : genre of web pages through the users' perspective
Source
Information processing and management. 44(2008) no.2, S.702-737
Year
2008
Abstract
The goal of the study presented in this article is to investigate to what extent the classification of a web page by a single genre matches the users' perspective. The extent of agreement on a single genre label for a web page can help understand whether there is a need for a different classification scheme that overrides the single-genre labelling. My hypothesis is that a single genre label does not account for the users' perspective. In order to test this hypothesis, I submitted a restricted number of web pages (25 web pages) to a large number of web users (135 subjects) asking them to assign only a single genre label to each of the web pages. Users could choose from a list of 21 genre labels, or select one of the two 'escape' options, i.e. 'Add a label' and 'I don't know'. The rationale was to observe the level of agreement on a single genre label per web page, and draw some conclusions about the appropriateness of limiting the assignment to only a single label when doing genre classification of web pages. Results show that users largely disagree on the label to be assigned to a web page.
Theme
Social tagging

Similar documents (content)

  1. Rosso, M.A.: User-based identification of Web genres (2008) 0.19
    0.1947559 = sum of:
      0.1947559 = product of:
        0.97377944 = sum of:
          0.021344136 = weight(abstract_txt:classification in 1863) [ClassicSimilarity], result of:
            0.021344136 = score(doc=1863,freq=2.0), product of:
              0.069131635 = queryWeight, product of:
                1.6509287 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.01048938 = queryNorm
              0.3087463 = fieldWeight in 1863, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1863)
          0.037383646 = weight(abstract_txt:users in 1863) [ClassicSimilarity], result of:
            0.037383646 = score(doc=1863,freq=3.0), product of:
              0.110558406 = queryWeight, product of:
                2.9525738 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.01048938 = queryNorm
              0.33813483 = fieldWeight in 1863, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1863)
          0.09645007 = weight(abstract_txt:page in 1863) [ClassicSimilarity], result of:
            0.09645007 = score(doc=1863,freq=2.0), product of:
              0.20797239 = queryWeight, product of:
                3.306451 = boost
                5.9964437 = idf(docFreq=298, maxDocs=44218)
                0.01048938 = queryNorm
              0.4637638 = fieldWeight in 1863, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9964437 = idf(docFreq=298, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1863)
          0.15572642 = weight(abstract_txt:pages in 1863) [ClassicSimilarity], result of:
            0.15572642 = score(doc=1863,freq=5.0), product of:
              0.22717936 = queryWeight, product of:
                3.8636582 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.01048938 = queryNorm
              0.68547785 = fieldWeight in 1863, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1863)
          0.6628752 = weight(abstract_txt:genre in 1863) [ClassicSimilarity], result of:
            0.6628752 = score(doc=1863,freq=10.0), product of:
              0.57609755 = queryWeight, product of:
                8.254647 = boost
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.01048938 = queryNorm
              1.1506301 = fieldWeight in 1863, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1863)
        0.2 = coord(5/25)
    
  2. Tsai, R.T.-H.; Chiu, B.; Wu, C.-E.: Visual webpage block importance prediction using conditional random fields (2011) 0.14
    0.14429468 = sum of:
      0.14429468 = product of:
        0.6012278 = sum of:
          0.049028877 = weight(abstract_txt:labels in 4924) [ClassicSimilarity], result of:
            0.049028877 = score(doc=4924,freq=2.0), product of:
              0.076340914 = queryWeight, product of:
                1.0016314 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.01048938 = queryNorm
              0.64223593 = fieldWeight in 4924, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0625 = fieldNorm(doc=4924)
          0.019410757 = weight(abstract_txt:only in 4924) [ClassicSimilarity], result of:
            0.019410757 = score(doc=4924,freq=2.0), product of:
              0.051859427 = queryWeight, product of:
                1.1675032 = boost
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.01048938 = queryNorm
              0.37429565 = fieldWeight in 4924, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.0625 = fieldNorm(doc=4924)
          0.07794342 = weight(abstract_txt:page in 4924) [ClassicSimilarity], result of:
            0.07794342 = score(doc=4924,freq=1.0), product of:
              0.20797239 = queryWeight, product of:
                3.306451 = boost
                5.9964437 = idf(docFreq=298, maxDocs=44218)
                0.01048938 = queryNorm
              0.37477773 = fieldWeight in 4924, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9964437 = idf(docFreq=298, maxDocs=44218)
                0.0625 = fieldNorm(doc=4924)
          0.11256004 = weight(abstract_txt:pages in 4924) [ClassicSimilarity], result of:
            0.11256004 = score(doc=4924,freq=2.0), product of:
              0.22717936 = queryWeight, product of:
                3.8636582 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.01048938 = queryNorm
              0.49546772 = fieldWeight in 4924, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0625 = fieldNorm(doc=4924)
          0.10310329 = weight(abstract_txt:single in 4924) [ClassicSimilarity], result of:
            0.10310329 = score(doc=4924,freq=1.0), product of:
              0.31575075 = queryWeight, product of:
                5.7616444 = boost
                5.2245407 = idf(docFreq=646, maxDocs=44218)
                0.01048938 = queryNorm
              0.3265338 = fieldWeight in 4924, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2245407 = idf(docFreq=646, maxDocs=44218)
                0.0625 = fieldNorm(doc=4924)
          0.23918144 = weight(abstract_txt:label in 4924) [ClassicSimilarity], result of:
            0.23918144 = score(doc=4924,freq=1.0), product of:
              0.5292372 = queryWeight, product of:
                6.977558 = boost
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.01048938 = queryNorm
              0.4519362 = fieldWeight in 4924, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.0625 = fieldNorm(doc=4924)
        0.24 = coord(6/25)
    
  3. Purpura, A.; Silvello, G.; Susto, G.A.: Learning to rank from relevance judgments distributions (2022) 0.10
    0.10491884 = sum of:
      0.10491884 = product of:
        0.5245942 = sum of:
          0.03449953 = weight(abstract_txt:observe in 645) [ClassicSimilarity], result of:
            0.03449953 = score(doc=645,freq=1.0), product of:
              0.07609244 = queryWeight, product of:
                7.2542357 = idf(docFreq=84, maxDocs=44218)
                0.01048938 = queryNorm
              0.45338973 = fieldWeight in 645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2542357 = idf(docFreq=84, maxDocs=44218)
                0.0625 = fieldNorm(doc=645)
          0.049028877 = weight(abstract_txt:labels in 645) [ClassicSimilarity], result of:
            0.049028877 = score(doc=645,freq=2.0), product of:
              0.076340914 = queryWeight, product of:
                1.0016314 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.01048938 = queryNorm
              0.64223593 = fieldWeight in 645, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0625 = fieldNorm(doc=645)
          0.05607428 = weight(abstract_txt:hypothesis in 645) [ClassicSimilarity], result of:
            0.05607428 = score(doc=645,freq=1.0), product of:
              0.13253132 = queryWeight, product of:
                1.8663948 = boost
                6.769634 = idf(docFreq=137, maxDocs=44218)
                0.01048938 = queryNorm
              0.4231021 = fieldWeight in 645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.769634 = idf(docFreq=137, maxDocs=44218)
                0.0625 = fieldNorm(doc=645)
          0.14581007 = weight(abstract_txt:single in 645) [ClassicSimilarity], result of:
            0.14581007 = score(doc=645,freq=2.0), product of:
              0.31575075 = queryWeight, product of:
                5.7616444 = boost
                5.2245407 = idf(docFreq=646, maxDocs=44218)
                0.01048938 = queryNorm
              0.4617885 = fieldWeight in 645, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2245407 = idf(docFreq=646, maxDocs=44218)
                0.0625 = fieldNorm(doc=645)
          0.23918144 = weight(abstract_txt:label in 645) [ClassicSimilarity], result of:
            0.23918144 = score(doc=645,freq=1.0), product of:
              0.5292372 = queryWeight, product of:
                6.977558 = boost
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.01048938 = queryNorm
              0.4519362 = fieldWeight in 645, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.0625 = fieldNorm(doc=645)
        0.2 = coord(5/25)
    
  4. Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.10
    0.10347053 = sum of:
      0.10347053 = product of:
        0.51735264 = sum of:
          0.03033507 = weight(abstract_txt:labels in 4095) [ClassicSimilarity], result of:
            0.03033507 = score(doc=4095,freq=1.0), product of:
              0.076340914 = queryWeight, product of:
                1.0016314 = boost
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.01048938 = queryNorm
              0.39736322 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2660704 = idf(docFreq=83, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.011162507 = weight(abstract_txt:number in 4095) [ClassicSimilarity], result of:
            0.011162507 = score(doc=4095,freq=1.0), product of:
              0.0493907 = queryWeight, product of:
                1.1393754 = boost
                4.132649 = idf(docFreq=1927, maxDocs=44218)
                0.01048938 = queryNorm
              0.22600424 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.132649 = idf(docFreq=1927, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.012009794 = weight(abstract_txt:only in 4095) [ClassicSimilarity], result of:
            0.012009794 = score(doc=4095,freq=1.0), product of:
              0.051859427 = queryWeight, product of:
                1.1675032 = boost
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.01048938 = queryNorm
              0.23158363 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.045277752 = weight(abstract_txt:classification in 4095) [ClassicSimilarity], result of:
            0.045277752 = score(doc=4095,freq=9.0), product of:
              0.069131635 = queryWeight, product of:
                1.6509287 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.01048938 = queryNorm
              0.65494984 = fieldWeight in 4095, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.4185675 = weight(abstract_txt:label in 4095) [ClassicSimilarity], result of:
            0.4185675 = score(doc=4095,freq=4.0), product of:
              0.5292372 = queryWeight, product of:
                6.977558 = boost
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.01048938 = queryNorm
              0.7908883 = fieldWeight in 4095, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
        0.2 = coord(5/25)
    
  5. Hajibayova, L.; Jacob, E.K.: User-generated genre tags through the lens of genre theories (2014) 0.10
    0.10131661 = sum of:
      0.10131661 = product of:
        0.8443051 = sum of:
          0.026163103 = weight(abstract_txt:users in 1450) [ClassicSimilarity], result of:
            0.026163103 = score(doc=1450,freq=2.0), product of:
              0.110558406 = queryWeight, product of:
                2.9525738 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.01048938 = queryNorm
              0.23664509 = fieldWeight in 1450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.046875 = fieldNorm(doc=1450)
          0.07732747 = weight(abstract_txt:single in 1450) [ClassicSimilarity], result of:
            0.07732747 = score(doc=1450,freq=1.0), product of:
              0.31575075 = queryWeight, product of:
                5.7616444 = boost
                5.2245407 = idf(docFreq=646, maxDocs=44218)
                0.01048938 = queryNorm
              0.24490035 = fieldWeight in 1450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2245407 = idf(docFreq=646, maxDocs=44218)
                0.046875 = fieldNorm(doc=1450)
          0.7408145 = weight(abstract_txt:genre in 1450) [ClassicSimilarity], result of:
            0.7408145 = score(doc=1450,freq=17.0), product of:
              0.57609755 = queryWeight, product of:
                8.254647 = boost
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.01048938 = queryNorm
              1.2859185 = fieldWeight in 1450, product of:
                4.1231055 = tf(freq=17.0), with freq of:
                  17.0 = termFreq=17.0
                6.653462 = idf(docFreq=154, maxDocs=44218)
                0.046875 = fieldNorm(doc=1450)
        0.12 = coord(3/25)