Document (#34060)

Author
Santini, M.
Title
Zero, single, or multi? : genre of web pages through the users' perspective
Source
Information processing and management. 44(2008) no.2, S.702-737
Year
2008
Abstract
The goal of the study presented in this article is to investigate to what extent the classification of a web page by a single genre matches the users' perspective. The extent of agreement on a single genre label for a web page can help understand whether there is a need for a different classification scheme that overrides the single-genre labelling. My hypothesis is that a single genre label does not account for the users' perspective. In order to test this hypothesis, I submitted a restricted number of web pages (25 web pages) to a large number of web users (135 subjects) asking them to assign only a single genre label to each of the web pages. Users could choose from a list of 21 genre labels, or select one of the two 'escape' options, i.e. 'Add a label' and 'I don't know'. The rationale was to observe the level of agreement on a single genre label per web page, and draw some conclusions about the appropriateness of limiting the assignment to only a single label when doing genre classification of web pages. Results show that users largely disagree on the label to be assigned to a web page.
Theme
Social tagging

Similar documents (content)

  1. Rosso, M.A.: User-based identification of Web genres (2008) 0.19
    0.1946782 = sum of:
      0.1946782 = product of:
        0.973391 = sum of:
          0.021359863 = weight(abstract_txt:classification in 2863) [ClassicSimilarity], result of:
            0.021359863 = score(doc=2863,freq=2.0), product of:
              0.069181114 = queryWeight, product of:
                1.6604408 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.010436534 = queryNorm
              0.3087528 = fieldWeight in 2863, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2863)
          0.037330773 = weight(abstract_txt:users in 2863) [ClassicSimilarity], result of:
            0.037330773 = score(doc=2863,freq=3.0), product of:
              0.11047893 = queryWeight, product of:
                2.9674563 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.010436534 = queryNorm
              0.33789948 = fieldWeight in 2863, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2863)
          0.09609511 = weight(abstract_txt:page in 2863) [ClassicSimilarity], result of:
            0.09609511 = score(doc=2863,freq=2.0), product of:
              0.20750839 = queryWeight, product of:
                3.3206017 = boost
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.010436534 = queryNorm
              0.46309024 = fieldWeight in 2863, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2863)
          0.1558368 = weight(abstract_txt:pages in 2863) [ClassicSimilarity], result of:
            0.1558368 = score(doc=2863,freq=5.0), product of:
              0.22733772 = queryWeight, product of:
                3.885883 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.010436534 = queryNorm
              0.6854859 = fieldWeight in 2863, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2863)
          0.6627684 = weight(abstract_txt:genre in 2863) [ClassicSimilarity], result of:
            0.6627684 = score(doc=2863,freq=10.0), product of:
              0.5761649 = queryWeight, product of:
                8.299724 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.010436534 = queryNorm
              1.1503103 = fieldWeight in 2863, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2863)
        0.2 = coord(5/25)
    
  2. Tsai, R.T.-H.; Chiu, B.; Wu, C.-E.: Visual webpage block importance prediction using conditional random fields (2011) 0.14
    0.14438848 = sum of:
      0.14438848 = product of:
        0.6016187 = sum of:
          0.048915096 = weight(abstract_txt:labels in 924) [ClassicSimilarity], result of:
            0.048915096 = score(doc=924,freq=2.0), product of:
              0.07623986 = queryWeight, product of:
                1.0063754 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.010436534 = queryNorm
              0.64159477 = fieldWeight in 924, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=924)
          0.019439526 = weight(abstract_txt:only in 924) [ClassicSimilarity], result of:
            0.019439526 = score(doc=924,freq=2.0), product of:
              0.051922303 = queryWeight, product of:
                1.1745214 = boost
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.010436534 = queryNorm
              0.37439644 = fieldWeight in 924, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.0625 = fieldNorm(doc=924)
          0.077656575 = weight(abstract_txt:page in 924) [ClassicSimilarity], result of:
            0.077656575 = score(doc=924,freq=1.0), product of:
              0.20750839 = queryWeight, product of:
                3.3206017 = boost
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.010436534 = queryNorm
              0.37423342 = fieldWeight in 924, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.0625 = fieldNorm(doc=924)
          0.11263982 = weight(abstract_txt:pages in 924) [ClassicSimilarity], result of:
            0.11263982 = score(doc=924,freq=2.0), product of:
              0.22733772 = queryWeight, product of:
                3.885883 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.010436534 = queryNorm
              0.49547353 = fieldWeight in 924, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.0625 = fieldNorm(doc=924)
          0.10317004 = weight(abstract_txt:single in 924) [ClassicSimilarity], result of:
            0.10317004 = score(doc=924,freq=1.0), product of:
              0.3159579 = queryWeight, product of:
                5.794668 = boost
                5.2244954 = idf(docFreq=649, maxDocs=44421)
                0.010436534 = queryNorm
              0.32653096 = fieldWeight in 924, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2244954 = idf(docFreq=649, maxDocs=44421)
                0.0625 = fieldNorm(doc=924)
          0.23979765 = weight(abstract_txt:label in 924) [ClassicSimilarity], result of:
            0.23979765 = score(doc=924,freq=1.0), product of:
              0.5302648 = queryWeight, product of:
                7.022058 = boost
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.010436534 = queryNorm
              0.45222247 = fieldWeight in 924, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.0625 = fieldNorm(doc=924)
        0.24 = coord(6/25)
    
  3. Purpura, A.; Silvello, G.; Susto, G.A.: Learning to rank from relevance judgments distributions (2022) 0.10
    0.10495565 = sum of:
      0.10495565 = product of:
        0.52477825 = sum of:
          0.033935 = weight(abstract_txt:observe in 1646) [ClassicSimilarity], result of:
            0.033935 = score(doc=1646,freq=1.0), product of:
              0.075276956 = queryWeight, product of:
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.010436534 = queryNorm
              0.45080194 = fieldWeight in 1646, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0625 = fieldNorm(doc=1646)
          0.048915096 = weight(abstract_txt:labels in 1646) [ClassicSimilarity], result of:
            0.048915096 = score(doc=1646,freq=2.0), product of:
              0.07623986 = queryWeight, product of:
                1.0063754 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.010436534 = queryNorm
              0.64159477 = fieldWeight in 1646, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=1646)
          0.05622601 = weight(abstract_txt:hypothesis in 1646) [ClassicSimilarity], result of:
            0.05622601 = score(doc=1646,freq=1.0), product of:
              0.13280009 = queryWeight, product of:
                1.8783786 = boost
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.010436534 = queryNorm
              0.42338836 = fieldWeight in 1646, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.0625 = fieldNorm(doc=1646)
          0.14590447 = weight(abstract_txt:single in 1646) [ClassicSimilarity], result of:
            0.14590447 = score(doc=1646,freq=2.0), product of:
              0.3159579 = queryWeight, product of:
                5.794668 = boost
                5.2244954 = idf(docFreq=649, maxDocs=44421)
                0.010436534 = queryNorm
              0.4617845 = fieldWeight in 1646, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2244954 = idf(docFreq=649, maxDocs=44421)
                0.0625 = fieldNorm(doc=1646)
          0.23979765 = weight(abstract_txt:label in 1646) [ClassicSimilarity], result of:
            0.23979765 = score(doc=1646,freq=1.0), product of:
              0.5302648 = queryWeight, product of:
                7.022058 = boost
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.010436534 = queryNorm
              0.45222247 = fieldWeight in 1646, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.0625 = fieldNorm(doc=1646)
        0.2 = coord(5/25)
    
  4. Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.10
    0.10368878 = sum of:
      0.10368878 = product of:
        0.5184439 = sum of:
          0.030264672 = weight(abstract_txt:labels in 95) [ClassicSimilarity], result of:
            0.030264672 = score(doc=95,freq=1.0), product of:
              0.07623986 = queryWeight, product of:
                1.0063754 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.010436534 = queryNorm
              0.39696652 = fieldWeight in 95, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.0111945765 = weight(abstract_txt:number in 95) [ClassicSimilarity], result of:
            0.0111945765 = score(doc=95,freq=1.0), product of:
              0.049496356 = queryWeight, product of:
                1.1467549 = boost
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.010436534 = queryNorm
              0.2261697 = fieldWeight in 95, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.012027592 = weight(abstract_txt:only in 95) [ClassicSimilarity], result of:
            0.012027592 = score(doc=95,freq=1.0), product of:
              0.051922303 = queryWeight, product of:
                1.1745214 = boost
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.010436534 = queryNorm
              0.23164597 = fieldWeight in 95, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.235812 = idf(docFreq=1746, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.045311116 = weight(abstract_txt:classification in 95) [ClassicSimilarity], result of:
            0.045311116 = score(doc=95,freq=9.0), product of:
              0.069181114 = queryWeight, product of:
                1.6604408 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.010436534 = queryNorm
              0.6549637 = fieldWeight in 95, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
          0.4196459 = weight(abstract_txt:label in 95) [ClassicSimilarity], result of:
            0.4196459 = score(doc=95,freq=4.0), product of:
              0.5302648 = queryWeight, product of:
                7.022058 = boost
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.010436534 = queryNorm
              0.79138935 = fieldWeight in 95, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.0546875 = fieldNorm(doc=95)
        0.2 = coord(5/25)
    
  5. Hajibayova, L.; Jacob, E.K.: User-generated genre tags through the lens of genre theories (2014) 0.10
    0.101303846 = sum of:
      0.101303846 = product of:
        0.8441987 = sum of:
          0.0261261 = weight(abstract_txt:users in 2450) [ClassicSimilarity], result of:
            0.0261261 = score(doc=2450,freq=2.0), product of:
              0.11047893 = queryWeight, product of:
                2.9674563 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.010436534 = queryNorm
              0.23648039 = fieldWeight in 2450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.07737753 = weight(abstract_txt:single in 2450) [ClassicSimilarity], result of:
            0.07737753 = score(doc=2450,freq=1.0), product of:
              0.3159579 = queryWeight, product of:
                5.794668 = boost
                5.2244954 = idf(docFreq=649, maxDocs=44421)
                0.010436534 = queryNorm
              0.24489823 = fieldWeight in 2450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2244954 = idf(docFreq=649, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.74069506 = weight(abstract_txt:genre in 2450) [ClassicSimilarity], result of:
            0.74069506 = score(doc=2450,freq=17.0), product of:
              0.5761649 = queryWeight, product of:
                8.299724 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.010436534 = queryNorm
              1.2855608 = fieldWeight in 2450, product of:
                4.1231055 = tf(freq=17.0), with freq of:
                  17.0 = termFreq=17.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
        0.12 = coord(3/25)