Document (#33864)

Rosso, M.A.
User-based identification of Web genres
Journal of the American Society for Information Science and Technology. 59(2008) no.7, S.1053-1072
This research explores the use of genre as a document descriptor in order to improve the effectiveness of Web searching. A major issue to be resolved is the identification of what document categories should be used as genres. As genre is a kind of folk typology, document categories must enjoy widespread recognition by their intended user groups in order to qualify as genres. Three user studies were conducted to develop a genre palette and show that it is recognizable to users. (Palette is a term used to denote a classification, attributable to Karlgren, Bretan, Dewe, Hallberg, and Wolkert, 1998.) To simplify the users' classification task, it was decided to focus on Web pages from the edu domain. The first study was a survey of user terminology for Web pages. Three participants separated 100 Web page printouts into stacks according to genre, assigning names and definitions to each genre. The second study aimed to refine the resulting set of 48 (often conceptually and lexically similar) genre names and definitions into a smaller palette of user-preferred terminology. Ten participants classified the same 100 Web pages. A set of five principles for creating a genre palette from individuals' sortings was developed, and the list of 48 was trimmed to 18 genres. The third study aimed to show that users would agree on the genres of Web pages when choosing from the genre palette. In an online experiment in which 257 participants categorized a new set of 55 pages using the 18 genres, on average, over 70% agreed on the genre of each page. Suggestions for improving the genre palette and future directions for the work are discussed.

Similar documents (author)

  1. Panicheva, P.; Cardiff, J.; Rosso, P.: Identifying subjective statements in news titles using a personal sense annotation framework (2013) 3.72
    3.7161405 = sum of:
      3.7161405 = weight(author_txt:rosso in 1968) [ClassicSimilarity], result of:
        3.7161405 = fieldWeight in 1968, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.375 = fieldNorm(doc=1968)
  2. Gupta, P.; Banchs, R.E.; Rosso, P.: Continuous space models for CLIR (2017) 3.72
    3.7161405 = sum of:
      3.7161405 = weight(author_txt:rosso in 4295) [ClassicSimilarity], result of:
        3.7161405 = fieldWeight in 4295, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.375 = fieldNorm(doc=4295)
  3. Giachanou, A.; Rosso, P.; Crestani, F.: ¬The impact of emotional signals on credibility assessment (2021) 3.72
    3.7161405 = sum of:
      3.7161405 = weight(author_txt:rosso in 1329) [ClassicSimilarity], result of:
        3.7161405 = fieldWeight in 1329, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.375 = fieldNorm(doc=1329)
  4. Paramita, M.L.; Kasinidou, M.; Kleanthous, S.; Rosso, P.; Kuflik, T.; Hopfgartner, F.: Towards improving user awareness of search engine biases : a participatory design approach (2024) 2.48
    2.477427 = sum of:
      2.477427 = weight(author_txt:rosso in 2274) [ClassicSimilarity], result of:
        2.477427 = fieldWeight in 2274, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.25 = fieldNorm(doc=2274)

Similar documents (content)

  1. Santini, M.: Zero, single, or multi? : genre of web pages through the users' perspective (2008) 0.26
    0.25935292 = sum of:
      0.25935292 = product of:
        0.9262604 = sum of:
          0.013479877 = weight(abstract_txt:show in 3059) [ClassicSimilarity], result of:
            0.013479877 = score(doc=3059,freq=1.0), product of:
              0.049006656 = queryWeight, product of:
                1.0051342 = boost
                4.400995 = idf(docFreq=1480, maxDocs=44421)
                0.011078479 = queryNorm
              0.27506217 = fieldWeight in 3059, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.400995 = idf(docFreq=1480, maxDocs=44421)
                0.0625 = fieldNorm(doc=3059)
          0.013923046 = weight(abstract_txt:order in 3059) [ClassicSimilarity], result of:
            0.013923046 = score(doc=3059,freq=1.0), product of:
              0.05007496 = queryWeight, product of:
                1.0160308 = boost
                4.448705 = idf(docFreq=1411, maxDocs=44421)
                0.011078479 = queryNorm
              0.27804407 = fieldWeight in 3059, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.448705 = idf(docFreq=1411, maxDocs=44421)
                0.0625 = fieldNorm(doc=3059)
          0.009448705 = weight(abstract_txt:study in 3059) [ClassicSimilarity], result of:
            0.009448705 = score(doc=3059,freq=1.0), product of:
              0.044266623 = queryWeight, product of:
                1.1699852 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.011078479 = queryNorm
              0.21344988 = fieldWeight in 3059, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.0625 = fieldNorm(doc=3059)
          0.024078317 = weight(abstract_txt:users in 3059) [ClassicSimilarity], result of:
            0.024078317 = score(doc=3059,freq=5.0), product of:
              0.048297234 = queryWeight, product of:
                1.2220904 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.011078479 = queryNorm
              0.49854442 = fieldWeight in 3059, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.0625 = fieldNorm(doc=3059)
          0.067897074 = weight(abstract_txt:page in 3059) [ClassicSimilarity], result of:
            0.067897074 = score(doc=3059,freq=4.0), product of:
              0.09071487 = queryWeight, product of:
                1.3675267 = boost
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.011078479 = queryNorm
              0.74846685 = fieldWeight in 3059, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.0625 = fieldNorm(doc=3059)
          0.1392771 = weight(abstract_txt:pages in 3059) [ClassicSimilarity], result of:
            0.1392771 = score(doc=3059,freq=4.0), product of:
              0.19876699 = queryWeight, product of:
                3.2006538 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.011078479 = queryNorm
              0.7007054 = fieldWeight in 3059, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.0625 = fieldNorm(doc=3059)
          0.6581563 = weight(abstract_txt:genre in 3059) [ClassicSimilarity], result of:
            0.6581563 = score(doc=3059,freq=8.0), product of:
              0.55972815 = queryWeight, product of:
                7.595741 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.011078479 = queryNorm
              1.1758499 = fieldWeight in 3059, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0625 = fieldNorm(doc=3059)
        0.28 = coord(7/25)
  2. Hajibayova, L.; Jacob, E.K.: User-generated genre tags through the lens of genre theories (2014) 0.25
    0.25202802 = sum of:
      0.25202802 = product of:
        1.0501168 = sum of:
          0.010021865 = weight(abstract_txt:study in 2450) [ClassicSimilarity], result of:
            0.010021865 = score(doc=2450,freq=2.0), product of:
              0.044266623 = queryWeight, product of:
                1.1699852 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.011078479 = queryNorm
              0.22639778 = fieldWeight in 2450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.016464816 = weight(abstract_txt:categories in 2450) [ClassicSimilarity], result of:
            0.016464816 = score(doc=2450,freq=1.0), product of:
              0.0678362 = queryWeight, product of:
                1.1825713 = boost
                5.177905 = idf(docFreq=680, maxDocs=44421)
                0.011078479 = queryNorm
              0.2427143 = fieldWeight in 2450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.177905 = idf(docFreq=680, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.011421349 = weight(abstract_txt:users in 2450) [ClassicSimilarity], result of:
            0.011421349 = score(doc=2450,freq=2.0), product of:
              0.048297234 = queryWeight, product of:
                1.2220904 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.011078479 = queryNorm
              0.23648039 = fieldWeight in 2450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.02561216 = weight(abstract_txt:user in 2450) [ClassicSimilarity], result of:
            0.02561216 = score(doc=2450,freq=3.0), product of:
              0.0857026 = queryWeight, product of:
                2.1016653 = boost
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.011078479 = queryNorm
              0.29884928 = fieldWeight in 2450, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.2670319 = weight(abstract_txt:genres in 2450) [ClassicSimilarity], result of:
            0.2670319 = score(doc=2450,freq=4.0), product of:
              0.39489907 = queryWeight, product of:
                4.9419713 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.011078479 = queryNorm
              0.6762029 = fieldWeight in 2450, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.7195646 = weight(abstract_txt:genre in 2450) [ClassicSimilarity], result of:
            0.7195646 = score(doc=2450,freq=17.0), product of:
              0.55972815 = queryWeight, product of:
                7.595741 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.011078479 = queryNorm
              1.2855608 = fieldWeight in 2450, product of:
                4.1231055 = tf(freq=17.0), with freq of:
                  17.0 = termFreq=17.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
        0.24 = coord(6/25)
  3. Dillon, A.; Gushrowski, B.A.: Genres and the Web : is the personal home page the first uniquely digital genre? (2000) 0.24
    0.24238314 = sum of:
      0.24238314 = product of:
        1.0099298 = sum of:
          0.07537991 = weight(abstract_txt:recognizable in 5389) [ClassicSimilarity], result of:
            0.07537991 = score(doc=5389,freq=1.0), product of:
              0.10560508 = queryWeight, product of:
                1.0433354 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.011078479 = queryNorm
              0.71379054 = fieldWeight in 5389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.078125 = fieldNorm(doc=5389)
          0.042435672 = weight(abstract_txt:page in 5389) [ClassicSimilarity], result of:
            0.042435672 = score(doc=5389,freq=1.0), product of:
              0.09071487 = queryWeight, product of:
                1.3675267 = boost
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.011078479 = queryNorm
              0.4677918 = fieldWeight in 5389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.987735 = idf(docFreq=302, maxDocs=44421)
                0.078125 = fieldNorm(doc=5389)
          0.04268693 = weight(abstract_txt:user in 5389) [ClassicSimilarity], result of:
            0.04268693 = score(doc=5389,freq=3.0), product of:
              0.0857026 = queryWeight, product of:
                2.1016653 = boost
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.011078479 = queryNorm
              0.4980821 = fieldWeight in 5389, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.078125 = fieldNorm(doc=5389)
          0.12310473 = weight(abstract_txt:pages in 5389) [ClassicSimilarity], result of:
            0.12310473 = score(doc=5389,freq=2.0), product of:
              0.19876699 = queryWeight, product of:
                3.2006538 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.011078479 = queryNorm
              0.6193419 = fieldWeight in 5389, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.078125 = fieldNorm(doc=5389)
          0.22252658 = weight(abstract_txt:genres in 5389) [ClassicSimilarity], result of:
            0.22252658 = score(doc=5389,freq=1.0), product of:
              0.39489907 = queryWeight, product of:
                4.9419713 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.011078479 = queryNorm
              0.56350243 = fieldWeight in 5389, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.078125 = fieldNorm(doc=5389)
          0.503796 = weight(abstract_txt:genre in 5389) [ClassicSimilarity], result of:
            0.503796 = score(doc=5389,freq=3.0), product of:
              0.55972815 = queryWeight, product of:
                7.595741 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.011078479 = queryNorm
              0.9000726 = fieldWeight in 5389, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.078125 = fieldNorm(doc=5389)
        0.24 = coord(6/25)
  4. Wu, I.-C.; Niu, Y.-F.: Effects of anchoring process under preference stabilities for interactive movie recommendations (2015) 0.22
    0.21598963 = sum of:
      0.21598963 = product of:
        0.7713916 = sum of:
          0.013479877 = weight(abstract_txt:show in 3130) [ClassicSimilarity], result of:
            0.013479877 = score(doc=3130,freq=1.0), product of:
              0.049006656 = queryWeight, product of:
                1.0051342 = boost
                4.400995 = idf(docFreq=1480, maxDocs=44421)
                0.011078479 = queryNorm
              0.27506217 = fieldWeight in 3130, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.400995 = idf(docFreq=1480, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.013923046 = weight(abstract_txt:order in 3130) [ClassicSimilarity], result of:
            0.013923046 = score(doc=3130,freq=1.0), product of:
              0.05007496 = queryWeight, product of:
                1.0160308 = boost
                4.448705 = idf(docFreq=1411, maxDocs=44421)
                0.011078479 = queryNorm
              0.27804407 = fieldWeight in 3130, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.448705 = idf(docFreq=1411, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.009448705 = weight(abstract_txt:study in 3130) [ClassicSimilarity], result of:
            0.009448705 = score(doc=3130,freq=1.0), product of:
              0.044266623 = queryWeight, product of:
                1.1699852 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.011078479 = queryNorm
              0.21344988 = fieldWeight in 3130, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.0215363 = weight(abstract_txt:users in 3130) [ClassicSimilarity], result of:
            0.0215363 = score(doc=3130,freq=4.0), product of:
              0.048297234 = queryWeight, product of:
                1.2220904 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.011078479 = queryNorm
              0.44591168 = fieldWeight in 3130, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.027882986 = weight(abstract_txt:user in 3130) [ClassicSimilarity], result of:
            0.027882986 = score(doc=3130,freq=2.0), product of:
              0.0857026 = queryWeight, product of:
                2.1016653 = boost
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.011078479 = queryNorm
              0.32534587 = fieldWeight in 3130, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.35604253 = weight(abstract_txt:genres in 3130) [ClassicSimilarity], result of:
            0.35604253 = score(doc=3130,freq=4.0), product of:
              0.39489907 = queryWeight, product of:
                4.9419713 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.011078479 = queryNorm
              0.9016039 = fieldWeight in 3130, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.32907814 = weight(abstract_txt:genre in 3130) [ClassicSimilarity], result of:
            0.32907814 = score(doc=3130,freq=2.0), product of:
              0.55972815 = queryWeight, product of:
                7.595741 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.011078479 = queryNorm
              0.58792496 = fieldWeight in 3130, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
        0.28 = coord(7/25)
  5. Crowston, K.; Kwasnik, B.H.: Can document-genre metadata improve information access to large digital collections? (2004) 0.19
    0.19062746 = sum of:
      0.19062746 = product of:
        1.1914216 = sum of:
          0.06274138 = weight(abstract_txt:identification in 949) [ClassicSimilarity], result of:
            0.06274138 = score(doc=949,freq=4.0), product of:
              0.08606247 = queryWeight, product of:
                1.3319976 = boost
                5.8321705 = idf(docFreq=353, maxDocs=44421)
                0.011078479 = queryNorm
              0.7290213 = fieldWeight in 949, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8321705 = idf(docFreq=353, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
          0.032532457 = weight(abstract_txt:document in 949) [ClassicSimilarity], result of:
            0.032532457 = score(doc=949,freq=3.0), product of:
              0.06998404 = queryWeight, product of:
                1.4710983 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.011078479 = queryNorm
              0.46485534 = fieldWeight in 949, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
          0.39806762 = weight(abstract_txt:genres in 949) [ClassicSimilarity], result of:
            0.39806762 = score(doc=949,freq=5.0), product of:
              0.39489907 = queryWeight, product of:
                4.9419713 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.011078479 = queryNorm
              1.0080237 = fieldWeight in 949, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
          0.6980802 = weight(abstract_txt:genre in 949) [ClassicSimilarity], result of:
            0.6980802 = score(doc=949,freq=9.0), product of:
              0.55972815 = queryWeight, product of:
                7.595741 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.011078479 = queryNorm
              1.2471772 = fieldWeight in 949, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
        0.16 = coord(4/25)