Document (#43414)

Matthews, P.
Glitre, K.
Genre analysis of movies using a topic model of plot summaries
Journal of the Association for Information Science and Technology. 72(2021) no.12, S.1511-1527
Genre plays an important role in the description, navigation, and discovery of movies, but it is rarely studied at large scale using quantitative methods. This allows an analysis of how genre labels are applied, how genres are composed and how these ingredients change, and how genres compare. We apply unsupervised topic modeling to a large collection of textual movie summaries and then use the model's topic proportions to investigate key questions in genre, including recognizability, mapping, canonicity, and change over time. We find that many genres can be quite easily predicted by their lexical signatures and this defines their position on the genre landscape. We find significant genre composition changes between periods for westerns, science fiction and road movies, reflecting changes in production and consumption values. We show that in terms of canonicity, canonical examples are often at the high end of the topic distribution profile for the genre rather than central as might be predicted by categorization theory.
Automatisches Indexieren

Similar documents (author)

  1. Matthews, J.R.: Suggested guidelines for screen layouts and design of online catalogs (1987) 5.17
    5.167175 = sum of:
      5.167175 = weight(author_txt:matthews in 1289) [ClassicSimilarity], result of:
        5.167175 = score(doc=1289,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.120955825 = queryNorm
          5.1671753 = fieldWeight in 1289, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.625 = fieldNorm(doc=1289)
  2. Matthews, J.R.: Public access to online catalogs : a planning guide for managers (1982) 5.17
    5.167175 = sum of:
      5.167175 = weight(author_txt:matthews in 1973) [ClassicSimilarity], result of:
        5.167175 = score(doc=1973,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.120955825 = queryNorm
          5.1671753 = fieldWeight in 1973, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.625 = fieldNorm(doc=1973)
  3. Matthews, J.R.: Use of knowledge about users in software development (1991) 5.17
    5.167175 = sum of:
      5.167175 = weight(author_txt:matthews in 4832) [ClassicSimilarity], result of:
        5.167175 = score(doc=4832,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.120955825 = queryNorm
          5.1671753 = fieldWeight in 4832, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.625 = fieldNorm(doc=4832)
  4. Matthews, J.R.: ¬The online catalog : time to move beyond the boundary of a 'catalog'! (1991) 5.17
    5.167175 = sum of:
      5.167175 = weight(author_txt:matthews in 7678) [ClassicSimilarity], result of:
        5.167175 = score(doc=7678,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.120955825 = queryNorm
          5.1671753 = fieldWeight in 7678, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.625 = fieldNorm(doc=7678)
  5. Matthews, J.R.: ¬The distribution of information : the role for online public access catalogs (1994) 5.17
    5.167175 = sum of:
      5.167175 = weight(author_txt:matthews in 306) [ClassicSimilarity], result of:
        5.167175 = score(doc=306,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.120955825 = queryNorm
          5.1671753 = fieldWeight in 306, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.267481 = idf(docFreq=30, maxDocs=44421)
            0.625 = fieldNorm(doc=306)

Similar documents (content)

  1. Crowston, K.; Kwasnik, B.H.: Can document-genre metadata improve information access to large digital collections? (2004) 0.18
    0.18138415 = sum of:
      0.18138415 = product of:
        1.1336509 = sum of:
          0.029709505 = weight(abstract_txt:large in 949) [ClassicSimilarity], result of:
            0.029709505 = score(doc=949,freq=2.0), product of:
              0.075663686 = queryWeight, product of:
                1.2484652 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.0136426315 = queryNorm
              0.3926521 = fieldWeight in 949, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
          0.061861504 = weight(abstract_txt:topic in 949) [ClassicSimilarity], result of:
            0.061861504 = score(doc=949,freq=1.0), product of:
              0.19585028 = queryWeight, product of:
                2.840598 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0136426315 = queryNorm
              0.3158612 = fieldWeight in 949, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
          0.30160272 = weight(abstract_txt:genres in 949) [ClassicSimilarity], result of:
            0.30160272 = score(doc=949,freq=5.0), product of:
              0.299202 = queryWeight, product of:
                3.040609 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0136426315 = queryNorm
              1.0080237 = fieldWeight in 949, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
          0.7404772 = weight(abstract_txt:genre in 949) [ClassicSimilarity], result of:
            0.7404772 = score(doc=949,freq=9.0), product of:
              0.5937225 = queryWeight, product of:
                6.542722 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0136426315 = queryNorm
              1.2471772 = fieldWeight in 949, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0625 = fieldNorm(doc=949)
        0.16 = coord(4/25)
  2. Hajibayova, L.; Jacob, E.K.: User-generated genre tags through the lens of genre theories (2014) 0.16
    0.1594283 = sum of:
      0.1594283 = product of:
        0.9964269 = sum of:
          0.01508334 = weight(abstract_txt:analysis in 2450) [ClassicSimilarity], result of:
            0.01508334 = score(doc=2450,freq=3.0), product of:
              0.05095862 = queryWeight, product of:
                1.0245697 = boost
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0136426315 = queryNorm
              0.29599193 = fieldWeight in 2450, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.015755843 = weight(abstract_txt:large in 2450) [ClassicSimilarity], result of:
            0.015755843 = score(doc=2450,freq=1.0), product of:
              0.075663686 = queryWeight, product of:
                1.2484652 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.0136426315 = queryNorm
              0.20823522 = fieldWeight in 2450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.20232126 = weight(abstract_txt:genres in 2450) [ClassicSimilarity], result of:
            0.20232126 = score(doc=2450,freq=4.0), product of:
              0.299202 = queryWeight, product of:
                3.040609 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0136426315 = queryNorm
              0.6762029 = fieldWeight in 2450, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
          0.76326644 = weight(abstract_txt:genre in 2450) [ClassicSimilarity], result of:
            0.76326644 = score(doc=2450,freq=17.0), product of:
              0.5937225 = queryWeight, product of:
                6.542722 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0136426315 = queryNorm
              1.2855608 = fieldWeight in 2450, product of:
                4.1231055 = tf(freq=17.0), with freq of:
                  17.0 = termFreq=17.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.046875 = fieldNorm(doc=2450)
        0.16 = coord(4/25)
  3. Wu, I.-C.; Niu, Y.-F.: Effects of anchoring process under preference stabilities for interactive movie recommendations (2015) 0.15
    0.15492117 = sum of:
      0.15492117 = product of:
        0.96825737 = sum of:
          0.13871682 = weight(abstract_txt:movie in 3130) [ClassicSimilarity], result of:
            0.13871682 = score(doc=3130,freq=4.0), product of:
              0.13315473 = queryWeight, product of:
                1.1711055 = boost
                8.334172 = idf(docFreq=28, maxDocs=44421)
                0.0136426315 = queryNorm
              1.0417715 = fieldWeight in 3130, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.334172 = idf(docFreq=28, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.26976168 = weight(abstract_txt:genres in 3130) [ClassicSimilarity], result of:
            0.26976168 = score(doc=3130,freq=4.0), product of:
              0.299202 = queryWeight, product of:
                3.040609 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0136426315 = queryNorm
              0.9016039 = fieldWeight in 3130, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.21071456 = weight(abstract_txt:movies in 3130) [ClassicSimilarity], result of:
            0.21071456 = score(doc=3130,freq=1.0), product of:
              0.40283513 = queryWeight, product of:
                3.528109 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0136426315 = queryNorm
              0.5230789 = fieldWeight in 3130, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
          0.3490643 = weight(abstract_txt:genre in 3130) [ClassicSimilarity], result of:
            0.3490643 = score(doc=3130,freq=2.0), product of:
              0.5937225 = queryWeight, product of:
                6.542722 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0136426315 = queryNorm
              0.58792496 = fieldWeight in 3130, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0625 = fieldNorm(doc=3130)
        0.16 = coord(4/25)
  4. Nahotko, M.: Genre groups in knowledge organization (2016) 0.14
    0.1357705 = sum of:
      0.1357705 = product of:
        1.1314209 = sum of:
          0.01741674 = weight(abstract_txt:analysis in 139) [ClassicSimilarity], result of:
            0.01741674 = score(doc=139,freq=1.0), product of:
              0.05095862 = queryWeight, product of:
                1.0245697 = boost
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0136426315 = queryNorm
              0.34178203 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.09375 = fieldNorm(doc=139)
          0.28612545 = weight(abstract_txt:genres in 139) [ClassicSimilarity], result of:
            0.28612545 = score(doc=139,freq=2.0), product of:
              0.299202 = queryWeight, product of:
                3.040609 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.0136426315 = queryNorm
              0.9562953 = fieldWeight in 139, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.09375 = fieldNorm(doc=139)
          0.8278787 = weight(abstract_txt:genre in 139) [ClassicSimilarity], result of:
            0.8278787 = score(doc=139,freq=5.0), product of:
              0.5937225 = queryWeight, product of:
                6.542722 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0136426315 = queryNorm
              1.3943865 = fieldWeight in 139, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.09375 = fieldNorm(doc=139)
        0.12 = coord(3/25)
  5. Finn, A.; Kushmerick, N.: Learning to classify documents according to genre (2006) 0.12
    0.122751154 = sum of:
      0.122751154 = product of:
        0.76719475 = sum of:
          0.014513951 = weight(abstract_txt:analysis in 10) [ClassicSimilarity], result of:
            0.014513951 = score(doc=10,freq=1.0), product of:
              0.05095862 = queryWeight, product of:
                1.0245697 = boost
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0136426315 = queryNorm
              0.28481838 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.02625974 = weight(abstract_txt:large in 10) [ClassicSimilarity], result of:
            0.02625974 = score(doc=10,freq=1.0), product of:
              0.075663686 = queryWeight, product of:
                1.2484652 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.0136426315 = queryNorm
              0.3470587 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.109356724 = weight(abstract_txt:topic in 10) [ClassicSimilarity], result of:
            0.109356724 = score(doc=10,freq=2.0), product of:
              0.19585028 = queryWeight, product of:
                2.840598 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0136426315 = queryNorm
              0.558369 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
          0.61706436 = weight(abstract_txt:genre in 10) [ClassicSimilarity], result of:
            0.61706436 = score(doc=10,freq=4.0), product of:
              0.5937225 = queryWeight, product of:
                6.542722 = boost
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.0136426315 = queryNorm
              1.0393144 = fieldWeight in 10, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.651612 = idf(docFreq=155, maxDocs=44421)
                0.078125 = fieldNorm(doc=10)
        0.16 = coord(4/25)