Document (#37489)

Author
Fegley, B.D.
Torvik, V.I.
Title
On the role of poetic versus nonpoetic features in "kindred" and diachronic poetry attribution
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.11, S.2165-2181
Year
2012
Abstract
Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (~10) achieved cross-validated precision and recall as high as 87%. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.
Theme
Computerlinguistik

Similar documents (author)

  1. Windfeld Lund, N.; Smalheiser, N.; Torvik, V.: Author name disambiguation (2009) 3.71
    3.7144227 = sum of:
      3.7144227 = weight(author_txt:torvik in 675) [ClassicSimilarity], result of:
        3.7144227 = fieldWeight in 675, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.375 = fieldNorm(doc=675)
    
  2. Swanson, D.R.; Smalheiser, N.R.; Torvik, V.I.: Ranking indirect connections in literature-based discovery : the role of Medical Subject Headings (2006) 3.71
    3.7144227 = sum of:
      3.7144227 = weight(author_txt:torvik in 6003) [ClassicSimilarity], result of:
        3.7144227 = fieldWeight in 6003, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.375 = fieldNorm(doc=6003)
    
  3. Torvik, V.I.; Weeber, M.; Swanson, D.R.; Smalheiser, N.R.: ¬A probabilistic similarity metric for medline mecords : a model for author name disambiguation (2005) 3.10
    3.0953524 = sum of:
      3.0953524 = weight(author_txt:torvik in 3308) [ClassicSimilarity], result of:
        3.0953524 = fieldWeight in 3308, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.3125 = fieldNorm(doc=3308)
    
  4. Lu, C.; Bu, Y.; Wang, J.; Ding, Y.; Torvik, V.; Schnaars, M.; Zhang, C.: Examining scientific writing styles from the perspective of linguistic complexity : a cross-level moderation model (2019) 2.48
    2.476282 = sum of:
      2.476282 = weight(author_txt:torvik in 5219) [ClassicSimilarity], result of:
        2.476282 = fieldWeight in 5219, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.25 = fieldNorm(doc=5219)
    

Similar documents (content)

  1. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.13
    0.12774281 = sum of:
      0.12774281 = product of:
        0.5322617 = sum of:
          0.065436184 = weight(abstract_txt:distinctive in 3015) [ClassicSimilarity], result of:
            0.065436184 = score(doc=3015,freq=1.0), product of:
              0.109180644 = queryWeight, product of:
                7.6715355 = idf(docFreq=55, maxDocs=44218)
                0.014231916 = queryNorm
              0.5993387 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6715355 = idf(docFreq=55, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.03320108 = weight(abstract_txt:text in 3015) [ClassicSimilarity], result of:
            0.03320108 = score(doc=3015,freq=3.0), product of:
              0.060674295 = queryWeight, product of:
                1.0542523 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014231916 = queryNorm
              0.54720175 = fieldWeight in 3015, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.009180178 = weight(abstract_txt:from in 3015) [ClassicSimilarity], result of:
            0.009180178 = score(doc=3015,freq=1.0), product of:
              0.04251493 = queryWeight, product of:
                1.0808328 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.014231916 = queryNorm
              0.21592833 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.075315684 = weight(abstract_txt:early in 3015) [ClassicSimilarity], result of:
            0.075315684 = score(doc=3015,freq=1.0), product of:
              0.1729409 = queryWeight, product of:
                2.1799004 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.014231916 = queryNorm
              0.43549955 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.21357997 = weight(abstract_txt:attribution in 3015) [ClassicSimilarity], result of:
            0.21357997 = score(doc=3015,freq=1.0), product of:
              0.34648234 = queryWeight, product of:
                3.0855198 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.014231916 = queryNorm
              0.61642385 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.13554865 = weight(abstract_txt:features in 3015) [ClassicSimilarity], result of:
            0.13554865 = score(doc=3015,freq=1.0), product of:
              0.38223502 = queryWeight, product of:
                5.9168754 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.014231916 = queryNorm
              0.35462123 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
        0.24 = coord(6/25)
    
  2. Sitas, A.: Greek folk literature, poetry, folk songs and the Library of Congress PA (supplement) schedule (1999) 0.09
    0.08857821 = sum of:
      0.08857821 = product of:
        0.7381518 = sum of:
          0.2909647 = weight(abstract_txt:folk in 5343) [ClassicSimilarity], result of:
            0.2909647 = score(doc=5343,freq=6.0), product of:
              0.14387631 = queryWeight, product of:
                1.147947 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.014231916 = queryNorm
              2.0223253 = fieldWeight in 5343, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.09375 = fieldNorm(doc=5343)
          0.03934608 = weight(abstract_txt:effective in 5343) [ClassicSimilarity], result of:
            0.03934608 = score(doc=5343,freq=1.0), product of:
              0.08678083 = queryWeight, product of:
                1.2608229 = boost
                4.8362236 = idf(docFreq=953, maxDocs=44218)
                0.014231916 = queryNorm
              0.45339596 = fieldWeight in 5343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8362236 = idf(docFreq=953, maxDocs=44218)
                0.09375 = fieldNorm(doc=5343)
          0.40784106 = weight(abstract_txt:poetry in 5343) [ClassicSimilarity], result of:
            0.40784106 = score(doc=5343,freq=2.0), product of:
              0.32744634 = queryWeight, product of:
                2.4491322 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.014231916 = queryNorm
              1.2455202 = fieldWeight in 5343, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.09375 = fieldNorm(doc=5343)
        0.12 = coord(3/25)
    
  3. Brooks, T.A.: Orthography as a fundamental impediment to online information retrieval (1998) 0.09
    0.088276125 = sum of:
      0.088276125 = product of:
        0.44138062 = sum of:
          0.039841294 = weight(abstract_txt:text in 1143) [ClassicSimilarity], result of:
            0.039841294 = score(doc=1143,freq=3.0), product of:
              0.060674295 = queryWeight, product of:
                1.0542523 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014231916 = queryNorm
              0.6566421 = fieldWeight in 1143, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=1143)
          0.0155792795 = weight(abstract_txt:from in 1143) [ClassicSimilarity], result of:
            0.0155792795 = score(doc=1143,freq=2.0), product of:
              0.04251493 = queryWeight, product of:
                1.0808328 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.014231916 = queryNorm
              0.36644253 = fieldWeight in 1143, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.09375 = fieldNorm(doc=1143)
          0.027607476 = weight(abstract_txt:problems in 1143) [ClassicSimilarity], result of:
            0.027607476 = score(doc=1143,freq=1.0), product of:
              0.068523675 = queryWeight, product of:
                1.1203727 = boost
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.014231916 = queryNorm
              0.4028896 = fieldWeight in 1143, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.09375 = fieldNorm(doc=1143)
          0.26797375 = weight(abstract_txt:orthographic in 1143) [ClassicSimilarity], result of:
            0.26797375 = score(doc=1143,freq=3.0), product of:
              0.1715934 = queryWeight, product of:
                1.2536533 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.014231916 = queryNorm
              1.5616786 = fieldWeight in 1143, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.09375 = fieldNorm(doc=1143)
          0.09037882 = weight(abstract_txt:early in 1143) [ClassicSimilarity], result of:
            0.09037882 = score(doc=1143,freq=1.0), product of:
              0.1729409 = queryWeight, product of:
                2.1799004 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.014231916 = queryNorm
              0.52259946 = fieldWeight in 1143, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.09375 = fieldNorm(doc=1143)
        0.2 = coord(5/25)
    
  4. Corbara, S.; Moreo, A.; Sebastiani, F.: Syllabic quantity patterns as rhythmic features for Latin authorship attribution (2023) 0.07
    0.07107764 = sum of:
      0.07107764 = product of:
        0.59231365 = sum of:
          0.019168653 = weight(abstract_txt:text in 846) [ClassicSimilarity], result of:
            0.019168653 = score(doc=846,freq=1.0), product of:
              0.060674295 = queryWeight, product of:
                1.0542523 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014231916 = queryNorm
              0.3159271 = fieldWeight in 846, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=846)
          0.30204767 = weight(abstract_txt:attribution in 846) [ClassicSimilarity], result of:
            0.30204767 = score(doc=846,freq=2.0), product of:
              0.34648234 = queryWeight, product of:
                3.0855198 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.014231916 = queryNorm
              0.8717549 = fieldWeight in 846, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.078125 = fieldNorm(doc=846)
          0.2710973 = weight(abstract_txt:features in 846) [ClassicSimilarity], result of:
            0.2710973 = score(doc=846,freq=4.0), product of:
              0.38223502 = queryWeight, product of:
                5.9168754 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.014231916 = queryNorm
              0.70924246 = fieldWeight in 846, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.078125 = fieldNorm(doc=846)
        0.12 = coord(3/25)
    
  5. Díez Platas, M.L.; Muñoz, S.R.; González-Blanco, E.; Ruiz Fabo, P.; Álvarez Mellado, E.: Medieval Spanish (12th-15th centuries) named entity recognition and attribute annotation system based on contextual information (2021) 0.07
    0.06729553 = sum of:
      0.06729553 = product of:
        0.42059708 = sum of:
          0.015334922 = weight(abstract_txt:text in 93) [ClassicSimilarity], result of:
            0.015334922 = score(doc=93,freq=1.0), product of:
              0.060674295 = queryWeight, product of:
                1.0542523 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014231916 = queryNorm
              0.25274166 = fieldWeight in 93, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=93)
          0.010386186 = weight(abstract_txt:from in 93) [ClassicSimilarity], result of:
            0.010386186 = score(doc=93,freq=2.0), product of:
              0.04251493 = queryWeight, product of:
                1.0808328 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.014231916 = queryNorm
              0.24429502 = fieldWeight in 93, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=93)
          0.10314314 = weight(abstract_txt:orthographic in 93) [ClassicSimilarity], result of:
            0.10314314 = score(doc=93,freq=1.0), product of:
              0.1715934 = queryWeight, product of:
                1.2536533 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.014231916 = queryNorm
              0.6010904 = fieldWeight in 93, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.0625 = fieldNorm(doc=93)
          0.29173285 = weight(abstract_txt:diachronic in 93) [ClassicSimilarity], result of:
            0.29173285 = score(doc=93,freq=2.0), product of:
              0.3431868 = queryWeight, product of:
                2.5073066 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.014231916 = queryNorm
              0.8500701 = fieldWeight in 93, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.0625 = fieldNorm(doc=93)
        0.16 = coord(4/25)