Document (#37489)

Author
Fegley, B.D.
Torvik, V.I.
Title
On the role of poetic versus nonpoetic features in "kindred" and diachronic poetry attribution
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.11, S.2165-2181
Year
2012
Abstract
Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (~10) achieved cross-validated precision and recall as high as 87%. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.
Theme
Computerlinguistik

Similar documents (author)

  1. Windfeld Lund, N.; Smalheiser, N.; Torvik, V.: Author name disambiguation (2009) 3.72
    3.7161405 = sum of:
      3.7161405 = weight(author_txt:torvik in 743) [ClassicSimilarity], result of:
        3.7161405 = fieldWeight in 743, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.375 = fieldNorm(doc=743)
    
  2. Swanson, D.R.; Smalheiser, N.R.; Torvik, V.I.: Ranking indirect connections in literature-based discovery : the role of Medical Subject Headings (2006) 3.72
    3.7161405 = sum of:
      3.7161405 = weight(author_txt:torvik in 3) [ClassicSimilarity], result of:
        3.7161405 = fieldWeight in 3, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.375 = fieldNorm(doc=3)
    
  3. Torvik, V.I.; Weeber, M.; Swanson, D.R.; Smalheiser, N.R.: ¬A probabilistic similarity metric for medline mecords : a model for author name disambiguation (2005) 3.10
    3.0967836 = sum of:
      3.0967836 = weight(author_txt:torvik in 4308) [ClassicSimilarity], result of:
        3.0967836 = fieldWeight in 4308, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.3125 = fieldNorm(doc=4308)
    
  4. Lu, C.; Bu, Y.; Wang, J.; Ding, Y.; Torvik, V.; Schnaars, M.; Zhang, C.: Examining scientific writing styles from the perspective of linguistic complexity : a cross-level moderation model (2019) 2.48
    2.477427 = sum of:
      2.477427 = weight(author_txt:torvik in 219) [ClassicSimilarity], result of:
        2.477427 = fieldWeight in 219, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.25 = fieldNorm(doc=219)
    

Similar documents (content)

  1. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.13
    0.12807114 = sum of:
      0.12807114 = product of:
        0.5336298 = sum of:
          0.06569146 = weight(abstract_txt:distinctive in 4015) [ClassicSimilarity], result of:
            0.06569146 = score(doc=4015,freq=1.0), product of:
              0.10954117 = queryWeight, product of:
                7.676116 = idf(docFreq=55, maxDocs=44421)
                0.01427039 = queryNorm
              0.5996966 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.676116 = idf(docFreq=55, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.03319736 = weight(abstract_txt:text in 4015) [ClassicSimilarity], result of:
            0.03319736 = score(doc=4015,freq=3.0), product of:
              0.0607123 = queryWeight, product of:
                1.0528455 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01427039 = queryNorm
              0.5467979 = fieldWeight in 4015, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.009154805 = weight(abstract_txt:from in 4015) [ClassicSimilarity], result of:
            0.009154805 = score(doc=4015,freq=1.0), product of:
              0.042466313 = queryWeight, product of:
                1.0784355 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.01427039 = queryNorm
              0.21557805 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.07521761 = weight(abstract_txt:early in 4015) [ClassicSimilarity], result of:
            0.07521761 = score(doc=4015,freq=1.0), product of:
              0.1729119 = queryWeight, product of:
                2.1761277 = boost
                5.5680695 = idf(docFreq=460, maxDocs=44421)
                0.01427039 = queryNorm
              0.43500543 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5680695 = idf(docFreq=460, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.2144025 = weight(abstract_txt:attribution in 4015) [ClassicSimilarity], result of:
            0.2144025 = score(doc=4015,freq=1.0), product of:
              0.3476149 = queryWeight, product of:
                3.0854685 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.01427039 = queryNorm
              0.61678165 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.13596602 = weight(abstract_txt:features in 4015) [ClassicSimilarity], result of:
            0.13596602 = score(doc=4015,freq=1.0), product of:
              0.38328782 = queryWeight, product of:
                5.91526 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.01427039 = queryNorm
              0.3547361 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
        0.24 = coord(6/25)
    
  2. Sitas, A.: Greek folk literature, poetry, folk songs and the Library of Congress PA (supplement) schedule (1999) 0.09
    0.08889531 = sum of:
      0.08889531 = product of:
        0.7407943 = sum of:
          0.29203242 = weight(abstract_txt:folk in 343) [ClassicSimilarity], result of:
            0.29203242 = score(doc=343,freq=6.0), product of:
              0.14432919 = queryWeight, product of:
                1.1478586 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.01427039 = queryNorm
              2.0233774 = fieldWeight in 343, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.09375 = fieldNorm(doc=343)
          0.039464153 = weight(abstract_txt:effective in 343) [ClassicSimilarity], result of:
            0.039464153 = score(doc=343,freq=1.0), product of:
              0.08701533 = queryWeight, product of:
                1.2604458 = boost
                4.837664 = idf(docFreq=956, maxDocs=44421)
                0.01427039 = queryNorm
              0.45353103 = fieldWeight in 343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.837664 = idf(docFreq=956, maxDocs=44421)
                0.09375 = fieldNorm(doc=343)
          0.4092977 = weight(abstract_txt:poetry in 343) [ClassicSimilarity], result of:
            0.4092977 = score(doc=343,freq=2.0), product of:
              0.3284557 = queryWeight, product of:
                2.4488642 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.01427039 = queryNorm
              1.2461276 = fieldWeight in 343, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.09375 = fieldNorm(doc=343)
        0.12 = coord(3/25)
    
  3. Brooks, T.A.: Orthography as a fundamental impediment to online information retrieval (1998) 0.09
    0.08845728 = sum of:
      0.08845728 = product of:
        0.44228637 = sum of:
          0.03983683 = weight(abstract_txt:text in 2143) [ClassicSimilarity], result of:
            0.03983683 = score(doc=2143,freq=3.0), product of:
              0.0607123 = queryWeight, product of:
                1.0528455 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01427039 = queryNorm
              0.6561575 = fieldWeight in 2143, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.09375 = fieldNorm(doc=2143)
          0.015536218 = weight(abstract_txt:from in 2143) [ClassicSimilarity], result of:
            0.015536218 = score(doc=2143,freq=2.0), product of:
              0.042466313 = queryWeight, product of:
                1.0784355 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.01427039 = queryNorm
              0.36584806 = fieldWeight in 2143, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.09375 = fieldNorm(doc=2143)
          0.027730495 = weight(abstract_txt:problems in 2143) [ClassicSimilarity], result of:
            0.027730495 = score(doc=2143,freq=1.0), product of:
              0.068775274 = queryWeight, product of:
                1.120579 = boost
                4.300847 = idf(docFreq=1636, maxDocs=44421)
                0.01427039 = queryNorm
              0.4032044 = fieldWeight in 2143, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.300847 = idf(docFreq=1636, maxDocs=44421)
                0.09375 = fieldNorm(doc=2143)
          0.2689217 = weight(abstract_txt:orthographic in 2143) [ClassicSimilarity], result of:
            0.2689217 = score(doc=2143,freq=3.0), product of:
              0.17211846 = queryWeight, product of:
                1.253502 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.01427039 = queryNorm
              1.5624223 = fieldWeight in 2143, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.09375 = fieldNorm(doc=2143)
          0.09026114 = weight(abstract_txt:early in 2143) [ClassicSimilarity], result of:
            0.09026114 = score(doc=2143,freq=1.0), product of:
              0.1729119 = queryWeight, product of:
                2.1761277 = boost
                5.5680695 = idf(docFreq=460, maxDocs=44421)
                0.01427039 = queryNorm
              0.5220065 = fieldWeight in 2143, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5680695 = idf(docFreq=460, maxDocs=44421)
                0.09375 = fieldNorm(doc=2143)
        0.2 = coord(5/25)
    
  4. Corbara, S.; Moreo, A.; Sebastiani, F.: Syllabic quantity patterns as rhythmic features for Latin authorship attribution (2023) 0.07
    0.07131713 = sum of:
      0.07131713 = product of:
        0.59430945 = sum of:
          0.019166503 = weight(abstract_txt:text in 1847) [ClassicSimilarity], result of:
            0.019166503 = score(doc=1847,freq=1.0), product of:
              0.0607123 = queryWeight, product of:
                1.0528455 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01427039 = queryNorm
              0.3156939 = fieldWeight in 1847, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=1847)
          0.3032109 = weight(abstract_txt:attribution in 1847) [ClassicSimilarity], result of:
            0.3032109 = score(doc=1847,freq=2.0), product of:
              0.3476149 = queryWeight, product of:
                3.0854685 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.01427039 = queryNorm
              0.8722609 = fieldWeight in 1847, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.078125 = fieldNorm(doc=1847)
          0.27193204 = weight(abstract_txt:features in 1847) [ClassicSimilarity], result of:
            0.27193204 = score(doc=1847,freq=4.0), product of:
              0.38328782 = queryWeight, product of:
                5.91526 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.01427039 = queryNorm
              0.7094722 = fieldWeight in 1847, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.078125 = fieldNorm(doc=1847)
        0.12 = coord(3/25)
    
  5. Díez Platas, M.L.; Muñoz, S.R.; González-Blanco, E.; Ruiz Fabo, P.; Álvarez Mellado, E.: Medieval Spanish (12th-15th centuries) named entity recognition and attribute annotation system based on contextual information (2021) 0.07
    0.065814935 = sum of:
      0.065814935 = product of:
        0.41134337 = sum of:
          0.015333203 = weight(abstract_txt:text in 1094) [ClassicSimilarity], result of:
            0.015333203 = score(doc=1094,freq=1.0), product of:
              0.0607123 = queryWeight, product of:
                1.0528455 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01427039 = queryNorm
              0.25255513 = fieldWeight in 1094, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1094)
          0.010357479 = weight(abstract_txt:from in 1094) [ClassicSimilarity], result of:
            0.010357479 = score(doc=1094,freq=2.0), product of:
              0.042466313 = queryWeight, product of:
                1.0784355 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.01427039 = queryNorm
              0.2438987 = fieldWeight in 1094, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=1094)
          0.10350802 = weight(abstract_txt:orthographic in 1094) [ClassicSimilarity], result of:
            0.10350802 = score(doc=1094,freq=1.0), product of:
              0.17211846 = queryWeight, product of:
                1.253502 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.01427039 = queryNorm
              0.60137665 = fieldWeight in 1094, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.0625 = fieldNorm(doc=1094)
          0.28214467 = weight(abstract_txt:diachronic in 1094) [ClassicSimilarity], result of:
            0.28214467 = score(doc=1094,freq=2.0), product of:
              0.33586082 = queryWeight, product of:
                2.4763155 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.01427039 = queryNorm
              0.8400643 = fieldWeight in 1094, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.0625 = fieldNorm(doc=1094)
        0.16 = coord(4/25)