Document (#31093)

Author
Koppel, M.
Akiva, N.
Dagan, I.
Title
Feature instability as a criterion for selecting potential style markers
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.11, S.1519-1525
Year
2006
Abstract
We introduce a new measure on linguistic features, called stability, which captures the extent to which a language element such as a word or a syntactic construct is replaceable by semantically equivalent elements. This measure may be perceived as quantifying the degree of available "synonymy" for a language item. We show that frequent, but unstable, features are especially useful as discriminators of an author's writing style.
Footnote
Beitrag in einem Themenschwerpunkt "Computational analysis of style"
Theme
Computerlinguistik

Similar documents (author)

  1. Koppel, T.P.: Public access catalogs through Internet (1990) 6.01
    6.010904 = sum of:
      6.010904 = weight(author_txt:koppel in 4070) [ClassicSimilarity], result of:
        6.010904 = fieldWeight in 4070, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.625 = fieldNorm(doc=4070)
    
  2. Akiva, N.; Koppel, M.: ¬A generic unsupervised method for decomposing multi-author documents (2013) 4.81
    4.808723 = sum of:
      4.808723 = weight(author_txt:koppel in 1098) [ClassicSimilarity], result of:
        4.808723 = fieldWeight in 1098, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.5 = fieldNorm(doc=1098)
    
  3. Koppel, M.; Schweitzer, N.: Measuring direct and indirect authorial influence in historical corpora (2014) 4.81
    4.808723 = sum of:
      4.808723 = weight(author_txt:koppel in 1506) [ClassicSimilarity], result of:
        4.808723 = fieldWeight in 1506, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.5 = fieldNorm(doc=1506)
    
  4. Koppel, M.; Winter, Y.: Determining if two documents are written by the same author (2014) 4.81
    4.808723 = sum of:
      4.808723 = weight(author_txt:koppel in 1602) [ClassicSimilarity], result of:
        4.808723 = fieldWeight in 1602, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.5 = fieldNorm(doc=1602)
    
  5. Koppel, M.; Schler, J.; Argamon, S.: Computational methods in authorship attribution (2009) 3.61
    3.606542 = sum of:
      3.606542 = weight(author_txt:koppel in 2683) [ClassicSimilarity], result of:
        3.606542 = fieldWeight in 2683, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.375 = fieldNorm(doc=2683)
    

Similar documents (content)

  1. Jana, S.: Sister Nivedita's influence on J. C. Bose's writings (2015) 0.09
    0.09103681 = sum of:
      0.09103681 = product of:
        0.5689801 = sum of:
          0.12646498 = weight(abstract_txt:writing in 1720) [ClassicSimilarity], result of:
            0.12646498 = score(doc=1720,freq=5.0), product of:
              0.14041333 = queryWeight, product of:
                1.05676 = boost
                6.444614 = idf(docFreq=190, maxDocs=44218)
                0.020617455 = queryNorm
              0.9006622 = fieldWeight in 1720, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.444614 = idf(docFreq=190, maxDocs=44218)
                0.0625 = fieldNorm(doc=1720)
          0.14167172 = weight(abstract_txt:markers in 1720) [ClassicSimilarity], result of:
            0.14167172 = score(doc=1720,freq=1.0), product of:
              0.2589844 = queryWeight, product of:
                1.4351887 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.020617455 = queryNorm
              0.547028 = fieldWeight in 1720, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0625 = fieldNorm(doc=1720)
          0.03952288 = weight(abstract_txt:features in 1720) [ClassicSimilarity], result of:
            0.03952288 = score(doc=1720,freq=1.0), product of:
              0.13931371 = queryWeight, product of:
                1.4886209 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.020617455 = queryNorm
              0.28369698 = fieldWeight in 1720, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0625 = fieldNorm(doc=1720)
          0.26132053 = weight(abstract_txt:style in 1720) [ClassicSimilarity], result of:
            0.26132053 = score(doc=1720,freq=5.0), product of:
              0.2870035 = queryWeight, product of:
                2.1366372 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.020617455 = queryNorm
              0.91051346 = fieldWeight in 1720, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=1720)
        0.16 = coord(4/25)
    
  2. Watson, C.: ¬An exploratory study of secondary students' judgments of the relevance and reliability of information (2014) 0.08
    0.079419196 = sum of:
      0.079419196 = product of:
        0.39709598 = sum of:
          0.047924347 = weight(abstract_txt:perceived in 1305) [ClassicSimilarity], result of:
            0.047924347 = score(doc=1305,freq=1.0), product of:
              0.12573484 = queryWeight, product of:
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.020617455 = queryNorm
              0.3811541 = fieldWeight in 1305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0625 = fieldNorm(doc=1305)
          0.07828134 = weight(abstract_txt:item in 1305) [ClassicSimilarity], result of:
            0.07828134 = score(doc=1305,freq=2.0), product of:
              0.1384141 = queryWeight, product of:
                1.0492098 = boost
                6.39857 = idf(docFreq=199, maxDocs=44218)
                0.020617455 = queryNorm
              0.565559 = fieldWeight in 1305, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.39857 = idf(docFreq=199, maxDocs=44218)
                0.0625 = fieldNorm(doc=1305)
          0.056556854 = weight(abstract_txt:writing in 1305) [ClassicSimilarity], result of:
            0.056556854 = score(doc=1305,freq=1.0), product of:
              0.14041333 = queryWeight, product of:
                1.05676 = boost
                6.444614 = idf(docFreq=190, maxDocs=44218)
                0.020617455 = queryNorm
              0.40278837 = fieldWeight in 1305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.444614 = idf(docFreq=190, maxDocs=44218)
                0.0625 = fieldNorm(doc=1305)
          0.097467326 = weight(abstract_txt:captures in 1305) [ClassicSimilarity], result of:
            0.097467326 = score(doc=1305,freq=1.0), product of:
              0.20183238 = queryWeight, product of:
                1.2669737 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.020617455 = queryNorm
              0.4829122 = fieldWeight in 1305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0625 = fieldNorm(doc=1305)
          0.1168661 = weight(abstract_txt:style in 1305) [ClassicSimilarity], result of:
            0.1168661 = score(doc=1305,freq=1.0), product of:
              0.2870035 = queryWeight, product of:
                2.1366372 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.020617455 = queryNorm
              0.407194 = fieldWeight in 1305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=1305)
        0.2 = coord(5/25)
    
  3. Zheng, R.; Li, J.; Chen, H.; Huang, Z.: ¬A framework for authorship identification of online messages : writing-style features and classification techniques (2006) 0.07
    0.06842333 = sum of:
      0.06842333 = product of:
        0.34211665 = sum of:
          0.056556854 = weight(abstract_txt:writing in 5276) [ClassicSimilarity], result of:
            0.056556854 = score(doc=5276,freq=1.0), product of:
              0.14041333 = queryWeight, product of:
                1.05676 = boost
                6.444614 = idf(docFreq=190, maxDocs=44218)
                0.020617455 = queryNorm
              0.40278837 = fieldWeight in 5276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.444614 = idf(docFreq=190, maxDocs=44218)
                0.0625 = fieldNorm(doc=5276)
          0.05873761 = weight(abstract_txt:syntactic in 5276) [ClassicSimilarity], result of:
            0.05873761 = score(doc=5276,freq=1.0), product of:
              0.14399995 = queryWeight, product of:
                1.0701715 = boost
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.020617455 = queryNorm
              0.4079002 = fieldWeight in 5276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5264034 = idf(docFreq=175, maxDocs=44218)
                0.0625 = fieldNorm(doc=5276)
          0.030910343 = weight(abstract_txt:language in 5276) [ClassicSimilarity], result of:
            0.030910343 = score(doc=5276,freq=1.0), product of:
              0.118258044 = queryWeight, product of:
                1.3715212 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.020617455 = queryNorm
              0.26138046 = fieldWeight in 5276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=5276)
          0.07904576 = weight(abstract_txt:features in 5276) [ClassicSimilarity], result of:
            0.07904576 = score(doc=5276,freq=4.0), product of:
              0.13931371 = queryWeight, product of:
                1.4886209 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.020617455 = queryNorm
              0.56739396 = fieldWeight in 5276, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0625 = fieldNorm(doc=5276)
          0.1168661 = weight(abstract_txt:style in 5276) [ClassicSimilarity], result of:
            0.1168661 = score(doc=5276,freq=1.0), product of:
              0.2870035 = queryWeight, product of:
                2.1366372 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.020617455 = queryNorm
              0.407194 = fieldWeight in 5276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=5276)
        0.2 = coord(5/25)
    
  4. White, H.D.: Authors as citers over time (2001) 0.07
    0.06577914 = sum of:
      0.06577914 = product of:
        0.41111964 = sum of:
          0.041933805 = weight(abstract_txt:perceived in 5581) [ClassicSimilarity], result of:
            0.041933805 = score(doc=5581,freq=1.0), product of:
              0.12573484 = queryWeight, product of:
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.020617455 = queryNorm
              0.33350983 = fieldWeight in 5581, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5581)
          0.10487276 = weight(abstract_txt:author's in 5581) [ClassicSimilarity], result of:
            0.10487276 = score(doc=5581,freq=3.0), product of:
              0.16062538 = queryWeight, product of:
                1.1302624 = boost
                6.892866 = idf(docFreq=121, maxDocs=44218)
                0.020617455 = queryNorm
              0.6529028 = fieldWeight in 5581, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.892866 = idf(docFreq=121, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5581)
          0.08719731 = weight(abstract_txt:frequent in 5581) [ClassicSimilarity], result of:
            0.08719731 = score(doc=5581,freq=2.0), product of:
              0.16258165 = queryWeight, product of:
                1.1371243 = boost
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.020617455 = queryNorm
              0.5363293 = fieldWeight in 5581, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5581)
          0.17711575 = weight(abstract_txt:style in 5581) [ClassicSimilarity], result of:
            0.17711575 = score(doc=5581,freq=3.0), product of:
              0.2870035 = queryWeight, product of:
                2.1366372 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.020617455 = queryNorm
              0.61712056 = fieldWeight in 5581, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5581)
        0.16 = coord(4/25)
    
  5. Wellisch, H.H.: Book and periodical indexing (1994) 0.06
    0.064154655 = sum of:
      0.064154655 = product of:
        0.40096658 = sum of:
          0.086497605 = weight(abstract_txt:author's in 8265) [ClassicSimilarity], result of:
            0.086497605 = score(doc=8265,freq=1.0), product of:
              0.16062538 = queryWeight, product of:
                1.1302624 = boost
                6.892866 = idf(docFreq=121, maxDocs=44218)
                0.020617455 = queryNorm
              0.5385052 = fieldWeight in 8265, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.892866 = idf(docFreq=121, maxDocs=44218)
                0.078125 = fieldNorm(doc=8265)
          0.03863793 = weight(abstract_txt:language in 8265) [ClassicSimilarity], result of:
            0.03863793 = score(doc=8265,freq=1.0), product of:
              0.118258044 = queryWeight, product of:
                1.3715212 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.020617455 = queryNorm
              0.32672557 = fieldWeight in 8265, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=8265)
          0.0494036 = weight(abstract_txt:features in 8265) [ClassicSimilarity], result of:
            0.0494036 = score(doc=8265,freq=1.0), product of:
              0.13931371 = queryWeight, product of:
                1.4886209 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.020617455 = queryNorm
              0.35462123 = fieldWeight in 8265, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.078125 = fieldNorm(doc=8265)
          0.22642744 = weight(abstract_txt:instability in 8265) [ClassicSimilarity], result of:
            0.22642744 = score(doc=8265,freq=1.0), product of:
              0.305092 = queryWeight, product of:
                1.5577136 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.020617455 = queryNorm
              0.74216115 = fieldWeight in 8265, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.078125 = fieldNorm(doc=8265)
        0.16 = coord(4/25)