Document (#31093)

Author
Koppel, M.
Akiva, N.
Dagan, I.
Title
Feature instability as a criterion for selecting potential style markers
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.11, S.1519-1525
Year
2006
Abstract
We introduce a new measure on linguistic features, called stability, which captures the extent to which a language element such as a word or a syntactic construct is replaceable by semantically equivalent elements. This measure may be perceived as quantifying the degree of available "synonymy" for a language item. We show that frequent, but unstable, features are especially useful as discriminators of an author's writing style.
Footnote
Beitrag in einem Themenschwerpunkt "Computational analysis of style"
Theme
Computerlinguistik

Similar documents (author)

  1. Koppel, T.P.: Public access catalogs through Internet (1990) 6.01
    6.0137663 = sum of:
      6.0137663 = weight(author_txt:koppel in 4069) [ClassicSimilarity], result of:
        6.0137663 = fieldWeight in 4069, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.625 = fieldNorm(doc=4069)
    
  2. Akiva, N.; Koppel, M.: ¬A generic unsupervised method for decomposing multi-author documents (2013) 4.81
    4.811013 = sum of:
      4.811013 = weight(author_txt:koppel in 2098) [ClassicSimilarity], result of:
        4.811013 = fieldWeight in 2098, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.5 = fieldNorm(doc=2098)
    
  3. Koppel, M.; Schweitzer, N.: Measuring direct and indirect authorial influence in historical corpora (2014) 4.81
    4.811013 = sum of:
      4.811013 = weight(author_txt:koppel in 2506) [ClassicSimilarity], result of:
        4.811013 = fieldWeight in 2506, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.5 = fieldNorm(doc=2506)
    
  4. Koppel, M.; Winter, Y.: Determining if two documents are written by the same author (2014) 4.81
    4.811013 = sum of:
      4.811013 = weight(author_txt:koppel in 2602) [ClassicSimilarity], result of:
        4.811013 = fieldWeight in 2602, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.5 = fieldNorm(doc=2602)
    
  5. Koppel, M.; Schler, J.; Argamon, S.: Computational methods in authorship attribution (2009) 3.61
    3.60826 = sum of:
      3.60826 = weight(author_txt:koppel in 3683) [ClassicSimilarity], result of:
        3.60826 = fieldWeight in 3683, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.622026 = idf(docFreq=7, maxDocs=44421)
          0.375 = fieldNorm(doc=3683)
    

Similar documents (content)

  1. Jana, S.: Sister Nivedita's influence on J. C. Bose's writings (2015) 0.09
    0.09107365 = sum of:
      0.09107365 = product of:
        0.5692103 = sum of:
          0.12606874 = weight(abstract_txt:writing in 2720) [ClassicSimilarity], result of:
            0.12606874 = score(doc=2720,freq=5.0), product of:
              0.14010027 = queryWeight, product of:
                1.0575589 = boost
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.02057458 = queryNorm
              0.8998465 = fieldWeight in 2720, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.0625 = fieldNorm(doc=2720)
          0.14183469 = weight(abstract_txt:markers in 2720) [ClassicSimilarity], result of:
            0.14183469 = score(doc=2720,freq=1.0), product of:
              0.2591467 = queryWeight, product of:
                1.4383279 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.02057458 = queryNorm
              0.5473143 = fieldWeight in 2720, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.0625 = fieldNorm(doc=2720)
          0.039544687 = weight(abstract_txt:features in 2720) [ClassicSimilarity], result of:
            0.039544687 = score(doc=2720,freq=1.0), product of:
              0.13934545 = queryWeight, product of:
                1.4915798 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.02057458 = queryNorm
              0.28378886 = fieldWeight in 2720, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.0625 = fieldNorm(doc=2720)
          0.2617621 = weight(abstract_txt:style in 2720) [ClassicSimilarity], result of:
            0.2617621 = score(doc=2720,freq=5.0), product of:
              0.2872865 = queryWeight, product of:
                2.1416953 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.02057458 = queryNorm
              0.91115355 = fieldWeight in 2720, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=2720)
        0.16 = coord(4/25)
    
  2. Watson, C.: ¬An exploratory study of secondary students' judgments of the relevance and reliability of information (2014) 0.08
    0.079388514 = sum of:
      0.079388514 = product of:
        0.39694256 = sum of:
          0.047666002 = weight(abstract_txt:perceived in 2305) [ClassicSimilarity], result of:
            0.047666002 = score(doc=2305,freq=1.0), product of:
              0.12526503 = queryWeight, product of:
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.02057458 = queryNorm
              0.3805212 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.0625 = fieldNorm(doc=2305)
          0.078233555 = weight(abstract_txt:item in 2305) [ClassicSimilarity], result of:
            0.078233555 = score(doc=2305,freq=2.0), product of:
              0.1383384 = queryWeight, product of:
                1.0508881 = boost
                6.398163 = idf(docFreq=200, maxDocs=44421)
                0.02057458 = queryNorm
              0.565523 = fieldWeight in 2305, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.398163 = idf(docFreq=200, maxDocs=44421)
                0.0625 = fieldNorm(doc=2305)
          0.056379654 = weight(abstract_txt:writing in 2305) [ClassicSimilarity], result of:
            0.056379654 = score(doc=2305,freq=1.0), product of:
              0.14010027 = queryWeight, product of:
                1.0575589 = boost
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.02057458 = queryNorm
              0.4024236 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.0625 = fieldNorm(doc=2305)
          0.097599775 = weight(abstract_txt:captures in 2305) [ClassicSimilarity], result of:
            0.097599775 = score(doc=2305,freq=1.0), product of:
              0.20198692 = queryWeight, product of:
                1.2698333 = boost
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.02057458 = queryNorm
              0.4831985 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.0625 = fieldNorm(doc=2305)
          0.11706357 = weight(abstract_txt:style in 2305) [ClassicSimilarity], result of:
            0.11706357 = score(doc=2305,freq=1.0), product of:
              0.2872865 = queryWeight, product of:
                2.1416953 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.02057458 = queryNorm
              0.40748024 = fieldWeight in 2305, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=2305)
        0.2 = coord(5/25)
    
  3. Zheng, R.; Li, J.; Chen, H.; Huang, Z.: ¬A framework for authorship identification of online messages : writing-style features and classification techniques (2006) 0.07
    0.06840425 = sum of:
      0.06840425 = product of:
        0.34202123 = sum of:
          0.056379654 = weight(abstract_txt:writing in 276) [ClassicSimilarity], result of:
            0.056379654 = score(doc=276,freq=1.0), product of:
              0.14010027 = queryWeight, product of:
                1.0575589 = boost
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.02057458 = queryNorm
              0.4024236 = fieldWeight in 276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.0625 = fieldNorm(doc=276)
          0.058836643 = weight(abstract_txt:syntactic in 276) [ClassicSimilarity], result of:
            0.058836643 = score(doc=276,freq=1.0), product of:
              0.14414158 = queryWeight, product of:
                1.0727036 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.02057458 = queryNorm
              0.40818647 = fieldWeight in 276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.0625 = fieldNorm(doc=276)
          0.030652002 = weight(abstract_txt:language in 276) [ClassicSimilarity], result of:
            0.030652002 = score(doc=276,freq=1.0), product of:
              0.11758175 = queryWeight, product of:
                1.370156 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.02057458 = queryNorm
              0.26068673 = fieldWeight in 276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0625 = fieldNorm(doc=276)
          0.07908937 = weight(abstract_txt:features in 276) [ClassicSimilarity], result of:
            0.07908937 = score(doc=276,freq=4.0), product of:
              0.13934545 = queryWeight, product of:
                1.4915798 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.02057458 = queryNorm
              0.5675777 = fieldWeight in 276, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.0625 = fieldNorm(doc=276)
          0.11706357 = weight(abstract_txt:style in 276) [ClassicSimilarity], result of:
            0.11706357 = score(doc=276,freq=1.0), product of:
              0.2872865 = queryWeight, product of:
                2.1416953 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.02057458 = queryNorm
              0.40748024 = fieldWeight in 276, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=276)
        0.2 = coord(5/25)
    
  4. White, H.D.: Authors as citers over time (2001) 0.07
    0.06583907 = sum of:
      0.06583907 = product of:
        0.4114942 = sum of:
          0.04170775 = weight(abstract_txt:perceived in 6581) [ClassicSimilarity], result of:
            0.04170775 = score(doc=6581,freq=1.0), product of:
              0.12526503 = queryWeight, product of:
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.02057458 = queryNorm
              0.33295605 = fieldWeight in 6581, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6581)
          0.10503787 = weight(abstract_txt:author's in 6581) [ClassicSimilarity], result of:
            0.10503787 = score(doc=6581,freq=3.0), product of:
              0.16077143 = queryWeight, product of:
                1.1328946 = boost
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.02057458 = queryNorm
              0.65333664 = fieldWeight in 6581, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6581)
          0.08733354 = weight(abstract_txt:frequent in 6581) [ClassicSimilarity], result of:
            0.08733354 = score(doc=6581,freq=2.0), product of:
              0.16272816 = queryWeight, product of:
                1.1397679 = boost
                6.939294 = idf(docFreq=116, maxDocs=44421)
                0.02057458 = queryNorm
              0.5366836 = fieldWeight in 6581, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.939294 = idf(docFreq=116, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6581)
          0.17741504 = weight(abstract_txt:style in 6581) [ClassicSimilarity], result of:
            0.17741504 = score(doc=6581,freq=3.0), product of:
              0.2872865 = queryWeight, product of:
                2.1416953 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.02057458 = queryNorm
              0.6175544 = fieldWeight in 6581, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6581)
        0.16 = coord(4/25)
    
  5. Wellisch, H.H.: Book and periodical indexing (1994) 0.06
    0.06416632 = sum of:
      0.06416632 = product of:
        0.40103954 = sum of:
          0.08663377 = weight(abstract_txt:author's in 8264) [ClassicSimilarity], result of:
            0.08663377 = score(doc=8264,freq=1.0), product of:
              0.16077143 = queryWeight, product of:
                1.1328946 = boost
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.02057458 = queryNorm
              0.538863 = fieldWeight in 8264, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.078125 = fieldNorm(doc=8264)
          0.038315002 = weight(abstract_txt:language in 8264) [ClassicSimilarity], result of:
            0.038315002 = score(doc=8264,freq=1.0), product of:
              0.11758175 = queryWeight, product of:
                1.370156 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.02057458 = queryNorm
              0.3258584 = fieldWeight in 8264, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.078125 = fieldNorm(doc=8264)
          0.049430862 = weight(abstract_txt:features in 8264) [ClassicSimilarity], result of:
            0.049430862 = score(doc=8264,freq=1.0), product of:
              0.13934545 = queryWeight, product of:
                1.4915798 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.02057458 = queryNorm
              0.3547361 = fieldWeight in 8264, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.078125 = fieldNorm(doc=8264)
          0.22665992 = weight(abstract_txt:instability in 8264) [ClassicSimilarity], result of:
            0.22665992 = score(doc=8264,freq=1.0), product of:
              0.3052581 = queryWeight, product of:
                1.5610567 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.02057458 = queryNorm
              0.74251896 = fieldWeight in 8264, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.078125 = fieldNorm(doc=8264)
        0.16 = coord(4/25)