Document (#43171)

Author
Soni, S.
Lerman, K.
Eisenstein, J.
Title
Follow the leader : documents on the leading edge of semantic change get more citations
Source
Journal of the Association for Information Science and Technology. 72(2021) no.4, S.478-492
Year
2021
Abstract
Diachronic word embeddings-vector representations of words over time-offer remarkable insights into the evolution of language and provide a tool for quantifying sociocultural change from text documents. Prior work has used such embeddings to identify shifts in the meaning of individual words. However, simply knowing that a word has changed in meaning is insufficient to identify the instances of word usage that convey the historical meaning or the newer meaning. In this study, we link diachronic word embeddings to documents, by situating those documents as leaders or laggards with respect to ongoing semantic changes. Specifically, we propose a novel method to quantify the degree of semantic progressiveness in each word usage, and then show how these usages can be aggregated to obtain scores for each document. We analyze two large collections of documents, representing legal opinions and scientific articles. Documents that are scored as semantically progressive receive a larger number of citations, indicating that they are especially influential. Our work thus provides a new technique for identifying lexical semantic leaders and demonstrates a new link between progressive use of language and influence in a citation network.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24421.
Theme
Computerlinguistik

Similar documents (content)

  1. Landauer, T.K.; Foltz, P.W.; Laham, D.: ¬An introduction to Latent Semantic Analysis (1998) 0.19
    0.1897268 = sum of:
      0.1897268 = product of:
        0.67759573 = sum of:
          0.021423265 = weight(abstract_txt:each in 1162) [ClassicSimilarity], result of:
            0.021423265 = score(doc=1162,freq=1.0), product of:
              0.066578045 = queryWeight, product of:
                1.0152977 = boost
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.015921101 = queryNorm
              0.32177672 = fieldWeight in 1162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.011537013 = weight(abstract_txt:that in 1162) [ClassicSimilarity], result of:
            0.011537013 = score(doc=1162,freq=2.0), product of:
              0.044069305 = queryWeight, product of:
                1.1681832 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015921101 = queryNorm
              0.26179248 = fieldWeight in 1162, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.08146029 = weight(abstract_txt:words in 1162) [ClassicSimilarity], result of:
            0.08146029 = score(doc=1162,freq=3.0), product of:
              0.112459846 = queryWeight, product of:
                1.3195523 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.015921101 = queryNorm
              0.72435 = fieldWeight in 1162, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.055080093 = weight(abstract_txt:usage in 1162) [ClassicSimilarity], result of:
            0.055080093 = score(doc=1162,freq=1.0), product of:
              0.12495023 = queryWeight, product of:
                1.3909016 = boost
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.015921101 = queryNorm
              0.44081625 = fieldWeight in 1162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.054929223 = weight(abstract_txt:semantic in 1162) [ClassicSimilarity], result of:
            0.054929223 = score(doc=1162,freq=1.0), product of:
              0.15713982 = queryWeight, product of:
                2.205901 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.015921101 = queryNorm
              0.34955636 = fieldWeight in 1162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.15165626 = weight(abstract_txt:meaning in 1162) [ClassicSimilarity], result of:
            0.15165626 = score(doc=1162,freq=2.0), product of:
              0.24546006 = queryWeight, product of:
                2.7569778 = boost
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.015921101 = queryNorm
              0.61784494 = fieldWeight in 1162, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
          0.3015096 = weight(abstract_txt:word in 1162) [ClassicSimilarity], result of:
            0.3015096 = score(doc=1162,freq=6.0), product of:
              0.28987068 = queryWeight, product of:
                3.3496544 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.015921101 = queryNorm
              1.0401521 = fieldWeight in 1162, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.078125 = fieldNorm(doc=1162)
        0.28 = coord(7/25)
    
  2. Fernández-Reyes, F.C.; Hermosillo-Valadez, J.; Montes-y-Gómez, M.: ¬A prospect-guided global query expansion strategy using word embeddings (2018) 0.16
    0.16383588 = sum of:
      0.16383588 = product of:
        0.8191794 = sum of:
          0.017138612 = weight(abstract_txt:each in 5090) [ClassicSimilarity], result of:
            0.017138612 = score(doc=5090,freq=1.0), product of:
              0.066578045 = queryWeight, product of:
                1.0152977 = boost
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.015921101 = queryNorm
              0.25742137 = fieldWeight in 5090, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.0625 = fieldNorm(doc=5090)
          0.014593296 = weight(abstract_txt:that in 5090) [ClassicSimilarity], result of:
            0.014593296 = score(doc=5090,freq=5.0), product of:
              0.044069305 = queryWeight, product of:
                1.1681832 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015921101 = queryNorm
              0.3311442 = fieldWeight in 5090, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=5090)
          0.062145323 = weight(abstract_txt:semantic in 5090) [ClassicSimilarity], result of:
            0.062145323 = score(doc=5090,freq=2.0), product of:
              0.15713982 = queryWeight, product of:
                2.205901 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.015921101 = queryNorm
              0.39547786 = fieldWeight in 5090, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=5090)
          0.19694524 = weight(abstract_txt:word in 5090) [ClassicSimilarity], result of:
            0.19694524 = score(doc=5090,freq=4.0), product of:
              0.28987068 = queryWeight, product of:
                3.3496544 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.015921101 = queryNorm
              0.67942446 = fieldWeight in 5090, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=5090)
          0.52835697 = weight(abstract_txt:embeddings in 5090) [ClassicSimilarity], result of:
            0.52835697 = score(doc=5090,freq=3.0), product of:
              0.51954395 = queryWeight, product of:
                3.4736385 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.015921101 = queryNorm
              1.016963 = fieldWeight in 5090, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0625 = fieldNorm(doc=5090)
        0.2 = coord(5/25)
    
  3. Leydesdorff, L.; Zhou, P.: Co-word analysis using the Chinese character set (2008) 0.15
    0.14960226 = sum of:
      0.14960226 = product of:
        0.62334275 = sum of:
          0.00978948 = weight(abstract_txt:that in 1970) [ClassicSimilarity], result of:
            0.00978948 = score(doc=1970,freq=1.0), product of:
              0.044069305 = queryWeight, product of:
                1.1681832 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015921101 = queryNorm
              0.22213829 = fieldWeight in 1970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.09375 = fieldNorm(doc=1970)
          0.055996895 = weight(abstract_txt:citations in 1970) [ClassicSimilarity], result of:
            0.055996895 = score(doc=1970,freq=1.0), product of:
              0.11187398 = queryWeight, product of:
                1.3161106 = boost
                5.339045 = idf(docFreq=576, maxDocs=44218)
                0.015921101 = queryNorm
              0.5005355 = fieldWeight in 1970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.339045 = idf(docFreq=576, maxDocs=44218)
                0.09375 = fieldNorm(doc=1970)
          0.07981446 = weight(abstract_txt:words in 1970) [ClassicSimilarity], result of:
            0.07981446 = score(doc=1970,freq=2.0), product of:
              0.112459846 = queryWeight, product of:
                1.3195523 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.015921101 = queryNorm
              0.7097151 = fieldWeight in 1970, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.09375 = fieldNorm(doc=1970)
          0.09321798 = weight(abstract_txt:semantic in 1970) [ClassicSimilarity], result of:
            0.09321798 = score(doc=1970,freq=2.0), product of:
              0.15713982 = queryWeight, product of:
                2.205901 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.015921101 = queryNorm
              0.5932168 = fieldWeight in 1970, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.09375 = fieldNorm(doc=1970)
          0.1286846 = weight(abstract_txt:meaning in 1970) [ClassicSimilarity], result of:
            0.1286846 = score(doc=1970,freq=1.0), product of:
              0.24546006 = queryWeight, product of:
                2.7569778 = boost
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.015921101 = queryNorm
              0.5242588 = fieldWeight in 1970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.09375 = fieldNorm(doc=1970)
          0.25583935 = weight(abstract_txt:word in 1970) [ClassicSimilarity], result of:
            0.25583935 = score(doc=1970,freq=3.0), product of:
              0.28987068 = queryWeight, product of:
                3.3496544 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.015921101 = queryNorm
              0.8825982 = fieldWeight in 1970, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.09375 = fieldNorm(doc=1970)
        0.24 = coord(6/25)
    
  4. Lee, G.E.; Sun, A.: Understanding the stability of medical concept embeddings (2021) 0.15
    0.14598423 = sum of:
      0.14598423 = product of:
        0.9124015 = sum of:
          0.0065263202 = weight(abstract_txt:that in 159) [ClassicSimilarity], result of:
            0.0065263202 = score(doc=159,freq=1.0), product of:
              0.044069305 = queryWeight, product of:
                1.1681832 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015921101 = queryNorm
              0.1480922 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=159)
          0.053209636 = weight(abstract_txt:words in 159) [ClassicSimilarity], result of:
            0.053209636 = score(doc=159,freq=2.0), product of:
              0.112459846 = queryWeight, product of:
                1.3195523 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.015921101 = queryNorm
              0.47314343 = fieldWeight in 159, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.0625 = fieldNorm(doc=159)
          0.17055957 = weight(abstract_txt:word in 159) [ClassicSimilarity], result of:
            0.17055957 = score(doc=159,freq=3.0), product of:
              0.28987068 = queryWeight, product of:
                3.3496544 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.015921101 = queryNorm
              0.5883988 = fieldWeight in 159, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=159)
          0.68210596 = weight(abstract_txt:embeddings in 159) [ClassicSimilarity], result of:
            0.68210596 = score(doc=159,freq=5.0), product of:
              0.51954395 = queryWeight, product of:
                3.4736385 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.015921101 = queryNorm
              1.3128936 = fieldWeight in 159, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0625 = fieldNorm(doc=159)
        0.16 = coord(4/25)
    
  5. Han, B.; Chen, L.; Tian, X.: Knowledge based collection selection for distributed information retrieval (2018) 0.14
    0.14102097 = sum of:
      0.14102097 = product of:
        0.44069052 = sum of:
          0.017138612 = weight(abstract_txt:each in 3289) [ClassicSimilarity], result of:
            0.017138612 = score(doc=3289,freq=1.0), product of:
              0.066578045 = queryWeight, product of:
                1.0152977 = boost
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.015921101 = queryNorm
              0.25742137 = fieldWeight in 3289, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.118742 = idf(docFreq=1954, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
          0.08223194 = weight(abstract_txt:scored in 3289) [ClassicSimilarity], result of:
            0.08223194 = score(doc=3289,freq=1.0), product of:
              0.15032491 = queryWeight, product of:
                1.0787687 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.015921101 = queryNorm
              0.547028 = fieldWeight in 3289, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
          0.0113039175 = weight(abstract_txt:that in 3289) [ClassicSimilarity], result of:
            0.0113039175 = score(doc=3289,freq=3.0), product of:
              0.044069305 = queryWeight, product of:
                1.1681832 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015921101 = queryNorm
              0.2565032 = fieldWeight in 3289, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
          0.06516823 = weight(abstract_txt:words in 3289) [ClassicSimilarity], result of:
            0.06516823 = score(doc=3289,freq=3.0), product of:
              0.112459846 = queryWeight, product of:
                1.3195523 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.015921101 = queryNorm
              0.57948 = fieldWeight in 3289, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
          0.044064075 = weight(abstract_txt:usage in 3289) [ClassicSimilarity], result of:
            0.044064075 = score(doc=3289,freq=1.0), product of:
              0.12495023 = queryWeight, product of:
                1.3909016 = boost
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.015921101 = queryNorm
              0.352653 = fieldWeight in 3289, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
          0.062145323 = weight(abstract_txt:semantic in 3289) [ClassicSimilarity], result of:
            0.062145323 = score(doc=3289,freq=2.0), product of:
              0.15713982 = queryWeight, product of:
                2.205901 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.015921101 = queryNorm
              0.39547786 = fieldWeight in 3289, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
          0.08578973 = weight(abstract_txt:meaning in 3289) [ClassicSimilarity], result of:
            0.08578973 = score(doc=3289,freq=1.0), product of:
              0.24546006 = queryWeight, product of:
                2.7569778 = boost
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.015921101 = queryNorm
              0.34950587 = fieldWeight in 3289, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.592094 = idf(docFreq=447, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
          0.07284868 = weight(abstract_txt:documents in 3289) [ClassicSimilarity], result of:
            0.07284868 = score(doc=3289,freq=2.0), product of:
              0.19998258 = queryWeight, product of:
                3.0477867 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.015921101 = queryNorm
              0.36427513 = fieldWeight in 3289, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=3289)
        0.32 = coord(8/25)