Document (#41547)

Author
Ferrer-i-Cancho, R.
Vitevitch, M.S.
Title
¬The origins of Zipf's meaning-frequency law
Source
Journal of the Association for Information Science and Technology. 69(2018) no.11, S.1369-1379
Year
2018
Abstract
In his pioneering research, G.K. Zipf observed that more frequent words tend to have more meanings, and showed that the number of meanings of a word grows as the square root of its frequency. He derived this relationship from two assumptions: that words follow Zipf's law for word frequencies (a power law dependency between frequency and rank) and Zipf's law of meaning distribution (a power law dependency between number of meanings and rank). Here we show that a single assumption on the joint probability of a word and a meaning suffices to infer Zipf's meaning-frequency law or relaxed versions. Interestingly, this assumption can be justified as the outcome of a biased random walk in the process of mental exploration.
Content
Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24057.
Theme
Informetrie
Object
Zipf-Gesetz

Similar documents (author)

  1. Sapena, A. Ferrer- => Ferrer-Sapena, A.: 4.93
    4.9339643 = sum of:
      4.9339643 = weight(author_txt:ferrer in 771) [ClassicSimilarity], result of:
        4.9339643 = fieldWeight in 771, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.375 = fieldNorm(doc=771)
    
  2. Ferrer, N. Ferran- => Ferran-Ferrer, N.: 4.93
    4.9339643 = sum of:
      4.9339643 = weight(author_txt:ferrer in 2285) [ClassicSimilarity], result of:
        4.9339643 = fieldWeight in 2285, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.375 = fieldNorm(doc=2285)
    
  3. Centelles, M.; Ferran-Ferrer, N.: Taxonomies and ontologies in Wikipedia and Wikidata : an in-depth examination of knowledge organization systems (This article examines Wikipedia's knowledge organization system (KOS) and the broader KOS of Wikidata. We study the structure, functions, and relationship of Wikipedia's KOS to concepts like taxonomies and folksonomies, highlighting its unique characteristics compared to social media. A significant aspect of our examination is the gender-related content classification in the Catalan edition of Wikipedia (Viquipèdia), which notably excludes female categories and non-binary gender classifications. We explore the potential implications of these restrictions on gender bias within the platform. Furthermore, we broaden our investigative methodology to assess the KOS of Wikidata. Wikidata is a dataset built on ontological principles, designed to enhance and enrich Wikipedia's digital, collaborative encyclopedia. The findings shed light on the presence or absence of gender bias and contribute to the ongoing discourse on promoting inclusivity and diversity in online knowledge sharing.) 4.07
    4.070313 = sum of:
      4.070313 = weight(author_txt:ferrer in 2296) [ClassicSimilarity], result of:
        4.070313 = fieldWeight in 2296, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.4375 = fieldNorm(doc=2296)
    
  4. Miro, A.B.; Sahun, X.B.; Ferrer, M.E.: ¬La Library of Congress Classification à la Biblioteca de la Universitat Pompeu Fabra (1993) 3.49
    3.4888396 = sum of:
      3.4888396 = weight(author_txt:ferrer in 7089) [ClassicSimilarity], result of:
        3.4888396 = fieldWeight in 7089, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.375 = fieldNorm(doc=7089)
    
  5. Ferrer Morillo, L.M.; Portillo de Hernández, R.: Tesauros transdisciplinarios : del reduccionismo científico a la unidad del conocimiento (2007) 3.49
    3.4888396 = sum of:
      3.4888396 = weight(author_txt:ferrer in 2107) [ClassicSimilarity], result of:
        3.4888396 = fieldWeight in 2107, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.375 = fieldNorm(doc=2107)
    

Similar documents (content)

  1. Sun, Q.; Shaw, D.; Davis, C.H.: ¬A model for estimating the occurence of same-frequency words and the boundary between high- and low-frequency words in texts (1999) 0.20
    0.19527425 = sum of:
      0.19527425 = product of:
        0.81364274 = sum of:
          0.08722218 = weight(abstract_txt:root in 4063) [ClassicSimilarity], result of:
            0.08722218 = score(doc=4063,freq=1.0), product of:
              0.11682491 = queryWeight, product of:
                1.1476381 = boost
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.012782337 = queryNorm
              0.74660605 = fieldWeight in 4063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.09375 = fieldNorm(doc=4063)
          0.024430666 = weight(abstract_txt:number in 4063) [ClassicSimilarity], result of:
            0.024430666 = score(doc=4063,freq=1.0), product of:
              0.06301119 = queryWeight, product of:
                1.1919583 = boost
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.012782337 = queryNorm
              0.38771948 = fieldWeight in 4063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.09375 = fieldNorm(doc=4063)
          0.10394746 = weight(abstract_txt:square in 4063) [ClassicSimilarity], result of:
            0.10394746 = score(doc=4063,freq=1.0), product of:
              0.13131875 = queryWeight, product of:
                1.2167479 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.012782337 = queryNorm
              0.791566 = fieldWeight in 4063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.09375 = fieldNorm(doc=4063)
          0.12997283 = weight(abstract_txt:words in 4063) [ClassicSimilarity], result of:
            0.12997283 = score(doc=4063,freq=6.0), product of:
              0.105676584 = queryWeight, product of:
                1.5436243 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.012782337 = queryNorm
              1.2299113 = fieldWeight in 4063, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.09375 = fieldNorm(doc=4063)
          0.08331473 = weight(abstract_txt:word in 4063) [ClassicSimilarity], result of:
            0.08331473 = score(doc=4063,freq=1.0), product of:
              0.1634202 = queryWeight, product of:
                2.3509898 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.012782337 = queryNorm
              0.50981903 = fieldWeight in 4063, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.09375 = fieldNorm(doc=4063)
          0.38475487 = weight(abstract_txt:frequency in 4063) [ClassicSimilarity], result of:
            0.38475487 = score(doc=4063,freq=7.0), product of:
              0.26075193 = queryWeight, product of:
                3.4291067 = boost
                5.948895 = idf(docFreq=314, maxDocs=44421)
                0.012782337 = queryNorm
              1.475559 = fieldWeight in 4063, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.948895 = idf(docFreq=314, maxDocs=44421)
                0.09375 = fieldNorm(doc=4063)
        0.24 = coord(6/25)
    
  2. Arsenault, C.: Aggregation consistency and frequency of Chinese words and characters (2006) 0.19
    0.1876648 = sum of:
      0.1876648 = product of:
        0.7819367 = sum of:
          0.07349347 = weight(abstract_txt:zipf in 734) [ClassicSimilarity], result of:
            0.07349347 = score(doc=734,freq=1.0), product of:
              0.13656649 = queryWeight, product of:
                1.2408215 = boost
                8.610425 = idf(docFreq=21, maxDocs=44421)
                0.012782337 = queryNorm
              0.53815156 = fieldWeight in 734, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.610425 = idf(docFreq=21, maxDocs=44421)
                0.0625 = fieldNorm(doc=734)
          0.01362002 = weight(abstract_txt:that in 734) [ClassicSimilarity], result of:
            0.01362002 = score(doc=734,freq=5.0), product of:
              0.04120913 = queryWeight, product of:
                1.3632138 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012782337 = queryNorm
              0.33050975 = fieldWeight in 734, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=734)
          0.050026562 = weight(abstract_txt:words in 734) [ClassicSimilarity], result of:
            0.050026562 = score(doc=734,freq=2.0), product of:
              0.105676584 = queryWeight, product of:
                1.5436243 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.012782337 = queryNorm
              0.47339305 = fieldWeight in 734, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.0625 = fieldNorm(doc=734)
          0.055543147 = weight(abstract_txt:word in 734) [ClassicSimilarity], result of:
            0.055543147 = score(doc=734,freq=1.0), product of:
              0.1634202 = queryWeight, product of:
                2.3509898 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.012782337 = queryNorm
              0.33987933 = fieldWeight in 734, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0625 = fieldNorm(doc=734)
          0.19389823 = weight(abstract_txt:frequency in 734) [ClassicSimilarity], result of:
            0.19389823 = score(doc=734,freq=4.0), product of:
              0.26075193 = queryWeight, product of:
                3.4291067 = boost
                5.948895 = idf(docFreq=314, maxDocs=44421)
                0.012782337 = queryNorm
              0.7436119 = fieldWeight in 734, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.948895 = idf(docFreq=314, maxDocs=44421)
                0.0625 = fieldNorm(doc=734)
          0.39535528 = weight(abstract_txt:zipf's in 734) [ClassicSimilarity], result of:
            0.39535528 = score(doc=734,freq=1.0), product of:
              0.6655643 = queryWeight, product of:
                5.478507 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.012782337 = queryNorm
              0.5940152 = fieldWeight in 734, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.0625 = fieldNorm(doc=734)
        0.24 = coord(6/25)
    
  3. Egghe, L.: Zipfian and Lotkaian continuous concentration theory (2005) 0.16
    0.15642734 = sum of:
      0.15642734 = product of:
        0.9776709 = sum of:
          0.15911803 = weight(abstract_txt:zipf in 4678) [ClassicSimilarity], result of:
            0.15911803 = score(doc=4678,freq=3.0), product of:
              0.13656649 = queryWeight, product of:
                1.2408215 = boost
                8.610425 = idf(docFreq=21, maxDocs=44421)
                0.012782337 = queryNorm
              1.1651323 = fieldWeight in 4678, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.610425 = idf(docFreq=21, maxDocs=44421)
                0.078125 = fieldNorm(doc=4678)
          0.010767572 = weight(abstract_txt:that in 4678) [ClassicSimilarity], result of:
            0.010767572 = score(doc=4678,freq=2.0), product of:
              0.04120913 = queryWeight, product of:
                1.3632138 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012782337 = queryNorm
              0.2612909 = fieldWeight in 4678, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=4678)
          0.1088894 = weight(abstract_txt:power in 4678) [ClassicSimilarity], result of:
            0.1088894 = score(doc=4678,freq=4.0), product of:
              0.121400006 = queryWeight, product of:
                1.6544802 = boost
                5.7404623 = idf(docFreq=387, maxDocs=44421)
                0.012782337 = queryNorm
              0.89694726 = fieldWeight in 4678, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7404623 = idf(docFreq=387, maxDocs=44421)
                0.078125 = fieldNorm(doc=4678)
          0.69889593 = weight(abstract_txt:zipf's in 4678) [ClassicSimilarity], result of:
            0.69889593 = score(doc=4678,freq=2.0), product of:
              0.6655643 = queryWeight, product of:
                5.478507 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.012782337 = queryNorm
              1.0500803 = fieldWeight in 4678, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.078125 = fieldNorm(doc=4678)
        0.16 = coord(4/25)
    
  4. Egghe, L.: ¬A new short proof of Naranan's theorem, explaining Lotka's law and Zipf's law (2010) 0.12
    0.123174235 = sum of:
      0.123174235 = product of:
        0.769839 = sum of:
          0.12933478 = weight(abstract_txt:grows in 419) [ClassicSimilarity], result of:
            0.12933478 = score(doc=419,freq=2.0), product of:
              0.12057326 = queryWeight, product of:
                1.1659038 = boost
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.012782337 = queryNorm
              1.0726655 = fieldWeight in 419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.09375 = fieldNorm(doc=419)
          0.03455018 = weight(abstract_txt:number in 419) [ClassicSimilarity], result of:
            0.03455018 = score(doc=419,freq=2.0), product of:
              0.06301119 = queryWeight, product of:
                1.1919583 = boost
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.012782337 = queryNorm
              0.54831815 = fieldWeight in 419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.09375 = fieldNorm(doc=419)
          0.012921085 = weight(abstract_txt:that in 419) [ClassicSimilarity], result of:
            0.012921085 = score(doc=419,freq=2.0), product of:
              0.04120913 = queryWeight, product of:
                1.3632138 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012782337 = queryNorm
              0.31354907 = fieldWeight in 419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.09375 = fieldNorm(doc=419)
          0.59303296 = weight(abstract_txt:zipf's in 419) [ClassicSimilarity], result of:
            0.59303296 = score(doc=419,freq=1.0), product of:
              0.6655643 = queryWeight, product of:
                5.478507 = boost
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.012782337 = queryNorm
              0.8910228 = fieldWeight in 419, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.504243 = idf(docFreq=8, maxDocs=44421)
                0.09375 = fieldNorm(doc=419)
        0.16 = coord(4/25)
    
  5. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.12
    0.12245833 = sum of:
      0.12245833 = product of:
        0.51024306 = sum of:
          0.056178093 = weight(abstract_txt:infer in 2338) [ClassicSimilarity], result of:
            0.056178093 = score(doc=2338,freq=1.0), product of:
              0.11417113 = queryWeight, product of:
                1.1345284 = boost
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.012782337 = queryNorm
              0.49205163 = fieldWeight in 2338, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.01362002 = weight(abstract_txt:that in 2338) [ClassicSimilarity], result of:
            0.01362002 = score(doc=2338,freq=5.0), product of:
              0.04120913 = queryWeight, product of:
                1.3632138 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012782337 = queryNorm
              0.33050975 = fieldWeight in 2338, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.124467865 = weight(abstract_txt:dependency in 2338) [ClassicSimilarity], result of:
            0.124467865 = score(doc=2338,freq=1.0), product of:
              0.2444705 = queryWeight, product of:
                2.3478236 = boost
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.012782337 = queryNorm
              0.50913244 = fieldWeight in 2338, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.09620355 = weight(abstract_txt:word in 2338) [ClassicSimilarity], result of:
            0.09620355 = score(doc=2338,freq=3.0), product of:
              0.1634202 = queryWeight, product of:
                2.3509898 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.012782337 = queryNorm
              0.58868825 = fieldWeight in 2338, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.10601465 = weight(abstract_txt:meanings in 2338) [ClassicSimilarity], result of:
            0.10601465 = score(doc=2338,freq=1.0), product of:
              0.25145638 = queryWeight, product of:
                2.9162798 = boost
                6.7456408 = idf(docFreq=141, maxDocs=44421)
                0.012782337 = queryNorm
              0.42160255 = fieldWeight in 2338, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7456408 = idf(docFreq=141, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.113758914 = weight(abstract_txt:meaning in 2338) [ClassicSimilarity], result of:
            0.113758914 = score(doc=2338,freq=2.0), product of:
              0.23023885 = queryWeight, product of:
                3.22223 = boost
                5.59 = idf(docFreq=450, maxDocs=44421)
                0.012782337 = queryNorm
              0.49409086 = fieldWeight in 2338, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.59 = idf(docFreq=450, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
        0.24 = coord(6/25)