Document (#36114)

Author
Milojevic, S.
Title
Power law distributions in information science : making the case for logarithmic binning
Source
Journal of the American Society for Information Science and Technology. 61(2010) no.12, S.2417-2425
Year
2010
Abstract
We suggest partial logarithmic binning as the method of choice for uncovering the nature of many distributions encountered in information science (IS). Logarithmic binning retrieves information and trends "not visible" in noisy power law tails. We also argue that obtaining the exponent from logarithmically binned data using a simple least square method is in some cases warranted in addition to methods such as the maximum likelihood. We also show why often-used cumulative distributions can make it difficult to distinguish noise from genuine features and to obtain an accurate power law exponent of the underlying distribution. The treatment is nontechnical, aimed at IS researchers with little or no background in mathematics.
Theme
Informetrie

Similar documents (author)

  1. Milojevic, S.: Modes of collaboration in modern science : beyond power laws and preferential attachment (2010) 6.19
    6.1935673 = sum of:
      6.1935673 = weight(author_txt:milojevic in 579) [ClassicSimilarity], result of:
        6.1935673 = fieldWeight in 579, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.625 = fieldNorm(doc=579)
    
  2. Zhang, G.; Ding, Y.; Milojevic, S.: Citation content analysis (CCA) : a framework for syntactic and semantic analysis of citation content (2013) 3.72
    3.7161405 = sum of:
      3.7161405 = weight(author_txt:milojevic in 1975) [ClassicSimilarity], result of:
        3.7161405 = fieldWeight in 1975, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.375 = fieldNorm(doc=1975)
    
  3. Milojevic, S.; Sugimoto, C.R.; Yan, E.; Ding, Y.: ¬The cognitive structure of Library and Information Science : analysis of article title words (2011) 3.10
    3.0967836 = sum of:
      3.0967836 = weight(author_txt:milojevic in 608) [ClassicSimilarity], result of:
        3.0967836 = fieldWeight in 608, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.3125 = fieldNorm(doc=608)
    
  4. Hu, B.; Dong, X.; Zhang, C.; Bowman, T.D.; Ding, Y.; Milojevic, S.; Ni, C.; Yan, E.; Larivière, V.: ¬A lead-lag analysis of the topic evolution patterns for preprints and publications (2015) 2.17
    2.1677487 = sum of:
      2.1677487 = weight(author_txt:milojevic in 3337) [ClassicSimilarity], result of:
        2.1677487 = fieldWeight in 3337, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.21875 = fieldNorm(doc=3337)
    

Similar documents (content)

  1. Leydesdorff, L.; Bensman, S.: Classification and Powerlaws : the logarithmic transformation (2006) 0.24
    0.23772159 = sum of:
      0.23772159 = product of:
        1.1886079 = sum of:
          0.0073304004 = weight(abstract_txt:information in 7) [ClassicSimilarity], result of:
            0.0073304004 = score(doc=7,freq=1.0), product of:
              0.038790006 = queryWeight, product of:
                1.0687346 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.015004867 = queryNorm
              0.18897653 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=7)
          0.019713474 = weight(abstract_txt:science in 7) [ClassicSimilarity], result of:
            0.019713474 = score(doc=7,freq=1.0), product of:
              0.06553094 = queryWeight, product of:
                1.1341945 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.015004867 = queryNorm
              0.30082697 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.078125 = fieldNorm(doc=7)
          0.15379867 = weight(abstract_txt:tails in 7) [ClassicSimilarity], result of:
            0.15379867 = score(doc=7,freq=1.0), product of:
              0.20459548 = queryWeight, product of:
                1.4170897 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.015004867 = queryNorm
              0.7517208 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.078125 = fieldNorm(doc=7)
          0.28535485 = weight(abstract_txt:distributions in 7) [ClassicSimilarity], result of:
            0.28535485 = score(doc=7,freq=3.0), product of:
              0.30892363 = queryWeight, product of:
                3.016029 = boost
                6.82627 = idf(docFreq=130, maxDocs=44421)
                0.015004867 = queryNorm
              0.92370677 = fieldWeight in 7, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.82627 = idf(docFreq=130, maxDocs=44421)
                0.078125 = fieldNorm(doc=7)
          0.72241056 = weight(abstract_txt:logarithmic in 7) [ClassicSimilarity], result of:
            0.72241056 = score(doc=7,freq=3.0), product of:
              0.5738306 = queryWeight, product of:
                4.1105676 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.015004867 = queryNorm
              1.2589265 = fieldWeight in 7, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.078125 = fieldNorm(doc=7)
        0.2 = coord(5/25)
    
  2. Bodoff, D.; Wu, B.; Wong, K.Y.M.: Relevance data for language models using maximum likelihood (2003) 0.13
    0.12588517 = sum of:
      0.12588517 = product of:
        0.5245216 = sum of:
          0.016213689 = weight(abstract_txt:also in 2822) [ClassicSimilarity], result of:
            0.016213689 = score(doc=2822,freq=1.0), product of:
              0.050941456 = queryWeight, product of:
                3.3949955 = idf(docFreq=4049, maxDocs=44421)
                0.015004867 = queryNorm
              0.31828082 = fieldWeight in 2822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3949955 = idf(docFreq=4049, maxDocs=44421)
                0.09375 = fieldNorm(doc=2822)
          0.11151818 = weight(abstract_txt:maximum in 2822) [ClassicSimilarity], result of:
            0.11151818 = score(doc=2822,freq=2.0), product of:
              0.116063036 = queryWeight, product of:
                1.067324 = boost
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.015004867 = queryNorm
              0.9608415 = fieldWeight in 2822, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.09375 = fieldNorm(doc=2822)
          0.0087964805 = weight(abstract_txt:information in 2822) [ClassicSimilarity], result of:
            0.0087964805 = score(doc=2822,freq=1.0), product of:
              0.038790006 = queryWeight, product of:
                1.0687346 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.015004867 = queryNorm
              0.22677183 = fieldWeight in 2822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.09375 = fieldNorm(doc=2822)
          0.12494788 = weight(abstract_txt:likelihood in 2822) [ClassicSimilarity], result of:
            0.12494788 = score(doc=2822,freq=2.0), product of:
              0.12520339 = queryWeight, product of:
                1.1085553 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.015004867 = queryNorm
              0.9979593 = fieldWeight in 2822, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.09375 = fieldNorm(doc=2822)
          0.06534567 = weight(abstract_txt:method in 2822) [ClassicSimilarity], result of:
            0.06534567 = score(doc=2822,freq=3.0), product of:
              0.08945149 = queryWeight, product of:
                1.3251289 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.015004867 = queryNorm
              0.7305151 = fieldWeight in 2822, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.09375 = fieldNorm(doc=2822)
          0.19769964 = weight(abstract_txt:distributions in 2822) [ClassicSimilarity], result of:
            0.19769964 = score(doc=2822,freq=1.0), product of:
              0.30892363 = queryWeight, product of:
                3.016029 = boost
                6.82627 = idf(docFreq=130, maxDocs=44421)
                0.015004867 = queryNorm
              0.6399628 = fieldWeight in 2822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.82627 = idf(docFreq=130, maxDocs=44421)
                0.09375 = fieldNorm(doc=2822)
        0.24 = coord(6/25)
    
  3. Payne, N.; Thelwall, M.: Mathematical models for academic webs : linear relationship or non-linear power law? (2005) 0.09
    0.089023806 = sum of:
      0.089023806 = product of:
        0.74186504 = sum of:
          0.03772734 = weight(abstract_txt:method in 2066) [ClassicSimilarity], result of:
            0.03772734 = score(doc=2066,freq=1.0), product of:
              0.08945149 = queryWeight, product of:
                1.3251289 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.015004867 = queryNorm
              0.42176312 = fieldWeight in 2066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.09375 = fieldNorm(doc=2066)
          0.20363699 = weight(abstract_txt:power in 2066) [ClassicSimilarity], result of:
            0.20363699 = score(doc=2066,freq=3.0), product of:
              0.218463 = queryWeight, product of:
                2.53629 = boost
                5.7404623 = idf(docFreq=387, maxDocs=44421)
                0.015004867 = queryNorm
              0.93213487 = fieldWeight in 2066, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7404623 = idf(docFreq=387, maxDocs=44421)
                0.09375 = fieldNorm(doc=2066)
          0.50050074 = weight(abstract_txt:logarithmic in 2066) [ClassicSimilarity], result of:
            0.50050074 = score(doc=2066,freq=1.0), product of:
              0.5738306 = queryWeight, product of:
                4.1105676 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.015004867 = queryNorm
              0.8722099 = fieldWeight in 2066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.09375 = fieldNorm(doc=2066)
        0.12 = coord(3/25)
    
  4. Waltman, L.; Eck, N.J. van; Raan, A.F.J. van: Universality of citation distributions revisited (2012) 0.09
    0.08801622 = sum of:
      0.08801622 = product of:
        0.5501014 = sum of:
          0.016213689 = weight(abstract_txt:also in 963) [ClassicSimilarity], result of:
            0.016213689 = score(doc=963,freq=1.0), product of:
              0.050941456 = queryWeight, product of:
                3.3949955 = idf(docFreq=4049, maxDocs=44421)
                0.015004867 = queryNorm
              0.31828082 = fieldWeight in 963, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3949955 = idf(docFreq=4049, maxDocs=44421)
                0.09375 = fieldNorm(doc=963)
          0.033454873 = weight(abstract_txt:science in 963) [ClassicSimilarity], result of:
            0.033454873 = score(doc=963,freq=2.0), product of:
              0.06553094 = queryWeight, product of:
                1.1341945 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.015004867 = queryNorm
              0.5105203 = fieldWeight in 963, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.09375 = fieldNorm(doc=963)
          0.15800704 = weight(abstract_txt:warranted in 963) [ClassicSimilarity], result of:
            0.15800704 = score(doc=963,freq=1.0), product of:
              0.18446945 = queryWeight, product of:
                1.3455863 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.015004867 = queryNorm
              0.8565486 = fieldWeight in 963, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.09375 = fieldNorm(doc=963)
          0.3424258 = weight(abstract_txt:distributions in 963) [ClassicSimilarity], result of:
            0.3424258 = score(doc=963,freq=3.0), product of:
              0.30892363 = queryWeight, product of:
                3.016029 = boost
                6.82627 = idf(docFreq=130, maxDocs=44421)
                0.015004867 = queryNorm
              1.108448 = fieldWeight in 963, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.82627 = idf(docFreq=130, maxDocs=44421)
                0.09375 = fieldNorm(doc=963)
        0.16 = coord(4/25)
    
  5. Ronda-Pupo, G.A.; Katz, J.S.: ¬The scaling relationship between citation-based performance and coauthorship patterns in natural sciences (2017) 0.08
    0.07953777 = sum of:
      0.07953777 = product of:
        0.6628148 = sum of:
          0.07600012 = weight(abstract_txt:cumulative in 4603) [ClassicSimilarity], result of:
            0.07600012 = score(doc=4603,freq=1.0), product of:
              0.12788035 = queryWeight, product of:
                1.1203436 = boost
                7.607123 = idf(docFreq=59, maxDocs=44421)
                0.015004867 = queryNorm
              0.59430647 = fieldWeight in 4603, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.607123 = idf(docFreq=59, maxDocs=44421)
                0.078125 = fieldNorm(doc=4603)
          0.16969748 = weight(abstract_txt:power in 4603) [ClassicSimilarity], result of:
            0.16969748 = score(doc=4603,freq=3.0), product of:
              0.218463 = queryWeight, product of:
                2.53629 = boost
                5.7404623 = idf(docFreq=387, maxDocs=44421)
                0.015004867 = queryNorm
              0.77677906 = fieldWeight in 4603, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7404623 = idf(docFreq=387, maxDocs=44421)
                0.078125 = fieldNorm(doc=4603)
          0.4171172 = weight(abstract_txt:exponent in 4603) [ClassicSimilarity], result of:
            0.4171172 = score(doc=4603,freq=3.0), product of:
              0.3475916 = queryWeight, product of:
                2.6121552 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.015004867 = queryNorm
              1.2000209 = fieldWeight in 4603, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.078125 = fieldNorm(doc=4603)
        0.12 = coord(3/25)