Document (#32006)

Kleinberg, J.M.
Authoritative sources in a hyperlinked environment
Journal of the Association for Computing Machinery. 46(1998) no.5, S.604-632
The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of "authoritative" information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of "hub pages" that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristics for link-based analysis.
Vorversionen auch in: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, 1998, und als IBM Research Report RJ 10076, May 1997.

Similar documents (author)

  1. Kleinberg, I.: Making the case for professional indexers : where is the proof? (1993) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:kleinberg in 7766) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 7766, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=7766)
  2. Kleinberg, I.: For want of an alphabetical index : some notes toward a history of the back-of-the-book index in nineteenth century America (1997) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:kleinberg in 3734) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 3734, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=3734)
  3. Liben-Nowell, D.; Kleinberg, J.: ¬The link-prediction problem for social networks (2007) 4.33
    4.333493 = sum of:
      4.333493 = weight(author_txt:kleinberg in 330) [ClassicSimilarity], result of:
        4.333493 = fieldWeight in 330, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.4375 = fieldNorm(doc=330)
  4. Chakrabarti, S.; Dom, B.; Kumar, S.R.; Raghavan, P.; Rajagopalan, S.; Tomkins, A.; Kleinberg, J.M.; Gibson, D.: Neue Pfade durch den Internet-Dschungel : Die zweite Generation von Web-Suchmaschinen (1999) 2.48
    2.476282 = sum of:
      2.476282 = weight(author_txt:kleinberg in 3) [ClassicSimilarity], result of:
        2.476282 = fieldWeight in 3, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.25 = fieldNorm(doc=3)

Similar documents (content)

  1. Lempel, R.; Moran, S.: SALSA: the stochastic approach for link-structure analysis (2001) 0.25
    0.2525643 = sum of:
      0.2525643 = product of:
        0.70156753 = sum of:
          0.03142474 = weight(abstract_txt:broad in 10) [ClassicSimilarity], result of:
            0.03142474 = score(doc=10,freq=1.0), product of:
              0.08701911 = queryWeight, product of:
                5.777993 = idf(docFreq=371, maxDocs=44218)
                0.015060438 = queryNorm
              0.36112458 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.777993 = idf(docFreq=371, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.035850193 = weight(abstract_txt:notion in 10) [ClassicSimilarity], result of:
            0.035850193 = score(doc=10,freq=1.0), product of:
              0.09500822 = queryWeight, product of:
                1.0448965 = boost
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.015060438 = queryNorm
              0.3773378 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.037405 = idf(docFreq=286, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.010555917 = weight(abstract_txt:based in 10) [ClassicSimilarity], result of:
            0.010555917 = score(doc=10,freq=1.0), product of:
              0.052979458 = queryWeight, product of:
                1.1034724 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015060438 = queryNorm
              0.19924548 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.018522238 = weight(abstract_txt:such in 10) [ClassicSimilarity], result of:
            0.018522238 = score(doc=10,freq=2.0), product of:
              0.061173383 = queryWeight, product of:
                1.1857386 = boost
                3.4255946 = idf(docFreq=3909, maxDocs=44218)
                0.015060438 = queryNorm
              0.30278262 = fieldWeight in 10, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4255946 = idf(docFreq=3909, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.012010967 = weight(abstract_txt:information in 10) [ClassicSimilarity], result of:
            0.012010967 = score(doc=10,freq=3.0), product of:
              0.045830246 = queryWeight, product of:
                1.256983 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.015060438 = queryNorm
              0.26207513 = fieldWeight in 10, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.04670815 = weight(abstract_txt:structure in 10) [ClassicSimilarity], result of:
            0.04670815 = score(doc=10,freq=3.0), product of:
              0.09900677 = queryWeight, product of:
                1.508482 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.015060438 = queryNorm
              0.47176725 = fieldWeight in 10, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.114778794 = weight(abstract_txt:pages in 10) [ClassicSimilarity], result of:
            0.114778794 = score(doc=10,freq=4.0), product of:
              0.16380656 = queryWeight, product of:
                1.9403199 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.015060438 = queryNorm
              0.7006972 = fieldWeight in 10, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.18935396 = weight(abstract_txt:authoritative in 10) [ClassicSimilarity], result of:
            0.18935396 = score(doc=10,freq=1.0), product of:
              0.41558212 = queryWeight, product of:
                3.7851381 = boost
                7.290168 = idf(docFreq=81, maxDocs=44218)
                0.015060438 = queryNorm
              0.4556355 = fieldWeight in 10, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.290168 = idf(docFreq=81, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
          0.24236256 = weight(abstract_txt:link in 10) [ClassicSimilarity], result of:
            0.24236256 = score(doc=10,freq=4.0), product of:
              0.33968565 = queryWeight, product of:
                3.9514935 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.015060438 = queryNorm
              0.7134907 = fieldWeight in 10, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=10)
        0.36 = coord(9/25)
  2. Menczer, F.: Lexical and semantic clustering by Web links (2004) 0.19
    0.19439915 = sum of:
      0.19439915 = product of:
        0.69428265 = sum of:
          0.05782642 = weight(abstract_txt:graph in 3090) [ClassicSimilarity], result of:
            0.05782642 = score(doc=3090,freq=1.0), product of:
              0.11261019 = queryWeight, product of:
                1.137579 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.015060438 = queryNorm
              0.51350963 = fieldWeight in 3090, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.078125 = fieldNorm(doc=3090)
          0.023152797 = weight(abstract_txt:such in 3090) [ClassicSimilarity], result of:
            0.023152797 = score(doc=3090,freq=2.0), product of:
              0.061173383 = queryWeight, product of:
                1.1857386 = boost
                3.4255946 = idf(docFreq=3909, maxDocs=44218)
                0.015060438 = queryNorm
              0.3784783 = fieldWeight in 3090, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4255946 = idf(docFreq=3909, maxDocs=44218)
                0.078125 = fieldNorm(doc=3090)
          0.008668169 = weight(abstract_txt:information in 3090) [ClassicSimilarity], result of:
            0.008668169 = score(doc=3090,freq=1.0), product of:
              0.045830246 = queryWeight, product of:
                1.256983 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.015060438 = queryNorm
              0.18913643 = fieldWeight in 3090, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=3090)
          0.033708706 = weight(abstract_txt:structure in 3090) [ClassicSimilarity], result of:
            0.033708706 = score(doc=3090,freq=1.0), product of:
              0.09900677 = queryWeight, product of:
                1.508482 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.015060438 = queryNorm
              0.3404687 = fieldWeight in 3090, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.078125 = fieldNorm(doc=3090)
          0.124251686 = weight(abstract_txt:pages in 3090) [ClassicSimilarity], result of:
            0.124251686 = score(doc=3090,freq=3.0), product of:
              0.16380656 = queryWeight, product of:
                1.9403199 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.015060438 = queryNorm
              0.7585269 = fieldWeight in 3090, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.078125 = fieldNorm(doc=3090)
          0.10796288 = weight(abstract_txt:connections in 3090) [ClassicSimilarity], result of:
            0.10796288 = score(doc=3090,freq=1.0), product of:
              0.21512282 = queryWeight, product of:
                2.2235706 = boost
                6.4238877 = idf(docFreq=194, maxDocs=44218)
                0.015060438 = queryNorm
              0.5018662 = fieldWeight in 3090, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4238877 = idf(docFreq=194, maxDocs=44218)
                0.078125 = fieldNorm(doc=3090)
          0.33871198 = weight(abstract_txt:link in 3090) [ClassicSimilarity], result of:
            0.33871198 = score(doc=3090,freq=5.0), product of:
              0.33968565 = queryWeight, product of:
                3.9514935 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.015060438 = queryNorm
              0.9971336 = fieldWeight in 3090, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.078125 = fieldNorm(doc=3090)
        0.28 = coord(7/25)
  3. Rauter, J.: ¬Die Bündelung von Kleinbergs authorities und hubs in van Rijsbergens Effektivitätsmaß (2006) 0.16
    0.16142173 = sum of:
      0.16142173 = product of:
        1.0088859 = sum of:
          0.069663234 = weight(abstract_txt:sources in 75) [ClassicSimilarity], result of:
            0.069663234 = score(doc=75,freq=1.0), product of:
              0.117424645 = queryWeight, product of:
                1.6428099 = boost
                4.7460723 = idf(docFreq=1043, maxDocs=44218)
                0.015060438 = queryNorm
              0.59325904 = fieldWeight in 75, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7460723 = idf(docFreq=1043, maxDocs=44218)
                0.125 = fieldNorm(doc=75)
          0.09736356 = weight(abstract_txt:environment in 75) [ClassicSimilarity], result of:
            0.09736356 = score(doc=75,freq=1.0), product of:
              0.16802925 = queryWeight, product of:
                2.406832 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.015060438 = queryNorm
              0.5794441 = fieldWeight in 75, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.125 = fieldNorm(doc=75)
          0.46315122 = weight(abstract_txt:hyperlinked in 75) [ClassicSimilarity], result of:
            0.46315122 = score(doc=75,freq=1.0), product of:
              0.4151822 = queryWeight, product of:
                3.089065 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.015060438 = queryNorm
              1.1155373 = fieldWeight in 75, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.125 = fieldNorm(doc=75)
          0.37870792 = weight(abstract_txt:authoritative in 75) [ClassicSimilarity], result of:
            0.37870792 = score(doc=75,freq=1.0), product of:
              0.41558212 = queryWeight, product of:
                3.7851381 = boost
                7.290168 = idf(docFreq=81, maxDocs=44218)
                0.015060438 = queryNorm
              0.911271 = fieldWeight in 75, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.290168 = idf(docFreq=81, maxDocs=44218)
                0.125 = fieldNorm(doc=75)
        0.16 = coord(4/25)
  4. Krause, J.: Shell Model, Semantic Web and Web Information Retrieval (2006) 0.12
    0.11528475 = sum of:
      0.11528475 = product of:
        0.41173124 = sum of:
          0.010555917 = weight(abstract_txt:based in 6061) [ClassicSimilarity], result of:
            0.010555917 = score(doc=6061,freq=1.0), product of:
              0.052979458 = queryWeight, product of:
                1.1034724 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015060438 = queryNorm
              0.19924548 = fieldWeight in 6061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=6061)
          0.013097201 = weight(abstract_txt:such in 6061) [ClassicSimilarity], result of:
            0.013097201 = score(doc=6061,freq=1.0), product of:
              0.061173383 = queryWeight, product of:
                1.1857386 = boost
                3.4255946 = idf(docFreq=3909, maxDocs=44218)
                0.015060438 = queryNorm
              0.21409966 = fieldWeight in 6061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4255946 = idf(docFreq=3909, maxDocs=44218)
                0.0625 = fieldNorm(doc=6061)
          0.016986074 = weight(abstract_txt:information in 6061) [ClassicSimilarity], result of:
            0.016986074 = score(doc=6061,freq=6.0), product of:
              0.045830246 = queryWeight, product of:
                1.256983 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.015060438 = queryNorm
              0.3706302 = fieldWeight in 6061, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=6061)
          0.026966965 = weight(abstract_txt:structure in 6061) [ClassicSimilarity], result of:
            0.026966965 = score(doc=6061,freq=1.0), product of:
              0.09900677 = queryWeight, product of:
                1.508482 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.015060438 = queryNorm
              0.27237496 = fieldWeight in 6061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=6061)
          0.034831617 = weight(abstract_txt:sources in 6061) [ClassicSimilarity], result of:
            0.034831617 = score(doc=6061,freq=1.0), product of:
              0.117424645 = queryWeight, product of:
                1.6428099 = boost
                4.7460723 = idf(docFreq=1043, maxDocs=44218)
                0.015060438 = queryNorm
              0.29662952 = fieldWeight in 6061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7460723 = idf(docFreq=1043, maxDocs=44218)
                0.0625 = fieldNorm(doc=6061)
          0.09940135 = weight(abstract_txt:pages in 6061) [ClassicSimilarity], result of:
            0.09940135 = score(doc=6061,freq=3.0), product of:
              0.16380656 = queryWeight, product of:
                1.9403199 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.015060438 = queryNorm
              0.60682154 = fieldWeight in 6061, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0625 = fieldNorm(doc=6061)
          0.20989214 = weight(abstract_txt:link in 6061) [ClassicSimilarity], result of:
            0.20989214 = score(doc=6061,freq=3.0), product of:
              0.33968565 = queryWeight, product of:
                3.9514935 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.015060438 = queryNorm
              0.6179011 = fieldWeight in 6061, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=6061)
        0.28 = coord(7/25)
  5. Yang, P.; Gao, W.; Tan, Q.; Wong, K.-F.: ¬A link-bridged topic model for cross-domain document classification (2013) 0.11
    0.10807189 = sum of:
      0.10807189 = product of:
        0.45029956 = sum of:
          0.010555917 = weight(abstract_txt:based in 2706) [ClassicSimilarity], result of:
            0.010555917 = score(doc=2706,freq=1.0), product of:
              0.052979458 = queryWeight, product of:
                1.1034724 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.015060438 = queryNorm
              0.19924548 = fieldWeight in 2706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.04626113 = weight(abstract_txt:graph in 2706) [ClassicSimilarity], result of:
            0.04626113 = score(doc=2706,freq=1.0), product of:
              0.11261019 = queryWeight, product of:
                1.137579 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.015060438 = queryNorm
              0.4108077 = fieldWeight in 2706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.009806914 = weight(abstract_txt:information in 2706) [ClassicSimilarity], result of:
            0.009806914 = score(doc=2706,freq=2.0), product of:
              0.045830246 = queryWeight, product of:
                1.256983 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.015060438 = queryNorm
              0.21398345 = fieldWeight in 2706, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.026966965 = weight(abstract_txt:structure in 2706) [ClassicSimilarity], result of:
            0.026966965 = score(doc=2706,freq=1.0), product of:
              0.09900677 = queryWeight, product of:
                1.508482 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.015060438 = queryNorm
              0.27237496 = fieldWeight in 2706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.08573905 = weight(abstract_txt:topics in 2706) [ClassicSimilarity], result of:
            0.08573905 = score(doc=2706,freq=4.0), product of:
              0.13485776 = queryWeight, product of:
                1.760539 = boost
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.015060438 = queryNorm
              0.6357739 = fieldWeight in 2706, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.2709696 = weight(abstract_txt:link in 2706) [ClassicSimilarity], result of:
            0.2709696 = score(doc=2706,freq=5.0), product of:
              0.33968565 = queryWeight, product of:
                3.9514935 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.015060438 = queryNorm
              0.7977069 = fieldWeight in 2706, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
        0.24 = coord(6/25)