Document (#33345)

Author
Li, Q.
Wu, Y.-f.B.
Title
People search : searching people sharing similar interests from the Web
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.1, S.111-125
Year
2008
Abstract
On the Web, there are limited ways of finding people sharing similar interests with a given person. The current methods are either ineffective or time consuming. In this paper, we present a new approach for searching people sharing similar interests from the Web. Given a person, to find similar people from the Web, there are two major research issues: person representation and matching persons. In this study, we propose a person representation method which uses a person's website to represent this person. Our design of matching process takes person representation into consideration to allow the same representation to be used when composing the query. Under this person representation method, the proposed algorithm integrates textual content and hyperlink information of all the pages belonging to a personal website to represent a person and match persons. Other algorithms are also explored and compared to the proposed algorithm. Experimental results are presented.

Similar documents (content)

  1. Dumitrescu, A.; Santini, S.: Full coverage of a reader's interests in context-based information filtering (2021) 0.22
    0.22235344 = sum of:
      0.22235344 = product of:
        0.9264727 = sum of:
          0.0782964 = weight(abstract_txt:person's in 1328) [ClassicSimilarity], result of:
            0.0782964 = score(doc=1328,freq=1.0), product of:
              0.14768392 = queryWeight, product of:
                1.2084607 = boost
                8.482592 = idf(docFreq=24, maxDocs=44421)
                0.014406952 = queryNorm
              0.530162 = fieldWeight in 1328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.482592 = idf(docFreq=24, maxDocs=44421)
                0.0625 = fieldNorm(doc=1328)
          0.02659406 = weight(abstract_txt:given in 1328) [ClassicSimilarity], result of:
            0.02659406 = score(doc=1328,freq=1.0), product of:
              0.090581276 = queryWeight, product of:
                1.3384438 = boost
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.014406952 = queryNorm
              0.29359335 = fieldWeight in 1328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.0625 = fieldNorm(doc=1328)
          0.047638334 = weight(abstract_txt:algorithm in 1328) [ClassicSimilarity], result of:
            0.047638334 = score(doc=1328,freq=1.0), product of:
              0.13360408 = queryWeight, product of:
                1.6255143 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.014406952 = queryNorm
              0.35656348 = fieldWeight in 1328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.0625 = fieldNorm(doc=1328)
          0.1741778 = weight(abstract_txt:interests in 1328) [ClassicSimilarity], result of:
            0.1741778 = score(doc=1328,freq=3.0), product of:
              0.2516714 = queryWeight, product of:
                2.7323952 = boost
                6.3932 = idf(docFreq=201, maxDocs=44421)
                0.014406952 = queryNorm
              0.6920842 = fieldWeight in 1328, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.3932 = idf(docFreq=201, maxDocs=44421)
                0.0625 = fieldNorm(doc=1328)
          0.07667189 = weight(abstract_txt:representation in 1328) [ClassicSimilarity], result of:
            0.07667189 = score(doc=1328,freq=1.0), product of:
              0.24903065 = queryWeight, product of:
                3.5089512 = boost
                4.9261017 = idf(docFreq=875, maxDocs=44421)
                0.014406952 = queryNorm
              0.30788136 = fieldWeight in 1328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9261017 = idf(docFreq=875, maxDocs=44421)
                0.0625 = fieldNorm(doc=1328)
          0.52309424 = weight(abstract_txt:person in 1328) [ClassicSimilarity], result of:
            0.52309424 = score(doc=1328,freq=4.0), product of:
              0.6600375 = queryWeight, product of:
                7.2259545 = boost
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.014406952 = queryNorm
              0.79252195 = fieldWeight in 1328, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.0625 = fieldNorm(doc=1328)
        0.24 = coord(6/25)
    
  2. Brown, S.A.; Dennis, A.R.; Burley, D.; Arling, P.: Knowledge sharing and knowledge management system avoidance : the role of knowledge type and the social network in bypassing an organizational knowledge management system (2013) 0.17
    0.1717778 = sum of:
      0.1717778 = product of:
        1.0736113 = sum of:
          0.022007314 = weight(abstract_txt:there in 2099) [ClassicSimilarity], result of:
            0.022007314 = score(doc=2099,freq=1.0), product of:
              0.06880501 = queryWeight, product of:
                1.1665167 = boost
                4.094086 = idf(docFreq=2012, maxDocs=44421)
                0.014406952 = queryNorm
              0.31985047 = fieldWeight in 2099, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.094086 = idf(docFreq=2012, maxDocs=44421)
                0.078125 = fieldNorm(doc=2099)
          0.012637321 = weight(abstract_txt:this in 2099) [ClassicSimilarity], result of:
            0.012637321 = score(doc=2099,freq=2.0), product of:
              0.04753484 = queryWeight, product of:
                1.3712035 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.014406952 = queryNorm
              0.26585388 = fieldWeight in 2099, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.078125 = fieldNorm(doc=2099)
          0.114257984 = weight(abstract_txt:sharing in 2099) [ClassicSimilarity], result of:
            0.114257984 = score(doc=2099,freq=2.0), product of:
              0.18743621 = queryWeight, product of:
                2.3580515 = boost
                5.5173187 = idf(docFreq=484, maxDocs=44421)
                0.014406952 = queryNorm
              0.6095833 = fieldWeight in 2099, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5173187 = idf(docFreq=484, maxDocs=44421)
                0.078125 = fieldNorm(doc=2099)
          0.92470866 = weight(abstract_txt:person in 2099) [ClassicSimilarity], result of:
            0.92470866 = score(doc=2099,freq=8.0), product of:
              0.6600375 = queryWeight, product of:
                7.2259545 = boost
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.014406952 = queryNorm
              1.4009941 = fieldWeight in 2099, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.078125 = fieldNorm(doc=2099)
        0.16 = coord(4/25)
    
  3. Zhou, Q.; Lee, C.S.; Sin, S.-C.J.; Lin, S.; Hu, H.; Ismail, M.F.F. Bin: Understanding the use of YouTube as a learning resource : a social cognitive perspective (2020) 0.14
    0.14476696 = sum of:
      0.14476696 = product of:
        0.60319567 = sum of:
          0.011434991 = weight(abstract_txt:from in 1175) [ClassicSimilarity], result of:
            0.011434991 = score(doc=1175,freq=2.0), product of:
              0.04688418 = queryWeight, product of:
                1.1793418 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.014406952 = queryNorm
              0.2438987 = fieldWeight in 1175, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=1175)
          0.023360277 = weight(abstract_txt:method in 1175) [ClassicSimilarity], result of:
            0.023360277 = score(doc=1175,freq=1.0), product of:
              0.0830808 = queryWeight, product of:
                1.2818325 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.014406952 = queryNorm
              0.2811754 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=1175)
          0.025103908 = weight(abstract_txt:proposed in 1175) [ClassicSimilarity], result of:
            0.025103908 = score(doc=1175,freq=1.0), product of:
              0.087165155 = queryWeight, product of:
                1.3129627 = boost
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.014406952 = queryNorm
              0.28800395 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.0625 = fieldNorm(doc=1175)
          0.014297497 = weight(abstract_txt:this in 1175) [ClassicSimilarity], result of:
            0.014297497 = score(doc=1175,freq=4.0), product of:
              0.04753484 = queryWeight, product of:
                1.3712035 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.014406952 = queryNorm
              0.30077934 = fieldWeight in 1175, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=1175)
          0.07598609 = weight(abstract_txt:people in 1175) [ClassicSimilarity], result of:
            0.07598609 = score(doc=1175,freq=1.0), product of:
              0.24754342 = queryWeight, product of:
                3.4984577 = boost
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.014406952 = queryNorm
              0.30696064 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.0625 = fieldNorm(doc=1175)
          0.45301288 = weight(abstract_txt:person in 1175) [ClassicSimilarity], result of:
            0.45301288 = score(doc=1175,freq=3.0), product of:
              0.6600375 = queryWeight, product of:
                7.2259545 = boost
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.014406952 = queryNorm
              0.68634415 = fieldWeight in 1175, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.0625 = fieldNorm(doc=1175)
        0.24 = coord(6/25)
    
  4. Lihui, C.; Lian, C.W.: Using Web structure and summarisation techniques for Web content mining (2005) 0.14
    0.13509531 = sum of:
      0.13509531 = product of:
        0.42217284 = sum of:
          0.049062934 = weight(abstract_txt:consuming in 2046) [ClassicSimilarity], result of:
            0.049062934 = score(doc=2046,freq=1.0), product of:
              0.10814531 = queryWeight, product of:
                1.0341172 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.014406952 = queryNorm
              0.45367602 = fieldWeight in 2046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
          0.043481246 = weight(abstract_txt:proposed in 2046) [ClassicSimilarity], result of:
            0.043481246 = score(doc=2046,freq=3.0), product of:
              0.087165155 = queryWeight, product of:
                1.3129627 = boost
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.014406952 = queryNorm
              0.49883747 = fieldWeight in 2046, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
          0.02659406 = weight(abstract_txt:given in 2046) [ClassicSimilarity], result of:
            0.02659406 = score(doc=2046,freq=1.0), product of:
              0.090581276 = queryWeight, product of:
                1.3384438 = boost
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.014406952 = queryNorm
              0.29359335 = fieldWeight in 2046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6974936 = idf(docFreq=1100, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
          0.0071487487 = weight(abstract_txt:this in 2046) [ClassicSimilarity], result of:
            0.0071487487 = score(doc=2046,freq=1.0), product of:
              0.04753484 = queryWeight, product of:
                1.3712035 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.014406952 = queryNorm
              0.15038967 = fieldWeight in 2046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
          0.043041136 = weight(abstract_txt:represent in 2046) [ClassicSimilarity], result of:
            0.043041136 = score(doc=2046,freq=1.0), product of:
              0.12486417 = queryWeight, product of:
                1.5714474 = boost
                5.515259 = idf(docFreq=485, maxDocs=44421)
                0.014406952 = queryNorm
              0.34470367 = fieldWeight in 2046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.515259 = idf(docFreq=485, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
          0.047638334 = weight(abstract_txt:algorithm in 2046) [ClassicSimilarity], result of:
            0.047638334 = score(doc=2046,freq=1.0), product of:
              0.13360408 = queryWeight, product of:
                1.6255143 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.014406952 = queryNorm
              0.35656348 = fieldWeight in 2046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
          0.07240676 = weight(abstract_txt:similar in 2046) [ClassicSimilarity], result of:
            0.07240676 = score(doc=2046,freq=1.0), product of:
              0.22252463 = queryWeight, product of:
                2.9667773 = boost
                5.206202 = idf(docFreq=661, maxDocs=44421)
                0.014406952 = queryNorm
              0.32538763 = fieldWeight in 2046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.206202 = idf(docFreq=661, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
          0.13279961 = weight(abstract_txt:representation in 2046) [ClassicSimilarity], result of:
            0.13279961 = score(doc=2046,freq=3.0), product of:
              0.24903065 = queryWeight, product of:
                3.5089512 = boost
                4.9261017 = idf(docFreq=875, maxDocs=44421)
                0.014406952 = queryNorm
              0.5332661 = fieldWeight in 2046, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.9261017 = idf(docFreq=875, maxDocs=44421)
                0.0625 = fieldNorm(doc=2046)
        0.32 = coord(8/25)
    
  5. Elsweiler, D.; Harvey, M.: Engaging and maintaining a sense of being informed : understanding the tasks motivating twitter search (2015) 0.13
    0.12714252 = sum of:
      0.12714252 = product of:
        0.5297605 = sum of:
          0.014004946 = weight(abstract_txt:from in 2635) [ClassicSimilarity], result of:
            0.014004946 = score(doc=2635,freq=3.0), product of:
              0.04688418 = queryWeight, product of:
                1.1793418 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.014406952 = queryNorm
              0.29871368 = fieldWeight in 2635, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=2635)
          0.0071487487 = weight(abstract_txt:this in 2635) [ClassicSimilarity], result of:
            0.0071487487 = score(doc=2635,freq=1.0), product of:
              0.04753484 = queryWeight, product of:
                1.3712035 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.014406952 = queryNorm
              0.15038967 = fieldWeight in 2635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=2635)
          0.043041136 = weight(abstract_txt:represent in 2635) [ClassicSimilarity], result of:
            0.043041136 = score(doc=2635,freq=1.0), product of:
              0.12486417 = queryWeight, product of:
                1.5714474 = boost
                5.515259 = idf(docFreq=485, maxDocs=44421)
                0.014406952 = queryNorm
              0.34470367 = fieldWeight in 2635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.515259 = idf(docFreq=485, maxDocs=44421)
                0.0625 = fieldNorm(doc=2635)
          0.07240676 = weight(abstract_txt:similar in 2635) [ClassicSimilarity], result of:
            0.07240676 = score(doc=2635,freq=1.0), product of:
              0.22252463 = queryWeight, product of:
                2.9667773 = boost
                5.206202 = idf(docFreq=661, maxDocs=44421)
                0.014406952 = queryNorm
              0.32538763 = fieldWeight in 2635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.206202 = idf(docFreq=661, maxDocs=44421)
                0.0625 = fieldNorm(doc=2635)
          0.13161176 = weight(abstract_txt:people in 2635) [ClassicSimilarity], result of:
            0.13161176 = score(doc=2635,freq=3.0), product of:
              0.24754342 = queryWeight, product of:
                3.4984577 = boost
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.014406952 = queryNorm
              0.5316714 = fieldWeight in 2635, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.0625 = fieldNorm(doc=2635)
          0.26154712 = weight(abstract_txt:person in 2635) [ClassicSimilarity], result of:
            0.26154712 = score(doc=2635,freq=1.0), product of:
              0.6600375 = queryWeight, product of:
                7.2259545 = boost
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.014406952 = queryNorm
              0.39626098 = fieldWeight in 2635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.0625 = fieldNorm(doc=2635)
        0.24 = coord(6/25)