Document (#34454)

Author
Kang, I.-S.
Na, S.-H.
Lee, S.
Jung, H.
Kim, P.
Sung, W.-K.
Lee, J.-H.
Title
On co-authorship for author disambiguation
Source
Information processing and management. 45(2009) no.1, S.84-97
Year
2009
Abstract
Author name disambiguation deals with clustering the same-name authors into different individuals. To attack the problem, many studies have employed a variety of disambiguation features such as coauthors, titles of papers/publications, topics of articles, emails/affiliations, etc. Among these, co-authorship is the most easily accessible and influential, since inter-person acquaintances represented by co-authorship could discriminate the identities of authors more clearly than other features. This study attempts to explore the net effects of co-authorship on author clustering in bibliographic data. First, to handle the shortage of explicit coauthors listed in known citations, a web-assisted technique of acquiring implicit coauthors of the target author to be disambiguated is proposed. Then, a coauthor disambiguation hypothesis that the identity of an author can be determined by his/her coauthors is examined and confirmed through a variety of author disambiguation experiments.

Similar documents (author)

  1. Jung, R.: ¬Die Reform der alphabetischen Katalogisierung in Deutschland 1908-1976 : eine annotierte Auswahlbibliographie (1976) 2.11
    2.1118848 = sum of:
      2.1118848 = product of:
        4.2237697 = sum of:
          4.2237697 = weight(author_txt:jung in 5322) [ClassicSimilarity], result of:
            4.2237697 = score(doc=5322,freq=1.0), product of:
              0.7457212 = queryWeight, product of:
                1.0579545 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.07777961 = queryNorm
              5.664006 = fieldWeight in 5322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.625 = fieldNorm(doc=5322)
        0.5 = coord(1/2)
    
  2. Jung, V.: Wissen, das produktiv wird : Mit Wissensmanagement zum lernenden Unternehmen (2000) 2.11
    2.1118848 = sum of:
      2.1118848 = product of:
        4.2237697 = sum of:
          4.2237697 = weight(author_txt:jung in 6057) [ClassicSimilarity], result of:
            4.2237697 = score(doc=6057,freq=1.0), product of:
              0.7457212 = queryWeight, product of:
                1.0579545 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.07777961 = queryNorm
              5.664006 = fieldWeight in 6057, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.625 = fieldNorm(doc=6057)
        0.5 = coord(1/2)
    
  3. Jung, R.: Bibliographie der Festschriften und Festschriftenbeiträge zum Buch und Bibliothekswesen : Deutschland, Österreich, Schweiz 1976-2000 (2002) 2.11
    2.1118848 = sum of:
      2.1118848 = product of:
        4.2237697 = sum of:
          4.2237697 = weight(author_txt:jung in 2089) [ClassicSimilarity], result of:
            4.2237697 = score(doc=2089,freq=1.0), product of:
              0.7457212 = queryWeight, product of:
                1.0579545 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.07777961 = queryNorm
              5.664006 = fieldWeight in 2089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.625 = fieldNorm(doc=2089)
        0.5 = coord(1/2)
    
  4. Jung, R.: Methodik und Didaktik einer Einführung in die RAK nach vorausgegangenem Unterricht der Titelaufnahme nach den "Preußischen Instruktionen" (1976) 2.11
    2.1118848 = sum of:
      2.1118848 = product of:
        4.2237697 = sum of:
          4.2237697 = weight(author_txt:jung in 2803) [ClassicSimilarity], result of:
            4.2237697 = score(doc=2803,freq=1.0), product of:
              0.7457212 = queryWeight, product of:
                1.0579545 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.07777961 = queryNorm
              5.664006 = fieldWeight in 2803, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.625 = fieldNorm(doc=2803)
        0.5 = coord(1/2)
    
  5. Jung, J.J.: Contextualized query sampling to discover semantic resource descriptions on the web (2009) 2.11
    2.1118848 = sum of:
      2.1118848 = product of:
        4.2237697 = sum of:
          4.2237697 = weight(author_txt:jung in 216) [ClassicSimilarity], result of:
            4.2237697 = score(doc=216,freq=1.0), product of:
              0.7457212 = queryWeight, product of:
                1.0579545 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.07777961 = queryNorm
              5.664006 = fieldWeight in 216, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.625 = fieldNorm(doc=216)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Kim, J.; Diesner, J.: Distortive effects of initial-based name disambiguation on measurements of large-scale coauthorship networks (2016) 0.33
    0.33233142 = sum of:
      0.33233142 = product of:
        1.0385357 = sum of:
          0.059367184 = weight(abstract_txt:identities in 3936) [ClassicSimilarity], result of:
            0.059367184 = score(doc=3936,freq=1.0), product of:
              0.11740549 = queryWeight, product of:
                1.2067783 = boost
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.01202494 = queryNorm
              0.50565934 = fieldWeight in 3936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
          0.081543006 = weight(abstract_txt:coauthor in 3936) [ClassicSimilarity], result of:
            0.081543006 = score(doc=3936,freq=1.0), product of:
              0.14507145 = queryWeight, product of:
                1.3414493 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01202494 = queryNorm
              0.5620886 = fieldWeight in 3936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
          0.11531922 = weight(abstract_txt:disambiguated in 3936) [ClassicSimilarity], result of:
            0.11531922 = score(doc=3936,freq=2.0), product of:
              0.14507145 = queryWeight, product of:
                1.3414493 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01202494 = queryNorm
              0.7949133 = fieldWeight in 3936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
          0.03178312 = weight(abstract_txt:authors in 3936) [ClassicSimilarity], result of:
            0.03178312 = score(doc=3936,freq=2.0), product of:
              0.07740847 = queryWeight, product of:
                1.3857744 = boost
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.01202494 = queryNorm
              0.4105897 = fieldWeight in 3936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
          0.08505217 = weight(abstract_txt:name in 3936) [ClassicSimilarity], result of:
            0.08505217 = score(doc=3936,freq=4.0), product of:
              0.11842345 = queryWeight, product of:
                1.7140249 = boost
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.01202494 = queryNorm
              0.7182038 = fieldWeight in 3936, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
          0.05397431 = weight(abstract_txt:clustering in 3936) [ClassicSimilarity], result of:
            0.05397431 = score(doc=3936,freq=1.0), product of:
              0.13882218 = queryWeight, product of:
                1.8557851 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01202494 = queryNorm
              0.38880178 = fieldWeight in 3936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
          0.11748371 = weight(abstract_txt:author in 3936) [ClassicSimilarity], result of:
            0.11748371 = score(doc=3936,freq=2.0), product of:
              0.26690066 = queryWeight, product of:
                4.4569087 = boost
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.01202494 = queryNorm
              0.44017768 = fieldWeight in 3936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
          0.49401298 = weight(abstract_txt:disambiguation in 3936) [ClassicSimilarity], result of:
            0.49401298 = score(doc=3936,freq=5.0), product of:
              0.48211393 = queryWeight, product of:
                5.468184 = boost
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.01202494 = queryNorm
              1.024681 = fieldWeight in 3936, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.0625 = fieldNorm(doc=3936)
        0.32 = coord(8/25)
    
  2. Pooja, K.M.; Mondal, S.; Chandra, J.: ¬A graph combination with edge pruning-based approach for author name disambiguation (2020) 0.19
    0.1925094 = sum of:
      0.1925094 = product of:
        0.80212253 = sum of:
          0.04881589 = weight(abstract_txt:handle in 1060) [ClassicSimilarity], result of:
            0.04881589 = score(doc=1060,freq=2.0), product of:
              0.08178775 = queryWeight, product of:
                1.0072271 = boost
                6.7527075 = idf(docFreq=140, maxDocs=44421)
                0.01202494 = queryNorm
              0.59686065 = fieldWeight in 1060, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7527075 = idf(docFreq=140, maxDocs=44421)
                0.0625 = fieldNorm(doc=1060)
          0.038926214 = weight(abstract_txt:authors in 1060) [ClassicSimilarity], result of:
            0.038926214 = score(doc=1060,freq=3.0), product of:
              0.07740847 = queryWeight, product of:
                1.3857744 = boost
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.01202494 = queryNorm
              0.50286764 = fieldWeight in 1060, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.0625 = fieldNorm(doc=1060)
          0.08505217 = weight(abstract_txt:name in 1060) [ClassicSimilarity], result of:
            0.08505217 = score(doc=1060,freq=4.0), product of:
              0.11842345 = queryWeight, product of:
                1.7140249 = boost
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.01202494 = queryNorm
              0.7182038 = fieldWeight in 1060, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.0625 = fieldNorm(doc=1060)
          0.11748371 = weight(abstract_txt:author in 1060) [ClassicSimilarity], result of:
            0.11748371 = score(doc=1060,freq=2.0), product of:
              0.26690066 = queryWeight, product of:
                4.4569087 = boost
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.01202494 = queryNorm
              0.44017768 = fieldWeight in 1060, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.0625 = fieldNorm(doc=1060)
          0.2909152 = weight(abstract_txt:coauthors in 1060) [ClassicSimilarity], result of:
            0.2909152 = score(doc=1060,freq=1.0), product of:
              0.5376773 = queryWeight, product of:
                5.1650453 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.01202494 = queryNorm
              0.5410591 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.0625 = fieldNorm(doc=1060)
          0.22092931 = weight(abstract_txt:disambiguation in 1060) [ClassicSimilarity], result of:
            0.22092931 = score(doc=1060,freq=1.0), product of:
              0.48211393 = queryWeight, product of:
                5.468184 = boost
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.01202494 = queryNorm
              0.45825124 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.0625 = fieldNorm(doc=1060)
        0.24 = coord(6/25)
    
  3. Kim, J.(im); Kim, J.(enna): Effect of forename string on author name disambiguation (2020) 0.19
    0.18959129 = sum of:
      0.18959129 = product of:
        0.94795644 = sum of:
          0.081543006 = weight(abstract_txt:disambiguated in 930) [ClassicSimilarity], result of:
            0.081543006 = score(doc=930,freq=1.0), product of:
              0.14507145 = queryWeight, product of:
                1.3414493 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01202494 = queryNorm
              0.5620886 = fieldWeight in 930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0625 = fieldNorm(doc=930)
          0.02247406 = weight(abstract_txt:authors in 930) [ClassicSimilarity], result of:
            0.02247406 = score(doc=930,freq=1.0), product of:
              0.07740847 = queryWeight, product of:
                1.3857744 = boost
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.01202494 = queryNorm
              0.29033077 = fieldWeight in 930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.0625 = fieldNorm(doc=930)
          0.073657334 = weight(abstract_txt:name in 930) [ClassicSimilarity], result of:
            0.073657334 = score(doc=930,freq=3.0), product of:
              0.11842345 = queryWeight, product of:
                1.7140249 = boost
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.01202494 = queryNorm
              0.6219827 = fieldWeight in 930, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.0625 = fieldNorm(doc=930)
          0.18575807 = weight(abstract_txt:author in 930) [ClassicSimilarity], result of:
            0.18575807 = score(doc=930,freq=5.0), product of:
              0.26690066 = queryWeight, product of:
                4.4569087 = boost
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.01202494 = queryNorm
              0.69598204 = fieldWeight in 930, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.0625 = fieldNorm(doc=930)
          0.584524 = weight(abstract_txt:disambiguation in 930) [ClassicSimilarity], result of:
            0.584524 = score(doc=930,freq=7.0), product of:
              0.48211393 = queryWeight, product of:
                5.468184 = boost
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.01202494 = queryNorm
              1.2124188 = fieldWeight in 930, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.0625 = fieldNorm(doc=930)
        0.2 = coord(5/25)
    
  4. Ferreira, A.A.; Veloso, A.; Gonçalves, M.A.; Laender, A.H.F.: Self-training author name disambiguation for information scarce scenarios (2014) 0.19
    0.18838981 = sum of:
      0.18838981 = product of:
        0.941949 = sum of:
          0.02247406 = weight(abstract_txt:authors in 2292) [ClassicSimilarity], result of:
            0.02247406 = score(doc=2292,freq=1.0), product of:
              0.07740847 = queryWeight, product of:
                1.3857744 = boost
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.01202494 = queryNorm
              0.29033077 = fieldWeight in 2292, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.0625 = fieldNorm(doc=2292)
          0.060140964 = weight(abstract_txt:name in 2292) [ClassicSimilarity], result of:
            0.060140964 = score(doc=2292,freq=2.0), product of:
              0.11842345 = queryWeight, product of:
                1.7140249 = boost
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.01202494 = queryNorm
              0.5078468 = fieldWeight in 2292, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7456303 = idf(docFreq=385, maxDocs=44421)
                0.0625 = fieldNorm(doc=2292)
          0.18575807 = weight(abstract_txt:author in 2292) [ClassicSimilarity], result of:
            0.18575807 = score(doc=2292,freq=5.0), product of:
              0.26690066 = queryWeight, product of:
                4.4569087 = boost
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.01202494 = queryNorm
              0.69598204 = fieldWeight in 2292, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.0625 = fieldNorm(doc=2292)
          0.2909152 = weight(abstract_txt:coauthors in 2292) [ClassicSimilarity], result of:
            0.2909152 = score(doc=2292,freq=1.0), product of:
              0.5376773 = queryWeight, product of:
                5.1650453 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.01202494 = queryNorm
              0.5410591 = fieldWeight in 2292, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.0625 = fieldNorm(doc=2292)
          0.38266078 = weight(abstract_txt:disambiguation in 2292) [ClassicSimilarity], result of:
            0.38266078 = score(doc=2292,freq=3.0), product of:
              0.48211393 = queryWeight, product of:
                5.468184 = boost
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.01202494 = queryNorm
              0.7937144 = fieldWeight in 2292, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.0625 = fieldNorm(doc=2292)
        0.2 = coord(5/25)
    
  5. Zhang, C.; Bu, Y.; Ding, Y.; Xu, J.: Understanding scientific collaboration : homophily, transitivity, and preferential attachment (2018) 0.17
    0.16580372 = sum of:
      0.16580372 = product of:
        0.82901853 = sum of:
          0.101928756 = weight(abstract_txt:coauthor in 11) [ClassicSimilarity], result of:
            0.101928756 = score(doc=11,freq=1.0), product of:
              0.14507145 = queryWeight, product of:
                1.3414493 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01202494 = queryNorm
              0.70261073 = fieldWeight in 11, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.078125 = fieldNorm(doc=11)
          0.026236048 = weight(abstract_txt:features in 11) [ClassicSimilarity], result of:
            0.026236048 = score(doc=11,freq=1.0), product of:
              0.07395934 = queryWeight, product of:
                1.3545492 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.01202494 = queryNorm
              0.3547361 = fieldWeight in 11, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.078125 = fieldNorm(doc=11)
          0.0397289 = weight(abstract_txt:authors in 11) [ClassicSimilarity], result of:
            0.0397289 = score(doc=11,freq=2.0), product of:
              0.07740847 = queryWeight, product of:
                1.3857744 = boost
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.01202494 = queryNorm
              0.5132371 = fieldWeight in 11, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.078125 = fieldNorm(doc=11)
          0.14685464 = weight(abstract_txt:author in 11) [ClassicSimilarity], result of:
            0.14685464 = score(doc=11,freq=2.0), product of:
              0.26690066 = queryWeight, product of:
                4.4569087 = boost
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.01202494 = queryNorm
              0.5502221 = fieldWeight in 11, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.980042 = idf(docFreq=829, maxDocs=44421)
                0.078125 = fieldNorm(doc=11)
          0.5142702 = weight(abstract_txt:coauthors in 11) [ClassicSimilarity], result of:
            0.5142702 = score(doc=11,freq=2.0), product of:
              0.5376773 = queryWeight, product of:
                5.1650453 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.01202494 = queryNorm
              0.9564663 = fieldWeight in 11, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.078125 = fieldNorm(doc=11)
        0.2 = coord(5/25)