Document (#38891)

Thomas, B.
Name disambiguation : learning from more user-friendly models
Cataloging and classification quarterly. 49(2011) no.3, S.223-232
Library catalogs do not provide catalog users with the assistance they need to easily and confidently select the person they are interested in. Examples are provided of Web services that do a better job of helping information seekers differentiate the person they are seeking from those with similar names. Some of the reasons for this failure in library catalogs are examined. This article then looks at how much information is necessary to help users disambiguate names, how that information could be captured and shared, and some ways the information could be displayed in library catalogs.

Similar documents (author)

  1. Thomas, D.: Book indexing principles and standards (1989) 4.66
    4.655245 = sum of:
      4.655245 = weight(author_txt:thomas in 865) [ClassicSimilarity], result of:
        4.655245 = fieldWeight in 865, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.448392 = idf(docFreq=69, maxDocs=44218)
          0.625 = fieldNorm(doc=865)
  2. Thomas, A.R.: Options in the arrangement of library materials and the new edition of the Bliss Bibliographic Classification (1992) 4.66
    4.655245 = sum of:
      4.655245 = weight(author_txt:thomas in 3934) [ClassicSimilarity], result of:
        4.655245 = fieldWeight in 3934, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.448392 = idf(docFreq=69, maxDocs=44218)
          0.625 = fieldNorm(doc=3934)
  3. Thomas, A.: Bliss regained : the second edition of the Bliss Bibliographic Classification (1993) 4.66
    4.655245 = sum of:
      4.655245 = weight(author_txt:thomas in 5077) [ClassicSimilarity], result of:
        4.655245 = fieldWeight in 5077, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.448392 = idf(docFreq=69, maxDocs=44218)
          0.625 = fieldNorm(doc=5077)
  4. Thomas, S.E.: CatTutor: a prototypical hypertext tutorial for catalogers (1992) 4.66
    4.655245 = sum of:
      4.655245 = weight(author_txt:thomas in 1384) [ClassicSimilarity], result of:
        4.655245 = fieldWeight in 1384, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.448392 = idf(docFreq=69, maxDocs=44218)
          0.625 = fieldNorm(doc=1384)
  5. Thomas, A.R.: CAPS (Counseling and Personnel Services Clearinghouse) : the work of ERIC Clearinghouse. (1989) 4.66
    4.655245 = sum of:
      4.655245 = weight(author_txt:thomas in 1537) [ClassicSimilarity], result of:
        4.655245 = fieldWeight in 1537, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.448392 = idf(docFreq=69, maxDocs=44218)
          0.625 = fieldNorm(doc=1537)

Similar documents (content)

  1. Vu, Q.M.; Takasu, A.; Adachi, J.: Improving the performance of personal name disambiguation using web directories (2008) 0.24
    0.24215506 = sum of:
      0.24215506 = product of:
        0.8648395 = sum of:
          0.0981367 = weight(abstract_txt:name in 2108) [ClassicSimilarity], result of:
            0.0981367 = score(doc=2108,freq=3.0), product of:
              0.12621084 = queryWeight, product of:
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.021964055 = queryNorm
              0.7775616 = fieldWeight in 2108, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.078125 = fieldNorm(doc=2108)
          0.038422626 = weight(abstract_txt:users in 2108) [ClassicSimilarity], result of:
            0.038422626 = score(doc=2108,freq=2.0), product of:
              0.09741836 = queryWeight, product of:
                1.2424734 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.021964055 = queryNorm
              0.39440846 = fieldWeight in 2108, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.078125 = fieldNorm(doc=2108)
          0.16701415 = weight(abstract_txt:disambiguation in 2108) [ClassicSimilarity], result of:
            0.16701415 = score(doc=2108,freq=2.0), product of:
              0.20594043 = queryWeight, product of:
                1.277387 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.021964055 = queryNorm
              0.8109828 = fieldWeight in 2108, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.078125 = fieldNorm(doc=2108)
          0.2735507 = weight(abstract_txt:disambiguate in 2108) [ClassicSimilarity], result of:
            0.2735507 = score(doc=2108,freq=2.0), product of:
              0.2861528 = queryWeight, product of:
                1.5057424 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.021964055 = queryNorm
              0.9559602 = fieldWeight in 2108, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.078125 = fieldNorm(doc=2108)
          0.016948601 = weight(abstract_txt:information in 2108) [ClassicSimilarity], result of:
            0.016948601 = score(doc=2108,freq=1.0), product of:
              0.08961046 = queryWeight, product of:
                1.6852372 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.021964055 = queryNorm
              0.18913643 = fieldWeight in 2108, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=2108)
          0.11854454 = weight(abstract_txt:names in 2108) [ClassicSimilarity], result of:
            0.11854454 = score(doc=2108,freq=1.0), product of:
              0.26012403 = queryWeight, product of:
                2.0302846 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.021964055 = queryNorm
              0.45572314 = fieldWeight in 2108, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.078125 = fieldNorm(doc=2108)
          0.15222222 = weight(abstract_txt:person in 2108) [ClassicSimilarity], result of:
            0.15222222 = score(doc=2108,freq=1.0), product of:
              0.30731103 = queryWeight, product of:
                2.2067633 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.021964055 = queryNorm
              0.49533603 = fieldWeight in 2108, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.078125 = fieldNorm(doc=2108)
        0.28 = coord(7/25)
  2. Delgado, A.D.; Martínez, R.; Montalvo, S.; Fresno, V.: Person name disambiguation in the Web using adaptive threshold clustering (2017) 0.17
    0.16530944 = sum of:
      0.16530944 = product of:
        0.68878937 = sum of:
          0.056659248 = weight(abstract_txt:name in 3694) [ClassicSimilarity], result of:
            0.056659248 = score(doc=3694,freq=1.0), product of:
              0.12621084 = queryWeight, product of:
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.021964055 = queryNorm
              0.44892538 = fieldWeight in 3694, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.078125 = fieldNorm(doc=3694)
          0.118096836 = weight(abstract_txt:disambiguation in 3694) [ClassicSimilarity], result of:
            0.118096836 = score(doc=3694,freq=1.0), product of:
              0.20594043 = queryWeight, product of:
                1.277387 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.021964055 = queryNorm
              0.57345146 = fieldWeight in 3694, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.078125 = fieldNorm(doc=3694)
          0.06491197 = weight(abstract_txt:could in 3694) [ClassicSimilarity], result of:
            0.06491197 = score(doc=3694,freq=1.0), product of:
              0.17410423 = queryWeight, product of:
                1.6610065 = boost
                4.772275 = idf(docFreq=1016, maxDocs=44218)
                0.021964055 = queryNorm
              0.37283397 = fieldWeight in 3694, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.772275 = idf(docFreq=1016, maxDocs=44218)
                0.078125 = fieldNorm(doc=3694)
          0.066920206 = weight(abstract_txt:they in 3694) [ClassicSimilarity], result of:
            0.066920206 = score(doc=3694,freq=2.0), product of:
              0.16143017 = queryWeight, product of:
                1.9588656 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.021964055 = queryNorm
              0.41454583 = fieldWeight in 3694, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.078125 = fieldNorm(doc=3694)
          0.11854454 = weight(abstract_txt:names in 3694) [ClassicSimilarity], result of:
            0.11854454 = score(doc=3694,freq=1.0), product of:
              0.26012403 = queryWeight, product of:
                2.0302846 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.021964055 = queryNorm
              0.45572314 = fieldWeight in 3694, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.078125 = fieldNorm(doc=3694)
          0.26365662 = weight(abstract_txt:person in 3694) [ClassicSimilarity], result of:
            0.26365662 = score(doc=3694,freq=3.0), product of:
              0.30731103 = queryWeight, product of:
                2.2067633 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.021964055 = queryNorm
              0.8579472 = fieldWeight in 3694, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.078125 = fieldNorm(doc=3694)
        0.24 = coord(6/25)
  3. Crane, G.; Jones, A.: Text, information, knowledge and the evolving record of humanity (2006) 0.16
    0.16277134 = sum of:
      0.16277134 = product of:
        0.40692836 = sum of:
          0.04434288 = weight(abstract_txt:name in 1182) [ClassicSimilarity], result of:
            0.04434288 = score(doc=1182,freq=5.0), product of:
              0.12621084 = queryWeight, product of:
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.021964055 = queryNorm
              0.3513397 = fieldWeight in 1182, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.01344792 = weight(abstract_txt:users in 1182) [ClassicSimilarity], result of:
            0.01344792 = score(doc=1182,freq=2.0), product of:
              0.09741836 = queryWeight, product of:
                1.2424734 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.021964055 = queryNorm
              0.13804297 = fieldWeight in 1182, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.014707632 = weight(abstract_txt:some in 1182) [ClassicSimilarity], result of:
            0.014707632 = score(doc=1182,freq=2.0), product of:
              0.1034108 = queryWeight, product of:
                1.2801169 = boost
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.021964055 = queryNorm
              0.1422253 = fieldWeight in 1182, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.04200002 = weight(abstract_txt:captured in 1182) [ClassicSimilarity], result of:
            0.04200002 = score(doc=1182,freq=1.0), product of:
              0.20814712 = queryWeight, product of:
                1.2842125 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.021964055 = queryNorm
              0.20178045 = fieldWeight in 1182, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.032129787 = weight(abstract_txt:could in 1182) [ClassicSimilarity], result of:
            0.032129787 = score(doc=1182,freq=2.0), product of:
              0.17410423 = queryWeight, product of:
                1.6610065 = boost
                4.772275 = idf(docFreq=1016, maxDocs=44218)
                0.021964055 = queryNorm
              0.1845434 = fieldWeight in 1182, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.772275 = idf(docFreq=1016, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.020294059 = weight(abstract_txt:library in 1182) [ClassicSimilarity], result of:
            0.020294059 = score(doc=1182,freq=4.0), product of:
              0.116449356 = queryWeight, product of:
                1.6637223 = boost
                3.1867187 = idf(docFreq=4964, maxDocs=44218)
                0.021964055 = queryNorm
              0.17427368 = fieldWeight in 1182, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1867187 = idf(docFreq=4964, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.0145304 = weight(abstract_txt:information in 1182) [ClassicSimilarity], result of:
            0.0145304 = score(doc=1182,freq=6.0), product of:
              0.08961046 = queryWeight, product of:
                1.6852372 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.021964055 = queryNorm
              0.16215071 = fieldWeight in 1182, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.023422072 = weight(abstract_txt:they in 1182) [ClassicSimilarity], result of:
            0.023422072 = score(doc=1182,freq=2.0), product of:
              0.16143017 = queryWeight, product of:
                1.9588656 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.021964055 = queryNorm
              0.14509104 = fieldWeight in 1182, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.10977378 = weight(abstract_txt:names in 1182) [ClassicSimilarity], result of:
            0.10977378 = score(doc=1182,freq=7.0), product of:
              0.26012403 = queryWeight, product of:
                2.0302846 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.021964055 = queryNorm
              0.42200553 = fieldWeight in 1182, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
          0.09227982 = weight(abstract_txt:person in 1182) [ClassicSimilarity], result of:
            0.09227982 = score(doc=1182,freq=3.0), product of:
              0.30731103 = queryWeight, product of:
                2.2067633 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.021964055 = queryNorm
              0.30028152 = fieldWeight in 1182, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.02734375 = fieldNorm(doc=1182)
        0.4 = coord(10/25)
  4. Sardo, L.: Multiple names (2004) 0.15
    0.1545413 = sum of:
      0.1545413 = product of:
        0.7727065 = sum of:
          0.0906548 = weight(abstract_txt:name in 6048) [ClassicSimilarity], result of:
            0.0906548 = score(doc=6048,freq=1.0), product of:
              0.12621084 = queryWeight, product of:
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.021964055 = queryNorm
              0.7182806 = fieldWeight in 6048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.125 = fieldNorm(doc=6048)
          0.047542244 = weight(abstract_txt:some in 6048) [ClassicSimilarity], result of:
            0.047542244 = score(doc=6048,freq=1.0), product of:
              0.1034108 = queryWeight, product of:
                1.2801169 = boost
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.021964055 = queryNorm
              0.45974156 = fieldWeight in 6048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.125 = fieldNorm(doc=6048)
          0.046386417 = weight(abstract_txt:library in 6048) [ClassicSimilarity], result of:
            0.046386417 = score(doc=6048,freq=1.0), product of:
              0.116449356 = queryWeight, product of:
                1.6637223 = boost
                3.1867187 = idf(docFreq=4964, maxDocs=44218)
                0.021964055 = queryNorm
              0.39833984 = fieldWeight in 6048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1867187 = idf(docFreq=4964, maxDocs=44218)
                0.125 = fieldNorm(doc=6048)
          0.26823565 = weight(abstract_txt:names in 6048) [ClassicSimilarity], result of:
            0.26823565 = score(doc=6048,freq=2.0), product of:
              0.26012403 = queryWeight, product of:
                2.0302846 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.021964055 = queryNorm
              1.0311837 = fieldWeight in 6048, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.125 = fieldNorm(doc=6048)
          0.31988737 = weight(abstract_txt:catalogs in 6048) [ClassicSimilarity], result of:
            0.31988737 = score(doc=6048,freq=1.0), product of:
              0.42189845 = queryWeight, product of:
                3.166768 = boost
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.021964055 = queryNorm
              0.7582094 = fieldWeight in 6048, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0656753 = idf(docFreq=278, maxDocs=44218)
                0.125 = fieldNorm(doc=6048)
        0.2 = coord(5/25)
  5. Kim, J.; Kim, J.; Owen-Smith, J.: Ethnicity-based name partitioning for author name disambiguation using supervised machine learning (2021) 0.14
    0.14472651 = sum of:
      0.14472651 = product of:
        0.7236326 = sum of:
          0.16959961 = weight(abstract_txt:name in 311) [ClassicSimilarity], result of:
            0.16959961 = score(doc=311,freq=14.0), product of:
              0.12621084 = queryWeight, product of:
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.021964055 = queryNorm
              1.34378 = fieldWeight in 311, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.0625 = fieldNorm(doc=311)
          0.21125804 = weight(abstract_txt:disambiguation in 311) [ClassicSimilarity], result of:
            0.21125804 = score(doc=311,freq=5.0), product of:
              0.20594043 = queryWeight, product of:
                1.277387 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.021964055 = queryNorm
              1.0258211 = fieldWeight in 311, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=311)
          0.023771122 = weight(abstract_txt:some in 311) [ClassicSimilarity], result of:
            0.023771122 = score(doc=311,freq=1.0), product of:
              0.1034108 = queryWeight, product of:
                1.2801169 = boost
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.021964055 = queryNorm
              0.22987078 = fieldWeight in 311, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.0625 = fieldNorm(doc=311)
          0.15474366 = weight(abstract_txt:disambiguate in 311) [ClassicSimilarity], result of:
            0.15474366 = score(doc=311,freq=1.0), product of:
              0.2861528 = queryWeight, product of:
                1.5057424 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.021964055 = queryNorm
              0.5407728 = fieldWeight in 311, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.0625 = fieldNorm(doc=311)
          0.16426012 = weight(abstract_txt:names in 311) [ClassicSimilarity], result of:
            0.16426012 = score(doc=311,freq=3.0), product of:
              0.26012403 = queryWeight, product of:
                2.0302846 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.021964055 = queryNorm
              0.6314685 = fieldWeight in 311, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.0625 = fieldNorm(doc=311)
        0.2 = coord(5/25)