Document (#40267)

Author
Zhao, G.
Wu, J.
Wang, D.
Li, T.
Title
Entity disambiguation to Wikipedia using collective ranking
Source
Information processing and management. 52(2016) no.6, S.1247-1257
Year
2016
Abstract
Entity disambiguation is a fundamental task of semantic Web annotation. Entity Linking (EL) is an essential procedure in entity disambiguation, which aims to link a mention appearing in a plain text to a structured or semi-structured knowledge base, such as Wikipedia. Existing research on EL usually annotates the mentions in a text one by one and treats entities independent to each other. However this might not be true in many application scenarios. For example, if two mentions appear in one text, they are likely to have certain intrinsic relationships. In this paper, we first propose a novel query expansion method for candidate generation utilizing the information of co-occurrences of mentions. We further propose a re-ranking model which can be iteratively adjusted based on the prediction in the previous round. Experiments on real-world data demonstrate the effectiveness of our proposed methods for entity disambiguation.
Content
Vgl.: http://www.sciencedirect.com/science/article/pii/S030645731630098X.

Similar documents (author)

  1. Wang, X.; High, A.; Wang, X.; Zhao, K.: Predicting users' continued engagement in online health communities from the quantity and quality of received support (2021) 3.61
    3.6074152 = sum of:
      3.6074152 = sum of:
        1.7613496 = weight(author_txt:wang in 1243) [ClassicSimilarity], result of:
          1.7613496 = score(doc=1243,freq=2.0), product of:
            0.60970986 = queryWeight, product of:
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.09327513 = queryNorm
            2.8888323 = fieldWeight in 1243, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.3125 = fieldNorm(doc=1243)
        1.8460655 = weight(author_txt:zhao in 1243) [ClassicSimilarity], result of:
          1.8460655 = score(doc=1243,freq=1.0), product of:
            0.79262465 = queryWeight, product of:
              1.1401768 = boost
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.09327513 = queryNorm
            2.3290539 = fieldWeight in 1243, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.3125 = fieldNorm(doc=1243)
    
  2. Wang, C.; Zhao, S.; Kalra, A.; Borcea, C.; Chen, Y.: Predictive models and analysis for webpage depth-level dwell time (2018) 3.09
    3.0915277 = sum of:
      3.0915277 = sum of:
        1.2454622 = weight(author_txt:wang in 370) [ClassicSimilarity], result of:
          1.2454622 = score(doc=370,freq=1.0), product of:
            0.60970986 = queryWeight, product of:
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.09327513 = queryNorm
            2.042713 = fieldWeight in 370, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.3125 = fieldNorm(doc=370)
        1.8460655 = weight(author_txt:zhao in 370) [ClassicSimilarity], result of:
          1.8460655 = score(doc=370,freq=1.0), product of:
            0.79262465 = queryWeight, product of:
              1.1401768 = boost
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.09327513 = queryNorm
            2.3290539 = fieldWeight in 370, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.3125 = fieldNorm(doc=370)
    
  3. Wang, X.; Zhang, M.; Fan, W.; Zhao, K.: Understanding the spread of COVID-19 misinformation on social media : the effects of topics and a political leader's nudge (2022) 3.09
    3.0915277 = sum of:
      3.0915277 = sum of:
        1.2454622 = weight(author_txt:wang in 1550) [ClassicSimilarity], result of:
          1.2454622 = score(doc=1550,freq=1.0), product of:
            0.60970986 = queryWeight, product of:
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.09327513 = queryNorm
            2.042713 = fieldWeight in 1550, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.3125 = fieldNorm(doc=1550)
        1.8460655 = weight(author_txt:zhao in 1550) [ClassicSimilarity], result of:
          1.8460655 = score(doc=1550,freq=1.0), product of:
            0.79262465 = queryWeight, product of:
              1.1401768 = boost
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.09327513 = queryNorm
            2.3290539 = fieldWeight in 1550, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.3125 = fieldNorm(doc=1550)
    
  4. Chen, K.; Zhao, Y.; Song, N.; Han, Y.; Peng, J.; Wang, J.: You are not alone: : characterizing users' relationship-layer identities in online health communities (2024) 2.47
    2.4732223 = sum of:
      2.4732223 = sum of:
        0.99636984 = weight(author_txt:wang in 2300) [ClassicSimilarity], result of:
          0.99636984 = score(doc=2300,freq=1.0), product of:
            0.60970986 = queryWeight, product of:
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.09327513 = queryNorm
            1.6341704 = fieldWeight in 2300, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              6.5366817 = idf(docFreq=174, maxDocs=44421)
              0.25 = fieldNorm(doc=2300)
        1.4768524 = weight(author_txt:zhao in 2300) [ClassicSimilarity], result of:
          1.4768524 = score(doc=2300,freq=1.0), product of:
            0.79262465 = queryWeight, product of:
              1.1401768 = boost
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.09327513 = queryNorm
            1.8632431 = fieldWeight in 2300, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.4529724 = idf(docFreq=69, maxDocs=44421)
              0.25 = fieldNorm(doc=2300)
    
  5. Zhao, L.: Save space for "newcomers" : analyzing problems in book number assignment under the LCC system (2004) 1.85
    1.8460655 = sum of:
      1.8460655 = product of:
        3.692131 = sum of:
          3.692131 = weight(author_txt:zhao in 4081) [ClassicSimilarity], result of:
            3.692131 = score(doc=4081,freq=1.0), product of:
              0.79262465 = queryWeight, product of:
                1.1401768 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.09327513 = queryNorm
              4.6581078 = fieldWeight in 4081, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.625 = fieldNorm(doc=4081)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Phan, M.C.; Sun, A.: Collective named entity recognition in user comments via parameterized label propagation (2020) 0.23
    0.2327526 = sum of:
      0.2327526 = product of:
        0.8312593 = sum of:
          0.045805812 = weight(abstract_txt:collective in 815) [ClassicSimilarity], result of:
            0.045805812 = score(doc=815,freq=1.0), product of:
              0.10650679 = queryWeight, product of:
                1.0384668 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.014904636 = queryNorm
              0.43007413 = fieldWeight in 815, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0625 = fieldNorm(doc=815)
          0.047686055 = weight(abstract_txt:utilizing in 815) [ClassicSimilarity], result of:
            0.047686055 = score(doc=815,freq=1.0), product of:
              0.1094018 = queryWeight, product of:
                1.0524857 = boost
                6.9740796 = idf(docFreq=112, maxDocs=44421)
                0.014904636 = queryNorm
              0.43587998 = fieldWeight in 815, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9740796 = idf(docFreq=112, maxDocs=44421)
                0.0625 = fieldNorm(doc=815)
          0.09120706 = weight(abstract_txt:mention in 815) [ClassicSimilarity], result of:
            0.09120706 = score(doc=815,freq=2.0), product of:
              0.13379478 = queryWeight, product of:
                1.1639211 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.014904636 = queryNorm
              0.6816937 = fieldWeight in 815, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.0625 = fieldNorm(doc=815)
          0.038222536 = weight(abstract_txt:propose in 815) [ClassicSimilarity], result of:
            0.038222536 = score(doc=815,freq=1.0), product of:
              0.1189378 = queryWeight, product of:
                1.5519543 = boost
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.014904636 = queryNorm
              0.32136577 = fieldWeight in 815, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.0625 = fieldNorm(doc=815)
          0.039354596 = weight(abstract_txt:text in 815) [ClassicSimilarity], result of:
            0.039354596 = score(doc=815,freq=2.0), product of:
              0.11018546 = queryWeight, product of:
                1.8294761 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014904636 = queryNorm
              0.3571669 = fieldWeight in 815, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=815)
          0.3237049 = weight(abstract_txt:mentions in 815) [ClassicSimilarity], result of:
            0.3237049 = score(doc=815,freq=3.0), product of:
              0.39222 = queryWeight, product of:
                3.4516716 = boost
                7.62393 = idf(docFreq=58, maxDocs=44421)
                0.014904636 = queryNorm
              0.82531464 = fieldWeight in 815, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.62393 = idf(docFreq=58, maxDocs=44421)
                0.0625 = fieldNorm(doc=815)
          0.24527834 = weight(abstract_txt:entity in 815) [ClassicSimilarity], result of:
            0.24527834 = score(doc=815,freq=2.0), product of:
              0.4424352 = queryWeight, product of:
                4.732753 = boost
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.014904636 = queryNorm
              0.5543825 = fieldWeight in 815, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.0625 = fieldNorm(doc=815)
        0.28 = coord(7/25)
    
  2. Li, C.; Sun, A.; Datta, A.: TSDW: Two-stage word sense disambiguation using Wikipedia (2013) 0.15
    0.1498903 = sum of:
      0.1498903 = product of:
        0.9368144 = sum of:
          0.039354596 = weight(abstract_txt:text in 1956) [ClassicSimilarity], result of:
            0.039354596 = score(doc=1956,freq=2.0), product of:
              0.11018546 = queryWeight, product of:
                1.8294761 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014904636 = queryNorm
              0.3571669 = fieldWeight in 1956, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1956)
          0.13759938 = weight(abstract_txt:wikipedia in 1956) [ClassicSimilarity], result of:
            0.13759938 = score(doc=1956,freq=4.0), product of:
              0.17599401 = queryWeight, product of:
                1.8878518 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.014904636 = queryNorm
              0.7818413 = fieldWeight in 1956, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.0625 = fieldNorm(doc=1956)
          0.5864225 = weight(abstract_txt:disambiguation in 1956) [ClassicSimilarity], result of:
            0.5864225 = score(doc=1956,freq=7.0), product of:
              0.48367983 = queryWeight, product of:
                4.426016 = boost
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.014904636 = queryNorm
              1.2124188 = fieldWeight in 1956, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.33202 = idf(docFreq=78, maxDocs=44421)
                0.0625 = fieldNorm(doc=1956)
          0.17343797 = weight(abstract_txt:entity in 1956) [ClassicSimilarity], result of:
            0.17343797 = score(doc=1956,freq=1.0), product of:
              0.4424352 = queryWeight, product of:
                4.732753 = boost
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.014904636 = queryNorm
              0.39200762 = fieldWeight in 1956, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.0625 = fieldNorm(doc=1956)
        0.16 = coord(4/25)
    
  3. Gao, N.; Dredze, M.; Oard, D.W.: Person entity linking in email with NIL detection (2017) 0.12
    0.11993801 = sum of:
      0.11993801 = product of:
        0.74961257 = sum of:
          0.06449313 = weight(abstract_txt:mention in 4830) [ClassicSimilarity], result of:
            0.06449313 = score(doc=4830,freq=1.0), product of:
              0.13379478 = queryWeight, product of:
                1.1639211 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.014904636 = queryNorm
              0.4820302 = fieldWeight in 4830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
          0.039354596 = weight(abstract_txt:text in 4830) [ClassicSimilarity], result of:
            0.039354596 = score(doc=4830,freq=2.0), product of:
              0.11018546 = queryWeight, product of:
                1.8294761 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014904636 = queryNorm
              0.3571669 = fieldWeight in 4830, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
          0.18689111 = weight(abstract_txt:mentions in 4830) [ClassicSimilarity], result of:
            0.18689111 = score(doc=4830,freq=1.0), product of:
              0.39222 = queryWeight, product of:
                3.4516716 = boost
                7.62393 = idf(docFreq=58, maxDocs=44421)
                0.014904636 = queryNorm
              0.47649562 = fieldWeight in 4830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.62393 = idf(docFreq=58, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
          0.45887375 = weight(abstract_txt:entity in 4830) [ClassicSimilarity], result of:
            0.45887375 = score(doc=4830,freq=7.0), product of:
              0.4424352 = queryWeight, product of:
                4.732753 = boost
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.014904636 = queryNorm
              1.0371547 = fieldWeight in 4830, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.0625 = fieldNorm(doc=4830)
        0.16 = coord(4/25)
    
  4. Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.10
    0.10276228 = sum of:
      0.10276228 = product of:
        0.5138114 = sum of:
          0.028666902 = weight(abstract_txt:propose in 820) [ClassicSimilarity], result of:
            0.028666902 = score(doc=820,freq=1.0), product of:
              0.1189378 = queryWeight, product of:
                1.5519543 = boost
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.014904636 = queryNorm
              0.24102433 = fieldWeight in 820, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.046875 = fieldNorm(doc=820)
          0.047708772 = weight(abstract_txt:structured in 820) [ClassicSimilarity], result of:
            0.047708772 = score(doc=820,freq=2.0), product of:
              0.13257293 = queryWeight, product of:
                1.6384999 = boost
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.014904636 = queryNorm
              0.3598681 = fieldWeight in 820, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.046875 = fieldNorm(doc=820)
          0.08266032 = weight(abstract_txt:ranking in 820) [ClassicSimilarity], result of:
            0.08266032 = score(doc=820,freq=5.0), product of:
              0.14090966 = queryWeight, product of:
                1.6892322 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.014904636 = queryNorm
              0.58661926 = fieldWeight in 820, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.046875 = fieldNorm(doc=820)
          0.036149506 = weight(abstract_txt:text in 820) [ClassicSimilarity], result of:
            0.036149506 = score(doc=820,freq=3.0), product of:
              0.11018546 = queryWeight, product of:
                1.8294761 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014904636 = queryNorm
              0.32807875 = fieldWeight in 820, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.046875 = fieldNorm(doc=820)
          0.3186259 = weight(abstract_txt:entity in 820) [ClassicSimilarity], result of:
            0.3186259 = score(doc=820,freq=6.0), product of:
              0.4424352 = queryWeight, product of:
                4.732753 = boost
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.014904636 = queryNorm
              0.720164 = fieldWeight in 820, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.046875 = fieldNorm(doc=820)
        0.2 = coord(5/25)
    
  5. Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.09
    0.09110254 = sum of:
      0.09110254 = product of:
        0.7591879 = sum of:
          0.13781753 = weight(abstract_txt:candidate in 3733) [ClassicSimilarity], result of:
            0.13781753 = score(doc=3733,freq=4.0), product of:
              0.1205054 = queryWeight, product of:
                1.1046056 = boost
                7.319441 = idf(docFreq=79, maxDocs=44421)
                0.014904636 = queryNorm
              1.1436627 = fieldWeight in 3733, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.319441 = idf(docFreq=79, maxDocs=44421)
                0.078125 = fieldNorm(doc=3733)
          0.047778174 = weight(abstract_txt:propose in 3733) [ClassicSimilarity], result of:
            0.047778174 = score(doc=3733,freq=1.0), product of:
              0.1189378 = queryWeight, product of:
                1.5519543 = boost
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.014904636 = queryNorm
              0.40170723 = fieldWeight in 3733, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.078125 = fieldNorm(doc=3733)
          0.5735922 = weight(abstract_txt:entity in 3733) [ClassicSimilarity], result of:
            0.5735922 = score(doc=3733,freq=7.0), product of:
              0.4424352 = queryWeight, product of:
                4.732753 = boost
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.014904636 = queryNorm
              1.2964433 = fieldWeight in 3733, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.078125 = fieldNorm(doc=3733)
        0.12 = coord(3/25)