Document (#34368)

Author
Wu, D.-S.
Liang, T.
Title
Chinese pronominal anaphora resolution using lexical knowledge and entropy-based weight
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.13, S.2138-2145
Year
2008
Abstract
Pronominal anaphors are commonly observed in written texts. In this article, effective Chinese pronominal anaphora resolution is addressed by using lexical knowledge acquisition and salience measurement. The lexical knowledge acquisition is aimed to extract more semantic features, such as gender, number, and collocate compatibility by employing multiple resources. The presented salience measurement is based on entropy-based weighting on selecting antecedent candidates. The resolution is justified with a real corpus and compared with a rule-based model. Experimental results by five-fold cross-validation show that our approach yields 82.5% success rate on 1343 anaphoric instances. In comparison with a general rule-based approach, the performance is improved by 7%.
Theme
Computerlinguistik

Similar documents (author)

  1. Liang, D.F.: Mathematical journals : an annotated guide (1992) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:liang in 247) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 247, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=247)
    
  2. Liang, L.: R-Sequences : relative indicators for the rhythm of science (2005) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:liang in 4877) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 4877, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=4877)
    
  3. Liang, T.-Y.: ¬The basic entity model : a fundamental theoretical model of information and information processing (1994) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:liang in 82) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 82, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=82)
    
  4. Liang, T.-Y.: ¬The basic entity model : a theoretical model of information processing, decision making and information systems (1996) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:liang in 5476) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 5476, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=5476)
    
  5. Liang, T.-Y.: ¬The basic entity model (1997) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:liang in 826) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 826, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=826)
    

Similar documents (content)

  1. Steinberger, J.; Poesio, M.; Kabadjov, M.A.; Jezek, K.: Two uses of anaphora resolution in summarization (2007) 0.32
    0.31612426 = sum of:
      0.31612426 = product of:
        1.9757767 = sum of:
          0.017929558 = weight(abstract_txt:approach in 1949) [ClassicSimilarity], result of:
            0.017929558 = score(doc=1949,freq=1.0), product of:
              0.061344434 = queryWeight, product of:
                1.0795733 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015188631 = queryNorm
              0.29227686 = fieldWeight in 1949, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.078125 = fieldNorm(doc=1949)
          0.027608134 = weight(abstract_txt:based in 1949) [ClassicSimilarity], result of:
            0.027608134 = score(doc=1949,freq=1.0), product of:
              0.11101972 = queryWeight, product of:
                2.2963316 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.015188631 = queryNorm
              0.24867775 = fieldWeight in 1949, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=1949)
          1.5994849 = weight(title_txt:anaphora in 1949) [ClassicSimilarity], result of:
            1.5994849 = score(doc=1949,freq=1.0), product of:
              0.43041563 = queryWeight, product of:
                2.8596215 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.015188631 = queryNorm
              3.7161405 = fieldWeight in 1949, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.375 = fieldNorm(doc=1949)
          0.33075398 = weight(abstract_txt:resolution in 1949) [ClassicSimilarity], result of:
            0.33075398 = score(doc=1949,freq=3.0), product of:
              0.33992946 = queryWeight, product of:
                3.1124656 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.015188631 = queryNorm
              0.9730077 = fieldWeight in 1949, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.078125 = fieldNorm(doc=1949)
        0.16 = coord(4/25)
    
  2. Alzahrani, S.; Palade, V.; Salim, N.; Abraham, A.: Using structural information and citation evidence to detect significant plagiarism cases in scientific publications (2012) 0.14
    0.13956039 = sum of:
      0.13956039 = product of:
        0.4361262 = sum of:
          0.05749103 = weight(abstract_txt:weighting in 982) [ClassicSimilarity], result of:
            0.05749103 = score(doc=982,freq=2.0), product of:
              0.106588446 = queryWeight, product of:
                1.006247 = boost
                6.9740796 = idf(docFreq=112, maxDocs=44421)
                0.015188631 = queryNorm
              0.53937393 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9740796 = idf(docFreq=112, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.06389931 = weight(abstract_txt:validation in 982) [ClassicSimilarity], result of:
            0.06389931 = score(doc=982,freq=2.0), product of:
              0.11436879 = queryWeight, product of:
                1.0423254 = boost
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.015188631 = queryNorm
              0.55871284 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.09702689 = weight(abstract_txt:weight in 982) [ClassicSimilarity], result of:
            0.09702689 = score(doc=982,freq=4.0), product of:
              0.11992089 = queryWeight, product of:
                1.0673257 = boost
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.015188631 = queryNorm
              0.80909085 = fieldWeight in 982, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.012550691 = weight(abstract_txt:approach in 982) [ClassicSimilarity], result of:
            0.012550691 = score(doc=982,freq=1.0), product of:
              0.061344434 = queryWeight, product of:
                1.0795733 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015188631 = queryNorm
              0.20459381 = fieldWeight in 982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.0111836195 = weight(abstract_txt:with in 982) [ClassicSimilarity], result of:
            0.0111836195 = score(doc=982,freq=4.0), product of:
              0.040963344 = queryWeight, product of:
                1.0804579 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015188631 = queryNorm
              0.2730153 = fieldWeight in 982, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.05848121 = weight(abstract_txt:candidates in 982) [ClassicSimilarity], result of:
            0.05848121 = score(doc=982,freq=1.0), product of:
              0.1358306 = queryWeight, product of:
                1.1359216 = boost
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.015188631 = queryNorm
              0.43054518 = fieldWeight in 982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.10202039 = weight(abstract_txt:fold in 982) [ClassicSimilarity], result of:
            0.10202039 = score(doc=982,freq=2.0), product of:
              0.1562313 = queryWeight, product of:
                1.218242 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.015188631 = queryNorm
              0.65300864 = fieldWeight in 982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
          0.033473086 = weight(abstract_txt:based in 982) [ClassicSimilarity], result of:
            0.033473086 = score(doc=982,freq=3.0), product of:
              0.11101972 = queryWeight, product of:
                2.2963316 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.015188631 = queryNorm
              0.30150574 = fieldWeight in 982, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0546875 = fieldNorm(doc=982)
        0.32 = coord(8/25)
    
  3. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.10
    0.09898219 = sum of:
      0.09898219 = product of:
        0.41242582 = sum of:
          0.028687295 = weight(abstract_txt:approach in 1831) [ClassicSimilarity], result of:
            0.028687295 = score(doc=1831,freq=4.0), product of:
              0.061344434 = queryWeight, product of:
                1.0795733 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015188631 = queryNorm
              0.467643 = fieldWeight in 1831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.011068913 = weight(abstract_txt:with in 1831) [ClassicSimilarity], result of:
            0.011068913 = score(doc=1831,freq=3.0), product of:
              0.040963344 = queryWeight, product of:
                1.0804579 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015188631 = queryNorm
              0.27021506 = fieldWeight in 1831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.05914178 = weight(abstract_txt:yields in 1831) [ClassicSimilarity], result of:
            0.05914178 = score(doc=1831,freq=1.0), product of:
              0.1251954 = queryWeight, product of:
                1.0905453 = boost
                7.558333 = idf(docFreq=62, maxDocs=44421)
                0.015188631 = queryNorm
              0.4723958 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.558333 = idf(docFreq=62, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.13691399 = weight(abstract_txt:chinese in 1831) [ClassicSimilarity], result of:
            0.13691399 = score(doc=1831,freq=4.0), product of:
              0.17389242 = queryWeight, product of:
                1.8176274 = boost
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.015188631 = queryNorm
              0.7873488 = fieldWeight in 1831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.038254954 = weight(abstract_txt:based in 1831) [ClassicSimilarity], result of:
            0.038254954 = score(doc=1831,freq=3.0), product of:
              0.11101972 = queryWeight, product of:
                2.2963316 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.015188631 = queryNorm
              0.344578 = fieldWeight in 1831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.13835889 = weight(abstract_txt:entropy in 1831) [ClassicSimilarity], result of:
            0.13835889 = score(doc=1831,freq=1.0), product of:
              0.27797568 = queryWeight, product of:
                2.2980947 = boost
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.015188631 = queryNorm
              0.49773738 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
        0.24 = coord(6/25)
    
  4. Brychcín, T.; Konopík, M.: HPS: High precision stemmer (2015) 0.10
    0.09537988 = sum of:
      0.09537988 = product of:
        0.39741617 = sum of:
          0.024843926 = weight(abstract_txt:approach in 3686) [ClassicSimilarity], result of:
            0.024843926 = score(doc=3686,freq=3.0), product of:
              0.061344434 = queryWeight, product of:
                1.0795733 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015188631 = queryNorm
              0.4049907 = fieldWeight in 3686, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.009037728 = weight(abstract_txt:with in 3686) [ClassicSimilarity], result of:
            0.009037728 = score(doc=3686,freq=2.0), product of:
              0.040963344 = queryWeight, product of:
                1.0804579 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015188631 = queryNorm
              0.22062966 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.0794754 = weight(abstract_txt:rule in 3686) [ClassicSimilarity], result of:
            0.0794754 = score(doc=3686,freq=1.0), product of:
              0.19208373 = queryWeight, product of:
                1.9103364 = boost
                6.6200633 = idf(docFreq=160, maxDocs=44421)
                0.015188631 = queryNorm
              0.41375396 = fieldWeight in 3686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6200633 = idf(docFreq=160, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.031235037 = weight(abstract_txt:based in 3686) [ClassicSimilarity], result of:
            0.031235037 = score(doc=3686,freq=2.0), product of:
              0.11101972 = queryWeight, product of:
                2.2963316 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.015188631 = queryNorm
              0.28134674 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.13835889 = weight(abstract_txt:entropy in 3686) [ClassicSimilarity], result of:
            0.13835889 = score(doc=3686,freq=1.0), product of:
              0.27797568 = queryWeight, product of:
                2.2980947 = boost
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.015188631 = queryNorm
              0.49773738 = fieldWeight in 3686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.963798 = idf(docFreq=41, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
          0.11446517 = weight(abstract_txt:lexical in 3686) [ClassicSimilarity], result of:
            0.11446517 = score(doc=3686,freq=1.0), product of:
              0.28042373 = queryWeight, product of:
                2.8269463 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.015188631 = queryNorm
              0.40818647 = fieldWeight in 3686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.0625 = fieldNorm(doc=3686)
        0.24 = coord(6/25)
    
  5. Drexel, G.: Knowledge engineering for intelligent information retrieval (2001) 0.09
    0.08982298 = sum of:
      0.08982298 = product of:
        0.37426242 = sum of:
          0.059448384 = weight(abstract_txt:instances in 43) [ClassicSimilarity], result of:
            0.059448384 = score(doc=43,freq=1.0), product of:
              0.10826268 = queryWeight, product of:
                1.014119 = boost
                7.028639 = idf(docFreq=106, maxDocs=44421)
                0.015188631 = queryNorm
              0.54911244 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.028639 = idf(docFreq=106, maxDocs=44421)
                0.078125 = fieldNorm(doc=43)
          0.025356224 = weight(abstract_txt:approach in 43) [ClassicSimilarity], result of:
            0.025356224 = score(doc=43,freq=2.0), product of:
              0.061344434 = queryWeight, product of:
                1.0795733 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015188631 = queryNorm
              0.41334188 = fieldWeight in 43, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.078125 = fieldNorm(doc=43)
          0.007988299 = weight(abstract_txt:with in 43) [ClassicSimilarity], result of:
            0.007988299 = score(doc=43,freq=1.0), product of:
              0.040963344 = queryWeight, product of:
                1.0804579 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015188631 = queryNorm
              0.19501092 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.078125 = fieldNorm(doc=43)
          0.09934425 = weight(abstract_txt:rule in 43) [ClassicSimilarity], result of:
            0.09934425 = score(doc=43,freq=1.0), product of:
              0.19208373 = queryWeight, product of:
                1.9103364 = boost
                6.6200633 = idf(docFreq=160, maxDocs=44421)
                0.015188631 = queryNorm
              0.5171924 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6200633 = idf(docFreq=160, maxDocs=44421)
                0.078125 = fieldNorm(doc=43)
          0.0390438 = weight(abstract_txt:based in 43) [ClassicSimilarity], result of:
            0.0390438 = score(doc=43,freq=2.0), product of:
              0.11101972 = queryWeight, product of:
                2.2963316 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.015188631 = queryNorm
              0.35168344 = fieldWeight in 43, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=43)
          0.14308147 = weight(abstract_txt:lexical in 43) [ClassicSimilarity], result of:
            0.14308147 = score(doc=43,freq=1.0), product of:
              0.28042373 = queryWeight, product of:
                2.8269463 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.015188631 = queryNorm
              0.5102331 = fieldWeight in 43, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.078125 = fieldNorm(doc=43)
        0.24 = coord(6/25)