Document (#32950)

Author
Steinberger, J.
Poesio, M.
Kabadjov, M.A.
Jezek, K.
Title
Two uses of anaphora resolution in summarization
Source
Information processing and management. 43(2007) no.6, S.1663-1680
Year
2007
Abstract
We propose a new method for using anaphoric information in Latent Semantic Analysis (lsa), and discuss its application to develop an lsa-based summarizer which achieves a significantly better performance than a system not using anaphoric information, and a better performance by the rouge measure than all but one of the single-document summarizers participating in DUC-2002. Anaphoric information is automatically extracted using a new release of our own anaphora resolution system, guitar, which incorporates proper noun resolution. Our summarizer also includes a new approach for automatically identifying the dimensionality reduction of a document on the basis of the desired summarization percentage. Anaphoric information is also used to check the coherence of the summary produced by our summarizer, by a reference checker module which identifies anaphoric resolution errors caused by sentence extraction.
Theme
Automatisches Abstracting

Similar documents (content)

  1. Wu, D.-S.; Liang, T.: Chinese pronominal anaphora resolution using lexical knowledge and entropy-based weight (2008) 0.25
    0.24681623 = sum of:
      0.24681623 = product of:
        1.5426015 = sum of:
          0.039948784 = weight(abstract_txt:performance in 3367) [ClassicSimilarity], result of:
            0.039948784 = score(doc=3367,freq=1.0), product of:
              0.09223865 = queryWeight, product of:
                1.3001782 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.015356447 = queryNorm
              0.43310243 = fieldWeight in 3367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.09375 = fieldNorm(doc=3367)
          0.0251065 = weight(abstract_txt:using in 3367) [ClassicSimilarity], result of:
            0.0251065 = score(doc=3367,freq=1.0), product of:
              0.077469684 = queryWeight, product of:
                1.4593449 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015356447 = queryNorm
              0.32408163 = fieldWeight in 3367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.09375 = fieldNorm(doc=3367)
          1.0514699 = weight(title_txt:anaphora in 3367) [ClassicSimilarity], result of:
            1.0514699 = score(doc=3367,freq=1.0), product of:
              0.42442015 = queryWeight, product of:
                2.7889736 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.015356447 = queryNorm
              2.477427 = fieldWeight in 3367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.25 = fieldNorm(doc=3367)
          0.4260763 = weight(abstract_txt:resolution in 3367) [ClassicSimilarity], result of:
            0.4260763 = score(doc=3367,freq=2.0), product of:
              0.44692588 = queryWeight, product of:
                4.047428 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.015356447 = queryNorm
              0.9533489 = fieldWeight in 3367, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.09375 = fieldNorm(doc=3367)
        0.16 = coord(4/25)
    
  2. Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.25
    0.246767 = sum of:
      0.246767 = product of:
        0.7711469 = sum of:
          0.021388775 = weight(abstract_txt:document in 3693) [ClassicSimilarity], result of:
            0.021388775 = score(doc=3693,freq=1.0), product of:
              0.07969456 = queryWeight, product of:
                1.2085392 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.015356447 = queryNorm
              0.26838437 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
          0.010023513 = weight(abstract_txt:which in 3693) [ClassicSimilarity], result of:
            0.010023513 = score(doc=3693,freq=1.0), product of:
              0.055040427 = queryWeight, product of:
                1.2300788 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.015356447 = queryNorm
              0.18211183 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
          0.09824232 = weight(abstract_txt:rouge in 3693) [ClassicSimilarity], result of:
            0.09824232 = score(doc=3693,freq=1.0), product of:
              0.17478085 = queryWeight, product of:
                1.2655472 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.015356447 = queryNorm
              0.5620886 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
          0.026632521 = weight(abstract_txt:performance in 3693) [ClassicSimilarity], result of:
            0.026632521 = score(doc=3693,freq=1.0), product of:
              0.09223865 = queryWeight, product of:
                1.3001782 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.015356447 = queryNorm
              0.28873494 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
          0.029155197 = weight(abstract_txt:better in 3693) [ClassicSimilarity], result of:
            0.029155197 = score(doc=3693,freq=1.0), product of:
              0.09797503 = queryWeight, product of:
                1.3399979 = boost
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.015356447 = queryNorm
              0.29757783 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
          0.028990492 = weight(abstract_txt:using in 3693) [ClassicSimilarity], result of:
            0.028990492 = score(doc=3693,freq=3.0), product of:
              0.077469684 = queryWeight, product of:
                1.4593449 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015356447 = queryNorm
              0.37421724 = fieldWeight in 3693, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
          0.23950014 = weight(abstract_txt:summarization in 3693) [ClassicSimilarity], result of:
            0.23950014 = score(doc=3693,freq=6.0), product of:
              0.21951194 = queryWeight, product of:
                2.005744 = boost
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.015356447 = queryNorm
              1.0910574 = fieldWeight in 3693, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
          0.31721398 = weight(abstract_txt:summarizer in 3693) [ClassicSimilarity], result of:
            0.31721398 = score(doc=3693,freq=1.0), product of:
              0.55068517 = queryWeight, product of:
                3.8908434 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.015356447 = queryNorm
              0.5760351 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=3693)
        0.32 = coord(8/25)
    
  3. Haag, M.: Automatic text summarization : Evaluation des Copernic Summarizer und mögliche Einsatzfelder in der Fachinformation der DaimlerCrysler AG (2002) 0.21
    0.21035977 = sum of:
      0.21035977 = product of:
        1.0517988 = sum of:
          0.012529392 = weight(abstract_txt:which in 1649) [ClassicSimilarity], result of:
            0.012529392 = score(doc=1649,freq=1.0), product of:
              0.055040427 = queryWeight, product of:
                1.2300788 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.015356447 = queryNorm
              0.2276398 = fieldWeight in 1649, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.078125 = fieldNorm(doc=1649)
          0.016554208 = weight(abstract_txt:information in 1649) [ClassicSimilarity], result of:
            0.016554208 = score(doc=1649,freq=3.0), product of:
              0.050575472 = queryWeight, product of:
                1.361543 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.015356447 = queryNorm
              0.32731694 = fieldWeight in 1649, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=1649)
          0.05683606 = weight(abstract_txt:automatically in 1649) [ClassicSimilarity], result of:
            0.05683606 = score(doc=1649,freq=1.0), product of:
              0.13175914 = queryWeight, product of:
                1.553949 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.015356447 = queryNorm
              0.43136334 = fieldWeight in 1649, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.078125 = fieldNorm(doc=1649)
          0.1728443 = weight(abstract_txt:summarization in 1649) [ClassicSimilarity], result of:
            0.1728443 = score(doc=1649,freq=2.0), product of:
              0.21951194 = queryWeight, product of:
                2.005744 = boost
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.015356447 = queryNorm
              0.78740275 = fieldWeight in 1649, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.078125 = fieldNorm(doc=1649)
          0.7930349 = weight(abstract_txt:summarizer in 1649) [ClassicSimilarity], result of:
            0.7930349 = score(doc=1649,freq=4.0), product of:
              0.55068517 = queryWeight, product of:
                3.8908434 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.015356447 = queryNorm
              1.4400877 = fieldWeight in 1649, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.078125 = fieldNorm(doc=1649)
        0.2 = coord(5/25)
    
  4. Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.18
    0.17677994 = sum of:
      0.17677994 = product of:
        0.44194984 = sum of:
          0.04920778 = weight(abstract_txt:achieves in 1947) [ClassicSimilarity], result of:
            0.04920778 = score(doc=1947,freq=1.0), product of:
              0.120497644 = queryWeight, product of:
                1.0508015 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.015356447 = queryNorm
              0.4083713 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.013942094 = weight(abstract_txt:than in 1947) [ClassicSimilarity], result of:
            0.013942094 = score(doc=1947,freq=1.0), product of:
              0.06549147 = queryWeight, product of:
                1.0955665 = boost
                3.8927383 = idf(docFreq=2461, maxDocs=44421)
                0.015356447 = queryNorm
              0.21288413 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8927383 = idf(docFreq=2461, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.026467258 = weight(abstract_txt:document in 1947) [ClassicSimilarity], result of:
            0.026467258 = score(doc=1947,freq=2.0), product of:
              0.07969456 = queryWeight, product of:
                1.2085392 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.015356447 = queryNorm
              0.3321087 = fieldWeight in 1947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.008770574 = weight(abstract_txt:which in 1947) [ClassicSimilarity], result of:
            0.008770574 = score(doc=1947,freq=1.0), product of:
              0.055040427 = queryWeight, product of:
                1.2300788 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.015356447 = queryNorm
              0.15934785 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.085962035 = weight(abstract_txt:rouge in 1947) [ClassicSimilarity], result of:
            0.085962035 = score(doc=1947,freq=1.0), product of:
              0.17478085 = queryWeight, product of:
                1.2655472 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.015356447 = queryNorm
              0.49182755 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.032956064 = weight(abstract_txt:performance in 1947) [ClassicSimilarity], result of:
            0.032956064 = score(doc=1947,freq=2.0), product of:
              0.09223865 = queryWeight, product of:
                1.3001782 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.015356447 = queryNorm
              0.35729125 = fieldWeight in 1947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.025510797 = weight(abstract_txt:better in 1947) [ClassicSimilarity], result of:
            0.025510797 = score(doc=1947,freq=1.0), product of:
              0.09797503 = queryWeight, product of:
                1.3399979 = boost
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.015356447 = queryNorm
              0.2603806 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7612453 = idf(docFreq=1032, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.013380608 = weight(abstract_txt:information in 1947) [ClassicSimilarity], result of:
            0.013380608 = score(doc=1947,freq=4.0), product of:
              0.050575472 = queryWeight, product of:
                1.361543 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.015356447 = queryNorm
              0.26456714 = fieldWeight in 1947, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.014645459 = weight(abstract_txt:using in 1947) [ClassicSimilarity], result of:
            0.014645459 = score(doc=1947,freq=1.0), product of:
              0.077469684 = queryWeight, product of:
                1.4593449 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015356447 = queryNorm
              0.18904762 = fieldWeight in 1947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
          0.17110716 = weight(abstract_txt:summarization in 1947) [ClassicSimilarity], result of:
            0.17110716 = score(doc=1947,freq=4.0), product of:
              0.21951194 = queryWeight, product of:
                2.005744 = boost
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.015356447 = queryNorm
              0.77948904 = fieldWeight in 1947, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1947)
        0.4 = coord(10/25)
    
  5. Aker, A.; Gaizauskas, R.: Generating descriptive multi-document summaries of geo-located entities using entity type models (2015) 0.17
    0.17298472 = sum of:
      0.17298472 = product of:
        0.8649236 = sum of:
          0.037046444 = weight(abstract_txt:document in 2726) [ClassicSimilarity], result of:
            0.037046444 = score(doc=2726,freq=3.0), product of:
              0.07969456 = queryWeight, product of:
                1.2085392 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.015356447 = queryNorm
              0.46485534 = fieldWeight in 2726, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=2726)
          0.037426565 = weight(abstract_txt:using in 2726) [ClassicSimilarity], result of:
            0.037426565 = score(doc=2726,freq=5.0), product of:
              0.077469684 = queryWeight, product of:
                1.4593449 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015356447 = queryNorm
              0.4831124 = fieldWeight in 2726, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=2726)
          0.045468852 = weight(abstract_txt:automatically in 2726) [ClassicSimilarity], result of:
            0.045468852 = score(doc=2726,freq=1.0), product of:
              0.13175914 = queryWeight, product of:
                1.553949 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.015356447 = queryNorm
              0.3450907 = fieldWeight in 2726, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.0625 = fieldNorm(doc=2726)
          0.19555102 = weight(abstract_txt:summarization in 2726) [ClassicSimilarity], result of:
            0.19555102 = score(doc=2726,freq=4.0), product of:
              0.21951194 = queryWeight, product of:
                2.005744 = boost
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.015356447 = queryNorm
              0.8908446 = fieldWeight in 2726, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.0625 = fieldNorm(doc=2726)
          0.5494307 = weight(abstract_txt:summarizer in 2726) [ClassicSimilarity], result of:
            0.5494307 = score(doc=2726,freq=3.0), product of:
              0.55068517 = queryWeight, product of:
                3.8908434 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.015356447 = queryNorm
              0.997722 = fieldWeight in 2726, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=2726)
        0.2 = coord(5/25)