Document (#34451)

Author
Otterbacher, J.
Erkan, G.
Radev, D.R.
Title
Biased LexRank : passage retrieval using random walks with question-based priors
Source
Information processing and management. 45(2009) no.1, S.42-54
Year
2009
Abstract
We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user's natural language question. We then perform a random walk on the lexical similarity graph in order to recursively retrieve additional passages that are similar to other relevant passages. We present results on several benchmarks that show the applicability of our work to question answering and topic-focused text summarization.
Theme
Retrievalalgorithmen

Similar documents (author)

  1. Otterbacher, J.; Radev, D.: Exploring fact-focused relevance and novelty detection (2008) 4.57
    4.5682592 = sum of:
      4.5682592 = weight(author_txt:radev in 3210) [ClassicSimilarity], result of:
        4.5682592 = fieldWeight in 3210, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.5 = fieldNorm(doc=3210)
    
  2. Finegan-Dollak, C.; Radev, D.R.: Sentence simplification, compression, and disaggregation for summarization of sophisticated documents (2016) 4.00
    3.9972267 = sum of:
      3.9972267 = weight(author_txt:radev in 4122) [ClassicSimilarity], result of:
        3.9972267 = fieldWeight in 4122, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.4375 = fieldNorm(doc=4122)
    
  3. Radev, D.R.; Libner, K.; Fan, W.: Getting answers to natural language questions on the Web (2002) 3.43
    3.4261944 = sum of:
      3.4261944 = weight(author_txt:radev in 204) [ClassicSimilarity], result of:
        3.4261944 = fieldWeight in 204, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.375 = fieldNorm(doc=204)
    
  4. Otterbacher, J.; Radev, D.; Kareem, O.: Hierarchical summarization for delivering information to mobile devices (2008) 3.43
    3.4261944 = sum of:
      3.4261944 = weight(author_txt:radev in 3071) [ClassicSimilarity], result of:
        3.4261944 = fieldWeight in 3071, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.375 = fieldNorm(doc=3071)
    
  5. Lam, W.; Chan, K.; Radev, D.; Saggion, H.; Teufel, S.: Context-based generic cross-lingual retrieval of documents and automated summaries (2005) 2.86
    2.8551621 = sum of:
      2.8551621 = weight(author_txt:radev in 2965) [ClassicSimilarity], result of:
        2.8551621 = fieldWeight in 2965, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.3125 = fieldNorm(doc=2965)
    

Similar documents (content)

  1. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.67
    0.6702599 = sum of:
      0.6702599 = product of:
        1.861833 = sum of:
          0.007424922 = weight(abstract_txt:based in 3765) [ClassicSimilarity], result of:
            0.007424922 = score(doc=3765,freq=1.0), product of:
              0.04265372 = queryWeight, product of:
                1.0392835 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.012893653 = queryNorm
              0.17407443 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.009135454 = weight(abstract_txt:that in 3765) [ClassicSimilarity], result of:
            0.009135454 = score(doc=3765,freq=4.0), product of:
              0.035317734 = queryWeight, product of:
                1.1582376 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012893653 = queryNorm
              0.2586648 = fieldWeight in 3765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.030381536 = weight(abstract_txt:text in 3765) [ClassicSimilarity], result of:
            0.030381536 = score(doc=3765,freq=4.0), product of:
              0.06874094 = queryWeight, product of:
                1.31936 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.012893653 = queryNorm
              0.44197148 = fieldWeight in 3765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.018895935 = weight(abstract_txt:present in 3765) [ClassicSimilarity], result of:
            0.018895935 = score(doc=3765,freq=1.0), product of:
              0.07950747 = queryWeight, product of:
                1.4189254 = boost
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.012893653 = queryNorm
              0.23766239 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.022848854 = weight(abstract_txt:relevant in 3765) [ClassicSimilarity], result of:
            0.022848854 = score(doc=3765,freq=1.0), product of:
              0.09024128 = queryWeight, product of:
                1.5116743 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.012893653 = queryNorm
              0.25319734 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.03244548 = weight(abstract_txt:retrieval in 3765) [ClassicSimilarity], result of:
            0.03244548 = score(doc=3765,freq=5.0), product of:
              0.07632009 = queryWeight, product of:
                1.7026315 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.012893653 = queryNorm
              0.42512372 = fieldWeight in 3765, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.045342438 = weight(abstract_txt:similarity in 3765) [ClassicSimilarity], result of:
            0.045342438 = score(doc=3765,freq=1.0), product of:
              0.1425057 = queryWeight, product of:
                1.8996418 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.012893653 = queryNorm
              0.31817982 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.72179365 = weight(abstract_txt:passage in 3765) [ClassicSimilarity], result of:
            0.72179365 = score(doc=3765,freq=14.0), product of:
              0.42831054 = queryWeight, product of:
                4.0334864 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.012893653 = queryNorm
              1.6852111 = fieldWeight in 3765, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.97356474 = weight(abstract_txt:passages in 3765) [ClassicSimilarity], result of:
            0.97356474 = score(doc=3765,freq=14.0), product of:
              0.5754923 = queryWeight, product of:
                5.3987145 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.012893653 = queryNorm
              1.6917076 = fieldWeight in 3765, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
        0.36 = coord(9/25)
    
  2. Kaszkiel, M.; Zobel, J.: Effective ranking with arbitrary passages (2001) 0.25
    0.2503208 = sum of:
      0.2503208 = product of:
        1.0430033 = sum of:
          0.007382562 = weight(abstract_txt:that in 6764) [ClassicSimilarity], result of:
            0.007382562 = score(doc=6764,freq=2.0), product of:
              0.035317734 = queryWeight, product of:
                1.1582376 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012893653 = queryNorm
              0.20903271 = fieldWeight in 6764, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=6764)
          0.030069921 = weight(abstract_txt:text in 6764) [ClassicSimilarity], result of:
            0.030069921 = score(doc=6764,freq=3.0), product of:
              0.06874094 = queryWeight, product of:
                1.31936 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.012893653 = queryNorm
              0.4374383 = fieldWeight in 6764, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=6764)
          0.026112976 = weight(abstract_txt:relevant in 6764) [ClassicSimilarity], result of:
            0.026112976 = score(doc=6764,freq=1.0), product of:
              0.09024128 = queryWeight, product of:
                1.5116743 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.012893653 = queryNorm
              0.2893684 = fieldWeight in 6764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0625 = fieldNorm(doc=6764)
          0.023451796 = weight(abstract_txt:retrieval in 6764) [ClassicSimilarity], result of:
            0.023451796 = score(doc=6764,freq=2.0), product of:
              0.07632009 = queryWeight, product of:
                1.7026315 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.012893653 = queryNorm
              0.3072821 = fieldWeight in 6764, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=6764)
          0.44093135 = weight(abstract_txt:passage in 6764) [ClassicSimilarity], result of:
            0.44093135 = score(doc=6764,freq=4.0), product of:
              0.42831054 = queryWeight, product of:
                4.0334864 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.012893653 = queryNorm
              1.0294665 = fieldWeight in 6764, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.0625 = fieldNorm(doc=6764)
          0.5150547 = weight(abstract_txt:passages in 6764) [ClassicSimilarity], result of:
            0.5150547 = score(doc=6764,freq=3.0), product of:
              0.5754923 = queryWeight, product of:
                5.3987145 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.012893653 = queryNorm
              0.894981 = fieldWeight in 6764, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0625 = fieldNorm(doc=6764)
        0.24 = coord(6/25)
    
  3. Melucci, M.: Passage retrieval : a probabilistic technique (1998) 0.24
    0.2421336 = sum of:
      0.2421336 = product of:
        1.00889 = sum of:
          0.015000607 = weight(abstract_txt:based in 2150) [ClassicSimilarity], result of:
            0.015000607 = score(doc=2150,freq=2.0), product of:
              0.04265372 = queryWeight, product of:
                1.0392835 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.012893653 = queryNorm
              0.35168344 = fieldWeight in 2150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=2150)
          0.0092282025 = weight(abstract_txt:that in 2150) [ClassicSimilarity], result of:
            0.0092282025 = score(doc=2150,freq=2.0), product of:
              0.035317734 = queryWeight, product of:
                1.1582376 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012893653 = queryNorm
              0.2612909 = fieldWeight in 2150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=2150)
          0.04852513 = weight(abstract_txt:text in 2150) [ClassicSimilarity], result of:
            0.04852513 = score(doc=2150,freq=5.0), product of:
              0.06874094 = queryWeight, product of:
                1.31936 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.012893653 = queryNorm
              0.70591307 = fieldWeight in 2150, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=2150)
          0.020728655 = weight(abstract_txt:retrieval in 2150) [ClassicSimilarity], result of:
            0.020728655 = score(doc=2150,freq=1.0), product of:
              0.07632009 = queryWeight, product of:
                1.7026315 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.012893653 = queryNorm
              0.27160156 = fieldWeight in 2150, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=2150)
          0.38973194 = weight(abstract_txt:passage in 2150) [ClassicSimilarity], result of:
            0.38973194 = score(doc=2150,freq=2.0), product of:
              0.42831054 = queryWeight, product of:
                4.0334864 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.012893653 = queryNorm
              0.90992844 = fieldWeight in 2150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.078125 = fieldNorm(doc=2150)
          0.52567554 = weight(abstract_txt:passages in 2150) [ClassicSimilarity], result of:
            0.52567554 = score(doc=2150,freq=2.0), product of:
              0.5754923 = queryWeight, product of:
                5.3987145 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.012893653 = queryNorm
              0.9134362 = fieldWeight in 2150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.078125 = fieldNorm(doc=2150)
        0.24 = coord(6/25)
    
  4. Landauer, T.K.; Foltz, P.W.; Laham, D.: ¬An introduction to Latent Semantic Analysis (1998) 0.23
    0.2277035 = sum of:
      0.2277035 = product of:
        0.94876456 = sum of:
          0.0092282025 = weight(abstract_txt:that in 2162) [ClassicSimilarity], result of:
            0.0092282025 = score(doc=2162,freq=2.0), product of:
              0.035317734 = queryWeight, product of:
                1.1582376 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012893653 = queryNorm
              0.2612909 = fieldWeight in 2162, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=2162)
          0.021701097 = weight(abstract_txt:text in 2162) [ClassicSimilarity], result of:
            0.021701097 = score(doc=2162,freq=1.0), product of:
              0.06874094 = queryWeight, product of:
                1.31936 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.012893653 = queryNorm
              0.3156939 = fieldWeight in 2162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=2162)
          0.064774916 = weight(abstract_txt:similarity in 2162) [ClassicSimilarity], result of:
            0.064774916 = score(doc=2162,freq=1.0), product of:
              0.1425057 = queryWeight, product of:
                1.8996418 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.012893653 = queryNorm
              0.4545426 = fieldWeight in 2162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.078125 = fieldNorm(doc=2162)
          0.09161971 = weight(abstract_txt:lexical in 2162) [ClassicSimilarity], result of:
            0.09161971 = score(doc=2162,freq=1.0), product of:
              0.17956442 = queryWeight, product of:
                2.1323855 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.012893653 = queryNorm
              0.5102331 = fieldWeight in 2162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.078125 = fieldNorm(doc=2162)
          0.38973194 = weight(abstract_txt:passage in 2162) [ClassicSimilarity], result of:
            0.38973194 = score(doc=2162,freq=2.0), product of:
              0.42831054 = queryWeight, product of:
                4.0334864 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.012893653 = queryNorm
              0.90992844 = fieldWeight in 2162, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.078125 = fieldNorm(doc=2162)
          0.37170872 = weight(abstract_txt:passages in 2162) [ClassicSimilarity], result of:
            0.37170872 = score(doc=2162,freq=1.0), product of:
              0.5754923 = queryWeight, product of:
                5.3987145 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.012893653 = queryNorm
              0.6458969 = fieldWeight in 2162, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.078125 = fieldNorm(doc=2162)
        0.24 = coord(6/25)
    
  5. Salton, G.: Automatic text structuring and summarization (1997) 0.22
    0.21621451 = sum of:
      0.21621451 = product of:
        0.9008938 = sum of:
          0.045355808 = weight(abstract_txt:perform in 1145) [ClassicSimilarity], result of:
            0.045355808 = score(doc=1145,freq=1.0), product of:
              0.078980304 = queryWeight, product of:
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.012893653 = queryNorm
              0.5742673 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1255183 = idf(docFreq=263, maxDocs=44421)
                0.09375 = fieldNorm(doc=1145)
          0.012728438 = weight(abstract_txt:based in 1145) [ClassicSimilarity], result of:
            0.012728438 = score(doc=1145,freq=1.0), product of:
              0.04265372 = queryWeight, product of:
                1.0392835 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.012893653 = queryNorm
              0.2984133 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.09375 = fieldNorm(doc=1145)
          0.00783039 = weight(abstract_txt:that in 1145) [ClassicSimilarity], result of:
            0.00783039 = score(doc=1145,freq=1.0), product of:
              0.035317734 = queryWeight, product of:
                1.1582376 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.012893653 = queryNorm
              0.22171268 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.09375 = fieldNorm(doc=1145)
          0.058230158 = weight(abstract_txt:text in 1145) [ClassicSimilarity], result of:
            0.058230158 = score(doc=1145,freq=5.0), product of:
              0.06874094 = queryWeight, product of:
                1.31936 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.012893653 = queryNorm
              0.8470957 = fieldWeight in 1145, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.09375 = fieldNorm(doc=1145)
          0.33069852 = weight(abstract_txt:passage in 1145) [ClassicSimilarity], result of:
            0.33069852 = score(doc=1145,freq=1.0), product of:
              0.42831054 = queryWeight, product of:
                4.0334864 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.012893653 = queryNorm
              0.77209985 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.09375 = fieldNorm(doc=1145)
          0.44605047 = weight(abstract_txt:passages in 1145) [ClassicSimilarity], result of:
            0.44605047 = score(doc=1145,freq=1.0), product of:
              0.5754923 = queryWeight, product of:
                5.3987145 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.012893653 = queryNorm
              0.7750763 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.09375 = fieldNorm(doc=1145)
        0.24 = coord(6/25)