Document (#15668)

Author
Goh, A.
Hui, S.C.
Title
TES: a text extraction system
Source
Microcomputers for information management. 13(1996) no.1, S.41-55
Year
1996
Abstract
With the onset of the information explosion arising from digital libraries and access to a wealth of information through the Internet, the need to efficiently determine the relevance of a document becomes even more urgent. Describes a text extraction system (TES), which retrieves a set of sentences from a document to form an indicative abstract. Such an automated process enables information to be filtered more quickly. Discusses the combination of various text extraction techniques. Compares results with manually produced abstracts
Theme
Automatisches Abstracting
Object
TES

Similar documents (content)

  1. Goh, A.; Hui, S.C.; Chan, S.K.: ¬A text extraction system for news reports (1996) 0.26
    0.25972795 = sum of:
      0.25972795 = product of:
        0.9275998 = sum of:
          0.09877386 = weight(abstract_txt:abstracts in 6669) [ClassicSimilarity], result of:
            0.09877386 = score(doc=6669,freq=4.0), product of:
              0.13254511 = queryWeight, product of:
                1.0643665 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.020888356 = queryNorm
              0.74520934 = fieldWeight in 6669, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.09672765 = weight(abstract_txt:manually in 6669) [ClassicSimilarity], result of:
            0.09672765 = score(doc=6669,freq=2.0), product of:
              0.164682 = queryWeight, product of:
                1.1864034 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.020888356 = queryNorm
              0.58736014 = fieldWeight in 6669, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.025293814 = weight(abstract_txt:system in 6669) [ClassicSimilarity], result of:
            0.025293814 = score(doc=6669,freq=2.0), product of:
              0.08484611 = queryWeight, product of:
                1.2043155 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.020888356 = queryNorm
              0.298114 = fieldWeight in 6669, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.11310974 = weight(abstract_txt:sentences in 6669) [ClassicSimilarity], result of:
            0.11310974 = score(doc=6669,freq=2.0), product of:
              0.18278718 = queryWeight, product of:
                1.2499199 = boost
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.020888356 = queryNorm
              0.61880565 = fieldWeight in 6669, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.17634416 = weight(abstract_txt:indicative in 6669) [ClassicSimilarity], result of:
            0.17634416 = score(doc=6669,freq=2.0), product of:
              0.24576485 = queryWeight, product of:
                1.4493364 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.020888356 = queryNorm
              0.71753204 = fieldWeight in 6669, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.04613781 = weight(abstract_txt:text in 6669) [ClassicSimilarity], result of:
            0.04613781 = score(doc=6669,freq=1.0), product of:
              0.18268411 = queryWeight, product of:
                2.1643143 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.020888356 = queryNorm
              0.25255513 = fieldWeight in 6669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.37121278 = weight(abstract_txt:extraction in 6669) [ClassicSimilarity], result of:
            0.37121278 = score(doc=6669,freq=5.0), product of:
              0.42896456 = queryWeight, product of:
                3.316505 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.020888356 = queryNorm
              0.8653694 = fieldWeight in 6669, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
        0.28 = coord(7/25)
    
  2. Barrio, P.; Gravano, L.: Sampling strategies for information extraction over the deep web (2017) 0.14
    0.13727076 = sum of:
      0.13727076 = product of:
        0.6863538 = sum of:
          0.04705007 = weight(abstract_txt:enables in 4412) [ClassicSimilarity], result of:
            0.04705007 = score(doc=4412,freq=1.0), product of:
              0.14027831 = queryWeight, product of:
                1.094976 = boost
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.020888356 = queryNorm
              0.33540517 = fieldWeight in 4412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.024492605 = weight(abstract_txt:information in 4412) [ClassicSimilarity], result of:
            0.024492605 = score(doc=4412,freq=8.0), product of:
              0.06546122 = queryWeight, product of:
                1.2955734 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020888356 = queryNorm
              0.37415442 = fieldWeight in 4412, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.07222072 = weight(abstract_txt:document in 4412) [ClassicSimilarity], result of:
            0.07222072 = score(doc=4412,freq=5.0), product of:
              0.13753447 = queryWeight, product of:
                1.5333104 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.020888356 = queryNorm
              0.52510995 = fieldWeight in 4412, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.10681052 = weight(abstract_txt:text in 4412) [ClassicSimilarity], result of:
            0.10681052 = score(doc=4412,freq=7.0), product of:
              0.18268411 = queryWeight, product of:
                2.1643143 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.020888356 = queryNorm
              0.5846733 = fieldWeight in 4412, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.43577993 = weight(abstract_txt:extraction in 4412) [ClassicSimilarity], result of:
            0.43577993 = score(doc=4412,freq=9.0), product of:
              0.42896456 = queryWeight, product of:
                3.316505 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.020888356 = queryNorm
              1.015888 = fieldWeight in 4412, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
        0.2 = coord(5/25)
    
  3. Reeve, L.H.; Han, H.; Brooks, A.D.: ¬The use of domain-specific concepts in biomedical text summarization (2007) 0.12
    0.12220202 = sum of:
      0.12220202 = product of:
        0.43643576 = sum of:
          0.053771507 = weight(abstract_txt:enables in 1955) [ClassicSimilarity], result of:
            0.053771507 = score(doc=1955,freq=1.0), product of:
              0.14027831 = queryWeight, product of:
                1.094976 = boost
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.020888356 = queryNorm
              0.38332018 = fieldWeight in 1955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.0625 = fieldNorm(doc=1955)
          0.055555403 = weight(abstract_txt:abstract in 1955) [ClassicSimilarity], result of:
            0.055555403 = score(doc=1955,freq=1.0), product of:
              0.14336394 = queryWeight, product of:
                1.1069533 = boost
                6.2002096 = idf(docFreq=244, maxDocs=44421)
                0.020888356 = queryNorm
              0.3875131 = fieldWeight in 1955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2002096 = idf(docFreq=244, maxDocs=44421)
                0.0625 = fieldNorm(doc=1955)
          0.06839678 = weight(abstract_txt:manually in 1955) [ClassicSimilarity], result of:
            0.06839678 = score(doc=1955,freq=1.0), product of:
              0.164682 = queryWeight, product of:
                1.1864034 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.020888356 = queryNorm
              0.41532636 = fieldWeight in 1955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0625 = fieldNorm(doc=1955)
          0.025293814 = weight(abstract_txt:system in 1955) [ClassicSimilarity], result of:
            0.025293814 = score(doc=1955,freq=2.0), product of:
              0.08484611 = queryWeight, product of:
                1.2043155 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.020888356 = queryNorm
              0.298114 = fieldWeight in 1955, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.0625 = fieldNorm(doc=1955)
          0.11310974 = weight(abstract_txt:sentences in 1955) [ClassicSimilarity], result of:
            0.11310974 = score(doc=1955,freq=2.0), product of:
              0.18278718 = queryWeight, product of:
                1.2499199 = boost
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.020888356 = queryNorm
              0.61880565 = fieldWeight in 1955, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.0625 = fieldNorm(doc=1955)
          0.017141253 = weight(abstract_txt:information in 1955) [ClassicSimilarity], result of:
            0.017141253 = score(doc=1955,freq=3.0), product of:
              0.06546122 = queryWeight, product of:
                1.2955734 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020888356 = queryNorm
              0.26185355 = fieldWeight in 1955, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=1955)
          0.10316728 = weight(abstract_txt:text in 1955) [ClassicSimilarity], result of:
            0.10316728 = score(doc=1955,freq=5.0), product of:
              0.18268411 = queryWeight, product of:
                2.1643143 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.020888356 = queryNorm
              0.56473047 = fieldWeight in 1955, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1955)
        0.28 = coord(7/25)
    
  4. Wang, P.; Hao, T.; Yan, J.; Jin, L.: Large-scale extraction of drug-disease pairs from the medical literature (2017) 0.12
    0.12068649 = sum of:
      0.12068649 = product of:
        0.50286037 = sum of:
          0.04938693 = weight(abstract_txt:abstracts in 4927) [ClassicSimilarity], result of:
            0.04938693 = score(doc=4927,freq=1.0), product of:
              0.13254511 = queryWeight, product of:
                1.0643665 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.020888356 = queryNorm
              0.37260467 = fieldWeight in 4927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.06839678 = weight(abstract_txt:manually in 4927) [ClassicSimilarity], result of:
            0.06839678 = score(doc=4927,freq=1.0), product of:
              0.164682 = queryWeight, product of:
                1.1864034 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.020888356 = queryNorm
              0.41532636 = fieldWeight in 4927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.07515588 = weight(abstract_txt:efficiently in 4927) [ClassicSimilarity], result of:
            0.07515588 = score(doc=4927,freq=1.0), product of:
              0.17536019 = queryWeight, product of:
                1.2242633 = boost
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.020888356 = queryNorm
              0.42858005 = fieldWeight in 4927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8572807 = idf(docFreq=126, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.009896507 = weight(abstract_txt:information in 4927) [ClassicSimilarity], result of:
            0.009896507 = score(doc=4927,freq=1.0), product of:
              0.06546122 = queryWeight, product of:
                1.2955734 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020888356 = queryNorm
              0.15118122 = fieldWeight in 4927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.06524871 = weight(abstract_txt:text in 4927) [ClassicSimilarity], result of:
            0.06524871 = score(doc=4927,freq=2.0), product of:
              0.18268411 = queryWeight, product of:
                2.1643143 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.020888356 = queryNorm
              0.3571669 = fieldWeight in 4927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
          0.23477557 = weight(abstract_txt:extraction in 4927) [ClassicSimilarity], result of:
            0.23477557 = score(doc=4927,freq=2.0), product of:
              0.42896456 = queryWeight, product of:
                3.316505 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.020888356 = queryNorm
              0.5473076 = fieldWeight in 4927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0625 = fieldNorm(doc=4927)
        0.24 = coord(6/25)
    
  5. Zhou, G.D.; Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge (2007) 0.12
    0.11773387 = sum of:
      0.11773387 = product of:
        0.58866936 = sum of:
          0.053771507 = weight(abstract_txt:enables in 1927) [ClassicSimilarity], result of:
            0.053771507 = score(doc=1927,freq=1.0), product of:
              0.14027831 = queryWeight, product of:
                1.094976 = boost
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.020888356 = queryNorm
              0.38332018 = fieldWeight in 1927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.0625 = fieldNorm(doc=1927)
          0.025293814 = weight(abstract_txt:system in 1927) [ClassicSimilarity], result of:
            0.025293814 = score(doc=1927,freq=2.0), product of:
              0.08484611 = queryWeight, product of:
                1.2043155 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.020888356 = queryNorm
              0.298114 = fieldWeight in 1927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.0625 = fieldNorm(doc=1927)
          0.024241393 = weight(abstract_txt:information in 1927) [ClassicSimilarity], result of:
            0.024241393 = score(doc=1927,freq=6.0), product of:
              0.06546122 = queryWeight, product of:
                1.2955734 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020888356 = queryNorm
              0.37031686 = fieldWeight in 1927, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=1927)
          0.04613781 = weight(abstract_txt:text in 1927) [ClassicSimilarity], result of:
            0.04613781 = score(doc=1927,freq=1.0), product of:
              0.18268411 = queryWeight, product of:
                2.1643143 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.020888356 = queryNorm
              0.25255513 = fieldWeight in 1927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1927)
          0.43922484 = weight(abstract_txt:extraction in 1927) [ClassicSimilarity], result of:
            0.43922484 = score(doc=1927,freq=7.0), product of:
              0.42896456 = queryWeight, product of:
                3.316505 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.020888356 = queryNorm
              1.0239187 = fieldWeight in 1927, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0625 = fieldNorm(doc=1927)
        0.2 = coord(5/25)