Document (#22060)

Author
Liu, J.
Wu, Y.
Zhou, L.
Title
¬A hybrid method for abstracting newspaper articles
Source
Journal of the American Society for Information Science. 50(1999) no.13, S.1234-1245
Year
1999
Abstract
This paper introduces a hybrid method for abstracting Chinese text. It integrates the statistical approach with language understanding. Some linguistics heuristics and segmentation are also incorporated into the abstracting process. The prototype system is of a multipurpose type catering for various users with different reqirements. Initial responses show that the proposed method contributes much to the flexibility and accuracy of the automatic Chinese abstracting system. In practice, the present work provides a path to developing an intelligent Chinese system for automating the information
Theme
Automatisches Abstracting
Form
Zeitungen

Similar documents (author)

  1. Zhou, L.: Characteristics of material organization and classification in the Kinsey Institute Library (2003) 4.86
    4.8560257 = sum of:
      4.8560257 = weight(author_txt:zhou in 639) [ClassicSimilarity], result of:
        4.8560257 = score(doc=639,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.12870605 = queryNorm
          4.856026 = fieldWeight in 639, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.625 = fieldNorm(doc=639)
    
  2. Zhou, J.-z.: ¬A new subclass for Library of Congress Classification, QF : Computer science (1998) 3.88
    3.8848207 = sum of:
      3.8848207 = weight(author_txt:zhou in 3846) [ClassicSimilarity], result of:
        3.8848207 = score(doc=3846,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.12870605 = queryNorm
          3.884821 = fieldWeight in 3846, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.5 = fieldNorm(doc=3846)
    
  3. Zhou, L.; Zhang, D.: NLPIR: a theoretical framework for applying Natural Language Processing to information retrieval (2003) 3.88
    3.8848207 = sum of:
      3.8848207 = weight(author_txt:zhou in 148) [ClassicSimilarity], result of:
        3.8848207 = score(doc=148,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.12870605 = queryNorm
          3.884821 = fieldWeight in 148, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.5 = fieldNorm(doc=148)
    
  4. Zhou, P.; Leydesdorff, L.: ¬A comparison between the China Scientific and Technical Papers and Citations Database and the Science Citation Index in terms of journal hierarchies and interjournal citation relations (2007) 3.88
    3.8848207 = sum of:
      3.8848207 = weight(author_txt:zhou in 1070) [ClassicSimilarity], result of:
        3.8848207 = score(doc=1070,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.12870605 = queryNorm
          3.884821 = fieldWeight in 1070, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.5 = fieldNorm(doc=1070)
    
  5. Zhou, G.D.; Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge (2007) 3.88
    3.8848207 = sum of:
      3.8848207 = weight(author_txt:zhou in 1927) [ClassicSimilarity], result of:
        3.8848207 = score(doc=1927,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.12870605 = queryNorm
          3.884821 = fieldWeight in 1927, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.769642 = idf(docFreq=50, maxDocs=44421)
            0.5 = fieldNorm(doc=1927)
    

Similar documents (content)

  1. Yang, C.C.; Li, K.W.: ¬A heuristic method based on a statistical approach for chinese text segmentation (2005) 0.24
    0.23772812 = sum of:
      0.23772812 = product of:
        0.9905338 = sum of:
          0.010660655 = weight(abstract_txt:with in 5580) [ClassicSimilarity], result of:
            0.010660655 = score(doc=5580,freq=3.0), product of:
              0.039452482 = queryWeight, product of:
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015805397 = queryNorm
              0.27021506 = fieldWeight in 5580, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=5580)
          0.027753297 = weight(abstract_txt:automatic in 5580) [ClassicSimilarity], result of:
            0.027753297 = score(doc=5580,freq=1.0), product of:
              0.08546571 = queryWeight, product of:
                1.0407437 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.015805397 = queryNorm
              0.32473022 = fieldWeight in 5580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=5580)
          0.047751196 = weight(abstract_txt:statistical in 5580) [ClassicSimilarity], result of:
            0.047751196 = score(doc=5580,freq=2.0), product of:
              0.097400606 = queryWeight, product of:
                1.1110374 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.015805397 = queryNorm
              0.49025562 = fieldWeight in 5580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.0625 = fieldNorm(doc=5580)
          0.29717565 = weight(abstract_txt:segmentation in 5580) [ClassicSimilarity], result of:
            0.29717565 = score(doc=5580,freq=9.0), product of:
              0.19960748 = queryWeight, product of:
                1.5905094 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.015805397 = queryNorm
              1.4888002 = fieldWeight in 5580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0625 = fieldNorm(doc=5580)
          0.1621515 = weight(abstract_txt:method in 5580) [ClassicSimilarity], result of:
            0.1621515 = score(doc=5580,freq=9.0), product of:
              0.19223054 = queryWeight, product of:
                2.7034583 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.015805397 = queryNorm
              0.84352624 = fieldWeight in 5580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=5580)
          0.44504154 = weight(abstract_txt:chinese in 5580) [ClassicSimilarity], result of:
            0.44504154 = score(doc=5580,freq=9.0), product of:
              0.3768271 = queryWeight, product of:
                3.785119 = boost
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.015805397 = queryNorm
              1.1810232 = fieldWeight in 5580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.0625 = fieldNorm(doc=5580)
        0.24 = coord(6/25)
    
  2. Khoo, C.S.G.; Dai, D.; Loh, T.E.: Using statistical and contextual information to identify two- and three-character words in Chinese text (2002) 0.15
    0.15432413 = sum of:
      0.15432413 = product of:
        0.64301723 = sum of:
          0.008704388 = weight(abstract_txt:with in 206) [ClassicSimilarity], result of:
            0.008704388 = score(doc=206,freq=2.0), product of:
              0.039452482 = queryWeight, product of:
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015805397 = queryNorm
              0.22062966 = fieldWeight in 206, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.03924909 = weight(abstract_txt:automatic in 206) [ClassicSimilarity], result of:
            0.03924909 = score(doc=206,freq=2.0), product of:
              0.08546571 = queryWeight, product of:
                1.0407437 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.015805397 = queryNorm
              0.45923787 = fieldWeight in 206, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.033765193 = weight(abstract_txt:statistical in 206) [ClassicSimilarity], result of:
            0.033765193 = score(doc=206,freq=1.0), product of:
              0.097400606 = queryWeight, product of:
                1.1110374 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.015805397 = queryNorm
              0.3466631 = fieldWeight in 206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.06171081 = weight(abstract_txt:incorporated in 206) [ClassicSimilarity], result of:
            0.06171081 = score(doc=206,freq=1.0), product of:
              0.1455983 = queryWeight, product of:
                1.3583947 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.015805397 = queryNorm
              0.4238429 = fieldWeight in 206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.2426429 = weight(abstract_txt:segmentation in 206) [ClassicSimilarity], result of:
            0.2426429 = score(doc=206,freq=6.0), product of:
              0.19960748 = queryWeight, product of:
                1.5905094 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.015805397 = queryNorm
              1.2156003 = fieldWeight in 206, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.25694486 = weight(abstract_txt:chinese in 206) [ClassicSimilarity], result of:
            0.25694486 = score(doc=206,freq=3.0), product of:
              0.3768271 = queryWeight, product of:
                3.785119 = boost
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.015805397 = queryNorm
              0.6818641 = fieldWeight in 206, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
        0.24 = coord(6/25)
    
  3. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.14
    0.13511796 = sum of:
      0.13511796 = product of:
        0.6755898 = sum of:
          0.010660655 = weight(abstract_txt:with in 1831) [ClassicSimilarity], result of:
            0.010660655 = score(doc=1831,freq=3.0), product of:
              0.039452482 = queryWeight, product of:
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015805397 = queryNorm
              0.27021506 = fieldWeight in 1831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.033765193 = weight(abstract_txt:statistical in 1831) [ClassicSimilarity], result of:
            0.033765193 = score(doc=1831,freq=1.0), product of:
              0.097400606 = queryWeight, product of:
                1.1110374 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.015805397 = queryNorm
              0.3466631 = fieldWeight in 1831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.07238529 = weight(abstract_txt:accuracy in 1831) [ClassicSimilarity], result of:
            0.07238529 = score(doc=1831,freq=3.0), product of:
              0.11228161 = queryWeight, product of:
                1.1928948 = boost
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.015805397 = queryNorm
              0.64467627 = fieldWeight in 1831, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.26208428 = weight(abstract_txt:segmentation in 1831) [ClassicSimilarity], result of:
            0.26208428 = score(doc=1831,freq=7.0), product of:
              0.19960748 = queryWeight, product of:
                1.5905094 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.015805397 = queryNorm
              1.3129983 = fieldWeight in 1831, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
          0.29669437 = weight(abstract_txt:chinese in 1831) [ClassicSimilarity], result of:
            0.29669437 = score(doc=1831,freq=4.0), product of:
              0.3768271 = queryWeight, product of:
                3.785119 = boost
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.015805397 = queryNorm
              0.7873488 = fieldWeight in 1831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.0625 = fieldNorm(doc=1831)
        0.2 = coord(5/25)
    
  4. Fletcher, G.P.; Hinde, C.J.: Using a neural network as a tool for constructing rule based systems (1995) 0.13
    0.1335225 = sum of:
      0.1335225 = product of:
        0.66761243 = sum of:
          0.010771131 = weight(abstract_txt:with in 3282) [ClassicSimilarity], result of:
            0.010771131 = score(doc=3282,freq=1.0), product of:
              0.039452482 = queryWeight, product of:
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015805397 = queryNorm
              0.2730153 = fieldWeight in 3282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.109375 = fieldNorm(doc=3282)
          0.08473256 = weight(abstract_txt:intelligent in 3282) [ClassicSimilarity], result of:
            0.08473256 = score(doc=3282,freq=1.0), product of:
              0.12385789 = queryWeight, product of:
                1.2528806 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.015805397 = queryNorm
              0.6841111 = fieldWeight in 3282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.109375 = fieldNorm(doc=3282)
          0.03985731 = weight(abstract_txt:system in 3282) [ClassicSimilarity], result of:
            0.03985731 = score(doc=3282,freq=1.0), product of:
              0.108044475 = queryWeight, product of:
                2.0267947 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.015805397 = queryNorm
              0.36889726 = fieldWeight in 3282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.109375 = fieldNorm(doc=3282)
          0.09458838 = weight(abstract_txt:method in 3282) [ClassicSimilarity], result of:
            0.09458838 = score(doc=3282,freq=1.0), product of:
              0.19223054 = queryWeight, product of:
                2.7034583 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.015805397 = queryNorm
              0.49205697 = fieldWeight in 3282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.109375 = fieldNorm(doc=3282)
          0.43766308 = weight(abstract_txt:abstracting in 3282) [ClassicSimilarity], result of:
            0.43766308 = score(doc=3282,freq=1.0), product of:
              0.58749396 = queryWeight, product of:
                5.4573216 = boost
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.015805397 = queryNorm
              0.7449661 = fieldWeight in 3282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.109375 = fieldNorm(doc=3282)
        0.2 = coord(5/25)
    
  5. Wan, T.-L.; Evens, M.; Wan, Y.-W.; Pao, Y.-Y.: Experiments with automatic indexing and a relational thesaurus in a Chinese information retrieval system (1997) 0.12
    0.11607993 = sum of:
      0.11607993 = product of:
        0.58039963 = sum of:
          0.013056581 = weight(abstract_txt:with in 956) [ClassicSimilarity], result of:
            0.013056581 = score(doc=956,freq=2.0), product of:
              0.039452482 = queryWeight, product of:
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.015805397 = queryNorm
              0.33094448 = fieldWeight in 956, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.09375 = fieldNorm(doc=956)
          0.07210518 = weight(abstract_txt:automatic in 956) [ClassicSimilarity], result of:
            0.07210518 = score(doc=956,freq=3.0), product of:
              0.08546571 = queryWeight, product of:
                1.0407437 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.015805397 = queryNorm
              0.8436738 = fieldWeight in 956, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.09375 = fieldNorm(doc=956)
          0.05064779 = weight(abstract_txt:statistical in 956) [ClassicSimilarity], result of:
            0.05064779 = score(doc=956,freq=1.0), product of:
              0.097400606 = queryWeight, product of:
                1.1110374 = boost
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.015805397 = queryNorm
              0.5199946 = fieldWeight in 956, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5466094 = idf(docFreq=470, maxDocs=44421)
                0.09375 = fieldNorm(doc=956)
          0.059172764 = weight(abstract_txt:system in 956) [ClassicSimilarity], result of:
            0.059172764 = score(doc=956,freq=3.0), product of:
              0.108044475 = queryWeight, product of:
                2.0267947 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.015805397 = queryNorm
              0.5476704 = fieldWeight in 956, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.09375 = fieldNorm(doc=956)
          0.3854173 = weight(abstract_txt:chinese in 956) [ClassicSimilarity], result of:
            0.3854173 = score(doc=956,freq=3.0), product of:
              0.3768271 = queryWeight, product of:
                3.785119 = boost
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.015805397 = queryNorm
              1.0227962 = fieldWeight in 956, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.09375 = fieldNorm(doc=956)
        0.2 = coord(5/25)