Document (#19831)

Author
Tseng, Y.-H.
Title
Keyword extraction techniques and relevance feedback
Source
Bulletin of the Library Association of China. 1997, no.59, Dec., S.59-64
Year
1997
Abstract
Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm
Footnote
[In Chinesisch]
Theme
Indexierungsstudien

Similar documents (author)

  1. Tseng, Y.-H.: Automatic cataloguing and searching for retrospective data by use of OCR text (2001) 4.57
    4.5682592 = sum of:
      4.5682592 = weight(author_txt:tseng in 5420) [ClassicSimilarity], result of:
        4.5682592 = fieldWeight in 5420, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.5 = fieldNorm(doc=5420)
    
  2. Tseng, Y.-H.: Solving vocabulary problems with interactive query expansion (1998) 4.57
    4.5682592 = sum of:
      4.5682592 = weight(author_txt:tseng in 6159) [ClassicSimilarity], result of:
        4.5682592 = fieldWeight in 6159, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.5 = fieldNorm(doc=6159)
    
  3. Tseng, Y.H.; Lin, Y.I.: Evaluation of fuzzy search, term suggestion, and term relevance feedback in an OPAC system (1998) 4.57
    4.5682592 = sum of:
      4.5682592 = weight(author_txt:tseng in 430) [ClassicSimilarity], result of:
        4.5682592 = fieldWeight in 430, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.5 = fieldNorm(doc=430)
    
  4. Tseng, Y.-H.: Automatic thesaurus generation for Chinese documents (2002) 4.57
    4.5682592 = sum of:
      4.5682592 = weight(author_txt:tseng in 226) [ClassicSimilarity], result of:
        4.5682592 = fieldWeight in 226, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.5 = fieldNorm(doc=226)
    
  5. Drenth, H.; Morris, A.; Tseng, G.: Expert systems as information intermediaries (1991) 3.43
    3.4261944 = sum of:
      3.4261944 = weight(author_txt:tseng in 3694) [ClassicSimilarity], result of:
        3.4261944 = fieldWeight in 3694, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.1365185 = idf(docFreq=12, maxDocs=44421)
          0.375 = fieldNorm(doc=3694)
    

Similar documents (content)

  1. Goh, A.; Hui, S.C.; Chan, S.K.: ¬A text extraction system for news reports (1996) 0.24
    0.23935233 = sum of:
      0.23935233 = product of:
        0.8548297 = sum of:
          0.03404664 = weight(abstract_txt:keywords in 6669) [ClassicSimilarity], result of:
            0.03404664 = score(doc=6669,freq=1.0), product of:
              0.09072489 = queryWeight, product of:
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.015109801 = queryNorm
              0.37527338 = fieldWeight in 6669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.052362204 = weight(abstract_txt:extracting in 6669) [ClassicSimilarity], result of:
            0.052362204 = score(doc=6669,freq=1.0), product of:
              0.12088032 = queryWeight, product of:
                1.154289 = boost
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.015109801 = queryNorm
              0.43317392 = fieldWeight in 6669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.026482657 = weight(abstract_txt:results in 6669) [ClassicSimilarity], result of:
            0.026482657 = score(doc=6669,freq=4.0), product of:
              0.060903374 = queryWeight, product of:
                1.1587038 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.015109801 = queryNorm
              0.4348307 = fieldWeight in 6669, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.030128853 = weight(abstract_txt:application in 6669) [ClassicSimilarity], result of:
            0.030128853 = score(doc=6669,freq=1.0), product of:
              0.10535991 = queryWeight, product of:
                1.5240158 = boost
                4.5753803 = idf(docFreq=1243, maxDocs=44421)
                0.015109801 = queryNorm
              0.28596127 = fieldWeight in 6669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5753803 = idf(docFreq=1243, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.037680626 = weight(abstract_txt:relevance in 6669) [ClassicSimilarity], result of:
            0.037680626 = score(doc=6669,freq=1.0), product of:
              0.12230167 = queryWeight, product of:
                1.6419803 = boost
                4.929532 = idf(docFreq=872, maxDocs=44421)
                0.015109801 = queryNorm
              0.30809575 = fieldWeight in 6669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.929532 = idf(docFreq=872, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.1731529 = weight(abstract_txt:keyword in 6669) [ClassicSimilarity], result of:
            0.1731529 = score(doc=6669,freq=1.0), product of:
              0.45879656 = queryWeight, product of:
                5.0284233 = boost
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.015109801 = queryNorm
              0.3774067 = fieldWeight in 6669, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
          0.50097585 = weight(abstract_txt:extraction in 6669) [ClassicSimilarity], result of:
            0.50097585 = score(doc=6669,freq=5.0), product of:
              0.5789156 = queryWeight, product of:
                6.187568 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.015109801 = queryNorm
              0.8653694 = fieldWeight in 6669, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0625 = fieldNorm(doc=6669)
        0.28 = coord(7/25)
    
  2. Ercan, G.; Cicekli, I.: Using lexical chains for keyword extraction (2007) 0.21
    0.2109438 = sum of:
      0.2109438 = product of:
        1.054719 = sum of:
          0.07222383 = weight(abstract_txt:keywords in 1951) [ClassicSimilarity], result of:
            0.07222383 = score(doc=1951,freq=2.0), product of:
              0.09072489 = queryWeight, product of:
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.015109801 = queryNorm
              0.7960751 = fieldWeight in 1951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.09375 = fieldNorm(doc=1951)
          0.019861992 = weight(abstract_txt:results in 1951) [ClassicSimilarity], result of:
            0.019861992 = score(doc=1951,freq=1.0), product of:
              0.060903374 = queryWeight, product of:
                1.1587038 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.015109801 = queryNorm
              0.32612303 = fieldWeight in 1951, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.09375 = fieldNorm(doc=1951)
          0.013239033 = weight(abstract_txt:that in 1951) [ClassicSimilarity], result of:
            0.013239033 = score(doc=1951,freq=2.0), product of:
              0.04222316 = queryWeight, product of:
                1.1816062 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.015109801 = queryNorm
              0.31354907 = fieldWeight in 1951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.09375 = fieldNorm(doc=1951)
          0.36731276 = weight(abstract_txt:keyword in 1951) [ClassicSimilarity], result of:
            0.36731276 = score(doc=1951,freq=2.0), product of:
              0.45879656 = queryWeight, product of:
                5.0284233 = boost
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.015109801 = queryNorm
              0.8006005 = fieldWeight in 1951, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.09375 = fieldNorm(doc=1951)
          0.5820813 = weight(abstract_txt:extraction in 1951) [ClassicSimilarity], result of:
            0.5820813 = score(doc=1951,freq=3.0), product of:
              0.5789156 = queryWeight, product of:
                6.187568 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.015109801 = queryNorm
              1.0054684 = fieldWeight in 1951, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.09375 = fieldNorm(doc=1951)
        0.2 = coord(5/25)
    
  3. Ning, X.; Jin, H.; Jia, W.; Yuan, P.: Practical and effective IR-style keyword search over semantic web (2009) 0.17
    0.16843493 = sum of:
      0.16843493 = product of:
        0.6015533 = sum of:
          0.051069956 = weight(abstract_txt:keywords in 213) [ClassicSimilarity], result of:
            0.051069956 = score(doc=213,freq=1.0), product of:
              0.09072489 = queryWeight, product of:
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.015109801 = queryNorm
              0.5629101 = fieldWeight in 213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.004374 = idf(docFreq=297, maxDocs=44421)
                0.09375 = fieldNorm(doc=213)
          0.017426621 = weight(abstract_txt:data in 213) [ClassicSimilarity], result of:
            0.017426621 = score(doc=213,freq=1.0), product of:
              0.055817228 = queryWeight, product of:
                1.1092665 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.015109801 = queryNorm
              0.31220865 = fieldWeight in 213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.09375 = fieldNorm(doc=213)
          0.019861992 = weight(abstract_txt:results in 213) [ClassicSimilarity], result of:
            0.019861992 = score(doc=213,freq=1.0), product of:
              0.060903374 = queryWeight, product of:
                1.1587038 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.015109801 = queryNorm
              0.32612303 = fieldWeight in 213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.09375 = fieldNorm(doc=213)
          0.016214436 = weight(abstract_txt:that in 213) [ClassicSimilarity], result of:
            0.016214436 = score(doc=213,freq=3.0), product of:
              0.04222316 = queryWeight, product of:
                1.1816062 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.015109801 = queryNorm
              0.3840176 = fieldWeight in 213, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.09375 = fieldNorm(doc=213)
          0.042055737 = weight(abstract_txt:retrieval in 213) [ClassicSimilarity], result of:
            0.042055737 = score(doc=213,freq=2.0), product of:
              0.09124241 = queryWeight, product of:
                1.7369838 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.015109801 = queryNorm
              0.46092314 = fieldWeight in 213, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.09375 = fieldNorm(doc=213)
          0.08761184 = weight(abstract_txt:algorithm in 213) [ClassicSimilarity], result of:
            0.08761184 = score(doc=213,freq=1.0), product of:
              0.16380784 = queryWeight, product of:
                1.9002866 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.015109801 = queryNorm
              0.53484523 = fieldWeight in 213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.09375 = fieldNorm(doc=213)
          0.36731276 = weight(abstract_txt:keyword in 213) [ClassicSimilarity], result of:
            0.36731276 = score(doc=213,freq=2.0), product of:
              0.45879656 = queryWeight, product of:
                5.0284233 = boost
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.015109801 = queryNorm
              0.8006005 = fieldWeight in 213, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.09375 = fieldNorm(doc=213)
        0.28 = coord(7/25)
    
  4. Semantic keyword-based search on structured data sources : COST Action IC1302. Second International KEYSTONE Conference, IKC 2016, Cluj-Napoca, Romania, September 8-9, 2016, Revised Selected Papers (2017) 0.16
    0.15781015 = sum of:
      0.15781015 = product of:
        0.98631346 = sum of:
          0.020331059 = weight(abstract_txt:data in 4479) [ClassicSimilarity], result of:
            0.020331059 = score(doc=4479,freq=1.0), product of:
              0.055817228 = queryWeight, product of:
                1.1092665 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.015109801 = queryNorm
              0.36424342 = fieldWeight in 4479, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.109375 = fieldNorm(doc=4479)
          0.04906503 = weight(abstract_txt:retrieval in 4479) [ClassicSimilarity], result of:
            0.04906503 = score(doc=4479,freq=2.0), product of:
              0.09124241 = queryWeight, product of:
                1.7369838 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.015109801 = queryNorm
              0.5377437 = fieldWeight in 4479, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.109375 = fieldNorm(doc=4479)
          0.5248418 = weight(abstract_txt:keyword in 4479) [ClassicSimilarity], result of:
            0.5248418 = score(doc=4479,freq=3.0), product of:
              0.45879656 = queryWeight, product of:
                5.0284233 = boost
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.015109801 = queryNorm
              1.1439532 = fieldWeight in 4479, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.109375 = fieldNorm(doc=4479)
          0.3920756 = weight(abstract_txt:extraction in 4479) [ClassicSimilarity], result of:
            0.3920756 = score(doc=4479,freq=1.0), product of:
              0.5789156 = queryWeight, product of:
                6.187568 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.015109801 = queryNorm
              0.6772587 = fieldWeight in 4479, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.109375 = fieldNorm(doc=4479)
        0.16 = coord(4/25)
    
  5. Colace, F.; Santo, M. de; Greco, L.; Napoletano, P.: Improving relevance feedback-based query expansion by the use of a weighted word pairs approach (2015) 0.15
    0.1485755 = sum of:
      0.1485755 = product of:
        0.61906457 = sum of:
          0.01655166 = weight(abstract_txt:results in 3263) [ClassicSimilarity], result of:
            0.01655166 = score(doc=3263,freq=1.0), product of:
              0.060903374 = queryWeight, product of:
                1.1587038 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.015109801 = queryNorm
              0.2717692 = fieldWeight in 3263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.078125 = fieldNorm(doc=3263)
          0.007801174 = weight(abstract_txt:that in 3263) [ClassicSimilarity], result of:
            0.007801174 = score(doc=3263,freq=1.0), product of:
              0.04222316 = queryWeight, product of:
                1.1816062 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.015109801 = queryNorm
              0.18476056 = fieldWeight in 3263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=3263)
          0.04710078 = weight(abstract_txt:relevance in 3263) [ClassicSimilarity], result of:
            0.04710078 = score(doc=3263,freq=1.0), product of:
              0.12230167 = queryWeight, product of:
                1.6419803 = boost
                4.929532 = idf(docFreq=872, maxDocs=44421)
                0.015109801 = queryNorm
              0.38511968 = fieldWeight in 3263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.929532 = idf(docFreq=872, maxDocs=44421)
                0.078125 = fieldNorm(doc=3263)
          0.035046447 = weight(abstract_txt:retrieval in 3263) [ClassicSimilarity], result of:
            0.035046447 = score(doc=3263,freq=2.0), product of:
              0.09124241 = queryWeight, product of:
                1.7369838 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.015109801 = queryNorm
              0.3841026 = fieldWeight in 3263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=3263)
          0.11650843 = weight(abstract_txt:feedback in 3263) [ClassicSimilarity], result of:
            0.11650843 = score(doc=3263,freq=2.0), product of:
              0.17754504 = queryWeight, product of:
                1.9783633 = boost
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.015109801 = queryNorm
              0.656219 = fieldWeight in 3263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.078125 = fieldNorm(doc=3263)
          0.39605612 = weight(abstract_txt:extraction in 3263) [ClassicSimilarity], result of:
            0.39605612 = score(doc=3263,freq=2.0), product of:
              0.5789156 = queryWeight, product of:
                6.187568 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.015109801 = queryNorm
              0.6841345 = fieldWeight in 3263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.078125 = fieldNorm(doc=3263)
        0.24 = coord(6/25)