Document (#44015)

Author
Jiang, Y.
Meng, R.
Huang, Y.
Lu, W.
Liu, J.
Title
Generating keyphrases for readers : a controllable keyphrase generation framework
Source
Journal of the Association for Information Science and Technology. 74(2023) no.7, S.759-774
Year
2023
Abstract
With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24749.
Theme
Automatisches Abstracting
Object
BART
T5

Similar documents (author)

  1. Meng, M.: ¬A conceptual framework for online education programs (1993) 1.36
    1.3557533 = sum of:
      1.3557533 = product of:
        4.06726 = sum of:
          4.06726 = weight(author_txt:meng in 7821) [ClassicSimilarity], result of:
            4.06726 = score(doc=7821,freq=1.0), product of:
              0.7060785 = queryWeight, product of:
                1.2837011 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.059678815 = queryNorm
              5.7603507 = fieldWeight in 7821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.625 = fieldNorm(doc=7821)
        0.33333334 = coord(1/3)
    
  2. Meng, L.: ¬The creation of [the] Chinese Science Citation Database : status quo and future development (1997) 1.36
    1.3557533 = sum of:
      1.3557533 = product of:
        4.06726 = sum of:
          4.06726 = weight(author_txt:meng in 1954) [ClassicSimilarity], result of:
            4.06726 = score(doc=1954,freq=1.0), product of:
              0.7060785 = queryWeight, product of:
                1.2837011 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.059678815 = queryNorm
              5.7603507 = fieldWeight in 1954, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.625 = fieldNorm(doc=1954)
        0.33333334 = coord(1/3)
    
  3. Jiang, D.: ¬A feasibility study of the outsourcing of cataloging in the academic libraries (1998) 0.97
    0.9673432 = sum of:
      0.9673432 = product of:
        2.9020295 = sum of:
          2.9020295 = weight(author_txt:jiang in 5622) [ClassicSimilarity], result of:
            2.9020295 = score(doc=5622,freq=1.0), product of:
              0.5637929 = queryWeight, product of:
                1.1470892 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.059678815 = queryNorm
              5.1473327 = fieldWeight in 5622, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.625 = fieldNorm(doc=5622)
        0.33333334 = coord(1/3)
    
  4. Jiang, S.Y.: Lost in translation : the treatment of Chinese classics in the Library of Congress Classification (2007) 0.97
    0.9673432 = sum of:
      0.9673432 = product of:
        2.9020295 = sum of:
          2.9020295 = weight(author_txt:jiang in 1773) [ClassicSimilarity], result of:
            2.9020295 = score(doc=1773,freq=1.0), product of:
              0.5637929 = queryWeight, product of:
                1.1470892 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.059678815 = queryNorm
              5.1473327 = fieldWeight in 1773, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.625 = fieldNorm(doc=1773)
        0.33333334 = coord(1/3)
    
  5. Jiang, T.: Architektur und Anwendungen des kollaborativen Lernsystems K3 (2008) 0.97
    0.9673432 = sum of:
      0.9673432 = product of:
        2.9020295 = sum of:
          2.9020295 = weight(author_txt:jiang in 2391) [ClassicSimilarity], result of:
            2.9020295 = score(doc=2391,freq=1.0), product of:
              0.5637929 = queryWeight, product of:
                1.1470892 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.059678815 = queryNorm
              5.1473327 = fieldWeight in 2391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.625 = fieldNorm(doc=2391)
        0.33333334 = coord(1/3)
    

Similar documents (content)

  1. Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.40
    0.39749056 = sum of:
      0.39749056 = product of:
        1.6562107 = sum of:
          0.014657404 = weight(abstract_txt:semantic in 290) [ClassicSimilarity], result of:
            0.014657404 = score(doc=290,freq=1.0), product of:
              0.052411847 = queryWeight, product of:
                1.2075366 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.00970022 = queryNorm
              0.27965823 = fieldWeight in 290, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.0625 = fieldNorm(doc=290)
          0.018754741 = weight(abstract_txt:because in 290) [ClassicSimilarity], result of:
            0.018754741 = score(doc=290,freq=1.0), product of:
              0.061773013 = queryWeight, product of:
                1.3109465 = boost
                4.8577175 = idf(docFreq=937, maxDocs=44421)
                0.00970022 = queryNorm
              0.30360734 = fieldWeight in 290, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8577175 = idf(docFreq=937, maxDocs=44421)
                0.0625 = fieldNorm(doc=290)
          0.040036876 = weight(abstract_txt:function in 290) [ClassicSimilarity], result of:
            0.040036876 = score(doc=290,freq=2.0), product of:
              0.081287086 = queryWeight, product of:
                1.5038216 = boost
                5.5724173 = idf(docFreq=458, maxDocs=44421)
                0.00970022 = queryNorm
              0.49253675 = fieldWeight in 290, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5724173 = idf(docFreq=458, maxDocs=44421)
                0.0625 = fieldNorm(doc=290)
          0.035839774 = weight(abstract_txt:generate in 290) [ClassicSimilarity], result of:
            0.035839774 = score(doc=290,freq=1.0), product of:
              0.09512652 = queryWeight, product of:
                1.6268082 = boost
                6.0281444 = idf(docFreq=290, maxDocs=44421)
                0.00970022 = queryNorm
              0.37675902 = fieldWeight in 290, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0281444 = idf(docFreq=290, maxDocs=44421)
                0.0625 = fieldNorm(doc=290)
          0.8985288 = weight(abstract_txt:keyphrases in 290) [ClassicSimilarity], result of:
            0.8985288 = score(doc=290,freq=7.0), product of:
              0.5781317 = queryWeight, product of:
                6.3411636 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.00970022 = queryNorm
              1.5541941 = fieldWeight in 290, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=290)
          0.6483931 = weight(abstract_txt:keyphrase in 290) [ClassicSimilarity], result of:
            0.6483931 = score(doc=290,freq=3.0), product of:
              0.6555669 = queryWeight, product of:
                7.396984 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.00970022 = queryNorm
              0.9890571 = fieldWeight in 290, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.0625 = fieldNorm(doc=290)
        0.24 = coord(6/25)
    
  2. Pirkola, A.: Constructing topic-specific search keyphrase suggestion tools for Web information retrieval (2010) 0.25
    0.25209245 = sum of:
      0.25209245 = product of:
        1.5755779 = sum of:
          0.023443425 = weight(abstract_txt:because in 665) [ClassicSimilarity], result of:
            0.023443425 = score(doc=665,freq=1.0), product of:
              0.061773013 = queryWeight, product of:
                1.3109465 = boost
                4.8577175 = idf(docFreq=937, maxDocs=44421)
                0.00970022 = queryNorm
              0.37950918 = fieldWeight in 665, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8577175 = idf(docFreq=937, maxDocs=44421)
                0.078125 = fieldNorm(doc=665)
          0.0063615213 = weight(abstract_txt:with in 665) [ClassicSimilarity], result of:
            0.0063615213 = score(doc=665,freq=1.0), product of:
              0.03262136 = queryWeight, product of:
                1.3472605 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.00970022 = queryNorm
              0.19501092 = fieldWeight in 665, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.078125 = fieldNorm(doc=665)
          0.7352815 = weight(abstract_txt:keyphrases in 665) [ClassicSimilarity], result of:
            0.7352815 = score(doc=665,freq=3.0), product of:
              0.5781317 = queryWeight, product of:
                6.3411636 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.00970022 = queryNorm
              1.2718236 = fieldWeight in 665, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.078125 = fieldNorm(doc=665)
          0.8104914 = weight(abstract_txt:keyphrase in 665) [ClassicSimilarity], result of:
            0.8104914 = score(doc=665,freq=3.0), product of:
              0.6555669 = queryWeight, product of:
                7.396984 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.00970022 = queryNorm
              1.2363214 = fieldWeight in 665, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.078125 = fieldNorm(doc=665)
        0.16 = coord(4/25)
    
  3. Zhang, Y.; Zhang, C.: Enhancing keyphrase extraction from microblogs using human reading time (2021) 0.23
    0.23104945 = sum of:
      0.23104945 = product of:
        1.1552472 = sum of:
          0.012211758 = weight(abstract_txt:important in 1238) [ClassicSimilarity], result of:
            0.012211758 = score(doc=1238,freq=1.0), product of:
              0.04640629 = queryWeight, product of:
                1.1362503 = boost
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.00970022 = queryNorm
              0.26314875 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.0625 = fieldNorm(doc=1238)
          0.032333408 = weight(abstract_txt:models in 1238) [ClassicSimilarity], result of:
            0.032333408 = score(doc=1238,freq=4.0), product of:
              0.055950727 = queryWeight, product of:
                1.2476375 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.00970022 = queryNorm
              0.57789075 = fieldWeight in 1238, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.0625 = fieldNorm(doc=1238)
          0.022390176 = weight(abstract_txt:tasks in 1238) [ClassicSimilarity], result of:
            0.022390176 = score(doc=1238,freq=1.0), product of:
              0.069517866 = queryWeight, product of:
                1.3907009 = boost
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.00970022 = queryNorm
              0.32207802 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.0625 = fieldNorm(doc=1238)
          0.339612 = weight(abstract_txt:keyphrases in 1238) [ClassicSimilarity], result of:
            0.339612 = score(doc=1238,freq=1.0), product of:
              0.5781317 = queryWeight, product of:
                6.3411636 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.00970022 = queryNorm
              0.5874302 = fieldWeight in 1238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=1238)
          0.74869984 = weight(abstract_txt:keyphrase in 1238) [ClassicSimilarity], result of:
            0.74869984 = score(doc=1238,freq=4.0), product of:
              0.6555669 = queryWeight, product of:
                7.396984 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.00970022 = queryNorm
              1.1420648 = fieldWeight in 1238, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.0625 = fieldNorm(doc=1238)
        0.2 = coord(5/25)
    
  4. Daudaravicius, V.: ¬A framework for keyphrase extraction from scientific journals (2016) 0.21
    0.21154548 = sum of:
      0.21154548 = product of:
        1.3221593 = sum of:
          0.032574825 = weight(abstract_txt:framework in 3930) [ClassicSimilarity], result of:
            0.032574825 = score(doc=3930,freq=2.0), product of:
              0.054064058 = queryWeight, product of:
                1.2264218 = boost
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.00970022 = queryNorm
              0.60252273 = fieldWeight in 3930, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5445113 = idf(docFreq=1282, maxDocs=44421)
                0.09375 = fieldNorm(doc=3930)
          0.007633826 = weight(abstract_txt:with in 3930) [ClassicSimilarity], result of:
            0.007633826 = score(doc=3930,freq=1.0), product of:
              0.03262136 = queryWeight, product of:
                1.3472605 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.00970022 = queryNorm
              0.23401311 = fieldWeight in 3930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.09375 = fieldNorm(doc=3930)
          0.72042584 = weight(abstract_txt:keyphrases in 3930) [ClassicSimilarity], result of:
            0.72042584 = score(doc=3930,freq=2.0), product of:
              0.5781317 = queryWeight, product of:
                6.3411636 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.00970022 = queryNorm
              1.2461276 = fieldWeight in 3930, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.09375 = fieldNorm(doc=3930)
          0.56152487 = weight(abstract_txt:keyphrase in 3930) [ClassicSimilarity], result of:
            0.56152487 = score(doc=3930,freq=1.0), product of:
              0.6555669 = queryWeight, product of:
                7.396984 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.00970022 = queryNorm
              0.8565486 = fieldWeight in 3930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.09375 = fieldNorm(doc=3930)
        0.16 = coord(4/25)
    
  5. Medelyan, O.; Witten, I.H.: Domain-independent automatic keyphrase indexing with small training sets (2008) 0.19
    0.18904722 = sum of:
      0.18904722 = product of:
        0.9452361 = sum of:
          0.018321754 = weight(abstract_txt:semantic in 2871) [ClassicSimilarity], result of:
            0.018321754 = score(doc=2871,freq=1.0), product of:
              0.052411847 = queryWeight, product of:
                1.2075366 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.00970022 = queryNorm
              0.34957278 = fieldWeight in 2871, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.078125 = fieldNorm(doc=2871)
          0.023443425 = weight(abstract_txt:because in 2871) [ClassicSimilarity], result of:
            0.023443425 = score(doc=2871,freq=1.0), product of:
              0.061773013 = queryWeight, product of:
                1.3109465 = boost
                4.8577175 = idf(docFreq=937, maxDocs=44421)
                0.00970022 = queryNorm
              0.37950918 = fieldWeight in 2871, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8577175 = idf(docFreq=937, maxDocs=44421)
                0.078125 = fieldNorm(doc=2871)
          0.011018479 = weight(abstract_txt:with in 2871) [ClassicSimilarity], result of:
            0.011018479 = score(doc=2871,freq=3.0), product of:
              0.03262136 = queryWeight, product of:
                1.3472605 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.00970022 = queryNorm
              0.33776882 = fieldWeight in 2871, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.078125 = fieldNorm(doc=2871)
          0.424515 = weight(abstract_txt:keyphrases in 2871) [ClassicSimilarity], result of:
            0.424515 = score(doc=2871,freq=1.0), product of:
              0.5781317 = queryWeight, product of:
                6.3411636 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.00970022 = queryNorm
              0.73428774 = fieldWeight in 2871, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.078125 = fieldNorm(doc=2871)
          0.46793744 = weight(abstract_txt:keyphrase in 2871) [ClassicSimilarity], result of:
            0.46793744 = score(doc=2871,freq=1.0), product of:
              0.6555669 = queryWeight, product of:
                7.396984 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.00970022 = queryNorm
              0.71379054 = fieldWeight in 2871, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.078125 = fieldNorm(doc=2871)
        0.2 = coord(5/25)