Document (#39804)

Author
Zadeh, B.Q.
Handschuh, S.
Title
¬The ACL RD-TEC : a dataset for benchmarking terminology extraction and classification in computational linguistics
Source
Proceedings of the 4th International Workshop on Computational Terminology, Dublin, Ireland, August 23 2014. COLING 2014. Eds.: Patrick Drouin et al., Dublin, Ireland, 2014-08-23 [https://www.deri.ie/sites/default/files/publications/the-acl-rd-tec.pdf]
Year
2014
Pages
S.52-63
Abstract
This paper introduces ACL RD-TEC: a dataset for evaluating the extraction and classification of terms from literature in the domain of computational linguistics. The dataset is derived from the Association for Computational Linguistics anthology reference corpus (ACL ARC). In its first release, the ACL RD-TEC consists of automatically segmented, part-of-speech-tagged ACL ARC documents, three lists of candidate terms, and more than 82,000 manually annotated terms. The annotated terms are marked as either valid or invalid, and valid terms are further classified as technology and non-technology terms. Technology terms signify methods, algorithms, and solutions in computational linguistics. The paper describes the dataset and reports the relevant statistics. We hope the step described in this paper encourages a collaborative effort towards building a full-fledged annotated corpus from the computational linguistics literature.
Content
Vgl. zum Corpus unter: http://acl-arc.comp.nus.edu.sg/.
Theme
Computerlinguistik
Object
ACL Anthology Reference Corpus

Similar documents (content)

  1. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.25
    0.24761897 = sum of:
      0.24761897 = product of:
        0.7738093 = sum of:
          0.05191755 = weight(abstract_txt:speech in 4015) [ClassicSimilarity], result of:
            0.05191755 = score(doc=4015,freq=1.0), product of:
              0.096687004 = queryWeight, product of:
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0140673425 = queryNorm
              0.53696513 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.020346966 = weight(abstract_txt:classification in 4015) [ClassicSimilarity], result of:
            0.020346966 = score(doc=4015,freq=1.0), product of:
              0.06523817 = queryWeight, product of:
                1.1616675 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0140673425 = queryNorm
              0.31188744 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.010078845 = weight(abstract_txt:from in 4015) [ClassicSimilarity], result of:
            0.010078845 = score(doc=4015,freq=1.0), product of:
              0.046752647 = queryWeight, product of:
                1.2044247 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0140673425 = queryNorm
              0.21557805 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.10225152 = weight(abstract_txt:corpus in 4015) [ClassicSimilarity], result of:
            0.10225152 = score(doc=4015,freq=2.0), product of:
              0.15191658 = queryWeight, product of:
                1.772693 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.0140673425 = queryNorm
              0.6730768 = fieldWeight in 4015, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.07592513 = weight(abstract_txt:extraction in 4015) [ClassicSimilarity], result of:
            0.07592513 = score(doc=4015,freq=1.0), product of:
              0.15694916 = queryWeight, product of:
                1.801816 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0140673425 = queryNorm
              0.48375618 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.07400904 = weight(abstract_txt:terms in 4015) [ClassicSimilarity], result of:
            0.07400904 = score(doc=4015,freq=1.0), product of:
              0.23426881 = queryWeight, product of:
                4.11834 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0140673425 = queryNorm
              0.31591502 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.20467137 = weight(abstract_txt:computational in 4015) [ClassicSimilarity], result of:
            0.20467137 = score(doc=4015,freq=1.0), product of:
              0.41259128 = queryWeight, product of:
                4.6191382 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0140673425 = queryNorm
              0.49606323 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.23460887 = weight(abstract_txt:linguistics in 4015) [ClassicSimilarity], result of:
            0.23460887 = score(doc=4015,freq=1.0), product of:
              0.4519027 = queryWeight, product of:
                4.8341866 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0140673425 = queryNorm
              0.51915795 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
        0.32 = coord(8/25)
    
  2. Radev, D.R.; Joseph, M.T.; Gibson, B.; Muthukrishnan, P.: ¬A bibliometric and network analysis of the field of computational linguistics (2016) 0.19
    0.1905099 = sum of:
      0.1905099 = product of:
        0.7937913 = sum of:
          0.012094612 = weight(abstract_txt:from in 3764) [ClassicSimilarity], result of:
            0.012094612 = score(doc=3764,freq=1.0), product of:
              0.046752647 = queryWeight, product of:
                1.2044247 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0140673425 = queryNorm
              0.25869364 = fieldWeight in 3764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.09375 = fieldNorm(doc=3764)
          0.13957305 = weight(abstract_txt:anthology in 3764) [ClassicSimilarity], result of:
            0.13957305 = score(doc=3764,freq=1.0), product of:
              0.165541 = queryWeight, product of:
                1.3084849 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0140673425 = queryNorm
              0.8431329 = fieldWeight in 3764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.09375 = fieldNorm(doc=3764)
          0.02387718 = weight(abstract_txt:paper in 3764) [ClassicSimilarity], result of:
            0.02387718 = score(doc=3764,freq=1.0), product of:
              0.073575564 = queryWeight, product of:
                1.5109266 = boost
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.0140673425 = queryNorm
              0.32452595 = fieldWeight in 3764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.09375 = fieldNorm(doc=3764)
          0.09111015 = weight(abstract_txt:extraction in 3764) [ClassicSimilarity], result of:
            0.09111015 = score(doc=3764,freq=1.0), product of:
              0.15694916 = queryWeight, product of:
                1.801816 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0140673425 = queryNorm
              0.5805074 = fieldWeight in 3764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.09375 = fieldNorm(doc=3764)
          0.24560563 = weight(abstract_txt:computational in 3764) [ClassicSimilarity], result of:
            0.24560563 = score(doc=3764,freq=1.0), product of:
              0.41259128 = queryWeight, product of:
                4.6191382 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0140673425 = queryNorm
              0.5952759 = fieldWeight in 3764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.09375 = fieldNorm(doc=3764)
          0.28153065 = weight(abstract_txt:linguistics in 3764) [ClassicSimilarity], result of:
            0.28153065 = score(doc=3764,freq=1.0), product of:
              0.4519027 = queryWeight, product of:
                4.8341866 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0140673425 = queryNorm
              0.62298954 = fieldWeight in 3764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.09375 = fieldNorm(doc=3764)
        0.24 = coord(6/25)
    
  3. Bird, S.; Dale, R.; Dorr, B.; Gibson, B.; Joseph, M.; Kan, M.-Y.; Lee, D.; Powley, B.; Radev, D.; Tan, Y.F.: ¬The ACL Anthology Reference Corpus : a reference dataset for bibliographic research in computational linguistics (2008) 0.16
    0.15908419 = sum of:
      0.15908419 = product of:
        0.79542094 = sum of:
          0.010078845 = weight(abstract_txt:from in 3804) [ClassicSimilarity], result of:
            0.010078845 = score(doc=3804,freq=1.0), product of:
              0.046752647 = queryWeight, product of:
                1.2044247 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0140673425 = queryNorm
              0.21557805 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=3804)
          0.20145634 = weight(abstract_txt:anthology in 3804) [ClassicSimilarity], result of:
            0.20145634 = score(doc=3804,freq=3.0), product of:
              0.165541 = queryWeight, product of:
                1.3084849 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0140673425 = queryNorm
              1.2169574 = fieldWeight in 3804, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.078125 = fieldNorm(doc=3804)
          0.14460549 = weight(abstract_txt:corpus in 3804) [ClassicSimilarity], result of:
            0.14460549 = score(doc=3804,freq=4.0), product of:
              0.15191658 = queryWeight, product of:
                1.772693 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.0140673425 = queryNorm
              0.9518743 = fieldWeight in 3804, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.078125 = fieldNorm(doc=3804)
          0.20467137 = weight(abstract_txt:computational in 3804) [ClassicSimilarity], result of:
            0.20467137 = score(doc=3804,freq=1.0), product of:
              0.41259128 = queryWeight, product of:
                4.6191382 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0140673425 = queryNorm
              0.49606323 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.078125 = fieldNorm(doc=3804)
          0.23460887 = weight(abstract_txt:linguistics in 3804) [ClassicSimilarity], result of:
            0.23460887 = score(doc=3804,freq=1.0), product of:
              0.4519027 = queryWeight, product of:
                4.8341866 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0140673425 = queryNorm
              0.51915795 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.078125 = fieldNorm(doc=3804)
        0.2 = coord(5/25)
    
  4. Engerer, V.: Exploring interdisciplinary relationships between linguistics and information retrieval from the 1960s to today (2017) 0.11
    0.11216495 = sum of:
      0.11216495 = product of:
        0.70103097 = sum of:
          0.01140291 = weight(abstract_txt:from in 4434) [ClassicSimilarity], result of:
            0.01140291 = score(doc=4434,freq=2.0), product of:
              0.046752647 = queryWeight, product of:
                1.2044247 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0140673425 = queryNorm
              0.2438987 = fieldWeight in 4434, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=4434)
          0.10620984 = weight(abstract_txt:fledged in 4434) [ClassicSimilarity], result of:
            0.10620984 = score(doc=4434,freq=1.0), product of:
              0.1808042 = queryWeight, product of:
                1.3674775 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0140673425 = queryNorm
              0.5874302 = fieldWeight in 4434, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=4434)
          0.16373709 = weight(abstract_txt:computational in 4434) [ClassicSimilarity], result of:
            0.16373709 = score(doc=4434,freq=1.0), product of:
              0.41259128 = queryWeight, product of:
                4.6191382 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0140673425 = queryNorm
              0.3968506 = fieldWeight in 4434, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0625 = fieldNorm(doc=4434)
          0.41968113 = weight(abstract_txt:linguistics in 4434) [ClassicSimilarity], result of:
            0.41968113 = score(doc=4434,freq=5.0), product of:
              0.4519027 = queryWeight, product of:
                4.8341866 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0140673425 = queryNorm
              0.928698 = fieldWeight in 4434, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0625 = fieldNorm(doc=4434)
        0.16 = coord(4/25)
    
  5. Ramisch, C.; Villavicencio, A.; Kordoni, V.: Introduction to the special issue on multiword expressions : from theory to practice and use (2013) 0.11
    0.10931718 = sum of:
      0.10931718 = product of:
        0.6832324 = sum of:
          0.05191755 = weight(abstract_txt:speech in 2124) [ClassicSimilarity], result of:
            0.05191755 = score(doc=2124,freq=1.0), product of:
              0.096687004 = queryWeight, product of:
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0140673425 = queryNorm
              0.53696513 = fieldWeight in 2124, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.078125 = fieldNorm(doc=2124)
          0.010078845 = weight(abstract_txt:from in 2124) [ClassicSimilarity], result of:
            0.010078845 = score(doc=2124,freq=1.0), product of:
              0.046752647 = queryWeight, product of:
                1.2044247 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0140673425 = queryNorm
              0.21557805 = fieldWeight in 2124, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=2124)
          0.28944904 = weight(abstract_txt:computational in 2124) [ClassicSimilarity], result of:
            0.28944904 = score(doc=2124,freq=2.0), product of:
              0.41259128 = queryWeight, product of:
                4.6191382 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0140673425 = queryNorm
              0.7015394 = fieldWeight in 2124, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.078125 = fieldNorm(doc=2124)
          0.33178702 = weight(abstract_txt:linguistics in 2124) [ClassicSimilarity], result of:
            0.33178702 = score(doc=2124,freq=2.0), product of:
              0.4519027 = queryWeight, product of:
                4.8341866 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.0140673425 = queryNorm
              0.7342002 = fieldWeight in 2124, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.078125 = fieldNorm(doc=2124)
        0.16 = coord(4/25)