Document (#39919)

Author
Rayson, P.
Piao, S.
Sharoff, S.
Evert, S.
Moiron, B.V.
Title
Multiword expressions : hard going or plain sailing?
Source
Language resources and evaluation. 44(2010) no.1, S.1-5
Year
2015
Abstract
Over the past two decades or so, Multi-Word Expressions (MWEs; also called Multi-word Units) have been an increasingly important concern for Computational Linguistics and Natural Language Processing (NLP). The term MWE has been used to refer to various types of linguistic units and expressions, including idioms, noun compounds, phrasal verbs, light verbs and other habitual collocations. However, while there is no universally agreed definition for MWE as yet, most researchers use the term to refer to those frequently occurring phrasal units which are subject to certain level of semantic opaqueness, or non-compositionality. Non-compositional MWEs pose tough challenges for automatic analysis because their interpretation cannot be achieved by directly combining the semantics of their constituents, thereby causing the "pain in the neck of NLP".
Theme
Computerlinguistik

Similar documents (content)

  1. Cruys, T. van de; Moirón, B.V.: Semantics-based multiword expression extraction (2007) 0.23
    0.23438668 = sum of:
      0.23438668 = product of:
        1.1719334 = sum of:
          0.1528324 = weight(abstract_txt:noun in 3919) [ClassicSimilarity], result of:
            0.1528324 = score(doc=3919,freq=3.0), product of:
              0.12113859 = queryWeight, product of:
                1.042489 = boost
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.014955812 = queryNorm
              1.2616327 = fieldWeight in 3919, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.09375 = fieldNorm(doc=3919)
          0.12205225 = weight(abstract_txt:multiword in 3919) [ClassicSimilarity], result of:
            0.12205225 = score(doc=3919,freq=1.0), product of:
              0.15038684 = queryWeight, product of:
                1.1615427 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.014955812 = queryNorm
              0.81158864 = fieldWeight in 3919, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.09375 = fieldNorm(doc=3919)
          0.24701482 = weight(abstract_txt:compositionality in 3919) [ClassicSimilarity], result of:
            0.24701482 = score(doc=3919,freq=2.0), product of:
              0.19097857 = queryWeight, product of:
                1.3089485 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.014955812 = queryNorm
              1.2934164 = fieldWeight in 3919, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.09375 = fieldNorm(doc=3919)
          0.4740197 = weight(abstract_txt:mwes in 3919) [ClassicSimilarity], result of:
            0.4740197 = score(doc=3919,freq=2.0), product of:
              0.37157252 = queryWeight, product of:
                2.5820642 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.014955812 = queryNorm
              1.2757125 = fieldWeight in 3919, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.09375 = fieldNorm(doc=3919)
          0.17601421 = weight(abstract_txt:expressions in 3919) [ClassicSimilarity], result of:
            0.17601421 = score(doc=3919,freq=1.0), product of:
              0.2768545 = queryWeight, product of:
                2.7297108 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.014955812 = queryNorm
              0.63576436 = fieldWeight in 3919, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.09375 = fieldNorm(doc=3919)
        0.2 = coord(5/25)
    
  2. Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.20
    0.1964722 = sum of:
      0.1964722 = product of:
        1.2279513 = sum of:
          0.12205225 = weight(abstract_txt:multiword in 3920) [ClassicSimilarity], result of:
            0.12205225 = score(doc=3920,freq=1.0), product of:
              0.15038684 = queryWeight, product of:
                1.1615427 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.014955812 = queryNorm
              0.81158864 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.34933168 = weight(abstract_txt:compositionality in 3920) [ClassicSimilarity], result of:
            0.34933168 = score(doc=3920,freq=4.0), product of:
              0.19097857 = queryWeight, product of:
                1.3089485 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.014955812 = queryNorm
              1.8291669 = fieldWeight in 3920, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.5805532 = weight(abstract_txt:mwes in 3920) [ClassicSimilarity], result of:
            0.5805532 = score(doc=3920,freq=3.0), product of:
              0.37157252 = queryWeight, product of:
                2.5820642 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.014955812 = queryNorm
              1.5624223 = fieldWeight in 3920, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
          0.17601421 = weight(abstract_txt:expressions in 3920) [ClassicSimilarity], result of:
            0.17601421 = score(doc=3920,freq=1.0), product of:
              0.2768545 = queryWeight, product of:
                2.7297108 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.014955812 = queryNorm
              0.63576436 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.09375 = fieldNorm(doc=3920)
        0.16 = coord(4/25)
    
  3. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.19
    0.19093242 = sum of:
      0.19093242 = product of:
        0.7955518 = sum of:
          0.036765758 = weight(abstract_txt:noun in 2536) [ClassicSimilarity], result of:
            0.036765758 = score(doc=2536,freq=1.0), product of:
              0.12113859 = queryWeight, product of:
                1.042489 = boost
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.014955812 = queryNorm
              0.30350164 = fieldWeight in 2536, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.11557497 = weight(abstract_txt:compounds in 2536) [ClassicSimilarity], result of:
            0.11557497 = score(doc=2536,freq=6.0), product of:
              0.14305803 = queryWeight, product of:
                1.1328864 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.014955812 = queryNorm
              0.80788875 = fieldWeight in 2536, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.1686673 = weight(abstract_txt:multiword in 2536) [ClassicSimilarity], result of:
            0.1686673 = score(doc=2536,freq=11.0), product of:
              0.15038684 = queryWeight, product of:
                1.1615427 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.014955812 = queryNorm
              1.1215563 = fieldWeight in 2536, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.025211852 = weight(abstract_txt:word in 2536) [ClassicSimilarity], result of:
            0.025211852 = score(doc=2536,freq=1.0), product of:
              0.11868613 = queryWeight, product of:
                1.4593022 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.014955812 = queryNorm
              0.21242458 = fieldWeight in 2536, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.24189718 = weight(abstract_txt:mwes in 2536) [ClassicSimilarity], result of:
            0.24189718 = score(doc=2536,freq=3.0), product of:
              0.37157252 = queryWeight, product of:
                2.5820642 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.014955812 = queryNorm
              0.6510093 = fieldWeight in 2536, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
          0.20743474 = weight(abstract_txt:expressions in 2536) [ClassicSimilarity], result of:
            0.20743474 = score(doc=2536,freq=8.0), product of:
              0.2768545 = queryWeight, product of:
                2.7297108 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.014955812 = queryNorm
              0.7492555 = fieldWeight in 2536, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2536)
        0.24 = coord(6/25)
    
  4. Dias, G.: Multiword unit hybrid extraction (o.J.) 0.17
    0.17460468 = sum of:
      0.17460468 = product of:
        0.8730234 = sum of:
          0.17616725 = weight(abstract_txt:multiword in 1643) [ClassicSimilarity], result of:
            0.17616725 = score(doc=1643,freq=3.0), product of:
              0.15038684 = queryWeight, product of:
                1.1615427 = boost
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.014955812 = queryNorm
              1.1714272 = fieldWeight in 1643, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.656945 = idf(docFreq=20, maxDocs=44421)
                0.078125 = fieldNorm(doc=1643)
          0.050423704 = weight(abstract_txt:word in 1643) [ClassicSimilarity], result of:
            0.050423704 = score(doc=1643,freq=1.0), product of:
              0.11868613 = queryWeight, product of:
                1.4593022 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.014955812 = queryNorm
              0.42484915 = fieldWeight in 1643, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.078125 = fieldNorm(doc=1643)
          0.1751488 = weight(abstract_txt:verbs in 1643) [ClassicSimilarity], result of:
            0.1751488 = score(doc=1643,freq=1.0), product of:
              0.2722168 = queryWeight, product of:
                2.210053 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.014955812 = queryNorm
              0.6434166 = fieldWeight in 1643, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.078125 = fieldNorm(doc=1643)
          0.18017386 = weight(abstract_txt:units in 1643) [ClassicSimilarity], result of:
            0.18017386 = score(doc=1643,freq=2.0), product of:
              0.2520336 = queryWeight, product of:
                2.6044743 = boost
                6.470359 = idf(docFreq=186, maxDocs=44421)
                0.014955812 = queryNorm
              0.71488035 = fieldWeight in 1643, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.470359 = idf(docFreq=186, maxDocs=44421)
                0.078125 = fieldNorm(doc=1643)
          0.29110974 = weight(abstract_txt:phrasal in 1643) [ClassicSimilarity], result of:
            0.29110974 = score(doc=1643,freq=1.0), product of:
              0.38195714 = queryWeight, product of:
                2.617897 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.014955812 = queryNorm
              0.7621529 = fieldWeight in 1643, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.078125 = fieldNorm(doc=1643)
        0.2 = coord(5/25)
    
  5. Kiela, D.; Clark, S.: Detecting compositionality of multi-word expressions using nearest neighbours in vector space models (2013) 0.12
    0.11533885 = sum of:
      0.11533885 = product of:
        0.7208678 = sum of:
          0.3529518 = weight(abstract_txt:compositionality in 2161) [ClassicSimilarity], result of:
            0.3529518 = score(doc=2161,freq=3.0), product of:
              0.19097857 = queryWeight, product of:
                1.3089485 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.014955812 = queryNorm
              1.8481225 = fieldWeight in 2161, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.109375 = fieldNorm(doc=2161)
          0.070593186 = weight(abstract_txt:word in 2161) [ClassicSimilarity], result of:
            0.070593186 = score(doc=2161,freq=1.0), product of:
              0.11868613 = queryWeight, product of:
                1.4593022 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.014955812 = queryNorm
              0.59478885 = fieldWeight in 2161, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.109375 = fieldNorm(doc=2161)
          0.09197291 = weight(abstract_txt:multi in 2161) [ClassicSimilarity], result of:
            0.09197291 = score(doc=2161,freq=1.0), product of:
              0.14157875 = queryWeight, product of:
                1.5938383 = boost
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.014955812 = queryNorm
              0.6496237 = fieldWeight in 2161, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9394164 = idf(docFreq=317, maxDocs=44421)
                0.109375 = fieldNorm(doc=2161)
          0.20534992 = weight(abstract_txt:expressions in 2161) [ClassicSimilarity], result of:
            0.20534992 = score(doc=2161,freq=1.0), product of:
              0.2768545 = queryWeight, product of:
                2.7297108 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.014955812 = queryNorm
              0.7417251 = fieldWeight in 2161, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.109375 = fieldNorm(doc=2161)
        0.16 = coord(4/25)