When the whole is greater than the sum of its parts: Multiword expressions and idiomaticity

WS 2019  ·  Aline Villavicencio ·

Multiword expressions (MWEs) feature prominently in the mental lexicon of native speakers (Jackendoff, 1997) in all languages and domains, from informal to technical contexts (Biber et al., 1999) with about four MWEs being produced per minute of discourse (Glucksberg, 1989). MWEs come in all shapes and forms, including idioms like rock the boat (as cause problems or disturb a situation) and compound nouns like monkey business (as dishonest behaviour). Their accurate detection and understanding may often require more than knowledge about individual words and how they can be combined (Fillmore, 1979), as they may display various degrees of idiosyncrasy, including lexical, syntactic, semantic and statistical (Sag et al., 2002; Baldwin and Kim, 2010), which provide new challenges and opportunities for language processing (Constant et al., 2017). For instance, while for some combinations the meaning can be inferred from their parts like olive oil (oil made of olives) this is not always the case, as in dark horse (meaning an unknown candidate who unexpectedly succeeds), and when processing a sentence some of the challenges are to identify which words form an expression (Ramisch, 2015), and whether the expression is idiomatic (Cordeiro et al., 2019). In this talk I will give an overview of advances on the identification and treatment of multiword expressions, in particular concentrating on techniques for identifying their degree of idiomaticity.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here