no code implementations • 18 Mar 2021 • Aili Shen, Meladel Mistica, Bahar Salehi, Hang Li, Timothy Baldwin, Jianzhong Qi
While pretrained language models ("LM") have driven impressive gains over morpho-syntactic and semantic tasks, their ability to model discourse and pragmatic phenomena is less clear.
no code implementations • WS 2019 • Aili Shen, Daniel Beck, Bahar Salehi, Jianzhong Qi, Timothy Baldwin
In the context of document quality assessment, previous work has mainly focused on predicting the quality of a document relative to a putative gold standard, without paying attention to the subjectivity of this task.
no code implementations • WS 2019 • N, Navnita akumar, Timothy Baldwin, Bahar Salehi
In this paper, we apply various embedding methods on multiword expressions to study how well they capture the nuances of non-compositional data.
no code implementations • 4 Jan 2019 • Aili Shen, Bahar Salehi, Timothy Baldwin, Jianzhong Qi
The quality of a document is affected by various factors, including grammaticality, readability, stylistics, and expertise depth, making the task of document quality assessment a complex one.
no code implementations • ALTA 2018 • N, Navnita akumar, Bahar Salehi, Timothy Baldwin
In this paper, we perform a comparative evaluation of off-the-shelf embedding models over the task of compositionality prediction of multiword expressions(``MWEs'').
no code implementations • WS 2017 • Bahar Salehi, Dirk Hovy, Eduard Hovy, Anders S{\o}gaard
Geolocation is the task of identifying a social media user{'}s primary location, and in natural language processing, there is a growing literature on to what extent automated analysis of social media posts can help.
no code implementations • WS 2017 • Bahar Salehi, Anders S{\o}gaard
Recent work in geolocation has made several hypotheses about what linguistic markers are relevant to detect where people write from.
no code implementations • COLING 2016 • Bahar Salehi, Paul Cook, Timothy Baldwin
Much previous research on multiword expressions (MWEs) has focused on the token- and type-level tasks of MWE identification and extraction, respectively.