no code implementations • 19 Oct 2023 • Songbo Hu, Han Zhou, Moy Yuan, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Anna Korhonen, Ivan Vulić
Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP.
1 code implementation • 26 Jul 2023 • Songbo Hu, Han Zhou, Mete Hergul, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Ivan Vulić, Anna Korhonen
Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages.
1 code implementation • 22 Jul 2022 • Fenia Christopoulou, Gerasimos Lampouras, Milan Gritta, Guchun Zhang, Yinpeng Guo, Zhongqi Li, Qi Zhang, Meng Xiao, Bo Shen, Lin Li, Hao Yu, Li Yan, Pingyi Zhou, Xin Wang, Yuchi Ma, Ignacio Iacobacci, Yasheng Wang, Guangtai Liang, Jiansheng Wei, Xin Jiang, Qianxiang Wang, Qun Liu
We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i. e. the synthesis of programming language solutions given a natural language problem description.
1 code implementation • Findings (ACL) 2022 • Milan Gritta, Ruoyu Hu, Ignacio Iacobacci
Task-oriented personal assistants enable people to interact with a host of devices and services using natural language.
Natural Language Understanding Zero-Shot Cross-Lingual Transfer
1 code implementation • Findings (ACL) 2021 • Benjamin Minixhofer, Milan Gritta, Ignacio Iacobacci
For small Natural Language Inference (NLI) datasets, language modelling is typically followed by pretraining on a large (labelled) NLI dataset before fine-tuning with each NLI subtask.
2 code implementations • Findings (ACL) 2021 • Milan Gritta, Ignacio Iacobacci
The introduction of pretrained cross-lingual language models brought decisive improvements to multilingual NLP tasks.
2 code implementations • 29 Oct 2020 • Milan Gritta, Gerasimos Lampouras, Ignacio Iacobacci
We propose the Conversation Graph (ConvGraph), a graph-based representation of dialogues that can be exploited for data augmentation, multi-reference training and evaluation of non-deterministic agents.
no code implementations • 12 May 2019 • Milan Gritta
We undertake the task of comparing lexicon-based sentiment classification of film reviews with machine learning approaches.
1 code implementation • 29 Oct 2018 • Milan Gritta, Mohammad Taher Pilehvar, Nigel Collier
Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems.
no code implementations • ACL 2018 • Milan Gritta, Mohammad Taher Pilehvar, Nigel Collier
The purpose of text geolocation is to associate geographic information contained in a document with a set (or sets) of coordinates, either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics.
1 code implementation • ACL 2017 • Milan Gritta, Mohammad Taher Pilehvar, Nut Limsopatham, Nigel Collier
Named entities are frequently used in a metonymic manner.