no code implementations • 5 Feb 2024 • Ana Ezquerro, Carlos Gómez-Rodríguez, David Vilares
We study incremental constituent parsers to assess their capacity to output trees based on prefix representations alone.
no code implementations • 22 Oct 2023 • Carlos Gómez-Rodríguez, Diego Roca, David Vilares
We introduce an encoding for parsing as sequence labeling that can represent any projective dependency tree as a sequence of 4-bit labels, one per word.
1 code implementation • 28 Sep 2023 • Ana Ezquerro, Carlos Gómez-Rodríguez, David Vilares
Since the popularization of BiLSTMs and Transformer-based bidirectional encoders, state-of-the-art syntactic parsers have lacked incrementality, requiring access to the whole sentence and deviating from human language processing.
1 code implementation • 20 Sep 2023 • Alberto Muñoz-Ortiz, David Vilares, Carlos Gómez-Rodríguez
We present an approach for assessing how multilingual large language models (LLMs) learn syntax in terms of multi-formalism syntactic structures.
no code implementations • 17 Aug 2023 • Alberto Muñoz-Ortiz, Carlos Gómez-Rodríguez, David Vilares
We conduct a quantitative analysis contrasting human-written English news text with comparable large language model (LLM) output from 4 LLMs from the LLaMa family.
1 code implementation • 24 May 2023 • Alberto Muñoz-Ortiz, David Vilares
The usefulness of part-of-speech tags for parsing has been heavily questioned due to the success of word-contextualized parsers.
no code implementations • 27 Oct 2022 • Alberto Muñoz-Ortiz, Mark Anderson, David Vilares, Carlos Gómez-Rodríguez
PoS tags, once taken for granted as a useful resource for syntactic parsing, have become more situational with the popularization of deep learning.
1 code implementation • COLING 2022 • Iago Alonso-Alonso, David Vilares, Carlos Gómez-Rodríguez
Treebank selection for parsing evaluation and the spurious effects that might arise from a biased choice have not been explored in detail.
no code implementations • insights (ACL) 2022 • Alberto Muñoz-Ortiz, Carlos Gómez-Rodríguez, David Vilares
We propose a morphology-based method for low-resource (LR) dependency parsing.
no code implementations • SemEval (NAACL) 2022 • Iago Alonso-Alonso, David Vilares, Carlos Gómez-Rodríguez
This paper addressed the problem of structured sentiment analysis using a bi-affine semantic dependency parser, large pre-trained language models, and publicly available translation models.
no code implementations • RANLP 2021 • Alberto Muñoz-Ortiz, Michalina Strzyz, David Vilares
Different linearizations have been proposed to cast dependency parsing as sequence labeling and solve the task as: (i) a head selection problem, (ii) finding a representation of the token arcs as bracket strings, or (iii) associating partial transition sequences of a transition-based parser to words.
1 code implementation • NAACL (CALCS) 2021 • Marvin M. Agüero-Torales, David Vilares, Antonio G. López-Herrera
This paper addresses the problem of sentiment analysis for Jopara, a code-switching language between Guarani and Spanish.
no code implementations • 25 Mar 2021 • David Vilares, Marcos Garcia, Carlos Gómez-Rodríguez
The experiments show that our models, especially the 12-layer one, outperform the results of mBERT in most tasks.
1 code implementation • COLING 2020 • Carlos Gómez-Rodríguez, Michalina Strzyz, David Vilares
We define a mapping from transition-based parsing algorithms that read sentences from left to right to sequence labeling encodings of syntactic trees.
1 code implementation • COLING 2020 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels, hence providing almost total coverage of crossing arcs in sequence labeling parsing.
1 code implementation • EMNLP 2020 • David Vilares, Carlos Gómez-Rodríguez
Second, it fills this gap and proposes to encode tree discontinuities as nearly ordered permutations of the input sequence.
3 code implementations • 5 Feb 2020 • David Vilares, Michalina Strzyz, Anders Søgaard, Carlos Gómez-Rodríguez
We first cast constituent and dependency parsing as sequence tagging.
1 code implementation • IJCNLP 2019 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We explore whether it is possible to leverage eye-tracking data in an RNN dependency parser (for English) when such information is only available during training, i. e., no aggregated or token-level gaze features are used at inference time.
no code implementations • WS 2019 • Mark Anderson, David Vilares, Carlos Gómez-Rodríguez
We introduce a language-agnostic evolutionary technique for automatically extracting chunks from dependency treebanks.
1 code implementation • ACL 2019 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We use parsing as sequence labeling as a common framework to learn across constituency and dependency syntactic abstractions.
no code implementations • ACL 2019 • David Vilares, Carlos Gómez-Rodríguez
We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning.
1 code implementation • NAACL 2019 • David Vilares, Carlos Gómez-Rodríguez
We explore the challenge of action prediction from textual descriptions of scenes, a testbed to approximate whether text inference can be used to predict upcoming actions.
2 code implementations • NAACL 2019 • David Vilares, Mostafa Abdou, Anders Søgaard
Combining these techniques, we clearly surpass the performance of sequence tagging constituent parsers on the English and Chinese Penn Treebanks, and reduce their parsing time even further.
1 code implementation • NAACL 2019 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We recast dependency parsing as a sequence labeling problem, exploring several encodings of dependency trees as labels.
no code implementations • WS 2018 • David Vilares, Carlos Gómez-Rodríguez
They also show how the size of the embeddings can be notably reduced.
1 code implementation • EMNLP 2018 • Carlos Gómez-Rodríguez, David Vilares
For each word w_t, it generates a label that encodes: (1) the number of ancestors in the tree that the words w_t and w_{t+1} have in common, and (2) the nonterminal symbol at the lowest common ancestor.
1 code implementation • WS 2018 • David Vilares, Carlos Gómez-Rodríguez
The usage of part-of-day nouns, such as 'night', and their time-specific greetings ('good night'), varies across languages and cultures.
1 code implementation • NAACL 2018 • David Vilares, Carlos Gómez-Rodríguez
Non-projective parsing can be useful to handle cycles and reentrancy in AMR graphs.
1 code implementation • EMNLP 2017 • David Vilares, Yulan He
We explore how to detect people{'}s perspectives that occupy a certain proposition.
1 code implementation • WS 2017 • David Vilares, Marcos Garcia, Miguel A. Alonso, Carlos Gómez-Rodríguez
Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines.
1 code implementation • CONLL 2017 • David Vilares, Carlos Gómez-Rodríguez
In the all treebanks category (LAS and UAS) we ranked 16th and 12th.
no code implementations • 7 Jun 2017 • Carlos Gómez-Rodríguez, Iago Alonso-Alonso, David Vilares
Syntactic parsing, the process of obtaining the internal structure of sentences in natural languages, is a crucial task for artificial intelligence applications that need to extract meaning from natural language text or speech.
no code implementations • 17 Jun 2016 • David Vilares, Carlos Gómez-Rodríguez, Miguel A. Alonso
We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules.
no code implementations • LREC 2016 • David Vilares, Miguel A. Alonso, Carlos G{\'o}mez-Rodr{\'\i}guez
Code-switching texts are those that contain terms in two or more different languages, and they appear increasingly often in social media.
no code implementations • ACL 2016 • David Vilares, Carlos Gómez-Rodríguez, Miguel A. Alonso
We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of the learned languages, or even sentences that mix both.