1 code implementation • 28 Sep 2023 • Ana Ezquerro, Carlos Gómez-Rodríguez, David Vilares
Since the popularization of BiLSTMs and Transformer-based bidirectional encoders, state-of-the-art syntactic parsers have lacked incrementality, requiring access to the whole sentence and deviating from human language processing.
no code implementations • 23 Sep 2023 • Olga Zamaraeva, Carlos Gómez-Rodríguez
We also present the grammar's coverage and overgeneration on a small portion of a learner corpus, an entirely new research line with respect to the SRG.
no code implementations • 20 Sep 2023 • Alberto Muñoz-Ortiz, David Vilares, Carlos Gómez-Rodríguez
We present an approach for assessing how multilingual large language models (LLMs) learn syntax in terms of multi-formalism syntactic structures.
no code implementations • 14 Sep 2023 • Olga Zamaraeva, Carlos Gómez-Rodríguez
We present new supertaggers trained on HPSG-based treebanks.
no code implementations • 11 Sep 2023 • Olga Kellert, Mahmud Uz Zaman, Nicholas Hill Matlis, Carlos Gómez-Rodríguez
This paper summarizes the results of experimenting with Universal Dependencies (UD) adaptation of an Unsupervised, Compositional and Recursive (UCR) rule-based approach for Sentiment Analysis (SA) submitted to the Shared Task at Rest-Mex 2023 (Team Olga/LyS-SALSA) (within the IberLEF 2023 conference).
no code implementations • 17 Aug 2023 • Alberto Muñoz-Ortiz, Carlos Gómez-Rodríguez, David Vilares
We conduct a quantitative analysis contrasting human-written English news text with comparable large language model (LLM) output from 4 LLMs from the LLaMa family.
no code implementations • 27 Oct 2022 • Alberto Muñoz-Ortiz, Mark Anderson, David Vilares, Carlos Gómez-Rodríguez
PoS tags, once taken for granted as a useful resource for syntactic parsing, have become more situational with the popularization of deep learning.
1 code implementation • CL (ACL) 2022 • Mark Anderson, Carlos Gómez-Rodríguez
We contribute to the discussion on parsing performance in NLP by introducing a measurement that evaluates the differences between the distributions of edge displacement (the directed distance of edges) seen in training and test data.
1 code implementation • COLING 2022 • Iago Alonso-Alonso, David Vilares, Carlos Gómez-Rodríguez
Treebank selection for parsing evaluation and the spurious effects that might arise from a biased choice have not been explored in detail.
no code implementations • insights (ACL) 2022 • Alberto Muñoz-Ortiz, Carlos Gómez-Rodríguez, David Vilares
We propose a morphology-based method for low-resource (LR) dependency parsing.
1 code implementation • 19 May 2022 • Ulugbek Salaev, Elmurod Kuriyozov, Carlos Gómez-Rodríguez
Machine transliteration, as defined in this paper, is a process of automatically transforming written script of words from a source alphabet into words of another target alphabet within the same language, while preserving their meaning, as well as pronunciation.
1 code implementation • SIGUL (LREC) 2022 • Ulugbek Salaev, Elmurod Kuriyozov, Carlos Gómez-Rodríguez
Semantic relatedness between words is one of the core concepts in natural language processing, thus making semantic evaluation an important task.
no code implementations • SemEval (NAACL) 2022 • Iago Alonso-Alonso, David Vilares, Carlos Gómez-Rodríguez
This paper addressed the problem of structured sentiment analysis using a bi-affine semantic dependency parser, large pre-trained language models, and publicly available translation models.
1 code implementation • 20 Oct 2021 • Daniel Fernández-González, Carlos Gómez-Rodríguez
In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech.
no code implementations • Quasy (SyntaxFest) 2021 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
Dependency distance minimization (DDm) is a well-established principle of word order.
no code implementations • ACL (IWPT) 2021 • Mark Anderson, Carlos Gómez-Rodríguez
We present the system submission from the FASTPARSE team for the EUD Shared Task at IWPT 2021.
1 code implementation • 20 May 2021 • Daniel Fernández-González, Carlos Gómez-Rodríguez
Dependency parsing is a crucial step towards deep language understanding and, therefore, widely demanded by numerous Natural Language Processing applications.
1 code implementation • EMNLP 2021 • Daniel Fernández-González, Carlos Gómez-Rodríguez
Discontinuous constituent parsers have always lagged behind continuous approaches in terms of accuracy and speed, as the presence of constituents with discontinuous yield introduces extra complexity to the task.
no code implementations • NoDaLiDa 2021 • Mark Anderson, Carlos Gómez-Rodríguez
We present an error analysis of neural UPOS taggers to evaluate why using gold standard tags has such a large positive contribution to parsing performance while using predicted UPOS tags either harms performance or offers a negligible improvement.
no code implementations • 25 Mar 2021 • David Vilares, Marcos Garcia, Carlos Gómez-Rodríguez
The experiments show that our models, especially the 12-layer one, outperform the results of mBERT in most tasks.
1 code implementation • COLING 2020 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels, hence providing almost total coverage of crossing arcs in sequence labeling parsing.
1 code implementation • COLING 2020 • Carlos Gómez-Rodríguez, Michalina Strzyz, David Vilares
We define a mapping from transition-based parsing algorithms that read sentences from left to right to sequence labeling encodings of syntactic trees.
no code implementations • CONLL 2020 • Mark Anderson, Carlos Gómez-Rodríguez
We present an analysis on the effect UPOS accuracy has on parsing performance.
1 code implementation • EMNLP 2020 • David Vilares, Carlos Gómez-Rodríguez
Second, it fills this gap and proposes to encode tree discontinuities as nearly ordered permutations of the input sequence.
1 code implementation • 21 Sep 2020 • Daniel Fernández-González, Carlos Gómez-Rodríguez
We propose a transition-based approach that, by training a single model, can efficiently parse any input sentence with both constituent and dependency trees, supporting both continuous/projective and discontinuous/non-projective syntactic structures.
2 code implementations • 30 Jul 2020 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez, Juan Luis Esteban, Lluís Alemany-Puig
Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies and the space is defined by the linear order of the words in the sentence.
no code implementations • 24 Jun 2020 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez, Juan Luis Esteban
A fundamental problem in network science is the normalization of the topological or physical distance between vertices, that requires understanding the range of variation of the unnormalized distances.
no code implementations • WS 2020 • Mathieu Dehouck, Mark Anderson, Carlos Gómez-Rodríguez
We present the system submission from the FASTPARSE team for the EUD Shared Task at IWPT 2020.
no code implementations • WS 2020 • Mark Anderson, Carlos Gómez-Rodríguez
The carbon footprint of natural language processing research has been increasing in recent years due to its reliance on large and inefficient neural network implementations.
1 code implementation • 27 May 2020 • Daniel Fernández-González, Carlos Gómez-Rodríguez
Sequence-to-sequence constituent parsing requires a linearization to represent trees as sequences.
1 code implementation • 27 May 2020 • Daniel Fernández-González, Carlos Gómez-Rodríguez
Transition-based parsers implemented with Pointer Networks have become the new state of the art in dependency parsing, excelling in producing labelled syntactic trees and outperforming graph-based models in this task.
1 code implementation • LREC 2020 • Elmurod Kuriyozov, Yerai Doval, Carlos Gómez-Rodríguez
Our experiments confirm that the obtained bilingual dictionaries outperform previously-available ones, and that word embeddings from a low-resource language can benefit from resource-rich closely-related languages when they are aligned together.
no code implementations • LREC 2020 • Mark Anderson, Carlos Gómez-Rodríguez
Empirical studies have shown that performance varies across different treebanks in such a way that one algorithm outperforms another on one treebank and the reverse is true for a different treebank.
1 code implementation • 5 Feb 2020 • Daniel Fernández-González, Carlos Gómez-Rodríguez
One of the most complex syntactic representations used in computational linguistics and NLP are discontinuous constituent trees, crucial for representing all grammatical phenomena of languages such as German.
3 code implementations • 5 Feb 2020 • David Vilares, Michalina Strzyz, Anders Søgaard, Carlos Gómez-Rodríguez
We first cast constituent and dependency parsing as sequence tagging.
1 code implementation • 25 Nov 2019 • Yerai Doval, Jesús Vilares, Carlos Gómez-Rodríguez
Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing from social media.
1 code implementation • IJCNLP 2019 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We explore whether it is possible to leverage eye-tracking data in an RNN dependency parser (for English) when such information is only available during training, i. e., no aggregated or token-level gaze features are used at inference time.
no code implementations • 19 Aug 2019 • Carlos Gómez-Rodríguez, Morten H. Christiansen, Ramon Ferrer-i-Cancho
The ability to produce and understand an unlimited number of different sentences is a hallmark of human language.
no code implementations • WS 2019 • Mark Anderson, David Vilares, Carlos Gómez-Rodríguez
We introduce a language-agnostic evolutionary technique for automatically extracting chunks from dependency treebanks.
1 code implementation • ACL 2019 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We use parsing as sequence labeling as a common framework to learn across constituency and dependency syntactic abstractions.
no code implementations • 13 Jun 2019 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences.
no code implementations • ACL 2019 • David Vilares, Carlos Gómez-Rodríguez
We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning.
2 code implementations • NAACL 2019 • David Vilares, Carlos Gómez-Rodríguez
We explore the challenge of action prediction from textual descriptions of scenes, a testbed to approximate whether text inference can be used to predict upcoming actions.
no code implementations • 6 Apr 2019 • Michalina Strzyz, Carlos Gómez-Rodríguez
This paper proposes a novel technique that applies case-based reasoning in order to generate templates for reusable parse tree fragments, based on PoS tags of bigrams and trigrams that demonstrate low variability in their syntactic analyses from prior data.
2 code implementations • 20 Mar 2019 • Daniel Fernández-González, Carlos Gómez-Rodríguez
We propose a novel transition-based algorithm that straightforwardly parses sentences from left to right by building $n$ attachments, with $n$ being the length of the input sentence.
Ranked #11 on
Dependency Parsing
on Penn Treebank
1 code implementation • NAACL 2019 • Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez
We recast dependency parsing as a sequence labeling problem, exploring several encodings of dependency trees as labels.
no code implementations • 3 Dec 2018 • Yerai Doval, Carlos Gómez-Rodríguez
The resulting system analyzes the text input with no word boundaries one token at a time, which can be a character or a byte, and uses the information gathered by the language model to determine if a boundary must be placed in the current position or not.
1 code implementation • 25 Oct 2018 • Daniel Fernández-González, Carlos Gómez-Rodríguez
In addition, by improving the performance of the state-of-the-art in-order shift-reduce parser, we achieve the best accuracy to date (92. 0 F1) obtained by a fully-supervised single-model greedy shift-reduce constituent parser on the WSJ benchmark.
no code implementations • WS 2018 • David Vilares, Carlos Gómez-Rodríguez
They also show how the size of the embeddings can be notably reduced.
1 code implementation • EMNLP 2018 • Carlos Gómez-Rodríguez, David Vilares
For each word w_t, it generates a label that encodes: (1) the number of ancestors in the tree that the words w_t and w_{t+1} have in common, and (2) the nonterminal symbol at the lowest common ancestor.
1 code implementation • ACL 2018 • Carlos Gómez-Rodríguez, Tianze Shi, Lillian Lee
Shi, Huang, and Lee (2017) obtained state-of-the-art results for English and Chinese dependency parsing by combining dynamic-programming implementations of transition-based dependency parsers with a minimal set of bidirectional LSTM features.
1 code implementation • WS 2018 • David Vilares, Carlos Gómez-Rodríguez
The usage of part-of-day nouns, such as 'night', and their time-specific greetings ('good night'), varies across languages and cultures.
1 code implementation • NAACL 2018 • David Vilares, Carlos Gómez-Rodríguez
Non-projective parsing can be useful to handle cycles and reentrancy in AMR graphs.
no code implementations • 14 May 2018 • Daniel Fernández-González, Carlos Gómez-Rodríguez
We propose an efficient dynamic oracle for training the 2-Planar transition-based parser, a linear-time parser with over 99% coverage on non-projective syntactic corpora.
1 code implementation • NAACL 2018 • Tianze Shi, Carlos Gómez-Rodríguez, Lillian Lee
We generalize Cohen, G\'omez-Rodr\'iguez, and Satta's (2011) parser to a family of non-projective transition-based dependency parsers allowing polynomial-time exact inference.
no code implementations • 21 Apr 2018 • Daniel Fernández-González, Carlos Gómez-Rodríguez
An increasingly wide range of artificial intelligence applications rely on syntactic information to process and extract meaning from natural language text or speech, with constituent trees being one of the most widely used syntactic formalisms.
1 code implementation • 25 Oct 2017 • Daniel Fernández-González, Carlos Gómez-Rodríguez
We present a novel transition system, based on the Covington non-projective parser, introducing non-local transitions that can directly create arcs involving nodes to the left of the current focus positions.
no code implementations • 24 Aug 2017 • Xinying Chen, Carlos Gómez-Rodríguez, Ramon Ferrer-i-Cancho
A comment on "Neurophysiological dynamics of phrase-structure building during sentence processing" by Nelson et al (2017), Proceedings of the National Academy of Sciences USA 114(18), E3669-E3678.
1 code implementation • WS 2017 • David Vilares, Marcos Garcia, Miguel A. Alonso, Carlos Gómez-Rodríguez
Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines.
1 code implementation • CONLL 2017 • David Vilares, Carlos Gómez-Rodríguez
In the all treebanks category (LAS and UAS) we ranked 16th and 12th.
1 code implementation • 11 Jun 2017 • Anssi Yli-Jyrä, Carlos Gómez-Rodríguez
We present a simple encoding for unlabeled noncrossing graphs and show how its latent counterpart helps us to represent several families of directed and undirected graphs used in syntactic and semantic parsing of natural language as context-free languages.
no code implementations • 11 Jun 2017 • Daniel Fernández-González, Carlos Gómez-Rodríguez
Restricted non-monotonicity has been shown beneficial for the projective arc-eager dependency parser in previous research, as posterior decisions can repair mistakes made in previous states due to the lack of information.
no code implementations • 7 Jun 2017 • Carlos Gómez-Rodríguez, Iago Alonso-Alonso, David Vilares
Syntactic parsing, the process of obtaining the internal structure of sentences in natural languages, is a crucial task for artificial intelligence applications that need to extract meaning from natural language text or speech.
no code implementations • 27 May 2017 • Carlos Gómez-Rodríguez
Liu et al. (2017) provide a comprehensive account of research on dependency distance in human languages.
no code implementations • 17 Jun 2016 • David Vilares, Carlos Gómez-Rodríguez, Miguel A. Alonso
We present a novel unsupervised approach for multilingual sentiment analysis driven by compositional syntax-based rules.
no code implementations • 13 Jan 2016 • Carlos Gómez-Rodríguez, Ramon Ferrer-i-Cancho
The structure of a sentence can be represented as a network where vertices are words and edges indicate syntactic dependencies.
no code implementations • 9 Sep 2015 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
A commentary on the article "Large-scale evidence of dependency length minimization in 37 languages" by Futrell, Mahowald & Gibson (PNAS 2015 112 (33) 10336-10341).
no code implementations • 26 Aug 2015 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence.
no code implementations • ACL 2016 • David Vilares, Carlos Gómez-Rodríguez, Miguel A. Alonso
We introduce an approach to train lexicalized parsers using bilingual corpora obtained by merging harmonized treebanks of different languages, producing parsers that can analyze sentences in either of the learned languages, or even sentences that mix both.