Search Results for author: Tiago Pimentel

Found 23 papers, 12 papers with code

A Bayesian Framework for Information-Theoretic Probing

1 code implementation8 Sep 2021 Tiago Pimentel, Ryan Cotterell

Pimentel et al. (2020) recently analysed probing from an information-theoretic perspective.

Modeling the Unigram Distribution

1 code implementation4 Jun 2021 Irene Nikkarinen, Tiago Pimentel, Damián E. Blasi, Ryan Cotterell

The unigram distribution is the non-contextual probability of finding a specific word form in a corpus.

A Non-Linear Structural Probe

no code implementations NAACL 2021 Jennifer C. White, Tiago Pimentel, Naomi Saphra, Ryan Cotterell

Probes are models devised to investigate the encoding of knowledge -- e. g. syntactic structure -- in contextual representations.

How (Non-)Optimal is the Lexicon?

no code implementations NAACL 2021 Tiago Pimentel, Irene Nikkarinen, Kyle Mahowald, Ryan Cotterell, Damián Blasi

Examining corpora from 7 typologically diverse languages, we use those upper bounds to quantify the lexicon's optimality and to explore the relative costs of major constraints on natural codes.

Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

no code implementations15 Apr 2021 Karolina Stańczak, Sagnik Ray Choudhury, Tiago Pimentel, Ryan Cotterell, Isabelle Augenstein

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent research has demonstrated that these models inherit societal biases extant in natural language.

Language Modelling

Finding Concept-specific Biases in Form--Meaning Associations

2 code implementations NAACL 2021 Tiago Pimentel, Brian Roark, Søren Wichmann, Ryan Cotterell, Damián Blasi

It is not a new idea that there are small, cross-linguistic associations between the forms and meanings of words.

Disambiguatory Signals are Stronger in Word-initial Positions

1 code implementation EACL 2021 Tiago Pimentel, Ryan Cotterell, Brian Roark

Psycholinguistic studies of human word processing and lexical access provide ample evidence of the preferred nature of word-initial versus word-final segments, e. g., in terms of attention paid by listeners (greater) or the likelihood of reduction by speakers (lower).

Speakers Fill Lexical Semantic Gaps with Context

1 code implementation EMNLP 2020 Tiago Pimentel, Rowan Hall Maudslay, Damián Blasi, Ryan Cotterell

For a language to be clear and efficiently encoded, we posit that the lexical ambiguity of a word type should correlate with how much information context provides about it, on average.

Pareto Probing: Trading Off Accuracy for Complexity

1 code implementation EMNLP 2020 Tiago Pimentel, Naomi Saphra, Adina Williams, Ryan Cotterell

In our contribution to this discussion, we argue for a probe metric that reflects the fundamental trade-off between probe complexity and performance: the Pareto hypervolume.

Dependency Parsing

Metaphor Detection using Context and Concreteness

no code implementations WS 2020 Rowan Hall Maudslay, Tiago Pimentel, Ryan Cotterell, Simone Teufel

We report the results of our system on the Metaphor Detection Shared Task at the Second Workshop on Figurative Language Processing 2020.

A Corpus for Large-Scale Phonetic Typology

no code implementations ACL 2020 Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W. black, Jason Eisner

A major hurdle in data-driven research on typology is having sufficient data in many languages to draw meaningful conclusions.

Phonotactic Complexity and its Trade-offs

1 code implementation TACL 2020 Tiago Pimentel, Brian Roark, Ryan Cotterell

We present methods for calculating a measure of phonotactic complexity---bits per phoneme---that permits a straightforward cross-linguistic comparison.

A Tale of a Probe and a Parser

1 code implementation ACL 2020 Rowan Hall Maudslay, Josef Valvoda, Tiago Pimentel, Adina Williams, Ryan Cotterell

One such probe is the structural probe (Hewitt and Manning, 2019), designed to quantify the extent to which syntactic information is encoded in contextualised word representations.

Predicting Declension Class from Form and Meaning

1 code implementation ACL 2020 Adina Williams, Tiago Pimentel, Arya D. McCarthy, Hagen Blix, Eleanor Chodroff, Ryan Cotterell

We find for two Indo-European languages (Czech and German) that form and meaning respectively share significant amounts of information with class (and contribute additional information above and beyond gender).

Assessing the Reliability of Visual Explanations of Deep Models with Adversarial Perturbations

no code implementations22 Apr 2020 Dan Valle, Tiago Pimentel, Adriano Veloso

Thus, in this work we propose an objective measure to evaluate the reliability of explanations of deep models.

Feature Importance

Information-Theoretic Probing for Linguistic Structure

1 code implementation ACL 2020 Tiago Pimentel, Josef Valvoda, Rowan Hall Maudslay, Ran Zmigrod, Adina Williams, Ryan Cotterell

The success of neural networks on a diverse set of NLP tasks has led researchers to question how much these networks actually ``know'' about natural language.

Word Embeddings

Rethinking Phonotactic Complexity

no code implementations WS 2019 Tiago Pimentel, Brian Roark, Ryan Cotterell

In this work, we propose the use of phone-level language models to estimate phonotactic complexity{---}measured in bits per phoneme{---}which makes cross-linguistic comparison straightforward.

Meaning to Form: Measuring Systematicity as Information

1 code implementation ACL 2019 Tiago Pimentel, Arya D. McCarthy, Damián E. Blasi, Brian Roark, Ryan Cotterell

A longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade?

UaiNets: From Unsupervised to Active Deep Anomaly Detection

no code implementations ICLR 2019 Tiago Pimentel, Marianne Monteiro, Juliano Viana, Adriano Veloso, Nivio Ziviani

This work presents a method for active anomaly detection which can be built upon existing deep learning solutions for unsupervised anomaly detection.

Unsupervised Anomaly Detection

Deep Active Learning for Anomaly Detection

no code implementations23 May 2018 Tiago Pimentel, Marianne Monteiro, Adriano Veloso, Nivio Ziviani

Anomalies are intuitively easy for human experts to understand, but they are hard to define mathematically.

Active Learning Unsupervised Anomaly Detection

Fast Node Embeddings: Learning Ego-Centric Representations

no code implementations ICLR 2018 Tiago Pimentel, Adriano Veloso, Nivio Ziviani

Representation learning is one of the foundations of Deep Learning and allowed important improvements on several Machine Learning tasks, such as Neural Machine Translation, Question Answering and Speech Recognition.

Link Prediction Machine Translation +4

Cannot find the paper you are looking for? You can Submit a new open access paper.