Search Results for author: Yulia Tsvetkov

Found 79 papers, 40 papers with code

Speaker Information Can Guide Models to Better Inductive Biases: A Case Study On Predicting Code-Switching

1 code implementation ACL 2022 Alissa Ostapenko, Shuly Wintner, Melinda Fricke, Yulia Tsvetkov

Natural language processing (NLP) models trained on people-generated data can be unreliable because, without any constraints, they can learn from spurious correlations that are not relevant to the task.

Unsupervised Keyphrase Extraction via Interpretable Neural Networks

no code implementations15 Mar 2022 Rishabh Joshi, Vidhisha Balachandran, Emily Saldanha, Maria Glenski, Svitlana Volkova, Yulia Tsvetkov

Keyphrase extraction aims at automatically extracting a list of "important" phrases which represent the key concepts in a document.

Keyphrase Extraction Topic Classification

Influence Tuning: Demoting Spurious Correlations via Instance Attribution and Instance-Driven Updates

1 code implementation Findings (EMNLP) 2021 Xiaochuang Han, Yulia Tsvetkov

Among the most critical limitations of deep learning NLP models are their lack of interpretability, and their reliance on spurious correlations.

Improving Span Representation for Domain-adapted Coreference Resolution

1 code implementation CRAC (ACL) 2021 Nupoor Gandhi, Anjalie Field, Yulia Tsvetkov

Recent work has shown fine-tuning neural coreference models can produce strong performance when adapting to different domains.

Coreference Resolution Domain Adaptation

SimVLM: Simple Visual Language Model Pretraining with Weak Supervision

no code implementations ICLR 2022 ZiRui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao

With recent progress in joint modeling of visual and textual representations, Vision-Language Pretraining (VLP) has achieved impressive performance on many multimodal downstream tasks.

Image Captioning Language Modelling +3

Controlled Text Generation as Continuous Optimization with Multiple Constraints

no code implementations NeurIPS 2021 Sachin Kumar, Eric Malmi, Aliaksei Severyn, Yulia Tsvetkov

As large-scale language model pretraining pushes the state-of-the-art in text generation, recent work has turned to controlling attributes of the text such models generate.

Language Modelling Machine Translation +3

A Survey of Race, Racism, and Anti-Racism in NLP

no code implementations ACL 2021 Anjalie Field, Su Lin Blodgett, Zeerak Waseem, Yulia Tsvetkov

Despite inextricable ties between race and language, little work has considered race in NLP research and development.

Machine Translation into Low-resource Language Varieties

no code implementations ACL 2021 Sachin Kumar, Antonios Anastasopoulos, Shuly Wintner, Yulia Tsvetkov

State-of-the-art machine translation (MT) systems are typically trained to generate the "standard" target language; however, many languages have multiple varieties (regional varieties, dialects, sociolects, non-native varieties) that are different from the standard language.

Machine Translation Translation

Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation

1 code implementation Findings (ACL) 2021 Prakhar Gupta, Yulia Tsvetkov, Jeffrey P. Bigham

Experiments on classification, ranking and evaluation tasks across multiple datasets demonstrate that our approaches outperform strong baselines in providing informative negative examples for training dialogue systems.

Dialogue Evaluation

DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues

1 code implementation ICLR 2021 Rishabh Joshi, Vidhisha Balachandran, Shikhar Vashishth, Alan Black, Yulia Tsvetkov

To successfully negotiate a deal, it is not enough to communicate fluently: pragmatic planning of persuasive negotiation strategies is essential.

Response Generation

Simple and Efficient ways to Improve REALM

no code implementations EMNLP (MRQA) 2021 Vidhisha Balachandran, Ashish Vaswani, Yulia Tsvetkov, Niki Parmar

Dense retrieval has been shown to be effective for retrieving relevant documents for Open Domain QA, surpassing popular sparse retrieval methods like BM25.

An Exploration of Data Augmentation Techniques for Improving English to Tigrinya Translation

no code implementations31 Mar 2021 Lidia Kidane, Sachin Kumar, Yulia Tsvetkov

It has been shown that the performance of neural machine translation (NMT) drops starkly in low-resource conditions, often requiring large amounts of auxiliary data to achieve competitive results.

Data Augmentation Machine Translation +1

SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers

2 code implementations EMNLP 2021 Dheeraj Rajagopal, Vidhisha Balachandran, Eduard Hovy, Yulia Tsvetkov

We introduce SelfExplain, a novel self-explaining model that explains a text classifier's predictions using phrase-based concepts.

Text Classification

Controlled Analyses of Social Biases in Wikipedia Bios

1 code implementation31 Dec 2020 Anjalie Field, Chan Young Park, Kevin Z. Lin, Yulia Tsvetkov

In this work, we present a methodology for analyzing Wikipedia pages about people that isolates dimensions of interest (e. g., gender), from other attributes (e. g., occupation).

Multilingual Contextual Affective Analysis of LGBT People Portrayals in Wikipedia

no code implementations21 Oct 2020 Chan Young Park, Xinru Yan, Anjalie Field, Yulia Tsvetkov

Specific lexical choices in narrative text reflect both the writer's attitudes towards people in the narrative and influence the audience's reactions.

End-to-End Differentiable GANs for Text Generation

no code implementations NeurIPS Workshop ICBINB 2020 Sachin Kumar, Yulia Tsvetkov

We posit that this gap is due to autoregressive nature and architectural requirements for text generation as well as a fundamental difference between the definition of Wasserstein distance in image and text domains.

Text Generation

Fortifying Toxic Speech Detectors Against Veiled Toxicity

1 code implementation EMNLP 2020 Xiaochuang Han, Yulia Tsvetkov

Modern toxic speech detectors are incompetent in recognizing disguised offensive language, such as adversarial attacks that deliberately avoid known toxic lexicons, or manifestations of implicit bias.

On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

1 code implementation EMNLP 2020 ZiRui Wang, Zachary C. Lipton, Yulia Tsvetkov

Modern multilingual models are trained on concatenated text from multiple languages in hopes of conferring benefits to each (positive transfer), with the most pronounced benefits accruing to low-resource languages.

Meta-Learning

Automatic Extraction of Rules Governing Morphological Agreement

1 code implementation EMNLP 2020 Aditi Chaudhary, Antonios Anastasopoulos, Adithya Pratapa, David R. Mortensen, Zaid Sheikh, Yulia Tsvetkov, Graham Neubig

Using cross-lingual transfer, even with no expert annotations in the language of interest, our framework extracts a grammatical specification which is nearly equivalent to those created with large amounts of gold-standard annotated data.

Cross-Lingual Transfer

Controlling Dialogue Generation with Semantic Exemplars

1 code implementation NAACL 2021 Prakhar Gupta, Jeffrey P. Bigham, Yulia Tsvetkov, Amy Pavel

Dialogue systems pretrained with large language models generate locally coherent responses, but lack the fine-grained control over responses necessary to achieve specific goals.

Dialogue Generation Response Generation

A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards

1 code implementation WS 2020 Zi-Yi Dou, Sachin Kumar, Yulia Tsvetkov

The model uses reinforcement learning to directly optimize a bilingual semantic similarity metric between the summaries generated in a target language and gold summaries in a source language.

Machine Translation reinforcement-learning +4

Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning of Pragmatically Motivated Tasks

1 code implementation EACL 2021 Jimin Sun, Hwijeen Ahn, Chan Young Park, Yulia Tsvetkov, David R. Mortensen

Much work in cross-lingual transfer learning explored how to select better transfer languages for multilingual tasks, primarily focusing on typological and genealogical similarities between languages.

Cross-Lingual Transfer Dependency Parsing +2

Demoting Racial Bias in Hate Speech Detection

no code implementations WS 2020 Mengzhou Xia, Anjalie Field, Yulia Tsvetkov

In current hate speech datasets, there exists a high correlation between annotators' perceptions of toxicity and signals of African American English (AAE).

Hate Speech Detection

A Computational Analysis of Polarization on Indian and Pakistani Social Media

1 code implementation20 May 2020 Aman Tyagi, Anjalie Field, Priyank Lathwal, Yulia Tsvetkov, Kathleen M. Carley

Between February 14, 2019 and March 4, 2019, a terrorist attack in Pulwama, Kashmir followed by retaliatory airstrikes led to rising tensions between India and Pakistan, two nuclear-armed countries.

14

Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions

1 code implementation ACL 2020 Xiaochuang Han, Byron C. Wallace, Yulia Tsvetkov

In this work, we investigate the use of influence functions for NLP, providing an alternative approach to interpreting neural text classifiers.

Natural Language Inference

Unsupervised Discovery of Implicit Gender Bias

1 code implementation EMNLP 2020 Anjalie Field, Yulia Tsvetkov

Despite their prevalence in society, social biases are difficult to identify, primarily because human judgements in this domain can be unreliable.

Balancing Training for Multilingual Neural Machine Translation

2 code implementations ACL 2020 Xinyi Wang, Yulia Tsvetkov, Graham Neubig

When training multilingual machine translation (MT) models that can translate to/from multiple languages, we are faced with imbalanced training sets: some languages have much more training data than others.

Machine Translation Translation

A Framework for the Computational Linguistic Analysis of Dehumanization

no code implementations6 Mar 2020 Julia Mendelsohn, Yulia Tsvetkov, Dan Jurafsky

Dehumanization is a pernicious psychological process that often leads to extreme intergroup bias, hate speech, and violence aimed at targeted social groups.

Abusive Language

StructSum: Summarization via Structured Representations

1 code implementation EACL 2021 Vidhisha Balachandran, Artidoro Pagnoni, Jay Yoon Lee, Dheeraj Rajagopal, Jaime Carbonell, Yulia Tsvetkov

To this end, we propose incorporating latent and explicit dependencies across sentences in the source document into end-to-end single-document summarization models.

Abstractive Text Summarization Document Summarization

Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods

1 code implementation SCiL 2020 Maria Ryskina, Ella Rabinovich, Taylor Berg-Kirkpatrick, David R. Mortensen, Yulia Tsvetkov

Besides presenting a new linguistic application of distributional semantics, this study tackles the linguistic question of the role of language-internal factors (in our case, sparsity) in language change motivated by language-external factors (reflected in frequency growth).

Learning to Generate Word- and Phrase-Embeddings for Efficient Phrase-Based Neural Machine Translation

no code implementations WS 2019 Chan Young Park, Yulia Tsvetkov

In this paper, we introduce a phrase-based NMT model built upon continuous-output NMT, in which the decoder generates embeddings of words or phrases.

Machine Translation Translation

A Margin-based Loss with Synthetic Negative Samples for Continuous-output Machine Translation

no code implementations WS 2019 Gayatri Bhat, Sachin Kumar, Yulia Tsvetkov

Neural models that eliminate the softmax bottleneck by generating word embeddings (rather than multinomial distributions over a vocabulary) attain faster training with fewer learnable parameters.

Machine Translation Translation +1

A Dynamic Strategy Coach for Effective Negotiation

no code implementations WS 2019 Yiheng Zhou, He He, Alan W. black, Yulia Tsvetkov

We consider a bargaining scenario where a seller and a buyer negotiate the price of an item for sale through a text-based dialog.

Decision Making Text Generation

Topics to Avoid: Demoting Latent Confounds in Text Classification

1 code implementation IJCNLP 2019 Sachin Kumar, Shuly Wintner, Noah A. Smith, Yulia Tsvetkov

Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize well.

Classification General Classification +2

Measuring Bias in Contextualized Word Representations

1 code implementation WS 2019 Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W. black, Yulia Tsvetkov

Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks.

Word Embeddings

Entity-Centric Contextual Affective Analysis

no code implementations ACL 2019 Anjalie Field, Yulia Tsvetkov

While contextualized word representations have improved state-of-the-art benchmarks in many NLP tasks, their potential usefulness for social-oriented tasks remains largely unexplored.

Word Embeddings

Contextual Affective Analysis: A Case Study of People Portrayals in Online #MeToo Stories

2 code implementations8 Apr 2019 Anjalie Field, Gayatri Bhat, Yulia Tsvetkov

We show that while these articles are sympathetic towards women who have experienced sexual harassment, they consistently present men as most powerful, even after sexual assault allegations.

Social and Information Networks

Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings

1 code implementation NAACL 2019 Thomas Manzini, Yao Chong Lim, Yulia Tsvetkov, Alan W. black

Online texts -- across genres, registers, domains, and styles -- are riddled with human stereotypes, expressed in overt or subtle ways.

Word Embeddings

Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs

1 code implementation ICLR 2019 Sachin Kumar, Yulia Tsvetkov

The Softmax function is used in the final layer of nearly all existing sequence-to-sequence models for language generation.

Machine Translation Text Generation +2

Style Transfer Through Multilingual and Feedback-Based Back-Translation

no code implementations17 Sep 2018 Shrimai Prabhumoye, Yulia Tsvetkov, Alan W. black, Ruslan Salakhutdinov

Style transfer is the task of transferring an attribute of a sentence (e. g., formality) while maintaining its semantic content.

Style Transfer Translation

Socially Responsible NLP

no code implementations NAACL 2018 Yulia Tsvetkov, Vinodkumar Prabhakaran, Rob Voigt

As language technologies have become increasingly prevalent, there is a growing awareness that decisions we make about our data, methods, and tools are often tied up with their impact on people and societies.

Decision Making

Native Language Cognate Effects on Second Language Lexical Choice

1 code implementation TACL 2018 Ella Rabinovich, Yulia Tsvetkov, Shuly Wintner

We present a computational analysis of cognate effects on the spontaneous linguistic productions of advanced non-native speakers.

Style Transfer Through Back-Translation

3 code implementations ACL 2018 Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, Alan W. black

We first learn a latent representation of the input sentence which is grounded in a language translation model in order to better preserve the meaning of the sentence while reducing stylistic properties.

Style Transfer Text Style Transfer +1

Correlation-based Intrinsic Evaluation of Word Vector Representations

no code implementations WS 2016 Yulia Tsvetkov, Manaal Faruqui, Chris Dyer

We introduce QVEC-CCA--an intrinsic evaluation metric for word vector representations based on correlations of learned vectors with features extracted from linguistic resources.

Word Similarity

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning

no code implementations ACL 2016 Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Brian MacWhinney, Chris Dyer

We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features.

Representation Learning

Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

no code implementations NAACL 2016 Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer

We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.

Representation Learning

Problems With Evaluation of Word Embeddings Using Word Similarity Tasks

1 code implementation WS 2016 Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, Chris Dyer

Our study suggests that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.

Semantic Similarity Semantic Textual Similarity +2

Massively Multilingual Word Embeddings

1 code implementation5 Feb 2016 Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith

We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.

Multilingual Word Embeddings Text Categorization

Morphological Inflection Generation Using Character Sequence to Sequence Learning

1 code implementation NAACL 2016 Manaal Faruqui, Yulia Tsvetkov, Graham Neubig, Chris Dyer

Morphological inflection generation is the task of generating the inflected form of a given lemma corresponding to a particular linguistic transformation.

Morphological Inflection

Sparse Overcomplete Word Vector Representations

2 code implementations IJCNLP 2015 Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, Noah Smith

Current distributed representations of words show little resemblance to theories of lexical semantics.

Augmenting English Adjective Senses with Supersenses

1 code implementation LREC 2014 Yulia Tsvetkov, Nathan Schneider, Dirk Hovy, Archna Bhatia, Manaal Faruqui, Chris Dyer

We develop a supersense taxonomy for adjectives, based on that of GermaNet, and apply it to English adjectives in WordNet using human annotation and supervised classification.

Classification General Classification

A Unified Annotation Scheme for the Semantic/Pragmatic Components of Definiteness

no code implementations LREC 2014 Archna Bhatia, M Simons, y, Lori Levin, Yulia Tsvetkov, Chris Dyer, Jordan Bender

We present a definiteness annotation scheme that captures the semantic, pragmatic, and discourse information, which we call communicative functions, associated with linguistic descriptions such as {``}a story about my speech{''}, {``}the story{''}, {``}every time I give it{''}, {``}this slideshow{''}.

Machine Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.