1 code implementation • COLING 2022 • Artidoro Pagnoni, Martin Graciarena, Yulia Tsvetkov
In this work, we discuss different threat scenarios from neural fake news generated by state-of-the-art language models.
no code implementations • 21 Oct 2024 • Sachin Kumar, Chan Young Park, Yulia Tsvetkov, Noah A. Smith, Hannaneh Hajishirzi
Conventional algorithms for training language models (LMs) with human feedback rely on preferences that are assumed to account for an "average" user, disregarding subjectivity and finer-grained variations.
no code implementations • 15 Oct 2024 • Shangbin Feng, Zifeng Wang, Yike Wang, Sayna Ebrahimi, Hamid Palangi, Lesly Miculicich, Achin Kulshrestha, Nathalie Rauschmayr, Yejin Choi, Yulia Tsvetkov, Chen-Yu Lee, Tomas Pfister
Extensive experiments demonstrate that Model Swarms could flexibly adapt LLM experts to a single task, multi-task domains, reward models, as well as diverse human interests, improving over 12 model composition baselines by up to 21. 0% across tasks and contexts.
1 code implementation • 14 Oct 2024 • Jihan Yao, Wenxuan Ding, Shangbin Feng, Lucy Lu Wang, Yulia Tsvetkov
In the absence of abundant reliable annotations for challenging tasks and contexts, how can we expand the frontier of LLM capabilities with potentially wrong answers?
no code implementations • 8 Oct 2024 • Jillian Fisher, Shangbin Feng, Robert Aron, Thomas Richardson, Yejin Choi, Daniel W. Fisher, Jennifer Pan, Yulia Tsvetkov, Katharina Reinecke
As modern AI models become integral to everyday tasks, concerns about their inherent biases and their potential impact on human decision-making have emerged.
1 code implementation • 5 Oct 2024 • Farhan Samir, Chan Young Park, Anjalie Field, Vered Shwartz, Yulia Tsvetkov
We introduce the InfoGap method -- an efficient and reliable approach to locating information gaps and inconsistencies in articles at the fact level, across languages.
no code implementations • 3 Oct 2024 • Yu Ying Chiu, Liwei Jiang, Bill Yuchen Lin, Chan Young Park, Shuyue Stella Li, Sahithya Ravi, Mehar Bhatia, Maria Antoniak, Yulia Tsvetkov, Vered Shwartz, Yejin Choi
Nonetheless, all models consistently underperform on questions related to South America and the Middle East.
no code implementations • 15 Aug 2024 • Xiaochuang Han, Marjan Ghazvininejad, Pang Wei Koh, Yulia Tsvetkov
Evaluation of image generation shows that this simple and straightforward approach is more effective than pixel-based modeling and sophisticated vector quantization baselines (on which our method yields a 31% reduction in FID).
no code implementations • 25 Jul 2024 • Bingbing Wen, Jihan Yao, Shangbin Feng, Chenjun Xu, Yulia Tsvetkov, Bill Howe, Lucy Lu Wang
Abstention, the refusal of large language models (LLMs) to provide an answer, is increasingly recognized for its potential to mitigate hallucinations and enhance safety in LLM systems.
no code implementations • 11 Jul 2024 • Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Valentin Hofmann, Tomasz Limisiewicz, Yulia Tsvetkov, Noah A. Smith
In multilingual settings, non-Latin scripts and low-resource languages are usually disadvantaged in terms of language models' utility, efficiency, and cost.
1 code implementation • 2 Jul 2024 • Chan Young Park, Shuyue Stella Li, Hayoung Jung, Svitlana Volkova, Tanushree Mitra, David Jurgens, Yulia Tsvetkov
The framework thus highlights the pivotal role of social norms in shaping online interactions, presenting a substantial advance in both the theory and application of social norm studies in digital spaces.
1 code implementation • 2 Jul 2024 • Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi
Chat-based language models are designed to be helpful, yet they should not comply with every user request.
1 code implementation • 27 Jun 2024 • Orevaoghene Ahia, Anuoluwapo Aremu, Diana Abagyan, Hila Gonen, David Ifeoluwa Adelani, Daud Abolade, Noah A. Smith, Yulia Tsvetkov
Recent efforts to develop NLP technologies for African languages have focused on their standard dialects, resulting in disparities for dialects and varieties for which there are little to no resources or tools.
1 code implementation • 23 Jun 2024 • Yizhuo Zhang, Heng Wang, Shangbin Feng, Zhaoxuan Tan, Xiaochuang Han, Tianxing He, Yulia Tsvetkov
To this end, we propose the NLGift benchmark, an evaluation suite of LLM graph reasoning generalization: whether LLMs could go beyond semantic, numeric, structural, reasoning patterns in the synthetic training data and improve utility on real-world graph-based tasks.
1 code implementation • 22 Jun 2024 • Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Orevaoghene Ahia, Shuyue Stella Li, Vidhisha Balachandran, Sunayana Sitaram, Yulia Tsvetkov
Multilingual LLMs often have knowledge disparities across languages, with larger gaps in under-resourced languages.
1 code implementation • 22 Jun 2024 • Shangbin Feng, Taylor Sorensen, YuHan Liu, Jillian Fisher, Chan Young Park, Yejin Choi, Yulia Tsvetkov
Modular Pluralism is uniquely compatible with black-box LLMs and offers the modular control of adding new community LMs for previously underrepresented communities.
2 code implementations • 3 Jun 2024 • Shuyue Stella Li, Vidhisha Balachandran, Shangbin Feng, Jonathan S. Ilgen, Emma Pierson, Pang Wei Koh, Yulia Tsvetkov
In this paper, we propose to change the static paradigm to an interactive one, develop systems that proactively ask questions to gather more information and respond reliably, and introduce an benchmark - MediQ - to evaluate question-asking ability in LLMs.
1 code implementation • 25 Apr 2024 • Kabir Ahuja, Vidhisha Balachandran, Madhur Panwar, Tianxing He, Noah A. Smith, Navin Goyal, Yulia Tsvetkov
Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures without explicitly encoding any structural bias.
no code implementations • 10 Apr 2024 • Yu Ying Chiu, Liwei Jiang, Maria Antoniak, Chan Young Park, Shuyue Stella Li, Mehar Bhatia, Sahithya Ravi, Yulia Tsvetkov, Vered Shwartz, Yejin Choi
Our study reveals that CulturalTeaming's various modes of AI assistance support annotators in creating cultural questions, that modern LLMs fail at, in a gamified manner.
1 code implementation • 16 Mar 2024 • Fahim Faisal, Orevaoghene Ahia, Aarohi Srivastava, Kabir Ahuja, David Chiang, Yulia Tsvetkov, Antonios Anastasopoulos
This allows for a comprehensive evaluation of NLP system performance on different language varieties.
1 code implementation • 5 Mar 2024 • Aly M. Kassem, Omar Mahmoud, Niloofar Mireshghallah, Hyunwoo Kim, Yulia Tsvetkov, Yejin Choi, Sherif Saad, Santu Rana
In this paper, we introduce a black-box prompt optimization method that uses an attacker LLM agent to uncover higher levels of memorization in a victim agent, compared to what is revealed by prompting the target model with the training data directly, which is the dominant approach of quantifying memorization in LLMs.
1 code implementation • 27 Feb 2024 • Roy Xie, Orevaoghene Ahia, Yulia Tsvetkov, Antonios Anastasopoulos
Identifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis.
1 code implementation • 18 Feb 2024 • Yichen Wang, Shangbin Feng, Abe Bohan Hou, Xiao Pu, Chao Shen, Xiaoming Liu, Yulia Tsvetkov, Tianxing He
Our experiments reveal that almost none of the existing detectors remain robust under all the attacks, and all detectors exhibit different loopholes.
1 code implementation • 16 Feb 2024 • Herun Wan, Shangbin Feng, Zhaoxuan Tan, Heng Wang, Yulia Tsvetkov, Minnan Luo
Large language models are limited by challenges in factuality and hallucinations to be directly employed off-the-shelf for judging the veracity of news articles, where factual accuracy is paramount.
1 code implementation • 12 Feb 2024 • Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, Hannaneh Hajishirzi
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
1 code implementation • 1 Feb 2024 • Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Vidhisha Balachandran, Yulia Tsvetkov
Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps -- missing or outdated information in LLMs -- might always persist given the evolving nature of knowledge.
1 code implementation • 1 Feb 2024 • Shangbin Feng, Herun Wan, Ningnan Wang, Zhaoxuan Tan, Minnan Luo, Yulia Tsvetkov
Social media bot detection has always been an arms race between advancements in machine learning bot detectors and adversarial bot strategies to evade detection.
2 code implementations • 16 Jan 2024 • Alisa Liu, Xiaochuang Han, Yizhong Wang, Yulia Tsvetkov, Yejin Choi, Noah A. Smith
Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors.
no code implementations • 12 Jan 2024 • Abhika Mishra, Akari Asai, Vidhisha Balachandran, Yizhong Wang, Graham Neubig, Yulia Tsvetkov, Hannaneh Hajishirzi
On our benchmark, our automatic and human evaluations show that FAVA significantly outperforms ChatGPT and GPT-4 on fine-grained hallucination detection, and edits suggested by FAVA improve the factuality of LM-generated text.
1 code implementation • 16 Nov 2023 • YuHan Liu, Shangbin Feng, Xiaochuang Han, Vidhisha Balachandran, Chan Young Park, Sachin Kumar, Yulia Tsvetkov
In this work, we take a first step towards designing summarization systems that are faithful to the author's intent, not only the semantic content of the article.
no code implementations • 13 Nov 2023 • Sachin Kumar, Chan Young Park, Yulia Tsvetkov
GEN-Z is generative, as it measures the LM likelihood of input text, conditioned on natural language descriptions of labels.
1 code implementation • 27 Oct 2023 • Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi
The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.)
1 code implementation • 17 Oct 2023 • Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
In this work, we focus on LLM sensitivity to a quintessential class of meaning-preserving design choices: prompt formatting.
1 code implementation • 15 Oct 2023 • Yuyang Bai, Shangbin Feng, Vidhisha Balachandran, Zhaoxuan Tan, Shiqi Lou, Tianxing He, Yulia Tsvetkov
To gain a better understanding of LLMs' knowledge abilities and their generalization, we evaluate 10 open-source and black-box LLMs on the KGQuiz benchmark across the five knowledge-intensive tasks and knowledge domains.
2 code implementations • 11 Oct 2023 • Devvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, KaiFeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, Prateek Jain
Foundation models are applied in a broad spectrum of settings with different inference constraints, from massive multi-accelerator clusters to resource-constrained standalone mobile devices.
no code implementations • 8 Oct 2023 • Xiao Pu, Jingyu Zhang, Xiaochuang Han, Yulia Tsvetkov, Tianxing He
The rampant proliferation of large language models, fluent enough to generate text indistinguishable from human-written language, gives unprecedented importance to the detection of machine-generated text.
1 code implementation • 6 Oct 2023 • Abe Bohan Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov
Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design.
1 code implementation • 2 Oct 2023 • Wenxuan Ding, Shangbin Feng, YuHan Liu, Zhaoxuan Tan, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov
The novel setting of geometric knowledge reasoning necessitates new LM abilities beyond existing atomic/linear multi-hop QA, such as backtracking, verifying facts and constraints, reasoning with uncertainty, and more.
1 code implementation • 2 Oct 2023 • Yike Wang, Shangbin Feng, Heng Wang, Weijia Shi, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov
To this end, we introduce an evaluation framework for simulating contextual knowledge conflicts and quantitatively evaluating to what extent LLMs achieve these goals.
1 code implementation • 29 Sep 2023 • Mengke Zhang, Tianxing He, Tianle Wang, Lu Mi, FatemehSadat Mireshghallah, Binyi Chen, Hao Wang, Yulia Tsvetkov
In the current user-server interaction paradigm of prompted generation with large language models (LLM) on cloud, the server fully controls the generation process, which leaves zero options for users who want to keep the generated text to themselves.
no code implementations • 26 Jun 2023 • Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, Tianlu Wang
We observe that a continued pretraining on this small subset significantly improves the model's ICL ability, by up to 18%.
no code implementations • 1 Jun 2023 • Melanie Sclar, Sachin Kumar, Peter West, Alane Suhr, Yejin Choi, Yulia Tsvetkov
We present SymbolicToM, a plug-and-play approach to reason about the belief states of multiple characters in reading comprehension tasks via explicit symbolic representation.
no code implementations • 30 May 2023 • Anjalie Field, Amanda Coston, Nupoor Gandhi, Alexandra Chouldechova, Emily Putnam-Hornstein, David Steier, Yulia Tsvetkov
Given well-established racial bias in this setting, we investigate possible ways deployed NLP is liable to increase racial disparities.
no code implementations • 24 May 2023 • Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, Marjan Ghazvininejad
Diffusion-based language models are emerging as a promising alternative to autoregressive LMs: they approach the competence of autoregressive LMs while offering nuanced controllability at inference time.
no code implementations • 24 May 2023 • Akari Asai, Sneha Kudugunta, Xinyan Velocity Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi
Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English.
3 code implementations • 24 May 2023 • Weijia Shi, Xiaochuang Han, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, Scott Wen-tau Yih
Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations.
no code implementations • 24 May 2023 • Yueqi Song, Catherine Cui, Simran Khanuja, PengFei Liu, Fahim Faisal, Alissa Ostapenko, Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Yulia Tsvetkov, Antonios Anastasopoulos, Graham Neubig
Despite the major advances in NLP, significant disparities in NLP system performance across languages still exist.
no code implementations • 23 May 2023 • Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov
Language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products.
no code implementations • 23 May 2023 • Lucille Njoo, Chan Young Park, Octavia Stappart, Marvin Thielk, Yi Chu, Yulia Tsvetkov
Empowering language is important in many real-world contexts, from education to workplace dynamics to healthcare.
2 code implementations • NeurIPS 2023 • Heng Wang, Shangbin Feng, Tianxing He, Zhaoxuan Tan, Xiaochuang Han, Yulia Tsvetkov
We then propose Build-a-Graph Prompting and Algorithmic Prompting, two instruction-based approaches to enhance LLMs in solving natural language graph problems.
2 code implementations • 17 May 2023 • Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov
Ultimately, Knowledge Card framework enables dynamic synthesis and updates of knowledge from diverse domains.
2 code implementations • 15 May 2023 • Shangbin Feng, Chan Young Park, YuHan Liu, Yulia Tsvetkov
We focus on hate speech and misinformation detection, aiming to empirically quantify the effects of political (social, economic) biases in pretraining data on the fairness of high-stakes social-oriented tasks.
1 code implementation • 14 May 2023 • Shangbin Feng, Vidhisha Balachandran, Yuyang Bai, Yulia Tsvetkov
We propose FactKB, a simple new approach to factuality evaluation that is generalizable across domains, in particular with respect to entities and relations.
2 code implementations • 31 Mar 2023 • Leon Derczynski, Hannah Rose Kirk, Vidhisha Balachandran, Sachin Kumar, Yulia Tsvetkov, M. R. Leiser, Saif Mohammad
However, there is no risk-centric framework for documenting the complexity of a landscape in which some risks are shared across models and contexts, while others are specific, and where certain conditions may be required for risks to manifest as harms.
no code implementations • 20 Dec 2022 • Weijia Shi, Xiaochuang Han, Hila Gonen, Ari Holtzman, Yulia Tsvetkov, Luke Zettlemoyer
Large language models can perform new tasks in a zero-shot fashion, given natural language prompts that specify the desired behavior.
1 code implementation • 20 Dec 2022 • Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov
In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data.
2 code implementations • 31 Oct 2022 • Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov
Despite the growing success of diffusion models in continuous-valued domains (e. g., images), similar efforts for discrete domains such as text have yet to match the performance of autoregressive language models.
no code implementations • 27 Oct 2022 • Inna Wanyin Lin, Lucille Njoo, Anjalie Field, ASHISH SHARMA, Katharina Reinecke, Tim Althoff, Yulia Tsvetkov
Mental health stigma prevents many individuals from receiving the appropriate care, and social psychology studies have shown that mental health tends to be overlooked in men.
no code implementations • 25 Oct 2022 • Melanie Sclar, Peter West, Sachin Kumar, Yulia Tsvetkov, Yejin Choi
Moreover, we uniquely propose iterative distillation of knowledge, where student models from the previous iteration of distillation serve as teacher models in the next iteration.
1 code implementation • 22 Oct 2022 • Vidhisha Balachandran, Hannaneh Hajishirzi, William W. Cohen, Yulia Tsvetkov
Abstractive summarization models often generate inconsistent summaries containing factual errors or hallucinated content.
no code implementations • 14 Oct 2022 • Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
Recent advances in the capacity of large language models to generate human-like text have resulted in their increased adoption in user-facing settings.
1 code implementation • 8 Oct 2022 • Shangbin Feng, Zhaoxuan Tan, Wenqian Zhang, Zhenyu Lei, Yulia Tsvetkov
With the advent of pretrained language models (LMs), increasing research efforts have been focusing on infusing commonsense and domain-specific knowledge to prepare LMs for downstream tasks.
1 code implementation • 25 May 2022 • Sachin Kumar, Biswajit Paria, Yulia Tsvetkov
Large pretrained language models generate fluent text but are notoriously hard to controllably sample from.
no code implementations • 25 May 2022 • Xiaochuang Han, Yulia Tsvetkov
Large pretrained language models have been performing increasingly well in a variety of downstream tasks via prompting.
no code implementations • 24 May 2022 • Chan Young Park, Julia Mendelsohn, Anjalie Field, Yulia Tsvetkov
NLP research on public opinion manipulation campaigns has primarily focused on detecting overt strategies such as fake news and disinformation.
1 code implementation • ACL 2022 • Alissa Ostapenko, Shuly Wintner, Melinda Fricke, Yulia Tsvetkov
Natural language processing (NLP) models trained on people-generated data can be unreliable because, without any constraints, they can learn from spurious correlations that are not relevant to the task.
1 code implementation • 15 Mar 2022 • Rishabh Joshi, Vidhisha Balachandran, Emily Saldanha, Maria Glenski, Svitlana Volkova, Yulia Tsvetkov
Keyphrase extraction aims at automatically extracting a list of "important" phrases representing the key concepts in a document.
1 code implementation • EMNLP (MRL) 2021 • Monisha Jegadeesan, Sachin Kumar, John Wieting, Yulia Tsvetkov
We present a novel technique for zero-shot paraphrase generation.
1 code implementation • 14 Oct 2021 • Liwei Jiang, Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jenny Liang, Jesse Dodge, Keisuke Sakaguchi, Maxwell Forbes, Jon Borchardt, Saadia Gabriel, Yulia Tsvetkov, Oren Etzioni, Maarten Sap, Regina Rini, Yejin Choi
As AI systems become increasingly powerful and pervasive, there are growing concerns about machines' morality or a lack thereof.
1 code implementation • Findings (EMNLP) 2021 • Chan Young Park, Julia Mendelsohn, Karthik Radhakrishnan, Kinjal Jain, Tushar Kanakagiri, David Jurgens, Yulia Tsvetkov
Online platforms and communities establish their own norms that govern what behavior is acceptable within the community.
1 code implementation • Findings (EMNLP) 2021 • Xiaochuang Han, Yulia Tsvetkov
Among the most critical limitations of deep learning NLP models are their lack of interpretability, and their reliance on spurious correlations.
1 code implementation • CRAC (ACL) 2021 • Nupoor Gandhi, Anjalie Field, Yulia Tsvetkov
Recent work has shown fine-tuning neural coreference models can produce strong performance when adapting to different domains.
1 code implementation • Findings (EMNLP) 2021 • Xinyi Wang, Yulia Tsvetkov, Sebastian Ruder, Graham Neubig
Adapters are light-weight modules that allow parameter-efficient fine-tuning of pretrained models.
2 code implementations • ICLR 2022 • ZiRui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao
With recent progress in joint modeling of visual and textual representations, Vision-Language Pretraining (VLP) has achieved impressive performance on many multimodal downstream tasks.
Ranked #4 on Visual Entailment on SNLI-VE val
1 code implementation • NeurIPS 2021 • Sachin Kumar, Eric Malmi, Aliaksei Severyn, Yulia Tsvetkov
As large-scale language model pretraining pushes the state-of-the-art in text generation, recent work has turned to controlling attributes of the text such models generate.
no code implementations • ACL 2021 • Anjalie Field, Su Lin Blodgett, Zeerak Waseem, Yulia Tsvetkov
Despite inextricable ties between race and language, little work has considered race in NLP research and development.
no code implementations • ACL 2021 • Sachin Kumar, Antonios Anastasopoulos, Shuly Wintner, Yulia Tsvetkov
State-of-the-art machine translation (MT) systems are typically trained to generate the "standard" target language; however, many languages have multiple varieties (regional varieties, dialects, sociolects, non-native varieties) that are different from the standard language.
1 code implementation • Findings (ACL) 2021 • Prakhar Gupta, Yulia Tsvetkov, Jeffrey P. Bigham
Experiments on classification, ranking and evaluation tasks across multiple datasets demonstrate that our approaches outperform strong baselines in providing informative negative examples for training dialogue systems.
2 code implementations • ICLR 2021 • Rishabh Joshi, Vidhisha Balachandran, Shikhar Vashishth, Alan Black, Yulia Tsvetkov
To successfully negotiate a deal, it is not enough to communicate fluently: pragmatic planning of persuasive negotiation strategies is essential.
2 code implementations • NAACL 2021 • Artidoro Pagnoni, Vidhisha Balachandran, Yulia Tsvetkov
Modern summarization models generate highly fluent but often factually unreliable outputs.
no code implementations • EMNLP (MRQA) 2021 • Vidhisha Balachandran, Ashish Vaswani, Yulia Tsvetkov, Niki Parmar
Dense retrieval has been shown to be effective for retrieving relevant documents for Open Domain QA, surpassing popular sparse retrieval methods like BM25.
no code implementations • 31 Mar 2021 • Lidia Kidane, Sachin Kumar, Yulia Tsvetkov
It has been shown that the performance of neural machine translation (NMT) drops starkly in low-resource conditions, often requiring large amounts of auxiliary data to achieve competitive results.
1 code implementation • EMNLP 2021 • Adithya Pratapa, Antonios Anastasopoulos, Shruti Rijhwani, Aditi Chaudhary, David R. Mortensen, Graham Neubig, Yulia Tsvetkov
Text generation systems are ubiquitous in natural language processing applications.
2 code implementations • EMNLP 2021 • Dheeraj Rajagopal, Vidhisha Balachandran, Eduard Hovy, Yulia Tsvetkov
We introduce SelfExplain, a novel self-explaining model that explains a text classifier's predictions using phrase-based concepts.
1 code implementation • 31 Dec 2020 • Anjalie Field, Chan Young Park, Kevin Z. Lin, Yulia Tsvetkov
In this work, we present a methodology for analyzing Wikipedia pages about people that isolates dimensions of interest (e. g., gender), from other attributes (e. g., occupation).
1 code implementation • CONLL 2020 • Tanmay Parekh, Emily Ahn, Yulia Tsvetkov, Alan W Black
Code-switching is a ubiquitous phenomenon in multilingual communities.
no code implementations • 21 Oct 2020 • Chan Young Park, Xinru Yan, Anjalie Field, Yulia Tsvetkov
Specific lexical choices in narrative text reflect both the writer's attitudes towards people in the narrative and influence the audience's reactions.
no code implementations • NeurIPS Workshop ICBINB 2020 • Sachin Kumar, Yulia Tsvetkov
We posit that this gap is due to autoregressive nature and architectural requirements for text generation as well as a fundamental difference between the definition of Wasserstein distance in image and text domains.
1 code implementation • ICLR 2021 • ZiRui Wang, Yulia Tsvetkov, Orhan Firat, Yuan Cao
Massively multilingual models subsuming tens or even hundreds of languages pose great challenges to multi-task optimization.
1 code implementation • EMNLP 2020 • Xiaochuang Han, Yulia Tsvetkov
Modern toxic speech detectors are incompetent in recognizing disguised offensive language, such as adversarial attacks that deliberately avoid known toxic lexicons, or manifestations of implicit bias.
1 code implementation • EMNLP 2020 • ZiRui Wang, Zachary C. Lipton, Yulia Tsvetkov
Modern multilingual models are trained on concatenated text from multiple languages in hopes of conferring benefits to each (positive transfer), with the most pronounced benefits accruing to low-resource languages.
1 code implementation • EMNLP 2020 • Aditi Chaudhary, Antonios Anastasopoulos, Adithya Pratapa, David R. Mortensen, Zaid Sheikh, Yulia Tsvetkov, Graham Neubig
Using cross-lingual transfer, even with no expert annotations in the language of interest, our framework extracts a grammatical specification which is nearly equivalent to those created with large amounts of gold-standard annotated data.
1 code implementation • NAACL 2021 • Prakhar Gupta, Jeffrey P. Bigham, Yulia Tsvetkov, Amy Pavel
Dialogue systems pretrained with large language models generate locally coherent responses, but lack the fine-grained control over responses necessary to achieve specific goals.
no code implementations • SEMEVAL 2020 • Sopan Khosla, Rishabh Joshi, Ritam Dutt, Alan W. black, Yulia Tsvetkov
In this paper we describe our submission for the task of Propaganda Span Identification in news articles.
1 code implementation • WS 2020 • Zi-Yi Dou, Sachin Kumar, Yulia Tsvetkov
The model uses reinforcement learning to directly optimize a bilingual semantic similarity metric between the summaries generated in a target language and gold summaries in a source language.
2 code implementations • EACL 2021 • Jimin Sun, Hwijeen Ahn, Chan Young Park, Yulia Tsvetkov, David R. Mortensen
Much work in cross-lingual transfer learning explored how to select better transfer languages for multilingual tasks, primarily focusing on typological and genealogical similarities between languages.
no code implementations • WS 2020 • Mengzhou Xia, Anjalie Field, Yulia Tsvetkov
In current hate speech datasets, there exists a high correlation between annotators' perceptions of toxicity and signals of African American English (AAE).
1 code implementation • 20 May 2020 • Aman Tyagi, Anjalie Field, Priyank Lathwal, Yulia Tsvetkov, Kathleen M. Carley
Between February 14, 2019 and March 4, 2019, a terrorist attack in Pulwama, Kashmir followed by retaliatory airstrikes led to rising tensions between India and Pakistan, two nuclear-armed countries.
1 code implementation • ACL 2020 • Xiaochuang Han, Byron C. Wallace, Yulia Tsvetkov
In this work, we investigate the use of influence functions for NLP, providing an alternative approach to interpreting neural text classifiers.
1 code implementation • EMNLP 2020 • Anjalie Field, Yulia Tsvetkov
Despite their prevalence in society, social biases are difficult to identify, primarily because human judgements in this domain can be unreliable.
2 code implementations • ACL 2020 • Xinyi Wang, Yulia Tsvetkov, Graham Neubig
When training multilingual machine translation (MT) models that can translate to/from multiple languages, we are faced with imbalanced training sets: some languages have much more training data than others.
no code implementations • 6 Mar 2020 • Julia Mendelsohn, Yulia Tsvetkov, Dan Jurafsky
Dehumanization is a pernicious psychological process that often leads to extreme intergroup bias, hate speech, and violence aimed at targeted social groups.
1 code implementation • EACL 2021 • Vidhisha Balachandran, Artidoro Pagnoni, Jay Yoon Lee, Dheeraj Rajagopal, Jaime Carbonell, Yulia Tsvetkov
To this end, we propose incorporating latent and explicit dependencies across sentences in the source document into end-to-end single-document summarization models.
1 code implementation • SCiL 2020 • Maria Ryskina, Ella Rabinovich, Taylor Berg-Kirkpatrick, David R. Mortensen, Yulia Tsvetkov
Besides presenting a new linguistic application of distributional semantics, this study tackles the linguistic question of the role of language-internal factors (in our case, sparsity) in language change motivated by language-external factors (reflected in frequency growth).
no code implementations • WS 2019 • Gayatri Bhat, Sachin Kumar, Yulia Tsvetkov
Neural models that eliminate the softmax bottleneck by generating word embeddings (rather than multinomial distributions over a vocabulary) attain faster training with fewer learnable parameters.
no code implementations • IJCNLP 2019 • Luke Breitfeller, Emily Ahn, David Jurgens, Yulia Tsvetkov
Microaggressions are subtle, often veiled, manifestations of human biases.
no code implementations • WS 2019 • Chan Young Park, Yulia Tsvetkov
In this paper, we introduce a phrase-based NMT model built upon continuous-output NMT, in which the decoder generates embeddings of words or phrases.
no code implementations • ICLR 2020 • Yiheng Zhou, Yulia Tsvetkov, Alan W. black, Zhou Yu
We train FSTs on a set of strategies and tactics used in negotiation dialogs.
no code implementations • WS 2019 • Yiheng Zhou, He He, Alan W. black, Yulia Tsvetkov
We consider a bargaining scenario where a seller and a buyer negotiate the price of an item for sale through a text-based dialog.
1 code implementation • IJCNLP 2019 • Sachin Kumar, Shuly Wintner, Noah A. Smith, Yulia Tsvetkov
Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize well.
1 code implementation • WS 2019 • Aditi Chaudhary, Elizabeth Salesky, Gayatri Bhat, David R. Mortensen, Jaime G. Carbonell, Yulia Tsvetkov
This paper presents the submission by the CMU-01 team to the SIGMORPHON 2019 task 2 of Morphological Analysis and Lemmatization in Context.
1 code implementation • WS 2019 • Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W. black, Yulia Tsvetkov
Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks.
no code implementations • ACL 2019 • Anjalie Field, Yulia Tsvetkov
While contextualized word representations have improved state-of-the-art benchmarks in many NLP tasks, their potential usefulness for social-oriented tasks remains largely unexplored.
2 code implementations • 8 Apr 2019 • Anjalie Field, Gayatri Bhat, Yulia Tsvetkov
We show that while these articles are sympathetic towards women who have experienced sexual harassment, they consistently present men as most powerful, even after sexual assault allegations.
Social and Information Networks
1 code implementation • NAACL 2019 • Thomas Manzini, Yao Chong Lim, Yulia Tsvetkov, Alan W. black
Online texts -- across genres, registers, domains, and styles -- are riddled with human stereotypes, expressed in overt or subtle ways.
1 code implementation • ICLR 2019 • Sachin Kumar, Yulia Tsvetkov
The Softmax function is used in the final layer of nearly all existing sequence-to-sequence models for language generation.
no code implementations • 17 Sep 2018 • Shrimai Prabhumoye, Yulia Tsvetkov, Alan W. black, Ruslan Salakhutdinov
Style transfer is the task of transferring an attribute of a sentence (e. g., formality) while maintaining its semantic content.
no code implementations • EMNLP 2018 • Anjalie Field, Doron Kliger, Shuly Wintner, Jennifer Pan, Dan Jurafsky, Yulia Tsvetkov
Amidst growing concern over media manipulation, NLP attention has focused on overt strategies like censorship and "fake news'".
no code implementations • NAACL 2018 • Yulia Tsvetkov, Vinodkumar Prabhakaran, Rob Voigt
As language technologies have become increasingly prevalent, there is a growing awareness that decisions we make about our data, methods, and tools are often tied up with their impact on people and societies.
1 code implementation • TACL 2018 • Ella Rabinovich, Yulia Tsvetkov, Shuly Wintner
We present a computational analysis of cognate effects on the spontaneous linguistic productions of advanced non-native speakers.
3 code implementations • ACL 2018 • Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, Alan W. black
We first learn a latent representation of the input sentence which is grounded in a language translation model in order to better preserve the meaning of the sentence while reducing stylistic properties.
Ranked #10 on Unsupervised Text Style Transfer on Yelp
no code implementations • ACL 2017 • David Jurgens, Yulia Tsvetkov, Dan Jurafsky
Language identification (LID) is a critical first step for processing multilingual text.
no code implementations • WS 2016 • Yulia Tsvetkov, Manaal Faruqui, Chris Dyer
We introduce QVEC-CCA--an intrinsic evaluation metric for word vector representations based on correlations of learned vectors with features extracted from linguistic resources.
no code implementations • NAACL 2016 • Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer
We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.
no code implementations • ACL 2016 • Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Brian MacWhinney, Chris Dyer
We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features.
1 code implementation • WS 2016 • Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, Chris Dyer
Our study suggests that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.
1 code implementation • 5 Feb 2016 • Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith
We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.
1 code implementation • NAACL 2016 • Manaal Faruqui, Yulia Tsvetkov, Graham Neubig, Chris Dyer
Morphological inflection generation is the task of generating the inflected form of a given lemma corresponding to a particular linguistic transformation.
3 code implementations • IJCNLP 2015 • Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, Noah Smith
Current distributed representations of words show little resemblance to theories of lexical semantics.
no code implementations • LREC 2014 • Archna Bhatia, M Simons, y, Lori Levin, Yulia Tsvetkov, Chris Dyer, Jordan Bender
We present a definiteness annotation scheme that captures the semantic, pragmatic, and discourse information, which we call communicative functions, associated with linguistic descriptions such as {``}a story about my speech{''}, {``}the story{''}, {``}every time I give it{''}, {``}this slideshow{''}.
1 code implementation • LREC 2014 • Yulia Tsvetkov, Nathan Schneider, Dirk Hovy, Archna Bhatia, Manaal Faruqui, Chris Dyer
We develop a supersense taxonomy for adjectives, based on that of GermaNet, and apply it to English adjectives in WordNet using human annotation and supervised classification.