Search Results for author: Christopher D. Manning

Found 151 papers, 73 papers with code

Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching

no code implementations25 Nov 2023 Tolúlopé Ògúnrèmí, Christopher D. Manning, Dan Jurafsky

While many speakers of low-resource languages regularly code-switch between their languages and other regional languages or English, datasets of codeswitched speech are too small to train bespoke acoustic models from scratch or do language model rescoring.

Language Modelling speech-recognition +1

Fine-tuning Language Models for Factuality

no code implementations14 Nov 2023 Katherine Tian, Eric Mitchell, Huaxiu Yao, Christopher D. Manning, Chelsea Finn

The fluency and creativity of large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines.

Misconceptions Misinformation +1

Pushdown Layers: Encoding Recursive Structure in Transformer Language Models

1 code implementation29 Oct 2023 Shikhar Murty, Pratyusha Sharma, Jacob Andreas, Christopher D. Manning

Recursion is a prominent feature of human language, and fundamentally challenging for self-attention due to the lack of an explicit recursive-state tracking mechanism.

text-classification Text Classification

An Emulator for Fine-Tuning Large Language Models using Small Language Models

no code implementations19 Oct 2023 Eric Mitchell, Rafael Rafailov, Archit Sharma, Chelsea Finn, Christopher D. Manning

To aid in doing so, we introduce a novel technique for decoupling the knowledge and skills gained in these two stages, enabling a direct answer to the question, "What would happen if we combined the knowledge learned by a large model during pre-training with the knowledge learned by a small model during fine-tuning (or vice versa)?"

Instruction Following

Grokking of Hierarchical Structure in Vanilla Transformers

1 code implementation30 May 2023 Shikhar Murty, Pratyusha Sharma, Jacob Andreas, Christopher D. Manning

When analyzing the relationship between model-internal properties and grokking, we find that optimal depth for grokking can be identified using the tree-structuredness metric of \citet{murty2023projections}.

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

2 code implementations NeurIPS 2023 Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn

However, RLHF is a complex and often unstable procedure, first fitting a reward model that reflects the human preferences, and then fine-tuning the large unsupervised LM using reinforcement learning to maximize this estimated reward without drifting too far from the original model.

Language Modelling reinforcement-learning +1

Backpack Language Models

1 code implementation26 May 2023 John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang

We can interpret a sense vector by inspecting its (non-contextual, linear) projection onto the output space, and intervene on these interpretable hooks to change the model's behavior in predictable ways.

Language Modelling Text Generation +1

Meta-Learning Online Adaptation of Language Models

no code implementations24 May 2023 Nathan Hu, Eric Mitchell, Christopher D. Manning, Chelsea Finn

We meta-train a small, autoregressive model to reweight the language modeling loss for each token during online fine-tuning, with the objective of maximizing the out-of-date base question-answering model's ability to answer questions about a document after a single weighted gradient step.

Language Modelling Meta-Learning +2

MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions

1 code implementation24 May 2023 Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen

The information stored in large language models (LLMs) falls out of date quickly, and retraining from scratch is often not an option.

Language Modelling Multi-hop Question Answering +1

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback

no code implementations24 May 2023 Katherine Tian, Eric Mitchell, Allan Zhou, Archit Sharma, Rafael Rafailov, Huaxiu Yao, Chelsea Finn, Christopher D. Manning

A trustworthy real-world prediction system should produce well-calibrated confidence scores; that is, its confidence in an answer should be indicative of the likelihood that the answer is correct, enabling deferral to an expert in cases of low-confidence predictions.

TriviaQA Unsupervised Pre-training

DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

2 code implementations26 Jan 2023 Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D. Manning, Chelsea Finn

In this paper, we identify a property of the structure of an LLM's probability function that is useful for such detection.

Language Modelling Text Detection

Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models

1 code implementation27 Nov 2022 Peter Henderson, Eric Mitchell, Christopher D. Manning, Dan Jurafsky, Chelsea Finn

A growing ecosystem of large, open-source foundation models has reduced the labeled data and technical expertise necessary to apply machine learning to many new problems.

Blocking Meta-Learning

On Measuring the Intrinsic Few-Shot Hardness of Datasets

1 code implementation16 Nov 2022 Xinran Zhao, Shikhar Murty, Christopher D. Manning

While advances in pre-training have led to dramatic improvements in few-shot learning of NLP tasks, there is limited understanding of what drives successful few-shot adaptation in datasets.

Few-Shot Learning

Fixing Model Bugs with Natural Language Patches

1 code implementation7 Nov 2022 Shikhar Murty, Christopher D. Manning, Scott Lundberg, Marco Tulio Ribeiro

Current approaches for fixing systematic problems in NLP models (e. g. regex patches, finetuning on more data) are either brittle, or labor-intensive and liable to shortcuts.

Relation Extraction Sentiment Analysis

Characterizing Intrinsic Compositionality in Transformers with Tree Projections

no code implementations2 Nov 2022 Shikhar Murty, Pratyusha Sharma, Jacob Andreas, Christopher D. Manning

To evaluate this possibility, we describe an unsupervised and parameter-free method to \emph{functionally project} the behavior of any transformer into the space of tree-structured networks.

Truncation Sampling as Language Model Desmoothing

1 code implementation27 Oct 2022 John Hewitt, Christopher D. Manning, Percy Liang

In this light, truncation algorithms aim to perform desmoothing, estimating a subset of the support of the true distribution.

Language Modelling

When can I Speak? Predicting initiation points for spoken dialogue agents

1 code implementation SIGDIAL (ACL) 2022 Siyan Li, Ashwin Paranjape, Christopher D. Manning

Current spoken dialogue systems initiate their turns after a long period of silence (700-1000ms), which leads to little real-time feedback, sluggish responses, and an overall stilted conversational flow.

Language Modelling Spoken Dialogue Systems

Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset

1 code implementation1 Jul 2022 Peter Henderson, Mark S. Krass, Lucia Zheng, Neel Guha, Christopher D. Manning, Dan Jurafsky, Daniel E. Ho

One concern with the rise of large language models lies with their potential for significant harm, particularly from pretraining on biased, obscene, copyrighted, and private information.

Memory-Based Model Editing at Scale

no code implementations13 Jun 2022 Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn

We find that only SERAC achieves high performance on all three problems, consistently outperforming existing approaches to model editing by a significant margin.

counterfactual Dialogue Generation +5

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

1 code implementation9 Jun 2022 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

Detecting Label Errors by using Pre-Trained Language Models

1 code implementation25 May 2022 Derek Chong, Jenny Hong, Christopher D. Manning

We show that large pre-trained language models are inherently highly capable of identifying label errors in natural language datasets: simply examining out-of-sample data points in descending order of fine-tuned task loss significantly outperforms more complex error-detection mechanisms proposed in previous work.

You Only Need One Model for Open-domain Question Answering

no code implementations14 Dec 2021 Haejun Lee, Akhil Kedia, Jongwon Lee, Ashwin Paranjape, Christopher D. Manning, Kyoung-Gu Woo

Recent approaches to Open-domain Question Answering refer to an external knowledge base using a retriever model, optionally rerank passages with a separate reranker model and generate an answer using another reader model.

Hard Attention Natural Questions +2

Fast Model Editing at Scale

2 code implementations ICLR 2022 Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning

To enable easy post-hoc editing at scale, we propose Model Editor Networks using Gradient Decomposition (MEND), a collection of small auxiliary editing networks that use a single desired input-output pair to make fast, local edits to a pre-trained model's behavior.

Language Modelling Model Editing

Hindsight: Posterior-guided training of retrievers for improved open-ended generation

no code implementations ICLR 2022 Ashwin Paranjape, Omar Khattab, Christopher Potts, Matei Zaharia, Christopher D. Manning

Many text generation systems benefit from using a retriever to retrieve passages from a textual knowledge corpus (e. g., Wikipedia) which are then provided as additional context to the generator.

Text Generation

ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts

1 code implementation Findings (EMNLP) 2021 Yuta Koreeda, Christopher D. Manning

Reviewing contracts is a time-consuming procedure that incurs large expenses to companies and social inequality to those who cannot afford it.

Multi-Label Classification Natural Language Inference

Conditional probing: measuring usable information beyond a baseline

1 code implementation EMNLP 2021 John Hewitt, Kawin Ethayarajh, Percy Liang, Christopher D. Manning

Probing experiments investigate the extent to which neural representations make properties -- like part-of-speech -- predictable.

Word Embeddings

On the Opportunities and Risks of Foundation Models

3 code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Neural Abstructions: Abstractions that Support Construction for Grounded Language Learning

no code implementations20 Jul 2021 Kaylee Burns, Christopher D. Manning, Li Fei-Fei

Although virtual agents are increasingly situated in environments where natural language is the most effective mode of interaction with humans, these exchanges are rarely used as an opportunity for learning.

Grounded language learning

Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

1 code implementation ACL 2021 Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, Christopher D. Manning

Active learning promises to alleviate the massive data needs of supervised machine learning: it has successfully improved sample efficiency by an order of magnitude on traditional tasks like topic classification and object recognition.

Active Learning Object Recognition +3

Capturing Logical Structure of Visually Structured Documents with Multimodal Transition Parser

1 code implementation EMNLP (NLLP) 2021 Yuta Koreeda, Christopher D. Manning

While many NLP pipelines assume raw, clean texts, many texts we encounter in the wild, including a vast majority of legal documents, are not so clean, with many of them being visually structured documents (VSDs) such as PDFs.

Boundary Detection

Human-like informative conversations: Better acknowledgements using conditional mutual information

1 code implementation NAACL 2021 Ashwin Paranjape, Christopher D. Manning

This is because models trained with two contexts - new factual content and conversational history - generate responses that are non-specific w. r. t.


SLM: Learning a Discourse Language Representation with Sentence Unshuffling

no code implementations EMNLP 2020 Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning

We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner.

Language Modelling

The EOS Decision and Length Extrapolation

1 code implementation EMNLP (BlackboxNLP) 2020 Benjamin Newman, John Hewitt, Percy Liang, Christopher D. Manning

Extrapolation to unseen sequence lengths is a challenge for neural generative models of language.

Contrastive Learning of Medical Visual Representations from Paired Images and Text

7 code implementations2 Oct 2020 Yuhao Zhang, Hang Jiang, Yasuhide Miura, Christopher D. Manning, Curtis P. Langlotz

Existing work commonly relies on fine-tuning weights transferred from ImageNet pretraining, which is suboptimal due to drastically different image characteristics, or rule-based label extraction from the textual report data paired with medical images, which is inaccurate and hard to generalize.

Contrastive Learning Descriptive +3

Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations

no code implementations27 Aug 2020 Ashwin Paranjape, Abigail See, Kathleen Kenealy, Haojun Li, Amelia Hardy, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Christopher D. Manning

At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3. 6/5. 0, a median conversation duration of 2 minutes 16 seconds, and a 90th percentile duration of over 12 minutes.

World Knowledge

Finding Universal Grammatical Relations in Multilingual BERT

1 code implementation ACL 2020 Ethan A. Chi, John Hewitt, Christopher D. Manning

Recent work has found evidence that Multilingual BERT (mBERT), a transformer-based multilingual masked language model, is capable of zero-shot cross-lingual transfer, suggesting that some aspects of its representations are shared cross-lingually.

Language Modelling Zero-Shot Cross-Lingual Transfer

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

no code implementations LREC 2020 Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, Daniel Zeman

Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework.

Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation

no code implementations ACL 2020 Kaustubh D. Dhole, Christopher D. Manning

Question Generation (QG) is fundamentally a simple syntactic transformation; however, many aspects of semantics influence what questions are good to form.

Descriptive Question Generation +3

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

17 code implementations ICLR 2020 Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.

Language Modelling Masked Language Modeling +3

Answering Complex Open-domain Questions Through Iterative Query Generation

1 code implementation IJCNLP 2019 Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning

It is challenging for current one-step retrieve-and-read question answering (QA) systems to answer questions like "Which novel by the author of 'Armada' will be adapted as a feature film by Steven Spielberg?"

Information Retrieval Question Answering +1

Do Massively Pretrained Language Models Make Better Storytellers?

1 code implementation CONLL 2019 Abigail See, Aneesh Pappu, Rohun Saxena, Akhila Yerukola, Christopher D. Manning

Large neural language models trained on massive amounts of text have emerged as a formidable strategy for Natural Language Understanding tasks.

Natural Language Understanding Story Generation

Learning by Abstraction: The Neural State Machine

4 code implementations NeurIPS 2019 Drew A. Hudson, Christopher D. Manning

We introduce the Neural State Machine, seeking to bridge the gap between the neural and symbolic views of AI and integrate their complementary strengths for the task of visual reasoning.

Visual Question Answering (VQA) Visual Reasoning

What Does BERT Look At? An Analysis of BERT's Attention

2 code implementations WS 2019 Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. Manning

Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data.

Language Modelling

A Structural Probe for Finding Syntax in Word Representations

1 code implementation NAACL 2019 John Hewitt, Christopher D. Manning

Recent work has improved our ability to detect linguistic knowledge in word representations.


GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

5 code implementations CVPR 2019 Drew A. Hudson, Christopher D. Manning

We introduce GQA, a new dataset for real-world visual reasoning and compositional question answering, seeking to address key shortcomings of previous VQA datasets.

Question Answering Visual Question Answering (VQA) +1

Semi-Supervised Sequence Modeling with Cross-View Training

2 code implementations EMNLP 2018 Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc V. Le

We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.

CCG Supertagging Dependency Parsing +6

Learning to Summarize Radiology Findings

2 code implementations WS 2018 Yuhao Zhang, Daisy Yi Ding, Tianpei Qian, Christopher D. Manning, Curtis P. Langlotz

The Impression section of a radiology report summarizes crucial radiology findings in natural language and plays a central role in communicating these findings to physicians.

Textual Analogy Parsing: What's Shared and What's Compared among Analogous Facts

2 code implementations EMNLP 2018 Matthew Lamm, Arun Tejasvi Chaganty, Christopher D. Manning, Dan Jurafsky, Percy Liang

To understand a sentence like "whereas only 10% of White Americans live at or below the poverty line, 28% of African Americans do" it is important not only to identify individual facts, e. g., poverty rates of distinct demographic groups, but also the higher-order relations between them, e. g., the disparity between them.

Textual Analogy Parsing

Simpler but More Accurate Semantic Dependency Parsing

3 code implementations ACL 2018 Timothy Dozat, Christopher D. Manning

While syntactic dependency annotations concentrate on the surface or functional structure of a sentence, semantic dependency annotations aim to capture between-word relationships that are more closely related to the meaning of a sentence, using graph-structured representations.

Dependency Parsing Semantic Dependency Parsing

Sentences with Gapping: Parsing and Reconstructing Elided Predicates

2 code implementations NAACL 2018 Sebastian Schuster, Joakim Nivre, Christopher D. Manning

Sentences with gapping, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments.

Natural Language Understanding Relation Extraction

Importance sampling for unbiased on-demand evaluation of knowledge base population

no code implementations EMNLP 2017 Arun Chaganty, Ashwin Paranjape, Percy Liang, Christopher D. Manning

Our first contribution is a new importance-sampling based evaluation which corrects for this bias by annotating a new system{'}s predictions on-demand via crowdsourcing.

Information Retrieval Knowledge Base Population +1

Stanford's Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task

no code implementations CONLL 2017 Timothy Dozat, Peng Qi, Christopher D. Manning

This paper describes the neural dependency parser submitted by Stanford to the CoNLL 2017 Shared Task on parsing Universal Dependencies.

Dependency Parsing

Arc-swift: A Novel Transition System for Dependency Parsing

1 code implementation ACL 2017 Peng Qi, Christopher D. Manning

Transition-based dependency parsers often need sequences of local shift and reduce operations to produce certain attachments.

Dependency Parsing

Get To The Point: Summarization with Pointer-Generator Networks

39 code implementations ACL 2017 Abigail See, Peter J. Liu, Christopher D. Manning

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text).

Abstractive Text Summarization Document Summarization +1

SceneSeer: 3D Scene Design with Natural Language

no code implementations28 Feb 2017 Angel X. Chang, Mihail Eric, Manolis Savva, Christopher D. Manning

We present SceneSeer: an interactive text to 3D scene generation system that allows a user to design 3D scenes using natural language.

Scene Generation Text to 3D

A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue

no code implementations EACL 2017 Mihail Eric, Christopher D. Manning

Task-oriented dialogue focuses on conversational agents that participate in user-initiated dialogues on domain-specific topics.

Response Generation

Deep Biaffine Attention for Neural Dependency Parsing

25 code implementations6 Nov 2016 Timothy Dozat, Christopher D. Manning

This paper builds off recent work from Kiperwasser & Goldberg (2016) using neural attention in a simple graph-based dependency parser.

Dependency Parsing

Compression of Neural Machine Translation Models via Pruning

1 code implementation CONLL 2016 Abigail See, Minh-Thang Luong, Christopher D. Manning

Neural Machine Translation (NMT), like many other deep learning domains, typically suffers from over-parameterization, resulting in large storage sizes.

Machine Translation NMT +1

A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task

3 code implementations ACL 2016 Danqi Chen, Jason Bolton, Christopher D. Manning

Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of NLP.

Reading Comprehension

Learning Language Games through Interaction

3 code implementations ACL 2016 Sida I. Wang, Percy Liang, Christopher D. Manning

We introduce a new language learning setting relevant to building adaptive natural language interfaces.

Semantic Parsing

Improving Coreference Resolution by Learning Entity-Level Distributed Representations

1 code implementation ACL 2016 Kevin Clark, Christopher D. Manning

A long-standing challenge in coreference resolution has been the incorporation of entity-level information - features defined over clusters of mentions instead of mention pairs.


A comparison of Named-Entity Disambiguation and Word Sense Disambiguation

no code implementations LREC 2016 Angel Chang, Valentin I. Spitkovsky, Christopher D. Manning, Eneko Agirre

Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia-derived resources like DBpedia.

Entity Disambiguation Word Sense Disambiguation

Evaluating the word-expert approach for Named-Entity Disambiguation

no code implementations15 Mar 2016 Angel X. Chang, Valentin I. Spitkovsky, Christopher D. Manning, Eneko Agirre

Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia.

Entity Disambiguation Word Sense Disambiguation

A large annotated corpus for learning natural language inference

3 code implementations EMNLP 2015 Samuel R. Bowman, Gabor Angeli, Christopher Potts, Christopher D. Manning

Understanding entailment and contradiction is fundamental to understanding natural language, and inference about entailment and contradiction is a valuable testing ground for the development of semantic representations.

Image Captioning Natural Language Inference

Effective Approaches to Attention-based Neural Machine Translation

47 code implementations EMNLP 2015 Minh-Thang Luong, Hieu Pham, Christopher D. Manning

Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25. 9 BLEU points, an improvement of 1. 0 BLEU points over the existing best system backed by NMT and an n-gram reranker.

 Ranked #1 on Machine Translation on 20NEWS (Accuracy metric)

Image-guided Story Ending Generation Machine Translation +2

Tree-structured composition in neural networks without tree-structured architectures

1 code implementation16 Jun 2015 Samuel R. Bowman, Christopher D. Manning, Christopher Potts

We hypothesize that neural sequence models like LSTMs are in fact able to discover and implicitly use recursive compositional structure, at least for tasks with clear cues to that structure in the data.

Text to 3D Scene Generation with Rich Lexical Grounding

no code implementations IJCNLP 2015 Angel Chang, Will Monroe, Manolis Savva, Christopher Potts, Christopher D. Manning

The ability to map descriptions of scenes to 3D geometric representations has many applications in areas such as art, education, and robotics.

Scene Generation Text to 3D

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

16 code implementations IJCNLP 2015 Kai Sheng Tai, Richard Socher, Christopher D. Manning

Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks.

General Classification Semantic Similarity +2

Learning Distributed Representations for Structured Output Prediction

no code implementations NeurIPS 2014 Vivek Srikumar, Christopher D. Manning

In recent years, distributed representations of inputs have led to performance gains in many applications by allowing statistical information to be shared across inputs.

Document Classification Part-Of-Speech Tagging +1

Global Belief Recursive Neural Networks

no code implementations NeurIPS 2014 Romain Paulus, Richard Socher, Christopher D. Manning

Recursive Neural Networks have recently obtained state of the art performance on several natural language processing tasks.

Sentiment Analysis Sentiment Classification

Simple MAP Inference via Low-Rank Relaxations

1 code implementation NeurIPS 2014 Roy Frostig, Sida Wang, Percy S. Liang, Christopher D. Manning

We focus on the problem of maximum a posteriori (MAP) inference in Markov random fields with binary variables and pairwise interactions.

Learning Distributed Word Representations for Natural Logic Reasoning

no code implementations15 Oct 2014 Samuel R. Bowman, Christopher Potts, Christopher D. Manning

Natural logic offers a powerful relational conception of meaning that is a natural counterpart to distributed semantic representations, which have proven valuable in a wide range of sophisticated language tasks.

Logical Reasoning Open-Ended Question Answering +1

Recursive Neural Networks Can Learn Logical Semantics

no code implementations WS 2015 Samuel R. Bowman, Christopher Potts, Christopher D. Manning

Tree-structured recursive neural networks (TreeRNNs) for sentence meaning have been successful for many applications, but it remains an open question whether the fixed-length representations that they learn can support tasks as demanding as logical deduction.

Open-Ended Question Answering Relational Reasoning +1

Universal Stanford dependencies: A cross-linguistic typology

no code implementations LREC 2014 Marie-Catherine de Marneffe, Timothy Dozat, Natalia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre, Christopher D. Manning

Revisiting the now de facto standard Stanford dependency representation, we propose an improved taxonomy to capture grammatical relations across languages, including morphologically rich ones.

Grounded Compositional Semantics for Finding and Describing Images with Sentences

no code implementations TACL 2014 Richard Socher, Andrej Karpathy, Quoc V. Le, Christopher D. Manning, Andrew Y. Ng

Previous work on Recursive Neural Networks (RNNs) shows that these models can produce compositional feature vectors for accurately representing and classifying sentences or images.

Cross-lingual Projected Expectation Regularization for Weakly Supervised Learning

no code implementations TACL 2014 Mengqiu Wang, Christopher D. Manning

We consider a multilingual weakly supervised learning scenario where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide the learning in other languages.

NER Weakly-supervised Learning

Relaxations for inference in restricted Boltzmann machines

no code implementations21 Dec 2013 Sida I. Wang, Roy Frostig, Percy Liang, Christopher D. Manning

We propose a relaxation-based approximate inference algorithm that samples near-MAP configurations of a binary pairwise Markov random field.

Reasoning With Neural Tensor Networks for Knowledge Base Completion

no code implementations NeurIPS 2013 Richard Socher, Danqi Chen, Christopher D. Manning, Andrew Ng

We assess the model by considering the problem of predicting additional true relations between entities given a partial knowledge base.

Knowledge Base Completion Tensor Networks

Cross-lingual Pseudo-Projected Expectation Regularization for Weakly Supervised Learning

no code implementations6 Oct 2013 Mengqiu Wang, Christopher D. Manning

We consider a multilingual weakly supervised learning scenario where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide the learning in other languages.

NER Weakly-supervised Learning

Robust Logistic Regression using Shift Parameters (Long Version)

no code implementations21 May 2013 Julie Tibshirani, Christopher D. Manning

Annotation errors can significantly hurt classifier performance, yet datasets are only growing noisier with the increased use of Amazon Mechanical Turk and techniques like distant supervision that automatically generate labels.

named-entity-recognition Named Entity Recognition +2