Search Results for author: Michihiro Yasunaga

Found 21 papers, 16 papers with code

Extending the WILDS Benchmark for Unsupervised Adaptation

no code implementations9 Dec 2021 Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

In this work, we present the WILDS 2. 0 update, which extends 8 of the 10 datasets in the WILDS benchmark of distribution shifts to include curated unlabeled data that would be realistically obtainable in deployment.

LM-Critic: Language Models for Unsupervised Grammatical Error Correction

2 code implementations EMNLP 2021 Michihiro Yasunaga, Jure Leskovec, Percy Liang

Training a model for grammatical error correction (GEC) requires a set of labeled ungrammatical / grammatical sentence pairs, but manually annotating such pairs can be expensive.

Grammatical Error Correction Language Modelling

On the Opportunities and Risks of Foundation Models

1 code implementation16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Kohd, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Break-It-Fix-It: Unsupervised Learning for Program Repair

1 code implementation11 Jun 2021 Michihiro Yasunaga, Percy Liang

To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code.

Code Repair Data Augmentation +3

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

3 code implementations NAACL 2021 Michihiro Yasunaga, Hongyu Ren, Antoine Bosselut, Percy Liang, Jure Leskovec

The problem of answering questions using knowledge from pre-trained language models (LMs) and knowledge graphs (KGs) presents two challenges: given a QA context (question and answer choice), methods need to (i) identify relevant knowledge from large KGs, and (ii) perform joint reasoning over the QA context and KG.

Common Sense Reasoning Graph Representation Learning +4

Graph-based, Self-Supervised Program Repair from Diagnostic Feedback

2 code implementations ICML 2020 Michihiro Yasunaga, Percy Liang

Second, we present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online to create a large amount of extra program repair examples, which we use to pre-train our models.

Code Generation Graph Learning +2

A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

1 code implementation30 Oct 2019 Irene Li, Michihiro Yasunaga, Muhammed Yavuz Nuzumlali, Cesar Caraballo, Shiwani Mahajan, Harlan Krumholz, Dragomir Radev

Specifically, a neural topic-attention model is applied to learn improved contextualized sentence representations for medical term abbreviation disambiguation.

Few-Shot Learning

ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks

1 code implementation4 Sep 2019 Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R. Fabbri, Irene Li, Dan Friedman, Dragomir R. Radev

Scientific article summarization is challenging: large, annotated corpora are not available, and the summary should ideally include the article's impacts on research community.

Scientific Document Summarization

The CL-SciSumm Shared Task 2018: Results and Key Insights

1 code implementation2 Sep 2019 Kokil Jaidka, Michihiro Yasunaga, Muthu Kumar Chandrasekaran, Dragomir Radev, Min-Yen Kan

This overview describes the official results of the CL-SciSumm Shared Task 2018 -- the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain.

Document Summarization Information Retrieval +1

SParC: Cross-Domain Semantic Parsing in Context

5 code implementations ACL 2019 Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev

The best model obtains an exact match accuracy of 20. 2% over all questions and less than10% over all interaction sequences, indicating that the cross-domain setting and the con-textual phenomena of the dataset present significant challenges for future research.

Semantic Parsing Text-To-Sql

TopicEq: A Joint Topic and Mathematical Equation Model for Scientific Texts

no code implementations16 Feb 2019 Michihiro Yasunaga, John Lafferty

Scientific documents rely on both mathematics and text to communicate ideas.

 Ranked #1 on Topic Models on arXiv (Topic Coherence@50 metric)

Language Modelling Topic Models

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task

2 code implementations11 Oct 2018 Tao Yu, Michihiro Yasunaga, Kai Yang, Rui Zhang, Dongxu Wang, Zifan Li, Dragomir Radev

In this paper we propose SyntaxSQLNet, a syntax tree network to address the complex and cross-domain text-to-SQL generation task.

Semantic Parsing Text-To-Sql

Robust Multilingual Part-of-Speech Tagging via Adversarial Training

1 code implementation NAACL 2018 Michihiro Yasunaga, Jungo Kasai, Dragomir Radev

Adversarial training (AT) is a powerful regularization method for neural networks, aiming to achieve robustness to input perturbations.

Chunking Dependency Parsing +3

Cannot find the paper you are looking for? You can Submit a new open access paper.