Search Results for author: Yizhong Wang

Found 32 papers, 24 papers with code

Third-Party Language Model Performance Prediction from Instruction

1 code implementation • 19 Mar 2024 • Rahul Nadkarni, Yizhong Wang, Noah A. Smith

Language model-based instruction-following systems have lately shown increasing performance on many benchmark tasks, demonstrating the capability of adapting to a broad variety of instructions.

Instruction Following Language Modelling

Paper
Code

Tur[k]ingBench: A Challenge Benchmark for Web Agents

no code implementations • 18 Mar 2024 • Kevin Xu, Yeganeh Kordi, Kate Sanders, Yizhong Wang, Adam Byerly, Jack Zhang, Benjamin Van Durme, Daniel Khashabi

We evaluate the performance of state-of-the-art models, including language-only, vision-only, and layout-only models, and their combinations, on this benchmark.

Paper
Add Code

Set the Clock: Temporal Alignment of Pretrained Language Models

1 code implementation • 26 Feb 2024 • Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith

We then develop several methods, from prompting to finetuning, to align LMs to use their most recent knowledge when answering questions, and investigate various factors in this alignment.

Paper
Code

Can Language Models Act as Knowledge Bases at Scale?

1 code implementation • 22 Feb 2024 • Qiyuan He, Yizhong Wang, Wenya Wang

Large language models (LLMs) have demonstrated remarkable proficiency in understanding and generating responses to complex queries through large-scale pre-training.

Natural Language Queries World Knowledge

Paper
Code

OLMo: Accelerating the Science of Language Models

2 code implementations • 1 Feb 2024 • Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi

Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs.

Language Modelling

3,897

Paper
Code

Tuning Language Models by Proxy

1 code implementation • 16 Jan 2024 • Alisa Liu, Xiaochuang Han, Yizhong Wang, Yulia Tsvetkov, Yejin Choi, Noah A. Smith

Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors.

Domain Adaptation Math +1

Paper
Code

Fine-grained Hallucination Detection and Editing for Language Models

no code implementations • 12 Jan 2024 • Abhika Mishra, Akari Asai, Vidhisha Balachandran, Yizhong Wang, Graham Neubig, Yulia Tsvetkov, Hannaneh Hajishirzi

On our benchmark, our automatic and human evaluations show that FAVA significantly outperforms ChatGPT and GPT-4 on fine-grained hallucination detection, and edits suggested by FAVA improve the factuality of LM-generated text.

Hallucination Retrieval

Paper
Add Code

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

2 code implementations • 17 Nov 2023 • Hamish Ivison, Yizhong Wang, Valentina Pyatkin, Nathan Lambert, Matthew Peters, Pradeep Dasigi, Joel Jang, David Wadden, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

Since the release of T\"ULU [Wang et al., 2023b], open resources for instruction tuning have developed quickly, from better base models to new finetuning techniques.

977

Paper
Code

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

2 code implementations • 17 Oct 2023 • Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi

Our framework trains a single arbitrary LM that adaptively retrieves passages on-demand, and generates and reflects on retrieved passages and its own generations using special tokens, called reflection tokens.

Fact Verification Response Generation +1

1,367

Paper
Code

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

1 code implementation • 17 Oct 2023 • Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu

In this work, we study Reinforcement Learning from Personalized Human Feedback (RLPHF) problem, wherein LLMs are aligned to multiple (sometimes conflicting) preferences by modeling alignment as a Multi-Objective Reinforcement Learning (MORL) problem.

Language Modelling Large Language Model +2

Paper
Code

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models

1 code implementation • 2 Oct 2023 • Qingqing Cao, Sewon Min, Yizhong Wang, Hannaneh Hajishirzi

Retrieval augmentation addresses many critical problems in large language models such as hallucination, staleness, and privacy leaks.

Hallucination Retrieval

Paper
Code

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

1 code implementation • NeurIPS 2023 • Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi

Our evaluations show that the best model in any given evaluation reaches on average 87% of ChatGPT performance, and 73% of GPT-4 performance, suggesting that further investment in building better base models and instruction-tuning data is required to close the gap.

Instruction Following

977

Paper
Code

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

1 code implementation • ICCV 2023 • Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A Smith

We introduce TIFA (Text-to-Image Faithfulness evaluation with question Answering), an automatic evaluation metric that measures the faithfulness of a generated image to its text input via visual question answering (VQA).

4k Language Modelling +4

109

Paper
Code

Self-Instruct: Aligning Language Models with Self-Generated Instructions

16 code implementations • 20 Dec 2022 • Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi

Applying our method to the vanilla GPT3, we demonstrate a 33% absolute improvement over the original model on Super-NaturalInstructions, on par with the performance of InstructGPT-001, which was trained with private user data and human annotations.

Instruction Following Language Modelling

28,728

Paper
Code

HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot Generalisation

no code implementations • 20 Dec 2022 • Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, Matthew Peters

By converting instructions into modules, HINT models can effectively disregard the length of instructions and few-shot example inputs in terms of compute usage.

In-Context Learning

Paper
Add Code

One Embedder, Any Task: Instruction-Finetuned Text Embeddings

3 code implementations • 19 Dec 2022 • Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu

Our analysis suggests that INSTRUCTOR is robust to changes in instructions, and that instruction finetuning mitigates the challenge of training a single model on diverse datasets.

Information Retrieval Learning Word Embeddings +3

4,000

Paper
Code

Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

7 code implementations • 16 Apr 2022 • Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi, Daniel Khashabi

This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions -- training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones.

Benchmarking Instruction Following

894

Paper
Code

Probing Across Time: What Does RoBERTa Know and When?

1 code implementation • Findings (EMNLP) 2021 • Leo Z. Liu, Yizhong Wang, Jungo Kasai, Hannaneh Hajishirzi, Noah A. Smith

Models of language trained on very large corpora have been demonstrated useful for NLP.

Language Modelling

Paper
Code

MultiModalQA: Complex Question Answering over Text, Tables and Images

no code implementations • ICLR 2021 • Alon Talmor, Ori Yoran, Amnon Catav, Dan Lahav, Yizhong Wang, Akari Asai, Gabriel Ilharco, Hannaneh Hajishirzi, Jonathan Berant

When answering complex questions, people can seamlessly combine information from visual, textual and tabular sources.

Question Answering

Paper
Add Code

Automated Lay Language Summarization of Biomedical Scientific Reviews

1 code implementation • 23 Dec 2020 • Yue Guo, Wei Qiu, Yizhong Wang, Trevor Cohen

Health literacy has emerged as a crucial factor in making appropriate health decisions and ensuring treatment outcomes.

Data Augmentation

Paper
Code

LiveQA: A Question Answering Dataset over Sports Live

2 code implementations • CCL 2020 • Qianying Liu, Sicong Jiang, Yizhong Wang, Sujian Li

In this paper, we introduce LiveQA, a new question answering dataset constructed from play-by-play live broadcast.

Multiple-choice Question Answering

Paper
Code

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

6 code implementations • EMNLP 2020 • Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, Yejin Choi

Experiments across four datasets show that these model-dependent measures reveal three distinct regions in the data map, each with pronounced characteristics.

Model Optimization Out-of-Distribution Generalization

183

Paper
Code

Do NLP Models Know Numbers? Probing Numeracy in Embeddings

1 code implementation • IJCNLP 2019 • Eric Wallace, Yizhong Wang, Sujian Li, Sameer Singh, Matt Gardner

The ability to understand and work with numbers (numeracy) is critical for many complex reasoning tasks.

Question Answering

Paper
Code

Machine Reading Comprehension: a Literature Review

no code implementations • 30 Jun 2019 • Xin Zhang, An Yang, Sujian Li, Yizhong Wang

Machine reading comprehension aims to teach machines to understand a text like a human and is a new challenging direction in Artificial Intelligence.

Machine Reading Comprehension

Paper
Add Code

DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

3 code implementations • NAACL 2019 • Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt Gardner

We introduce a new English reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs.

Ranked #14 on Question Answering on DROP Test

Question Answering Reading Comprehension +1

Paper
Code

Toward Fast and Accurate Neural Discourse Segmentation

1 code implementation • EMNLP 2018 • Yizhong Wang, Sujian Li, Jingfeng Yang

Discourse segmentation, which segments texts into Elementary Discourse Units, is a fundamental step in discourse analysis.

Discourse Segmentation

Paper
Code

Bag-of-Words as Target for Neural Machine Translation

1 code implementation • ACL 2018 • Shuming Ma, Xu sun, Yizhong Wang, Junyang Lin

However, most of the existing neural machine translation models only use one of the correct translations as the targets, and the other correct sentences are punished as the incorrect sentences in the training stage.

Machine Translation Sentence +1

Paper
Code

Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification

no code implementations • ACL 2018 • Yizhong Wang, Kai Liu, Jing Liu, wei he, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang

Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine.

Ranked #3 on Question Answering on MS MARCO

Machine Reading Comprehension Question Answering

Paper
Add Code

Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification

no code implementations • IJCNLP 2017 • Yizhong Wang, Sujian Li, Jingfeng Yang, Xu sun, Houfeng Wang

Identifying implicit discourse relations between text spans is a challenging task because it requires understanding the meaning of the text.

General Classification Implicit Discourse Relation Classification +3

Paper
Add Code

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

3 code implementations • WS 2018 • Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yu-An Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang

Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements.

Machine Reading Comprehension

11,349

Paper
Code

A Two-Stage Parsing Method for Text-Level Discourse Analysis

1 code implementation • ACL 2017 • Yizhong Wang, Sujian Li, Houfeng Wang

Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance.

Ranked #5 on Discourse Parsing on RST-DT

Dependency Parsing Document Summarization +4

Paper
Code

Towards Non-projective High-Order Dependency Parser

no code implementations • COLING 2016 • Wenjing Fang, Kenny Zhu, Yizhong Wang, Jia Tan

This paper presents a novel high-order dependency parsing framework that targets non-projective treebanks.

Dependency Parsing Vocal Bursts Intensity Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.