Search Results for author: Huan Sun

Found 56 papers, 38 papers with code

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

no code implementations27 Nov 2023 Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning.

How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities

1 code implementation15 Nov 2023 Lingbo Mo, Boshi Wang, Muhao Chen, Huan Sun

The rapid progress in open-source Large Language Models (LLMs) is significantly driving AI development forward.

Ethics Fairness +1

TableLlama: Towards Open Large Generalist Models for Tables

no code implementations15 Nov 2023 Tianshu Zhang, Xiang Yue, Yifei Li, Huan Sun

Towards that end, we construct TableInstruct, a new dataset with a variety of realistic tables and tasks, for instruction tuning and evaluating LLMs.

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning

1 code implementation11 Sep 2023 Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

The MAmmoTH models are trained on MathInstruct, our meticulously curated instruction tuning dataset.

Mathematical Reasoning

AgentBench: Evaluating LLMs as Agents

1 code implementation7 Aug 2023 Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.

Decision Making Instruction Following

Biomedical Language Models are Robust to Sub-optimal Tokenization

1 code implementation30 Jun 2023 Bernal Jiménez Gutiérrez, Huan Sun, Yu Su

As opposed to general English, many concepts in biomedical terminology have been designed in recent history by biomedical professionals with the goal of being precise and concise.

Entity Linking Language Modelling +4

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

1 code implementation16 Jun 2023 Kai Zhang, Lingbo Mo, Wenhu Chen, Huan Sun, Yu Su

To address this issue, we introduce MagicBrush (https://osu-nlp-group. github. io/MagicBrush/), the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing.


Mind2Web: Towards a Generalist Agent for the Web

1 code implementation9 Jun 2023 Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Samuel Stevens, Boshi Wang, Huan Sun, Yu Su

We introduce Mind2Web, the first dataset for developing and evaluating generalist agents for the web that can follow language instructions to complete complex tasks on any website.

Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms

1 code implementation26 May 2023 Tianshu Zhang, Changchang Liu, Wei-Han Lee, Yu Su, Huan Sun

By leveraging data from multiple clients, the FL paradigm can be especially beneficial for clients that have little training data to develop a data-hungry neural semantic parser on their own.

Federated Learning Semantic Parsing +1

Error Detection for Text-to-SQL Semantic Parsing

no code implementations23 May 2023 Shijie Chen, Ziru Chen, Huan Sun, Yu Su

Despite remarkable progress in text-to-SQL semantic parsing in recent years, the performance of existing parsers is still far from perfect.

Language Modelling Semantic Parsing +1

Exploring Chain-of-Thought Style Prompting for Text-to-SQL

no code implementations23 May 2023 Chang-You Tai, Ziru Chen, Tianshu Zhang, Xiang Deng, Huan Sun

Thus, we systematically study how to enhance LLMs' reasoning ability through chain of thought (CoT) style prompting, including the original chain-of-thought prompting (Wei et al., 2022b) and least-to-most prompting (Zhou et al., 2023).

SQL Parsing Text-To-SQL

Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate

no code implementations22 May 2023 Boshi Wang, Xiang Yue, Huan Sun

Large language models (LLMs) such as ChatGPT and GPT-4 have shown impressive performance in complex reasoning tasks.

Benchmarking Memorization

Text-to-SQL Error Correction with Language Models of Code

1 code implementation22 May 2023 Ziru Chen, Shijie Chen, Michael White, Raymond Mooney, Ali Payani, Jayanth Srinivasa, Yu Su, Huan Sun

Thus, we propose a novel representation for SQL queries and their edits that adheres more closely to the pre-training corpora of language models of code.

SQL Parsing Text-To-SQL

Automatic Evaluation of Attribution by Large Language Models

1 code implementation10 May 2023 Xiang Yue, Boshi Wang, Ziru Chen, Kai Zhang, Yu Su, Huan Sun

We manually curate a set of test examples covering 12 domains from a generative search engine, New Bing.

Fact Checking Language Modelling +3

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

no code implementations6 Mar 2023 Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun, Yoon Kim

Prompt tuning, in which a base pretrained model is adapted to each task via conditioning on learned prompt vectors, has emerged as a promising approach for efficiently adapting large language models to multiple downstream tasks.

Transfer Learning

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

1 code implementation20 Dec 2022 Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun

Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs).

Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe

1 code implementation25 Oct 2022 Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, Robert Sim

Privacy concerns have attracted increasing attention in data-driven products due to the tendency of machine learning models to memorize sensitive training data.

Language Modelling Text Generation

Bootstrapping a User-Centered Task-Oriented Dialogue System

no code implementations11 Jul 2022 Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Lingbo Mo, Samuel Stevens, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun

We present TacoBot, a task-oriented dialogue system built for the inaugural Alexa Prize TaskBot Challenge, which assists users in completing multi-step cooking and home improvement tasks.

Data Augmentation Dialogue Management +2

$\mathsf{G^2Retro}$ as a Two-Step Graph Generative Models for Retrosynthesis Prediction

1 code implementation10 Jun 2022 Ziqi Chen, Oluwatosin R. Ayinde, James R. Fuchs, Huan Sun, Xia Ning

It first predicts the reaction centers in the target molecules (products), identifies the synthons needed to assemble the products, and transforms these synthons into reactants.

Retrosynthesis Vocal Bursts Valence Prediction

Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again

1 code implementation16 Mar 2022 Bernal Jiménez Gutiérrez, Nikolas McNeal, Clay Washington, You Chen, Lang Li, Huan Sun, Yu Su

In this paper, we present the first systematic and comprehensive study to compare the few-shot performance of GPT-3 in-context learning with fine-tuning smaller (i. e., BERT-sized) PLMs on two highly representative biomedical information extraction tasks, named entity recognition and relation extraction.

Model Selection named-entity-recognition +4

Iteratively Prompt Pre-trained Language Models for Chain of Thought

1 code implementation16 Mar 2022 Boshi Wang, Xiang Deng, Huan Sun

While Pre-trained Language Models (PLMs) internalize a great amount of world knowledge, they have been shown incapable of recalling these knowledge to solve tasks requiring complex & multi-step reasoning.

Synthetic Question Value Estimation for Domain Adaptation of Question Answering

1 code implementation ACL 2022 Xiang Yue, Ziyu Yao, Huan Sun

Synthesizing QA pairs with a question generator (QG) on the target domain has become a popular approach for domain adaptation of question answering (QA) models.

Domain Adaptation Question Answering

DOM-LM: Learning Generalizable Representations for HTML Documents

1 code implementation25 Jan 2022 Xiang Deng, Prashant Shiralkar, Colin Lockard, Binxuan Huang, Huan Sun

We argue that the text and HTML structure together convey important semantics of the content and therefore warrant a special treatment for their representation learning.

Attribute Extraction Open Information Extraction +2

TopNet: Learning from Neural Topic Model to Generate Long Stories

no code implementations14 Dec 2021 Yazheng Yang, Boyuan Pan, Deng Cai, Huan Sun

In particular, instead of directly generating a story, we first learn to map the short text input to a low-dimensional topic distribution (which is pre-assigned by a topic model).

Story Generation

Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction

1 code implementation Findings (ACL) 2022 Lingbo Mo, Ashley Lewis, Huan Sun, Michael White

In this work, we investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language and enables the user to make corrections through natural-language feedback for individual steps.

Question Answering Semantic Parsing

ReasonBERT: Pre-trained to Reason with Distant Supervision

1 code implementation EMNLP 2021 Xiang Deng, Yu Su, Alyssa Lees, You Wu, Cong Yu, Huan Sun

We present ReasonBert, a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid contexts.

 Ranked #1 on Question Answering on HotpotQA (Joint F1 metric)

Extractive Question-Answering Question Answering +1

Differential Privacy for Text Analytics via Natural Text Sanitization

1 code implementation Findings (ACL) 2021 Xiang Yue, Minxin Du, Tianhao Wang, Yaliang Li, Huan Sun, Sherman S. M. Chow

The sanitized texts also contribute to our sanitization-aware pretraining and fine-tuning, enabling privacy-preserving natural language processing over the BERT language model with promising utility.

Language Modelling Privacy Preserving

Learning Structural Edits via Incremental Tree Transformations

1 code implementation ICLR 2021 Ziyu Yao, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig

To show the unique benefits of modeling tree edits directly, we further propose a novel edit encoder for learning to represent edits, as well as an imitation learning method that allows the editor to be more robust.

Imitation Learning

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

2 code implementations30 Oct 2020 Xiang Yue, Xinliang Frederick Zhang, Ziyu Yao, Simon Lin, Huan Sun

Clinical question answering (QA) aims to automatically answer questions from medical professionals based on clinical texts.

Domain Adaptation Question Answering +2

Structure-Grounded Pretraining for Text-to-SQL

no code implementations NAACL 2021 Xiang Deng, Ahmed Hassan Awadallah, Christopher Meek, Oleksandr Polozov, Huan Sun, Matthew Richardson

Additionally, to evaluate different methods under more realistic text-table alignment settings, we create a new evaluation set Spider-Realistic based on Spider dev set with explicit mentions of column names removed, and adopt eight existing text-to-SQL datasets for cross-database evaluation.


COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval

1 code implementation EMNLP 2021 Xinliang Frederick Zhang, Heming Sun, Xiang Yue, Simon Lin, Huan Sun

For evaluation, we introduce Query Bank and Relevance Set, where the former contains 1, 236 human-paraphrased queries while the latter contains ~32 human-annotated FAQ items for each query.


Learning a Cost-Effective Annotation Policy for Question Answering

1 code implementation EMNLP 2020 Bernhard Kratzwald, Stefan Feuerriegel, Huan Sun

State-of-the-art question answering (QA) relies upon large amounts of training data for which labeling is time consuming and thus expensive.

Question Answering

Energy Efficiency Optimization in IRS-Enhanced mmWave Systems with Lens Antenna Array

no code implementations2 Jul 2020 Yazheng Wang, Hancheng Lu, Dan Zhao, Huan Sun

To address this problem, we propose an intelligent reflect surface (IRS) enhanced multi-user mmWave communication system with lens antenna array.


Joint Passive Beamforming and User Association Optimization for IRS-assisted mmWave Systems

no code implementations2 Jul 2020 Dan Zhao, Hancheng Lu, Yazheng Wang, Huan Sun

Considering the impact of IRS on user association, we formulate a sum rate maximization problem by jointly optimizing the passive beamforming at IRS and user association, which is an intractable non-convex problem.

TURL: Table Understanding through Representation Learning

1 code implementation26 Jun 2020 Xiang Deng, Huan Sun, Alyssa Lees, You Wu, Cong Yu

In this paper, we present TURL, a novel framework that introduces the pre-training/fine-tuning paradigm to relational Web tables.

Cell Entity Annotation Columns Property Annotation +3

An Imitation Game for Learning Semantic Parsers from User Interaction

1 code implementation EMNLP 2020 Ziyu Yao, Yiqi Tang, Wen-tau Yih, Huan Sun, Yu Su

Despite the widely successful applications, bootstrapping and fine-tuning semantic parsers are still a tedious process with challenges such as costly data annotation and privacy risks.

Imitation Learning Text-To-SQL

Rationalizing Medical Relation Prediction from Corpus-level Statistics

1 code implementation ACL 2020 Zhen Wang, Jennifer Lee, Simon Lin, Huan Sun

Nowadays, the interpretability of machine learning models is becoming increasingly important, especially in the medical domain.

Decision Making

Practical Annotation Strategies for Question Answering Datasets

no code implementations6 Mar 2020 Bernhard Kratzwald, Xiang Yue, Huan Sun, Stefan Feuerriegel

Here, remarkably, annotating a stratified subset with only 1. 2% of the original training set achieves 97. 7% of the performance as if the complete dataset was annotated.

Question Answering Test

Easy-to-Hard: Leveraging Simple Questions for Complex Question Generation

no code implementations5 Dec 2019 Jie Zhao, Xiang Deng, Huan Sun

This paper makes one of the first efforts toward automatically generating complex questions from knowledge graphs.

Data Augmentation Knowledge Graphs +2

An End-to-End Framework for Cold Question Routing in Community Question Answering Services

no code implementations22 Nov 2019 Jiankai Sun, Jie Zhao, Huan Sun, Srinivasan Parthasarathy

Routing newly posted questions (a. k. a cold questions) to potential answerers with the suitable expertise in Community Question Answering sites (CQAs) is an important and challenging task.

Community Question Answering Graph Embedding

Model-based Interactive Semantic Parsing: A Unified Framework and A Text-to-SQL Case Study

2 code implementations IJCNLP 2019 Ziyu Yao, Yu Su, Huan Sun, Wen-tau Yih

As a promising paradigm, interactive semantic parsing has shown to improve both semantic parsing accuracy and user confidence in the results.

Semantic Parsing Text-To-SQL

Automatic Table completion using Knowledge Base

no code implementations20 Sep 2019 Bortik Bandyopadhyay, Xiang Deng, Goonmeet Bajaj, Huan Sun, Srinivasan Parthasarathy

In this work, we propose to resolve a new type of heterogeneous query viz: tabular query, which contains a natural language query description, column names of the desired table, and an example row.

Decision Making

Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction

1 code implementation IJCNLP 2019 Xiang Deng, Huan Sun

Given two entities, distant supervision exploits sentences that directly mention them for predicting their semantic relation.

Relation Extraction

Reinforced Dynamic Reasoning for Conversational Question Generation

1 code implementation ACL 2019 Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun

This paper investigates a new task named Conversational Question Generation (CQG) which is to generate a question based on a passage and a conversation history (i. e., previous turns of question-answer pairs).

Question Answering Question Generation +2

SurfCon: Synonym Discovery on Privacy-Aware Clinical Data

1 code implementation21 Jun 2019 Zhen Wang, Xiang Yue, Soheil Moosavinasab, Yungui Huang, Simon Lin, Huan Sun

To solve the problem, we propose a new framework SurfCon that leverages two important types of information in the privacy-aware clinical data, i. e., the surface form information, and the global context information for synonym discovery.

Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations

4 code implementations12 Jun 2019 Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang, Huan Sun

Our experimental results demonstrate that the recent graph embedding methods achieve promising results and deserve more attention in the future biomedical graph analysis.

Graph Embedding Link Prediction +2

CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

1 code implementation13 Mar 2019 Ziyu Yao, Jayavardhan Reddy Peddamail, Huan Sun

In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called `CoaCor'), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others.

reinforcement-learning Reinforcement Learning (RL) +1

StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow

1 code implementation26 Mar 2018 Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, Huan Sun

In this paper, we investigate a new problem of systematically mining question-code pairs from Stack Overflow (in contrast to heuristically collecting them).


An End-to-End Deep Framework for Answer Triggering with a Novel Group-Level Objective

no code implementations EMNLP 2017 Jie Zhao, Yu Su, Ziyu Guan, Huan Sun

Given a question and a set of answer candidates, answer triggering determines whether the candidate set contains any correct answers.

Multiple Instance Learning Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.