Search Results for author: Huan Sun

Found 65 papers, 47 papers with code

Knowledge Transfer between Structured and Unstructured Sources for Complex Question Answering

no code implementations • NAACL (SUKI) 2022 • Lingbo Mo, Zhen Wang, Jie Zhao, Huan Sun

More fine-grained analyses on transfer behaviors reveal the types of transferred knowledge and transfer patterns.

Multi-hop Question Answering Question Answering +1

Paper
Add Code

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

1 code implementation • 11 Apr 2024 • Zeyi Liao, Huan Sun

Moreover, we utilize those successful suffixes as training data to learn a generative model, named AmpleGCG, which captures the distribution of adversarial suffixes given a harmful query and enables the rapid generation of hundreds of suffixes for any harmful queries in seconds.

Paper
Code

Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents

no code implementations • 5 Apr 2024 • Harsh Kohli, Huan Sun

The rapid progress of large language models (LLMs) has seen them excel and frequently surpass human performance on standard benchmarks.

Multiple-choice Navigate

Paper
Add Code

AttributionBench: How Hard is Automatic Attribution Evaluation?

1 code implementation • 23 Feb 2024 • Yifei Li, Xiang Yue, Zeyi Liao, Huan Sun

Modern generative search engines enhance the reliability of large language model (LLM) responses by providing cited evidence.

Binary Classification Language Modelling +1

Paper
Code

A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models

1 code implementation • 18 Feb 2024 • Jaylen Jones, Lingbo Mo, Eric Fosler-Lussier, Huan Sun

Counter narratives - informed responses to hate speech contexts designed to refute hateful claims and de-escalate encounters - have emerged as an effective hate speech intervention strategy.

Paper
Code

When is Tree Search Useful for LLM Planning? It Depends on the Discriminator

1 code implementation • 16 Feb 2024 • Ziru Chen, Michael White, Raymond Mooney, Ali Payani, Yu Su, Huan Sun

In this paper, we examine how large language models (LLMs) solve multi-step problems under a language agent framework with three components: a generator, a discriminator, and a planning method.

Mathematical Reasoning Re-Ranking +2

Paper
Code

A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

1 code implementation • 15 Feb 2024 • Lingbo Mo, Zeyi Liao, Boyuan Zheng, Yu Su, Chaowei Xiao, Huan Sun

There is a surprisingly large gap between the speed and scale of their development and deployment and our understanding of their safety risks.

Paper
Code

LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset

1 code implementation • 14 Feb 2024 • Botao Yu, Frazier N. Baker, Ziqi Chen, Xia Ning, Huan Sun

Using SMolInstruct, we fine-tune a set of open-source LLMs, among which, we find that Mistral serves as the best base model for chemistry tasks.

Drug Discovery

Paper
Code

eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data

no code implementations • 13 Feb 2024 • Bo Peng, Xinyi Ling, Ziru Chen, Huan Sun, Xia Ning

Both the ECInstruct dataset and the eCeLLM models show great potential in empowering versatile and effective LLMs for e-commerce.

Domain Generalization

Paper
Add Code

GPT-4V(ision) is a Generalist Web Agent, if Grounded

1 code implementation • 3 Jan 2024 • Boyuan Zheng, Boyu Gou, Jihyung Kil, Huan Sun, Yu Su

The recent development on large multimodal models (LMMs), especially GPT-4V(ision) and Gemini, has been quickly expanding the capability boundaries of multimodal models beyond traditional tasks like image captioning and visual question answering.

Image Captioning Question Answering +1

473

Paper
Code

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

2 code implementations • 27 Nov 2023 • Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning.

Complex Query Answering Logical Reasoning +1

7,124

Paper
Code

TableLlama: Towards Open Large Generalist Models for Tables

no code implementations • 15 Nov 2023 • Tianshu Zhang, Xiang Yue, Yifei Li, Huan Sun

Towards that end, we construct TableInstruct, a new dataset with a variety of realistic tables and tasks, for instruction tuning and evaluating LLMs.

Paper
Add Code

How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities

1 code implementation • 15 Nov 2023 • Lingbo Mo, Boshi Wang, Muhao Chen, Huan Sun

The rapid progress in open-source Large Language Models (LLMs) is significantly driving AI development forward.

Ethics Fairness +2

Paper
Code

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning

1 code implementation • 11 Sep 2023 • Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

The MAmmoTH models are trained on MathInstruct, our meticulously curated instruction tuning dataset.

Math Mathematical Reasoning

274

Paper
Code

AgentBench: Evaluating LLMs as Agents

1 code implementation • 7 Aug 2023 • Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.

Decision Making Instruction Following

1,835

Paper
Code

Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System

1 code implementation • 29 Jul 2023 • Lingbo Mo, Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Sunit Singh, Samuel Stevens, Chang-You Tai, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun

We introduce TacoBot, a user-centered task-oriented digital assistant designed to guide users through complex real-world tasks with multiple steps.

Data Augmentation Dialogue Management +3

Paper
Code

Biomedical Language Models are Robust to Sub-optimal Tokenization

1 code implementation • 30 Jun 2023 • Bernal Jiménez Gutiérrez, Huan Sun, Yu Su

As opposed to general English, many concepts in biomedical terminology have been designed in recent history by biomedical professionals with the goal of being precise and concise.

Entity Linking Language Modelling +4

Paper
Code

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

1 code implementation • NeurIPS 2023 • Kai Zhang, Lingbo Mo, Wenhu Chen, Huan Sun, Yu Su

To address this issue, we introduce MagicBrush (https://osu-nlp-group. github. io/MagicBrush/), the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing.

text-guided-image-editing

248

Paper
Code

Mind2Web: Towards a Generalist Agent for the Web

1 code implementation • NeurIPS 2023 • Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Samuel Stevens, Boshi Wang, Huan Sun, Yu Su

We introduce Mind2Web, the first dataset for developing and evaluating generalist agents for the web that can follow language instructions to complete complex tasks on any website.

573

Paper
Code

Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms

1 code implementation • 26 May 2023 • Tianshu Zhang, Changchang Liu, Wei-Han Lee, Yu Su, Huan Sun

By leveraging data from multiple clients, the FL paradigm can be especially beneficial for clients that have little training data to develop a data-hungry neural semantic parser on their own.

Federated Learning Semantic Parsing +1

Paper
Code

Exploring Chain-of-Thought Style Prompting for Text-to-SQL

no code implementations • 23 May 2023 • Chang-You Tai, Ziru Chen, Tianshu Zhang, Xiang Deng, Huan Sun

Thus, we systematically study how to enhance LLMs' reasoning ability through chain of thought (CoT) style prompting, including the original chain-of-thought prompting (Wei et al., 2022b) and least-to-most prompting (Zhou et al., 2023).

In-Context Learning SQL Parsing +1

Paper
Add Code

Error Detection for Text-to-SQL Semantic Parsing

1 code implementation • 23 May 2023 • Shijie Chen, Ziru Chen, Huan Sun, Yu Su

Despite remarkable progress in text-to-SQL semantic parsing in recent years, the performance of existing parsers is still far from perfect.

Language Modelling Semantic Parsing +1

Paper
Code

Text-to-SQL Error Correction with Language Models of Code

1 code implementation • 22 May 2023 • Ziru Chen, Shijie Chen, Michael White, Raymond Mooney, Ali Payani, Jayanth Srinivasa, Yu Su, Huan Sun

Thus, we propose a novel representation for SQL queries and their edits that adheres more closely to the pre-training corpora of language models of code.

SQL Parsing Text-To-SQL

Paper
Code

Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate

no code implementations • 22 May 2023 • Boshi Wang, Xiang Yue, Huan Sun

Large language models (LLMs) such as ChatGPT and GPT-4 have shown impressive performance in complex reasoning tasks.

Benchmarking Math +1

Paper
Add Code

Automatic Evaluation of Attribution by Large Language Models

1 code implementation • 10 May 2023 • Xiang Yue, Boshi Wang, Ziru Chen, Kai Zhang, Yu Su, Huan Sun

We manually curate a set of test examples covering 12 domains from a generative search engine, New Bing.

Fact Checking Language Modelling +3

Paper
Code

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

no code implementations • 6 Mar 2023 • Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun, Yoon Kim

Prompt tuning, in which a base pretrained model is adapted to each task via conditioning on learned prompt vectors, has emerged as a promising approach for efficiently adapting large language models to multiple downstream tasks.

Transfer Learning

Paper
Add Code

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

2 code implementations • 20 Dec 2022 • Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun

Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs).

835

Paper
Code

Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe

1 code implementation • 25 Oct 2022 • Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, Robert Sim

Privacy concerns have attracted increasing attention in data-driven products due to the tendency of machine learning models to memorize sensitive training data.

Language Modelling Text Generation

Paper
Code

Bootstrapping a User-Centered Task-Oriented Dialogue System

no code implementations • 11 Jul 2022 • Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Lingbo Mo, Samuel Stevens, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun

We present TacoBot, a task-oriented dialogue system built for the inaugural Alexa Prize TaskBot Challenge, which assists users in completing multi-step cooking and home improvement tasks.

Data Augmentation Dialogue Management +2

Paper
Add Code

$\mathsf{G^2Retro}$ as a Two-Step Graph Generative Models for Retrosynthesis Prediction

1 code implementation • 10 Jun 2022 • Ziqi Chen, Oluwatosin R. Ayinde, James R. Fuchs, Huan Sun, Xia Ning

It first predicts the reaction centers in the target molecules (products), identifies the synthons needed to assemble the products, and transforms these synthons into reactants.

Retrosynthesis Vocal Bursts Valence Prediction

Paper
Code

Synthetic Question Value Estimation for Domain Adaptation of Question Answering

1 code implementation • ACL 2022 • Xiang Yue, Ziyu Yao, Huan Sun

Synthesizing QA pairs with a question generator (QG) on the target domain has become a popular approach for domain adaptation of question answering (QA) models.

Domain Adaptation Question Answering

Paper
Code

Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again

1 code implementation • 16 Mar 2022 • Bernal Jiménez Gutiérrez, Nikolas McNeal, Clay Washington, You Chen, Lang Li, Huan Sun, Yu Su

In this paper, we present the first systematic and comprehensive study to compare the few-shot performance of GPT-3 in-context learning with fine-tuning smaller (i. e., BERT-sized) PLMs on two highly representative biomedical information extraction tasks, named entity recognition and relation extraction.

In-Context Learning Model Selection +5

Paper
Code

Iteratively Prompt Pre-trained Language Models for Chain of Thought

1 code implementation • 16 Mar 2022 • Boshi Wang, Xiang Deng, Huan Sun

While Pre-trained Language Models (PLMs) internalize a great amount of world knowledge, they have been shown incapable of recalling these knowledge to solve tasks requiring complex & multi-step reasoning.

World Knowledge

Paper
Code

DOM-LM: Learning Generalizable Representations for HTML Documents

1 code implementation • 25 Jan 2022 • Xiang Deng, Prashant Shiralkar, Colin Lockard, Binxuan Huang, Huan Sun

We argue that the text and HTML structure together convey important semantics of the content and therefore warrant a special treatment for their representation learning.

Ranked #2 on Attribute Extraction on SWDE

Attribute Attribute Extraction +3

Paper
Code

TopNet: Learning from Neural Topic Model to Generate Long Stories

no code implementations • 14 Dec 2021 • Yazheng Yang, Boyuan Pan, Deng Cai, Huan Sun

In particular, instead of directly generating a story, we first learn to map the short text input to a low-dimensional topic distribution (which is pre-assigned by a topic model).

Story Generation

Paper
Add Code

Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction

1 code implementation • Findings (ACL) 2022 • Lingbo Mo, Ashley Lewis, Huan Sun, Michael White

In this work, we investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language and enables the user to make corrections through natural-language feedback for individual steps.

Question Answering Semantic Parsing

Paper
Code

ReasonBERT: Pre-trained to Reason with Distant Supervision

1 code implementation • EMNLP 2021 • Xiang Deng, Yu Su, Alyssa Lees, You Wu, Cong Yu, Huan Sun

We present ReasonBert, a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid contexts.

Ranked #1 on Semantic Parsing on GraphQuestions

Extractive Question-Answering Question Answering +1

Paper
Code

Differential Privacy for Text Analytics via Natural Text Sanitization

1 code implementation • Findings (ACL) 2021 • Xiang Yue, Minxin Du, Tianhao Wang, Yaliang Li, Huan Sun, Sherman S. M. Chow

The sanitized texts also contribute to our sanitization-aware pretraining and fine-tuning, enabling privacy-preserving natural language processing over the BERT language model with promising utility.

Language Modelling Privacy Preserving

Paper
Code

Learning Structural Edits via Incremental Tree Transformations

1 code implementation • ICLR 2021 • Ziyu Yao, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig

To show the unique benefits of modeling tree edits directly, we further propose a novel edit encoder for learning to represent edits, as well as an imitation learning method that allows the editor to be more robust.

Imitation Learning

Paper
Code

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

2 code implementations • 30 Oct 2020 • Xiang Yue, Xinliang Frederick Zhang, Ziyu Yao, Simon Lin, Huan Sun

Clinical question answering (QA) aims to automatically answer questions from medical professionals based on clinical texts.

Domain Adaptation Question Answering +2

134

Paper
Code

Structure-Grounded Pretraining for Text-to-SQL

no code implementations • NAACL 2021 • Xiang Deng, Ahmed Hassan Awadallah, Christopher Meek, Oleksandr Polozov, Huan Sun, Matthew Richardson

Additionally, to evaluate different methods under more realistic text-table alignment settings, we create a new evaluation set Spider-Realistic based on Spider dev set with explicit mentions of column names removed, and adopt eight existing text-to-SQL datasets for cross-database evaluation.

Text-To-SQL

Paper
Add Code

COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval

1 code implementation • EMNLP 2021 • Xinliang Frederick Zhang, Heming Sun, Xiang Yue, Simon Lin, Huan Sun

For evaluation, we introduce Query Bank and Relevance Set, where the former contains 1, 236 human-paraphrased queries while the latter contains ~32 human-annotated FAQ items for each query.

16k Retrieval

Paper
Code

Adversarial Training for Code Retrieval with Question-Description Relevance Regularization

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jie Zhao, Huan Sun

Code retrieval is a key task aiming to match natural and programming languages.

Multi-Task Learning Retrieval

Paper
Code

Learning a Cost-Effective Annotation Policy for Question Answering

1 code implementation • EMNLP 2020 • Bernhard Kratzwald, Stefan Feuerriegel, Huan Sun

State-of-the-art question answering (QA) relies upon large amounts of training data for which labeling is time consuming and thus expensive.

Question Answering

Paper
Code

Energy Efficiency Optimization in IRS-Enhanced mmWave Systems with Lens Antenna Array

no code implementations • 2 Jul 2020 • Yazheng Wang, Hancheng Lu, Dan Zhao, Huan Sun

To address this problem, we propose an intelligent reflect surface (IRS) enhanced multi-user mmWave communication system with lens antenna array.

Blocking

Paper
Add Code

Joint Passive Beamforming and User Association Optimization for IRS-assisted mmWave Systems

no code implementations • 2 Jul 2020 • Dan Zhao, Hancheng Lu, Yazheng Wang, Huan Sun

Considering the impact of IRS on user association, we formulate a sum rate maximization problem by jointly optimizing the passive beamforming at IRS and user association, which is an intractable non-convex problem.

Paper
Add Code

TURL: Table Understanding through Representation Learning

1 code implementation • 26 Jun 2020 • Xiang Deng, Huan Sun, Alyssa Lees, You Wu, Cong Yu

In this paper, we present TURL, a novel framework that introduces the pre-training/fine-tuning paradigm to relational Web tables.

Ranked #1 on Column Type Annotation on WikipediaGS-CTA

Cell Entity Annotation Columns Property Annotation +3

112

Paper
Code

An Imitation Game for Learning Semantic Parsers from User Interaction

1 code implementation • EMNLP 2020 • Ziyu Yao, Yiqi Tang, Wen-tau Yih, Huan Sun, Yu Su

Despite the widely successful applications, bootstrapping and fine-tuning semantic parsers are still a tedious process with challenges such as costly data annotation and privacy risks.

Imitation Learning Text-To-SQL

Paper
Code

Rationalizing Medical Relation Prediction from Corpus-level Statistics

1 code implementation • ACL 2020 • Zhen Wang, Jennifer Lee, Simon Lin, Huan Sun

Nowadays, the interpretability of machine learning models is becoming increasingly important, especially in the medical domain.

Decision Making Relation

Paper
Code

Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset

1 code implementation • ACL 2020 • Xiang Yue, Bernal Jimenez Gutierrez, Huan Sun

In this paper, we provide an in-depth analysis of this dataset and the clinical reading comprehension (CliniRC) task.

Machine Reading Comprehension Question Answering

Paper
Code

Practical Annotation Strategies for Question Answering Datasets

no code implementations • 6 Mar 2020 • Bernhard Kratzwald, Xiang Yue, Huan Sun, Stefan Feuerriegel

Here, remarkably, annotating a stratified subset with only 1. 2% of the original training set achieves 97. 7% of the performance as if the complete dataset was annotated.

Question Answering

Paper
Add Code

Easy-to-Hard: Leveraging Simple Questions for Complex Question Generation

no code implementations • 5 Dec 2019 • Jie Zhao, Xiang Deng, Huan Sun

This paper makes one of the first efforts toward automatically generating complex questions from knowledge graphs.

Data Augmentation Knowledge Graphs +2

Paper
Add Code

An End-to-End Framework for Cold Question Routing in Community Question Answering Services

no code implementations • 22 Nov 2019 • Jiankai Sun, Jie Zhao, Huan Sun, Srinivasan Parthasarathy

Routing newly posted questions (a. k. a cold questions) to potential answerers with the suitable expertise in Community Question Answering sites (CQAs) is an important and challenging task.

Community Question Answering Graph Embedding

Paper
Add Code

Model-based Interactive Semantic Parsing: A Unified Framework and A Text-to-SQL Case Study

2 code implementations • IJCNLP 2019 • Ziyu Yao, Yu Su, Huan Sun, Wen-tau Yih

As a promising paradigm, interactive semantic parsing has shown to improve both semantic parsing accuracy and user confidence in the results.

Semantic Parsing Text-To-SQL

Paper
Code

Automatic Table completion using Knowledge Base

no code implementations • 20 Sep 2019 • Bortik Bandyopadhyay, Xiang Deng, Goonmeet Bajaj, Huan Sun, Srinivasan Parthasarathy

In this work, we propose to resolve a new type of heterogeneous query viz: tabular query, which contains a natural language query description, column names of the desired table, and an example row.

Decision Making

Paper
Add Code

Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction

1 code implementation • IJCNLP 2019 • Xiang Deng, Huan Sun

Given two entities, distant supervision exploits sentences that directly mention them for predicting their semantic relation.

Relation Relation Extraction

Paper
Code

Reinforced Dynamic Reasoning for Conversational Question Generation

1 code implementation • ACL 2019 • Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun

This paper investigates a new task named Conversational Question Generation (CQG) which is to generate a question based on a passage and a conversation history (i. e., previous turns of question-answer pairs).

Question Answering Question Generation +1

Paper
Code

SurfCon: Synonym Discovery on Privacy-Aware Clinical Data

1 code implementation • 21 Jun 2019 • Zhen Wang, Xiang Yue, Soheil Moosavinasab, Yungui Huang, Simon Lin, Huan Sun

To solve the problem, we propose a new framework SurfCon that leverages two important types of information in the privacy-aware clinical data, i. e., the surface form information, and the global context information for synonym discovery.

Paper
Code

Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations

4 code implementations • 12 Jun 2019 • Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang, Huan Sun

Our experimental results demonstrate that the recent graph embedding methods achieve promising results and deserve more attention in the future biomedical graph analysis.

Graph Embedding Link Prediction +2

219

Paper
Code

CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning

1 code implementation • 13 Mar 2019 • Ziyu Yao, Jayavardhan Reddy Peddamail, Huan Sun

In this work, we investigate a novel perspective of Code annotation for Code retrieval (hence called `CoaCor'), where a code annotation model is trained to generate a natural language annotation that can represent the semantic meaning of a given code snippet and can be leveraged by a code retrieval model to better distinguish relevant code snippets from others.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement Learning

1 code implementation • 21 Aug 2018 • Ziyu Yao, Xiujun Li, Jianfeng Gao, Brian Sadler, Huan Sun

Given a text description, most existing semantic parsers synthesize a program in one shot.

Hierarchical Reinforcement Learning reinforcement-learning +2

Paper
Code

StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow

1 code implementation • 26 Mar 2018 • Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, Huan Sun

In this paper, we investigate a new problem of systematically mining question-code pairs from Stack Overflow (in contrast to heuristically collecting them).

Retrieval

164

Paper
Code