1 code implementation • EMNLP 2021 • Kunrui Zhu, Yan Gao, Jiaqi Guo, Jian-Guang Lou
Experiments on our dataset demonstrate that CAST significantly outperforms state-of-the-art neural machine translation models.
1 code implementation • EMNLP 2020 • Jiaqi Guo, Qian Liu, Jian-Guang Lou, Zhenwen Li, Xueqing Liu, Tao Xie, Ting Liu
Thus, the impact of meaning representation on semantic parsing is less understood.
no code implementations • Findings (EMNLP) 2021 • Tongliang Li, Lei Fang, Jian-Guang Lou, Zhoujun Li
In this paper, we propose to generate text conditioned on the structured data (table) and a prefix (the written text) by leveraging the pre-trained models.
no code implementations • EMNLP 2020 • Yuntao Li, Bei Chen, Qian Liu, Yan Gao, Jian-Guang Lou, Yan Zhang, Dongmei Zhang
In Natural Language Interfaces to Databases systems, the text-to-SQL technique allows users to query databases by using natural language questions.
1 code implementation • ACL 2022 • Libo Qin, Qiguang Chen, Tianbao Xie, Qixin Li, Jian-Guang Lou, Wanxiang Che, Min-Yen Kan
Specifically, we employ contrastive learning, leveraging bilingual dictionaries to construct multilingual views of the same utterance, then encourage their representations to be more similar than negative example pairs, which achieves to explicitly align representations of similar sentences across languages.
1 code implementation • Findings (EMNLP) 2021 • Jiaqi Guo, Jian-Guang Lou, Ting Liu, Dongmei Zhang
Using only 10% of utterance-denotation pairs, the parser achieves 84. 2 denotation accuracy on WikiSQL, which is competitive with the previous state-of-the-art approaches using 100% labeled data.
no code implementations • 11 Oct 2024 • Yurong Wu, Yan Gao, Bin Benjamin Zhu, Zineng Zhou, Xiaodi Sun, Sheng Yang, Jian-Guang Lou, Zhiming Ding, Linjun Yang
Prompt engineering is pivotal for harnessing the capabilities of large language models (LLMs) across diverse applications.
no code implementations • 11 Oct 2024 • Sheng Yang, Yurong Wu, Yan Gao, Zineng Zhou, Bin Benjamin Zhu, Xiaodi Sun, Jian-Guang Lou, Zhiming Ding, Anbang Hu, Yuan Fang, Yunsong Li, Junyan Chen, Linjun Yang
Prompt engineering is very important to enhance the performance of large language models (LLMs).
no code implementations • 2 Jun 2024 • Weihao Zeng, Can Xu, Yingxiu Zhao, Jian-Guang Lou, Weizhu Chen
Fine-tuning large pre-trained language models with Evol-Instruct has achieved encouraging results across a wide range of tasks.
2 code implementations • 25 Apr 2024 • Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou
While many contemporary large language models (LLMs) can process lengthy input, they still struggle to fully utilize information within the long context, known as the lost-in-the-middle challenge.
1 code implementation • 8 Mar 2024 • Jinyang Li, Nan Huo, Yan Gao, Jiayi Shi, Yingxiu Zhao, Ge Qu, Yurong Wu, Chenhao Ma, Jian-Guang Lou, Reynold Cheng
The challenges and costs of collecting realistic interactive logs for data analysis hinder the quantitative evaluation of Large Language Model (LLM) agents in this task.
no code implementations • 21 Dec 2023 • Zhenwen Li, Jian-Guang Lou, Tao Xie
To address this issue, in this paper, we report our insight that there exists a high similarity between the task of NL2ERM and the increasingly popular task of text-to-SQL, and propose a data transformation algorithm that transforms the existing data of text-to-SQL into the data of NL2ERM.
no code implementations • 15 Nov 2023 • Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, Jianbing Shen
Large Language Models (LLMs) have ushered in a transformative era in the field of natural language processing, excelling in tasks related to text comprehension and generation.
1 code implementation • NeurIPS 2023 • Jiawei Lin, Jiaqi Guo, Shizhao Sun, Zijiang James Yang, Jian-Guang Lou, Dongmei Zhang
In this work, we propose LayoutPrompter, which leverages large language models (LLMs) to address the above problems through in-context learning.
1 code implementation • 8 Nov 2023 • Junren Li, Lei Fang, Jian-Guang Lou
Computer-assisted methods have emerged as valuable tools for retrosynthesis analysis.
1 code implementation • 31 Oct 2023 • Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen
To further improve their reasoning capabilities, this work explores whether LLMs can LEarn from MistAkes (LEMA), akin to the human learning process.
2 code implementations • 12 Sep 2023 • Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-Guang Lou, Shuai Ma
To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i. e., \textbf{Re}-\textbf{Re}ading the question as input.
no code implementations • ICCV 2023 • Jiawei Lin, Jiaqi Guo, Shizhao Sun, Weijiang Xu, Ting Liu, Jian-Guang Lou, Dongmei Zhang
To model combined and incomplete constraints, we use a Transformer-based layout generation model and carefully design a way to represent constraints and layouts as sequences.
1 code implementation • 25 May 2023 • Yan Liu, Yan Gao, Zhe Su, Xiaokang Chen, Elliott Ash, Jian-Guang Lou
In this work, we aim to uncover and categorize social biases in Text-to-SQL models.
no code implementations • 24 May 2023 • Jian Wu, Yicheng Xu, Yan Gao, Jian-Guang Lou, Börje F. Karlsson, Manabu Okumura
A common challenge in HQA and other passage-table QA datasets is that it is generally unrealistic to iterate over all table rows, columns, and linked passages to retrieve evidence.
no code implementations • 23 May 2023 • Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou
Few-shot selection -- selecting appropriate examples for each test instance separately -- is important for in-context learning.
1 code implementation • 23 May 2023 • Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, Yujiu Yang
Question answering plays a pivotal role in human daily life because it involves our acquisition of knowledge about the world.
no code implementations • 8 May 2023 • Shengnan An, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Jian-Guang Lou, Dongmei Zhang
Compositional generalization--understanding unseen combinations of seen primitives--is an essential reasoning capability in human intelligence.
1 code implementation • 22 Mar 2023 • Fengji Zhang, Bei Chen, Yue Zhang, Jacky Keung, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, Weizhu Chen
The task of repository-level code completion is to continue writing the unfinished code based on a broader context of the repository.
Ranked #2 on Code Completion on Rambo Benchmark
1 code implementation • ICCV 2023 • Junyi Zhang, Jiaqi Guo, Shizhao Sun, Jian-Guang Lou, Dongmei Zhang
To tackle the challenge, we summarize three critical factors for achieving a mild forward process for the layout, i. e., legality, coordinate proximity and type disruption.
1 code implementation • 23 Feb 2023 • Shengnan An, Zeqi Lin, Bei Chen, Qiang Fu, Nanning Zheng, Jian-Guang Lou
Abstraction is a desirable capability for deep learning models, which means to induce abstract concepts from concrete instances and flexibly apply them beyond the learning context.
1 code implementation • 3 Jan 2023 • Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Min-Yen Kan, Jian-Guang Lou
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables.
1 code implementation • 27 Dec 2022 • Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou
Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems.
1 code implementation • ACL 2022 • Xinyu Pi, Bing Wang, Yan Gao, Jiaqi Guo, Zhoujun Li, Jian-Guang Lou
The robustness of Text-to-SQL parsers against adversarial perturbations plays a crucial role in delivering highly reliable applications.
no code implementations • 19 Dec 2022 • Daoguang Zan, Bei Chen, Fengji Zhang, Dianjie Lu, Bingchao Wu, Bei guan, Yongji Wang, Jian-Guang Lou
The task of generating code from a natural language description, or NL2Code, is considered a pressing and significant challenge in code intelligence.
1 code implementation • 17 Dec 2022 • Bing Wang, Yan Gao, Zhoujun Li, Jian-Guang Lou
Following this study, we propose a simple yet effective counterfactual example generation approach that automatically produces ambiguous and unanswerable text-to-SQL examples.
1 code implementation • 31 Oct 2022 • Daoguang Zan, Bei Chen, Zeqi Lin, Bei guan, Yongji Wang, Jian-Guang Lou
In this paper, we investigate how to equip pre-trained language models with the ability of code generation for private libraries.
no code implementations • CVPR 2023 • Zhaoyun Jiang, Jiaqi Guo, Shizhao Sun, Huayu Deng, Zhongkai Wu, Vuksan Mijovic, Zijiang James Yang, Jian-Guang Lou, Dongmei Zhang
First, to flexibly handle diverse constraints, we propose a constraint serialization scheme, which represents different user constraints as sequences of tokens with a predefined format.
1 code implementation • 21 Jul 2022 • Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen
A natural way to evaluate the quality and correctness of a code solution is to run it against a set of test cases, but the manual creation of such test cases is often costly and time-consuming.
Ranked #1 on Code Generation on APPS
1 code implementation • 14 Jun 2022 • Daoguang Zan, Bei Chen, Dejian Yang, Zeqi Lin, Minsu Kim, Bei guan, Yongji Wang, Weizhu Chen, Jian-Guang Lou
Usually, expensive text-code paired data is essential for training a code generation model.
Ranked #128 on Code Generation on HumanEval
no code implementations • 6 Jun 2022 • Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou, Weizhu Chen
Few-shot learning is a challenging task that requires language models to generalize from limited examples.
Ranked #51 on Arithmetic Reasoning on GSM8K
no code implementations • 25 May 2022 • Zhi Chen, Jijia Bao, Lu Chen, Yuncong Liu, Da Ma, Bei Chen, Mengyue Wu, Su Zhu, Xin Dong, Fujiang Ge, Qingliang Miao, Jian-Guang Lou, Kai Yu
In this work, we aim to build a unified dialogue foundation model (DFM) which can be used to solve massive diverse dialogue tasks.
1 code implementation • 18 May 2022 • Xinyu Pi, Wanjun Zhong, Yan Gao, Nan Duan, Jian-Guang Lou
We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models.
1 code implementation • 18 Apr 2022 • Libo Qin, Qiguang Chen, Tianbao Xie, Qixin Li, Jian-Guang Lou, Wanxiang Che, Min-Yen Kan
We present Global--Local Contrastive Learning Framework (GL-CLeF) to address this shortcoming.
1 code implementation • 12 Apr 2022 • Lei Fang, Junren Li, Ming Zhao, Li Tan, Jian-Guang Lou
In this paper, we propose a substructure-level decoding model, where the substructures are reaction-aware and can be automatically extracted with a fully data-driven approach.
no code implementations • SIGDIAL (ACL) 2022 • Zhi Chen, Lu Chen, Bei Chen, Libo Qin, Yuncong Liu, Su Zhu, Jian-Guang Lou, Kai Yu
With the development of pre-trained language models, remarkable success has been witnessed in dialogue understanding (DU).
1 code implementation • 15 Mar 2022 • Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou
Existing text-to-SQL semantic parsers are typically designed for particular settings such as handling queries that span multiple tables, domains or turns which makes them ineffective when applied to different settings.
no code implementations • 7 Mar 2022 • Shengnan An, Yifei Li, Zeqi Lin, Qian Liu, Bei Chen, Qiang Fu, Weizhu Chen, Nanning Zheng, Jian-Guang Lou
This motivates us to propose input-tuning, which fine-tunes both the continuous prompts and the input representations, leading to a more effective way to adapt unfamiliar inputs to frozen PLMs.
1 code implementation • 27 Jan 2022 • Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Qiang Fu, Yan Gao, Jian-Guang Lou, Weizhu Chen
Reasoning over natural language is a long-standing goal for the research community.
Ranked #2 on Question Answering on DROP Test (using extra training data)
2 code implementations • 20 Jan 2022 • Qi Shi, Qian Liu, Bei Chen, Yu Zhang, Ting Liu, Jian-Guang Lou
In this work, we propose LEMON, a general framework for language-based environment manipulation tasks.
no code implementations • 26 Oct 2021 • Lei Fang, Jian-Guang Lou
", our goal is to obtain a deep understanding of the percentage numbers ("30 percent" and "20%") by extracting their quantitative facts: part ("like watching football" and "prefer to watch NBA") and whole ("Americans).
1 code implementation • Findings (ACL) 2021 • Qian Liu, Dejian Yang, Jiahui Zhang, Jiaqi Guo, Bin Zhou, Jian-Guang Lou
Recent years pretrained language models (PLMs) hit a success on several downstream tasks, showing their power on modeling language.
1 code implementation • ACL 2022 • Zhoujun Cheng, Haoyu Dong, Zhiruo Wang, Ran Jia, Jiaqi Guo, Yan Gao, Shi Han, Jian-Guang Lou, Dongmei Zhang
HiTab provides 10, 686 QA pairs and descriptive sentences with well-annotated quantity and entity alignment on 3, 597 tables with broad coverage of table hierarchies and numerical reasoning types.
1 code implementation • ACL 2021 • Shuang Chen, Qian Liu, Zhiwei Yu, Chin-Yew Lin, Jian-Guang Lou, Feng Jiang
We present Retriever-Transducer-Checker (ReTraCk), a neural semantic parsing framework for large scale knowledge base question answering (KBQA).
Ranked #1 on Knowledge Base Question Answering on GrailQA
no code implementations • ACL 2021 • Jiaqi Guo, Ziliang Si, Yu Wang, Qian Liu, Ming Fan, Jian-Guang Lou, Zijiang Yang, Ting Liu
However, we identify two biases in existing datasets for XDTS: (1) a high proportion of context-independent questions and (2) a high proportion of easy SQL queries.
4 code implementations • ICLR 2022 • Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou
TAPEX addresses the data scarcity challenge via guiding the language model to mimic a SQL executor on the diverse, large-scale and high-quality synthetic corpus.
Ranked #1 on Semantic Parsing on WikiSQL (Denotation accuracy (test) metric)
2 code implementations • Findings (ACL) 2021 • Chenyao Liu, Shengnan An, Zeqi Lin, Qian Liu, Bei Chen, Jian-Guang Lou, Lijie Wen, Nanning Zheng, Dongmei Zhang
In this paper, we propose LeAR, an end-to-end neural model to learn algebraic recombination for compositional generalization.
Ranked #2 on Semantic Parsing on CFQ
no code implementations • 13 Dec 2020 • Yinuo Guo, Zeqi Lin, Jian-Guang Lou, Dongmei Zhang
Experiments on Geo, ComplexWebQuestions, and Formulas show that our framework can consistently improve performances of neural semantic parsers in different domains.
no code implementations • 8 Dec 2020 • Yinuo Guo, Hualei Zhu, Zeqi Lin, Bei Chen, Jian-Guang Lou, Dongmei Zhang
Human intelligence exhibits compositional generalization (i. e., the capacity to understand and produce unseen combinations of seen components), but current neural seq2seq models lack such ability.
1 code implementation • 9 Nov 2020 • Yuntao Li, Bei Chen, Qian Liu, Yan Gao, Jian-Guang Lou, Yan Zhang, Dongmei Zhang
In Natural Language Interfaces to Databases systems, the text-to-SQL technique allows users to query databases by using natural language questions.
no code implementations • NeurIPS 2020 • Yinuo Guo, Zeqi Lin, Jian-Guang Lou, Dongmei Zhang
We formalize human language understanding as a structured prediction task where the output is a partially ordered set (poset).
Ranked #4 on Semantic Parsing on CFQ
1 code implementation • EMNLP 2020 • Qian Liu, Bei Chen, Jian-Guang Lou, Bin Zhou, Dongmei Zhang
Recent years the task of incomplete utterance rewriting has raised a large attention.
Ranked #1 on Dialogue Rewriting on Rewrite
1 code implementation • 15 Jul 2020 • Qianhui Wu, Zijia Lin, Börje F. Karlsson, Biqing Huang, Jian-Guang Lou
Prior works in cross-lingual named entity recognition (NER) with no/little labeled data fall into two primary categories: model transfer based and data transfer based methods.
Ranked #1 on Cross-Lingual NER on NoDaLiDa Norwegian Bokmål
1 code implementation • NeurIPS 2020 • Qian Liu, Shengnan An, Jian-Guang Lou, Bei Chen, Zeqi Lin, Yan Gao, Bin Zhou, Nanning Zheng, Dongmei Zhang
Compositional generalization is a basic and essential intellective capability of human beings, which allows us to recombine known parts readily.
1 code implementation • ACL 2020 • Qianhui Wu, Zijia Lin, Börje F. Karlsson, Jian-Guang Lou, Biqing Huang
However, such methods either are not applicable if the labeled data in the source languages is unavailable, or do not leverage information contained in unlabeled data in the target language.
Ranked #1 on Cross-Lingual NER on CoNLL German
1 code implementation • ACL 2020 • Qian Liu, Yihong Chen, Bei Chen, Jian-Guang Lou, Zixuan Chen, Bin Zhou, Dongmei Zhang
Despite the continuing efforts to improve the engagingness and consistency of chit-chat dialogue systems, the majority of current work simply focus on mimicking human-like responses, leaving understudied the aspects of modeling understanding between interlocutors.
Ranked #2 on Dialogue Generation on Persona-Chat (using extra training data)
1 code implementation • 3 Feb 2020 • Qian Liu, Bei Chen, Jiaqi Guo, Jian-Guang Lou, Bin Zhou, Dongmei Zhang
Recently semantic parsing in context has received considerable attention, which is challenging since there are complex contextual phenomena.
no code implementations • IJCNLP 2019 • Haoyan Liu, Lei Fang, Qian Liu, Bei Chen, Jian-Guang Lou, Zhoujun Li
One key component in text-to-SQL is to predict the comparison relations between columns and their values.
no code implementations • IJCNLP 2019 • Zhen Dong, Shizhao Sun, Hongzhi Liu, Jian-Guang Lou, Dongmei Zhang
On text-to-SQL generation, the input utterance usually contains lots of tokens that are related to column names or cells in the table, called \textit{table-related tokens}.
no code implementations • 23 Oct 2019 • Yan Gao, Jian-Guang Lou, Dongmei Zhang
This paper presents a novel approach to translating natural language questions to SQL queries for given tables, which meets three requirements as a real-world data analysis application: cross-domain, multilingualism and enabling quick-start.
1 code implementation • IJCNLP 2019 • Qian Liu, Bei Chen, Haoyan Liu, Lei Fang, Jian-Guang Lou, Bin Zhou, Dongmei Zhang
To leverage the advances in context-independent semantic parsing, we propose to perform follow-up query analysis, aiming to restate context-dependent natural language queries with contextual information.
1 code implementation • 28 May 2019 • Yihong Chen, Bei Chen, Xiangnan He, Chen Gao, Yong Li, Jian-Guang Lou, Yue Wang
We show how to employ LambdaOpt on matrix factorization, a classical model that is representative of a large family of recommender models.
5 code implementations • ACL 2019 • Jiaqi Guo, Zecheng Zhan, Yan Gao, Yan Xiao, Jian-Guang Lou, Ting Liu, Dongmei Zhang
We present a neural approach called IRNet for complex and cross-domain Text-to-SQL.
1 code implementation • 24 Jan 2019 • Qian Liu, Bei Chen, Jian-Guang Lou, Ge Jin, Dongmei Zhang
NLIDB allow users to search databases using natural language instead of SQL-like query languages.
no code implementations • EMNLP 2018 • Zexuan Zhong, Jiaqi Guo, Wei Yang, Jian Peng, Tao Xie, Jian-Guang Lou, Ting Liu, Dongmei Zhang
Recent research proposes syntax-based approaches to address the problem of generating programs from natural language specifications.
no code implementations • 22 Jun 2018 • Yihong Chen, Bei Chen, Xuguang Duan, Jian-Guang Lou, Yue Wang, Wenwu Zhu, Yong Cao
Almost all the knowledge empowered applications rely upon accurate knowledge, which has to be either collected manually with high cost, or extracted automatically with unignorable errors.