Search Results for author: Jian Xie

Found 13 papers, 8 papers with code

How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?

1 code implementation • 4 Apr 2024 • Siye Wu, Jian Xie, Jiangjie Chen, Tinghui Zhu, Kai Zhang, Yanghua Xiao

By leveraging the retrieval of information from external knowledge databases, Large Language Models (LLMs) exhibit enhanced capabilities for accomplishing many knowledge-intensive tasks.

Retrieval

Paper
Code

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

1 code implementation • 28 Mar 2024 • Ang Lv, Kaiyi Zhang, Yuhan Chen, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan

In this paper, we deeply explore the mechanisms employed by Transformer-based language models in factual recall tasks.

Paper
Code

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

1 code implementation • 2 Feb 2024 • Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su

Are these language agents capable of planning in more complex settings that are out of the reach of prior AI agents?

111

Paper
Code

Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning

1 code implementation • 31 Jan 2024 • Tinghui Zhu, Kai Zhang, Jian Xie, Yu Su

Recent advancements have significantly augmented the reasoning capabilities of Large Language Models (LLMs) through various methodologies, especially chain-of-thought (CoT) reasoning.

Paper
Code

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

no code implementations • 5 Dec 2023 • Renze Lou, Kai Zhang, Jian Xie, Yuxuan Sun, Janice Ahn, Hanzi Xu, Yu Su, Wenpeng Yin

In the realm of large language models (LLMs), enhancing instruction-following capability often involves curating expansive training data.

Instruction Following

Paper
Add Code

Baichuan 2: Open Large-scale Language Models

1 code implementation • 19 Sep 2023 • Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, Juntao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, WeiPeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering.

Feature Engineering GSM8K

3,918

Paper
Code

QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

1 code implementation • 11 Jun 2023 • Jian Xie, Yidan Liang, Jingping Liu, Yanghua Xiao, Baohua Wu, Shenghua Ni

In this paper, we propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search.

Domain Adaptation Language Modelling

Paper
Code

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

1 code implementation • 22 May 2023 • Jian Xie, Kai Zhang, Jiangjie Chen, Renze Lou, Yu Su

By providing external information to large language models (LLMs), tool augmentation (including retrieval augmentation) has emerged as a promising solution for addressing the limitations of LLMs' static parametric memory.

Retrieval

Paper
Code

Dialogue State Distillation Network with Inter-slot Contrastive Learning for Dialogue State Tracking

no code implementations • 16 Feb 2023 • Jing Xu, Dandan song, Chong Liu, Siu Cheung Hui, Fei Li, Qiang Ju, Xiaonan He, Jian Xie

In this paper, we propose a Dialogue State Distillation Network (DSDN) to utilize relevant information of previous dialogue states and migrate the gap of utilization between training and testing.

Contrastive Learning Dialogue State Tracking +1

Paper
Add Code

A Transformer-Based User Satisfaction Prediction for Proactive Interaction Mechanism in DuerOS

no code implementations • 5 Dec 2022 • Wei Shen, Xiaonan He, Chuheng Zhang, Xuyun Zhang, Jian Xie

Moreover, they are trained and evaluated on the benchmark datasets with adequate labels, which are expensive to obtain in a commercial dialogue system.

Spoken Dialogue Systems

Paper
Add Code

TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations

no code implementations • 23 Dec 2021 • Xin Tian, Xinxian Huang, Dongfeng He, Yingzhan Lin, Siqi Bao, Huang He, Liankai Huang, Qiang Ju, Xiyuan Zhang, Jian Xie, Shuqi Sun, Fan Wang, Hua Wu, Haifeng Wang

Task-oriented dialogue systems have been plagued by the difficulties of obtaining large-scale and high-quality annotated conversations.

Data Augmentation speech-recognition +3

Paper
Add Code

SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling

1 code implementation • EMNLP 2020 • Di wu, Liang Ding, Fan Lu, Jian Xie

Slot filling and intent detection are two main tasks in spoken language understanding (SLU) system.

Intent Detection slot-filling +2

Paper
Code

Shift Convolution Network for Stereo Matching

no code implementations • 20 Nov 2019 • Jian Xie

In this paper, we present Shift Convolution Network (ShiftConvNet) to provide matching capability between two feature maps for stereo estimation.

Stereo Matching

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.