Search Results for author: Hongshen Xu

Found 11 papers, 7 papers with code

Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind

2 code implementations • 6 Apr 2024 • Hongchuan Zeng, Hongshen Xu, Lu Chen, Kai Yu

MBS overcomes the English-centric limitations of existing methods by sampling calibration data from various languages proportionally to the language distribution of the model training datasets.

Model Compression

156

Paper
Code

Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback

no code implementations • 27 Mar 2024 • Hongshen Xu, Zichen Zhu, Situo Zhang, Da Ma, Shuai Fan, Lu Chen, Kai Yu

Large Language Models (LLMs) often generate erroneous outputs, known as hallucinations, due to their limitations in discerning questions beyond their knowledge scope.

Hallucination

Paper
Add Code

Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding

1 code implementation • 28 Feb 2024 • Hongshen Xu, Lu Chen, Zihan Zhao, Da Ma, Ruisheng Cao, Zichen Zhu, Kai Yu

Additionally, we propose several pre-training tasks to model the interaction among text, structure, and image modalities effectively.

document understanding Information Retrieval +1

Paper
Code

A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames

1 code implementation • 28 Feb 2024 • Hongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu Chen, Kai Yu

Previous work on spoken language understanding (SLU) mainly focuses on single-intent settings, where each input utterance merely contains one user intent.

Graph Attention Spoken Language Understanding

Paper
Code

ChemDFM: Dialogue Foundation Model for Chemistry

no code implementations • 26 Jan 2024 • Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Hongshen Xu, Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, Xin Chen, Kai Yu

To this end, we develop ChemDFM, the first LLM towards CGI.

Paper
Add Code

ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

no code implementations • 28 Oct 2023 • Ruisheng Cao, Hanchong Zhang, Hongshen Xu, Jieyu Li, Da Ma, Lu Chen, Kai Yu

Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema.

Text-To-SQL

Paper
Add Code

ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought

1 code implementation • 26 Oct 2023 • Hanchong Zhang, Ruisheng Cao, Lu Chen, Hongshen Xu, Kai Yu

Recently Large Language Models (LLMs) have been proven to have strong abilities in various domains and tasks.

In-Context Learning Text-To-SQL

Paper
Code

Large Language Models Are Semi-Parametric Reinforcement Learning Agents

1 code implementation • NeurIPS 2023 • Danyang Zhang, Lu Chen, Situo Zhang, Hongshen Xu, Zihan Zhao, Kai Yu

By equipping the LLM with a long-term experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory.

Language Modelling Large Language Model +1

Paper
Code

Mobile-Env: An Evaluation Platform and Benchmark for LLM-GUI Interaction

1 code implementation • 14 May 2023 • Danyang Zhang, Hongshen Xu, Zihan Zhao, Lu Chen, Ruisheng Cao, Kai Yu

A GUI task set based on WikiHow app is collected on Mobile-Env to form a benchmark covering a range of GUI interaction capabilities.

Language Modelling

Paper
Code

On the Structural Generalization in Text-to-SQL

no code implementations • 12 Jan 2023 • Jieyu Li, Lu Chen, Ruisheng Cao, Su Zhu, Hongshen Xu, Zhi Chen, Hanchong Zhang, Kai Yu

Exploring the generalization of a text-to-SQL parser is essential for a system to automatically adapt the real-world databases.

Text-To-SQL

Paper
Add Code

TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages

1 code implementation • NAACL 2022 • Zihan Zhao, Lu Chen, Ruisheng Cao, Hongshen Xu, Xingyu Chen, Kai Yu

Recently, the structural reading comprehension (SRC) task on web pages has attracted increasing research interests.

Graph Attention Language Modelling +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.