Search Results for author: Hongshen Xu

Found 11 papers, 7 papers with code

Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind

2 code implementations6 Apr 2024 Hongchuan Zeng, Hongshen Xu, Lu Chen, Kai Yu

MBS overcomes the English-centric limitations of existing methods by sampling calibration data from various languages proportionally to the language distribution of the model training datasets.

Model Compression

Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback

no code implementations27 Mar 2024 Hongshen Xu, Zichen Zhu, Situo Zhang, Da Ma, Shuai Fan, Lu Chen, Kai Yu

Large Language Models (LLMs) often generate erroneous outputs, known as hallucinations, due to their limitations in discerning questions beyond their knowledge scope.

Hallucination

Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding

1 code implementation28 Feb 2024 Hongshen Xu, Lu Chen, Zihan Zhao, Da Ma, Ruisheng Cao, Zichen Zhu, Kai Yu

Additionally, we propose several pre-training tasks to model the interaction among text, structure, and image modalities effectively.

document understanding Information Retrieval +1

A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames

1 code implementation28 Feb 2024 Hongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu Chen, Kai Yu

Previous work on spoken language understanding (SLU) mainly focuses on single-intent settings, where each input utterance merely contains one user intent.

Graph Attention Spoken Language Understanding

ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL

no code implementations28 Oct 2023 Ruisheng Cao, Hanchong Zhang, Hongshen Xu, Jieyu Li, Da Ma, Lu Chen, Kai Yu

Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema.

Text-To-SQL

ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought

1 code implementation26 Oct 2023 Hanchong Zhang, Ruisheng Cao, Lu Chen, Hongshen Xu, Kai Yu

Recently Large Language Models (LLMs) have been proven to have strong abilities in various domains and tasks.

In-Context Learning Text-To-SQL

Large Language Models Are Semi-Parametric Reinforcement Learning Agents

1 code implementation NeurIPS 2023 Danyang Zhang, Lu Chen, Situo Zhang, Hongshen Xu, Zihan Zhao, Kai Yu

By equipping the LLM with a long-term experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory.

Language Modelling Large Language Model +1

Mobile-Env: An Evaluation Platform and Benchmark for LLM-GUI Interaction

1 code implementation14 May 2023 Danyang Zhang, Hongshen Xu, Zihan Zhao, Lu Chen, Ruisheng Cao, Kai Yu

A GUI task set based on WikiHow app is collected on Mobile-Env to form a benchmark covering a range of GUI interaction capabilities.

Language Modelling

On the Structural Generalization in Text-to-SQL

no code implementations12 Jan 2023 Jieyu Li, Lu Chen, Ruisheng Cao, Su Zhu, Hongshen Xu, Zhi Chen, Hanchong Zhang, Kai Yu

Exploring the generalization of a text-to-SQL parser is essential for a system to automatically adapt the real-world databases.

Text-To-SQL

Cannot find the paper you are looking for? You can Submit a new open access paper.