Search Results for author: Xinting Huang

Found 29 papers, 13 papers with code

LoGU: Long-form Generation with Uncertainty Expressions

1 code implementation18 Oct 2024 Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen yang, Nigel Collier, Dong Yu, Deqing Yang

To tackle these challenges, we propose a refinement-based data collection framework and a two-stage training pipeline.

Instruction Following

Atomic Calibration of LLMs in Long-Form Generations

no code implementations17 Oct 2024 Caiqi Zhang, Ruihan Yang, Zhisong Zhang, Xinting Huang, Sen yang, Dong Yu, Nigel Collier

Existing research on LLM calibration has primarily focused on short-form tasks, providing a single confidence score at the response level (macro calibration).

Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability

no code implementations15 Oct 2024 Tsz Ting Chung, Leyang Cui, Lemao Liu, Xinting Huang, Shuming Shi, Dit-yan Yeung

Large Language Models (LLMs) have demonstrated impressive capabilities in a wide range of natural language processing tasks when leveraging in-context learning.

In-Context Learning

See or Guess: Counterfactually Regularized Image Captioning

1 code implementation29 Aug 2024 Qian Cao, Xu Chen, Ruihua Song, Xiting Wang, Xinting Huang, Yuchen Ren

Image captioning, which generates natural language descriptions of the visual information in an image, is a crucial task in vision-language research.

Causal Inference counterfactual +2

Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning

no code implementations25 Jun 2024 Sen yang, Leyang Cui, Deng Cai, Xinting Huang, Shuming Shi, Wai Lam

Iterative preference learning, though yielding superior performances, requires online annotated preference labels.

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

no code implementations24 Jun 2024 Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi

Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications.

InversionView: A General-Purpose Method for Reading Information from Neural Activations

1 code implementation27 May 2024 Xinting Huang, Madhur Panwar, Navin Goyal, Michael Hahn

The inner workings of neural networks can be better understood if we can fully decipher the information encoded in neural activations.

Decoder

Knowledge Fusion of Chat LLMs: A Preliminary Technical Report

2 code implementations25 Feb 2024 Fanqi Wan, ZiYi Yang, Longguang Zhong, Xiaojun Quan, Xinting Huang, Wei Bi

Recently, FuseLLM introduced the concept of knowledge fusion to transfer the collective knowledge of multiple structurally varied LLMs into a target LLM through lightweight continual training.

Knowledge Verification to Nip Hallucination in the Bud

1 code implementation19 Jan 2024 Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi

While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as hallucination.

Hallucination World Knowledge

Knowledge Fusion of Large Language Models

3 code implementations19 Jan 2024 Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi

In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM.

Code Generation Common Sense Reasoning +6

Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models

1 code implementation16 Jan 2024 Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li

We present Inferflow, an efficient and highly configurable inference engine for large language models (LLMs).

Quantization

Longer Fixations, More Computation: Gaze-Guided Recurrent Neural Networks

no code implementations31 Oct 2023 Xinting Huang, Jiajing Wan, Ioannis Kritikos, Nora Hollenstein

Humans read texts at a varying pace, while machine learning models treat each token in the same way in terms of a computational process.

Language Modelling Sentiment Analysis

SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving

no code implementations19 Oct 2023 Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong

Large Language Models (LLMs) have driven substantial progress in artificial intelligence in recent years, exhibiting impressive capabilities across a wide range of tasks, including mathematical problem-solving.

GSM8K Math +1

Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration

3 code implementations13 Oct 2023 Fanqi Wan, Xinting Huang, Tao Yang, Xiaojun Quan, Wei Bi, Shuming Shi

Instruction-tuning can be substantially optimized through enhanced diversity, resulting in models capable of handling a broader spectrum of tasks.

Diversity

DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction Wrapping

1 code implementation11 Sep 2023 Yongrui Chen, Haiyun Jiang, Xinting Huang, Shuming Shi, Guilin Qi

In particular, compared to the best-performing baseline, the LLM trained using our generated dataset exhibits a 10\% relative improvement in performance on AlpacaEval, despite utilizing only 1/5 of its training data.

Hallucination Instruction Following

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

1 code implementation3 Sep 2023 Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.

Hallucination World Knowledge

Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration

1 code implementation15 Jun 2023 Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, Zhaopeng Tu

Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied.

Language Modelling

Pre-training Multi-party Dialogue Models with Latent Discourse Inference

1 code implementation24 May 2023 Yiyang Li, Xinting Huang, Wei Bi, Hai Zhao

Multi-party dialogues are more difficult for models to understand than one-to-one two-party dialogues, since they involve multiple interlocutors, resulting in interweaving reply-to relations and information flows.

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

no code implementations22 May 2023 Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, Wei Bi

Proprietary Large Language Models (LLMs), such as ChatGPT, have garnered significant attention due to their exceptional capabilities in handling a diverse range of tasks.

Instruction Following

Effidit: Your AI Writing Assistant

no code implementations3 Aug 2022 Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma

In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME).

Keywords to Sentences Retrieval +3

Robust Task-Oriented Dialogue Generation with Contrastive Pre-training and Adversarial Filtering

no code implementations20 May 2022 Shiquan Yang, Xinting Huang, Jey Han Lau, Sarah Erfani

Data artifacts incentivize machine learning models to learn non-transferable generalizations by taking advantage of shortcuts in the data, and there is growing evidence that data artifacts play a role for the strong results that deep learning models achieve in recent natural language processing benchmarks.

Contrastive Learning Dialogue Generation

KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for Comprehension And Generation

1 code implementation SEMEVAL 2020 Jiajing Wan, Xinting Huang

This paper presents our strategies in SemEval 2020 Task 4: Commonsense Validation and Explanation.

Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation

no code implementations ACL 2020 Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang

This approach requires complete state-action annotations of human-to-human dialogues (i. e., expert demonstrations), which is labor intensive.

Task-Oriented Dialogue Systems

MALA: Cross-Domain Dialogue Generation with Action Learning

no code implementations18 Dec 2019 Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang

These two components, however, have a discrepancy in their objectives, i. e., task completion and language quality.

Dialogue Generation Response Generation +2

CARL: Aggregated Search with Context-Aware Module Embedding Learning

no code implementations3 Aug 2019 Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang, Hai-Tao Zheng

To model and utilize the context information for aggregated search, we propose a model with context attention and representation learning (CARL).

Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.