Search Results for author: Lifeng Shang

Empirical study shows the proposed model can effectively deal with the variations of questions and answers, and generate right and natural answers by referring to the facts in the knowledge-base.

Decoder Generative Question Answering +1

Paper
Code

Neural Machine Translation with Reconstruction

1 code implementation • 7 Nov 2016 • Zhaopeng Tu, Yang Liu, Lifeng Shang, Xiaohua Liu, Hang Li

Although end-to-end Neural Machine Translation (NMT) has achieved remarkable progress in the past two years, it suffers from a major drawback: translations generated by NMT systems often lack of adequacy.

Decoder Machine Translation +3

Paper
Code

Paraphrase Generation with Deep Reinforcement Learning

no code implementations • EMNLP 2018 • Zichao Li, Xin Jiang, Lifeng Shang, Hang Li

The generator, built as a sequence-to-sequence learning model, can produce paraphrases given a sentence.

Paraphrase Generation Question Answering +3

Paper
Add Code

An Investigation of Few-Shot Learning in Spoken Term Classification

1 code implementation • 26 Dec 2018 • Yangbin Chen, Tom Ko, Lifeng Shang, Xiao Chen, Xin Jiang, Qing Li

In this paper, we investigate the feasibility of applying few-shot learning algorithms to a speech task.

Few-Shot Learning General Classification +1

Paper
Code

Decomposable Neural Paraphrase Generation

no code implementations • ACL 2019 • Zichao Li, Xin Jiang, Lifeng Shang, Qun Liu

Paraphrasing exists at different granularity levels, such as lexical level, phrasal level and sentential level.

Paraphrase Generation Sentence +1

Paper
Add Code

Dialog State Tracking with Reinforced Data Augmentation

no code implementations • 21 Aug 2019 • Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu

Neural dialog state trackers are generally limited due to the lack of quantity and diversity of annotated training data.

Data Augmentation dialog state tracking +1

Paper
Add Code

TinyBERT: Distilling BERT for Natural Language Understanding

7 code implementations • Findings of the Association for Computational Linguistics 2020 • Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu

To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models.

Ranked #1 on Natural Language Inference on MultiNLI Dev

Knowledge Distillation Language Modelling +6

11,450

Paper
Code

Neural Subgraph Isomorphism Counting

1 code implementation • 25 Dec 2019 • Xin Liu, Haojie Pan, Mutian He, Yangqiu Song, Xin Jiang, Lifeng Shang

In this paper, we study a new graph learning problem: learning to count subgraph isomorphisms.

Domain Adaptation Graph Learning +4

Paper
Code

DynaBERT: Dynamic BERT with Adaptive Width and Depth

3 code implementations • NeurIPS 2020 • Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu

The pre-trained language models like BERT, though powerful in many natural language processing tasks, are both computation and memory expensive.

Language Modelling

11,450

Paper
Code

Enriching Large-Scale Eventuality Knowledge Graph with Entailment Relations

1 code implementation • AKBC 2020 • Changlong Yu, Hongming Zhang, Yangqiu Song, Wilfred Ng, Lifeng Shang

Computational and cognitive studies suggest that the abstraction of eventualities (activities, states, and events) is crucial for humans to understand daily eventualities.

EEG

Paper
Code

TernaryBERT: Distillation-aware Ultra-low Bit BERT

5 code implementations • EMNLP 2020 • Wei Zhang, Lu Hou, Yichun Yin, Lifeng Shang, Xiao Chen, Xin Jiang, Qun Liu

Transformer-based pre-training models like BERT have achieved remarkable performance in many natural language processing tasks. However, these models are both computation and memory expensive, hindering their deployment to resource-constrained devices.

Knowledge Distillation Quantization

2,961

Paper
Code

SparTerm: Learning Term-based Sparse Representation for Fast Text Retrieval

no code implementations • 2 Oct 2020 • Yang Bai, Xiaoguang Li, Gang Wang, Chaoliang Zhang, Lifeng Shang, Jun Xu, Zhaowei Wang, Fangshan Wang, Qun Liu

Term-based sparse representations dominate the first-stage text retrieval in industrial applications, due to its advantage in efficiency, interpretability, and exact term matching.

Language Modelling Retrieval +1

Paper
Add Code

Improving Task-Agnostic BERT Distillation with Layer Mapping Search

no code implementations • 11 Dec 2020 • Xiaoqi Jiao, Huating Chang, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu

Comprehensive experiments on the evaluation benchmarks demonstrate that 1) layer mapping strategy has a significant effect on task-agnostic BERT distillation and different layer mappings can result in quite different performances; 2) the optimal layer mapping strategy from the proposed search process consistently outperforms the other heuristic ones; 3) with the optimal layer mapping, our student model achieves state-of-the-art performance on the GLUE tasks.

Knowledge Distillation

Paper
Add Code

BinaryBERT: Pushing the Limit of BERT Quantization

1 code implementation • ACL 2021 • Haoli Bai, Wei zhang, Lu Hou, Lifeng Shang, Jing Jin, Xin Jiang, Qun Liu, Michael Lyu, Irwin King

In this paper, we propose BinaryBERT, which pushes BERT quantization to the limit by weight binarization.

Binarization Model Compression +1

2,961

Paper
Code

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions

no code implementations • 31 Dec 2020 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Chengjie Sun, Zhenzhou Ji, Bingquan Liu

In this paper, we propose a new retrieval target, hop, to collect the hidden reasoning evidence from Wikipedia for complex question answering.

Ranked #6 on Question Answering on HotpotQA

Document Embedding Open-Domain Question Answering +1

Paper
Add Code

On Position Embeddings in BERT

no code implementations • ICLR 2021 • Benyou Wang, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, Jakob Grue Simonsen

Various Position Embeddings (PEs) have been proposed in Transformer based architectures~(e. g. BERT) to model word order.

General Classification Position +1

Paper
Add Code

Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation

no code implementations • 5 Mar 2021 • Chang Liu, Xiaoguang Li, Guohao Cai, Zhenhua Dong, Hong Zhu, Lifeng Shang

It is still an open question to leverage various types of information under the BERT framework.

Open-Ended Question Answering Sequential Recommendation +1

Paper
Add Code

LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation

no code implementations • 11 Mar 2021 • Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, Qun Liu

The multilingual pre-trained language models (e. g, mBERT, XLM and XLM-R) have shown impressive performance on cross-lingual natural language understanding tasks.

Natural Language Understanding XLM-R

Paper
Add Code

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

1 code implementation • ICLR 2021 • Mingyang Yi, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma

Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i. e., harder examples).

Image Augmentation Image Classification +1

Paper
Code

Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation

no code implementations • 24 Apr 2021 • Cheng Chen, Yichun Yin, Lifeng Shang, Zhi Wang, Xin Jiang, Xiao Chen, Qun Liu

Task-agnostic knowledge distillation, a teacher-student framework, has been proved effective for BERT compression.

Knowledge Distillation

Paper
Add Code

Improved OOD Generalization via Adversarial Training and Pre-training

no code implementations • 24 May 2021 • Mingyang Yi, Lu Hou, Jiacheng Sun, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma

In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data.

Image Classification Natural Language Understanding

Paper
Add Code

A Mutual Information Maximization Approach for the Spurious Solution Problem in Weakly Supervised Question Answering

1 code implementation • ACL 2021 • Zhihong Shao, Lifeng Shang, Qun Liu, Minlie Huang

This setting gives rise to the spurious solution problem: there may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance (e. g., producing wrong solutions or answers).

Question Answering

Paper
Code

AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models

1 code implementation • ACL 2021 • Yichun Yin, Cheng Chen, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu

Specifically, we carefully design the techniques of one-shot learning and the search space to provide an adaptive and efficient development way of tiny PLMs for various latency constraints.

Neural Architecture Search One-Shot Learning

2,961

Paper
Code

GhostBERT: Generate More Features with Cheap Operations for BERT

no code implementations • ACL 2021 • Zhiqi Huang, Lu Hou, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu

Transformer-based pre-trained language models like BERT, though powerful in many tasks, are expensive in both memory and computation, due to their large number of parameters.

Paper
Add Code

Integrating Regular Expressions with Neural Networks via DFA

no code implementations • 7 Sep 2021 • Shaobo Li, Qun Liu, Xin Jiang, Yichun Yin, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Lifeng Shang

Human-designed rules are widely used to build industry applications.

intent-classification Intent Classification +1

Paper
Add Code

Generate & Rank: A Multi-task Framework for Math Word Problems

no code implementations • Findings (EMNLP) 2021 • Jianhao Shen, Yichun Yin, Lin Li, Lifeng Shang, Xin Jiang, Ming Zhang, Qun Liu

Math word problem (MWP) is a challenging and critical task in natural language processing.

Ranked #2 on Math Word Problem Solving on Math23K

Language Modelling Math +1

Paper
Add Code

Improving Unsupervised Question Answering via Summarization-Informed Question Generation

no code implementations • EMNLP 2021 • Chenyang Lyu, Lifeng Shang, Yvette Graham, Jennifer Foster, Xin Jiang, Qun Liu

Template-based QG uses linguistically-informed heuristics to transform declarative sentences into interrogatives, whereas supervised QG uses existing Question Answering (QA) datasets to train a system to generate a question given a passage and an answer.

Dependency Parsing named-entity-recognition +8

Paper
Add Code

DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling

1 code implementation • EMNLP 2021 • Baojun Wang, Zhao Zhang, Kun Xu, Guang-Yuan Hao, Yuyang Zhang, Lifeng Shang, Linlin Li, Xiao Chen, Xin Jiang, Qun Liu

Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks.

Denoising TAG

835

Paper
Code

Towards Efficient Post-training Quantization of Pre-trained Language Models

no code implementations • 30 Sep 2021 • Haoli Bai, Lu Hou, Lifeng Shang, Xin Jiang, Irwin King, Michael R. Lyu

Experiments on GLUE and SQuAD benchmarks show that our proposed PTQ solution not only performs close to QAT, but also enjoys significant reductions in training time, memory overhead, and data consumption.

Quantization

Paper
Add Code

bert2BERT: Towards Reusable Pretrained Language Models

no code implementations • ACL 2022 • Cheng Chen, Yichun Yin, Lifeng Shang, Xin Jiang, Yujia Qin, Fengyu Wang, Zhi Wang, Xiao Chen, Zhiyuan Liu, Qun Liu

However, large language model pre-training costs intensive computational resources and most of the models are trained from scratch without reusing the existing pre-trained models, which is wasteful.

Language Modelling Large Language Model

Paper
Add Code

Read before Generate! Faithful Long Form Question Answering with Machine Reading

no code implementations • Findings (ACL) 2022 • Dan Su, Xiaoguang Li, Jindi Zhang, Lifeng Shang, Xin Jiang, Qun Liu, Pascale Fung

Long-form question answering (LFQA) aims to generate a paragraph-length answer for a given question.

Ranked #1 on Question Answering on KILT: ELI5

Answer Generation Long Form Question Answering +1

Paper
Add Code

Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation

no code implementations • Findings (ACL) 2022 • Wenliang Dai, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Pascale Fung

Furthermore, the original textual language understanding and generation ability of the PLM is maintained after VLKD, which makes our model versatile for both multimodal and unimodal tasks.

Image Captioning Knowledge Distillation +4

Paper
Add Code

Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering

1 code implementation • ACL 2022 • Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Lan Luo, Ke Zhan, Enrui Hu, Xinyu Zhang, Hao Jiang, Zhao Cao, Fan Yu, Xin Jiang, Qun Liu, Lei Chen

To alleviate the data scarcity problem in training question answering systems, recent works propose additional intermediate pre-training for dense passage retrieval (DPR).

Open-Domain Question Answering Passage Retrieval +1

Paper
Code

Compression of Generative Pre-trained Language Models via Quantization

no code implementations • ACL 2022 • Chaofan Tao, Lu Hou, Wei zhang, Lifeng Shang, Xin Jiang, Qun Liu, Ping Luo, Ngai Wong

We find that previous quantization methods fail on generative tasks due to the \textit{homogeneous word embeddings} caused by reduced capacity, and \textit{varied distribution of weights}.

Model Compression Quantization +1

Paper
Add Code

How Pre-trained Language Models Capture Factual Knowledge? A Causal-Inspired Analysis

no code implementations • Findings (ACL) 2022 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Zhenhua Dong, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Xin Jiang, Qun Liu

We check the words that have three typical associations with the missing words: knowledge-dependent, positionally close, and highly co-occurred.

Paper
Add Code

PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model

2 code implementations • 31 Mar 2022 • Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, Qun Liu

We investigate different aspects of responses generated by PanGu-Bot, including response quality, knowledge, and safety.

Dialogue Generation Language Modelling

2,961

Paper
Code

Exploring Extreme Parameter Compression for Pre-trained Language Models

1 code implementation • ICLR 2022 • Yuxin Ren, Benyou Wang, Lifeng Shang, Xin Jiang, Qun Liu

A tiny version achieves $96. 7\%$ performance of BERT-base with $ {1}/{48} $ encoder parameters (i. e., less than 2M parameters excluding the embedding layer) and $2. 7 \times$ faster on inference.

Knowledge Distillation Tensor Decomposition

Paper
Code

Pre-training Language Models with Deterministic Factual Knowledge

no code implementations • 20 Oct 2022 • Shaobo Li, Xiaoguang Li, Lifeng Shang, Chengjie Sun, Bingquan Liu, Zhenzhou Ji, Xin Jiang, Qun Liu

Further experiments on question-answering datasets show that trying to learn a deterministic relationship with the proposed methods can also help other knowledge-intensive tasks.

Knowledge Probing Question Answering

Paper
Add Code

LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal Modeling

no code implementations • 21 Oct 2022 • Dongsheng Chen, Chaofan Tao, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu

Recent large-scale video-language pre-trained models have shown appealing performance on various downstream tasks.

Language Modelling Question Answering +3

Paper
Add Code

Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation

1 code implementation • 4 Dec 2022 • Zhexin Zhang, Jiale Cheng, Hao Sun, Jiawen Deng, Fei Mi, Yasheng Wang, Lifeng Shang, Minlie Huang

In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations.

Response Generation

Paper
Code

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

1 code implementation • 7 Dec 2022 • Zhongwei Wan, Yichun Yin, Wei zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e. g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora.

General Knowledge Language Modelling +3

Paper
Code

Retrieval-based Disentangled Representation Learning with Natural Language Supervision

no code implementations • 15 Dec 2022 • Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Lei Chen

In light of this, we present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning.

Cross-Modal Retrieval Disentanglement +2

Paper
Add Code

Enhancing Coherence of Extractive Summarization with Multitask Learning

no code implementations • 22 May 2023 • Renlong Jie, Xiaojun Meng, Lifeng Shang, Xin Jiang, Qun Liu

This study proposes a multitask learning architecture for extractive summarization with coherence boosting.

Extractive Summarization Sentence

Paper
Add Code

mCLIP: Multilingual CLIP via Cross-lingual Transfer

1 code implementation • ACL 2023 • Guanhua Chen, Lu Hou, Yun Chen, Wenliang Dai, Lifeng Shang, Xin Jiang, Qun Liu, Jia Pan, Wenping Wang

Furthermore, to enhance the token- and sentence-level multilingual representation of the MTE, we propose to train it with machine translation and contrastive learning jointly before the TriKD to provide a better initialization.

Contrastive Learning Cross-Lingual Transfer +7

Paper
Code

Aligning Large Language Models with Human: A Survey

1 code implementation • 24 Jul 2023 • YuFei Wang, Wanjun Zhong, Liangyou Li, Fei Mi, Xingshan Zeng, Wenyong Huang, Lifeng Shang, Xin Jiang, Qun Liu

(2) Training methodologies: a detailed review of the prevailing training methods employed for LLM alignment.

603

Paper
Code

AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models

no code implementations • 12 Aug 2023 • Siheng Li, Cheng Yang, Yichun Yin, Xinyu Zhu, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang

Information-seeking conversation, which aims to help users gather information through conversation, has achieved great progress in recent years.

Few-Shot Learning Language Modelling

Paper
Add Code

NewsDialogues: Towards Proactive News Grounded Conversation

1 code implementation • 12 Aug 2023 • Siheng Li, Yichun Yin, Cheng Yang, Wangjie Jiang, Yiwei Li, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang

In this paper, we propose a novel task, Proactive News Grounded Conversation, in which a dialogue system can proactively lead the conversation based on some key topics of the news.

Response Generation

Paper
Code

Prompt-Based Length Controlled Generation with Reinforcement Learning

no code implementations • 23 Aug 2023 • Renlong Jie, Xiaojun Meng, Lifeng Shang, Xin Jiang, Qun Liu

Large language models (LLMs) like ChatGPT and GPT-4 have attracted great attention given their surprising performance on a wide range of NLP tasks.

reinforcement-learning

Paper
Add Code

SELF: Self-Evolution with Language Feedback

no code implementations • 1 Oct 2023 • Jianqiao Lu, Wanjun Zhong, Wenyong Huang, YuFei Wang, Qi Zhu, Fei Mi, Baojun Wang, Weichao Wang, Xingshan Zeng, Lifeng Shang, Xin Jiang, Qun Liu

SELF initiates with a meta-skill learning process that equips the LLMs with capabilities for self-feedback and self-refinement.

Language Modelling Large Language Model

Paper
Add Code

Exploring the Usage of Chinese Pinyin in Pretraining

no code implementations • 8 Oct 2023 • Baojun Wang, Kun Xu, Lifeng Shang

Through delicate pretraining tasks, the characters and pinyin representation are fused, which can enhance the error tolerance for SSP errors.

Language Modelling

Paper
Add Code

Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment

1 code implementation • 12 Oct 2023 • Boyang Xue, Weichao Wang, Hongru Wang, Fei Mi, Rui Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong

Inspired by previous work which identified that feed-forward networks (FFNs) within Transformers are responsible for factual knowledge expressions, we investigate two methods to efficiently improve the factual expression capability {of FFNs} by knowledge enhancement and alignment respectively.

Paper
Code

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

no code implementations • 16 Oct 2023 • Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing Hong, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-yan Yeung, Lifeng Shang, Xin Jiang, Qun Liu

The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges.

Instruction Following

Paper
Add Code

M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models

1 code implementation • 30 Oct 2023 • Wai-Chung Kwan, Xingshan Zeng, YuFei Wang, Yusen Sun, Liangyou Li, Lifeng Shang, Qun Liu, Kam-Fai Wong

In this paper, we propose M4LE, a Multi-ability, Multi-range, Multi-task, Multi-domain benchmark for Long-context Evaluation.

8k Semantic Retrieval

Paper
Code

FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models

1 code implementation • 31 Oct 2023 • Yuxin Jiang, YuFei Wang, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang, Qun Liu, Wei Wang

To fill this research gap, in this paper, we propose FollowBench, a Multi-level Fine-grained Constraints Following Benchmark for LLMs.

Instruction Following

Paper
Code

Data Management For Large Language Models: A Survey

1 code implementation • 4 Dec 2023 • Zige Wang, Wanjun Zhong, YuFei Wang, Qi Zhu, Fei Mi, Baojun Wang, Lifeng Shang, Xin Jiang, Qun Liu

Data plays a fundamental role in the training of Large Language Models (LLMs).

Management

192

Paper
Code

Preparing Lessons for Progressive Training on Language Models

1 code implementation • 17 Jan 2024 • Yu Pan, Ye Yuan, Yichun Yin, Jiaxin Shi, Zenglin Xu, Ming Zhang, Lifeng Shang, Xin Jiang, Qun Liu

The rapid progress of Transformers in artificial intelligence has come at the cost of increased resource consumption and greenhouse gas emissions due to growing model sizes.

Paper
Code

PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models

1 code implementation • 26 Jan 2024 • Haochen Tan, Zhijiang Guo, Zhan Shi, Lu Xu, Zhili Liu, Yunlong Feng, Xiaoguang Li, Yasheng Wang, Lifeng Shang, Qun Liu, Linqi Song

LLMs are prompted to generate extensive content in response to these meta-questions.

Text Generation

Paper
Code

YODA: Teacher-Student Progressive Learning for Language Models

no code implementations • 28 Jan 2024 • Jianqiao Lu, Wanjun Zhong, YuFei Wang, Zhijiang Guo, Qi Zhu, Wenyong Huang, Yanlin Wang, Fei Mi, Baojun Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu

With the teacher's guidance, the student learns to iteratively refine its answer with feedback, and forms a robust and comprehensive understanding of the posed questions.

GSM8K Math

Paper
Add Code

Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios

1 code implementation • 30 Jan 2024 • Shijue Huang, Wanjun Zhong, Jianqiao Lu, Qi Zhu, Jiahui Gao, Weiwen Liu, Yutai Hou, Xingshan Zeng, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruifeng Xu, Qun Liu

The recent trend of using Large Language Models (LLMs) as tool agents in real-world applications underscores the necessity for comprehensive evaluations of their capabilities, particularly in complex scenarios involving planning, creating, and using tools.

Benchmarking

Paper
Code

MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models

1 code implementation • 30 Jan 2024 • Wai-Chung Kwan, Xingshan Zeng, Yuxin Jiang, YuFei Wang, Liangyou Li, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong

Large language models (LLMs) are increasingly relied upon for complex multi-turn conversations across diverse real-world applications.

Paper
Code

Learning to Edit: Aligning LLMs with Knowledge Editing

1 code implementation • 19 Feb 2024 • Yuxin Jiang, YuFei Wang, Chuhan Wu, Wanjun Zhong, Xingshan Zeng, Jiahui Gao, Liangyou Li, Xin Jiang, Lifeng Shang, Ruiming Tang, Qun Liu, Wei Wang

Knowledge editing techniques, aiming to efficiently modify a minor proportion of knowledge in large language models (LLMs) without negatively impacting performance across other inputs, have garnered widespread attention.

knowledge editing Philosophy

Paper
Code

Does the Generator Mind its Contexts? An Analysis of Generative Model Faithfulness under Context Transfer

no code implementations • 22 Feb 2024 • Xinshuo Hu, Baotian Hu, Dongfang Li, Xiaoguang Li, Lifeng Shang

The present study introduces the knowledge-augmented generator, which is specifically designed to produce information that remains grounded in contextual knowledge, regardless of alterations in the context.

Generative Question Answering Hallucination +1

Paper
Add Code

MTRec: Multi-Task Learning over BERT for News Recommendation

1 code implementation • Findings (ACL) 2022 • Qiwei Bi, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, Hanfang Yang

With the adoption of large pre-trained models like BERT in news recommendation, the above way to incorporate multi-field information may encounter challenges: the shallow feature encoding to compress the category and entity information is not compatible with the deep BERT encoding.

Multi-Task Learning News Recommendation

Paper
Code

Controlled Text Generation Using Dictionary Prior in Variational Autoencoders

no code implementations • Findings (ACL) 2022 • Xianghong Fang, Jian Li, Lifeng Shang, Xin Jiang, Qun Liu, Dit-yan Yeung

While variational autoencoders (VAEs) have been widely applied in text generation tasks, they are troubled by two challenges: insufficient representation capacity and poor controllability.

Contrastive Learning Language Modelling +2

Paper
Add Code

MINER: Multi-Interest Matching Network for News Recommendation

1 code implementation • Findings (ACL) 2022 • Jian Li, Jieming Zhu, Qiwei Bi, Guohao Cai, Lifeng Shang, Zhenhua Dong, Xin Jiang, Qun Liu

Accurately matching user’s interests and candidate news is the key to news recommendation.

News Recommendation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.