Search Results for author: Fei Huang

Found 214 papers, 124 papers with code

PALM: Pre-training an Autoencoding\&Autoregressive Language Model for Context-conditioned Generation

no code implementations EMNLP 2020 Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si

An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.

Abstractive Text Summarization Conversational Response Generation +8

Alibaba Speech Translation Systems for IWSLT 2018

no code implementations IWSLT (EMNLP) 2018 Nguyen Bach, Hongjie Chen, Kai Fan, Cheung-Chi Leung, Bo Li, Chongjia Ni, Rong Tong, Pei Zhang, Boxing Chen, Bin Ma, Fei Huang

This work describes the En→De Alibaba speech translation system developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2018.

Sentence Translation

S^4-Tuning: A Simple Cross-lingual Sub-network Tuning Method

no code implementations ACL 2022 Runxin Xu, Fuli Luo, Baobao Chang, Songfang Huang, Fei Huang

The emergence of multilingual pre-trained language models makes it possible to adapt to target languages with only few labeled examples. However, vanilla fine-tuning tends to achieve degenerated and unstable results, owing to the Language Interference among different languages, and Parameter Overload under the few-sample transfer learning scenarios. To address two problems elegantly, we propose S^4-Tuning, a Simple Cross-lingual Sub-network Tuning method.

Transfer Learning

Bilingual Methods for Adaptive Training Data Selection for Machine Translation

no code implementations AMTA 2016 Boxing Chen, Roland Kuhn, George Foster, Colin Cherry, Fei Huang

In this paper, we propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus.

Machine Translation NMT +2

Rethinking Denoised Auto-Encoding in Language Pre-Training

no code implementations EMNLP 2021 Fuli Luo, Pengcheng Yang, Shicheng Li, Xuancheng Ren, Xu sun, Songfang Huang, Fei Huang

Pre-trained self-supervised models such as BERT have achieved striking success in learning sequence representations, especially for natural language processing.

Natural Language Understanding Sentence

A Survey on Self-Evolution of Large Language Models

no code implementations22 Apr 2024 Zhengwei Tao, Ting-En Lin, Xiancai Chen, Hangyu Li, Yuchuan Wu, Yongbin Li, Zhi Jin, Fei Huang, DaCheng Tao, Jingren Zhou

Large language models (LLMs) have significantly advanced in various fields and intelligent agent applications.

Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning

1 code implementation17 Apr 2024 Xiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang

Our objective is to train a generative model that can simultaneously provide a score indicating the presence of shared key point between a pair of arguments and generate the shared key point.

Argument Mining graph partitioning +2

Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts

no code implementations2 Apr 2024 Zhuo Chen, Xinyu Wang, Yong Jiang, Pengjun Xie, Fei Huang, Kewei Tu

With our method, the origin language models can cover several times longer contexts while keeping the computing requirements close to the baseline.

In-Context Learning Language Modelling +2

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

1 code implementation28 Mar 2024 Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

Recently, visually-situated text parsing (VsTP) has experienced notable advancements, driven by the increasing demand for automated document understanding and the emergence of Generative Large Language Models (LLMs) capable of processing document-based questions.

document understanding Key Information Extraction +3

Fine-Tuning Language Models with Reward Learning on Policy

1 code implementation28 Mar 2024 Hao Lang, Fei Huang, Yongbin Li

RLHF contains three steps, i. e., human preference collecting, reward learning, and policy optimization, which are usually performed serially.


ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy

no code implementations21 Mar 2024 Zonghan Yang, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

In WebShop, the 1-shot performance of the A$^3$T agent matches human average, and 4 rounds of iterative refinement lead to the performance approaching human experts.

Policy Gradient Methods

RoleInteract: Evaluating the Social Interaction of Role-Playing Agents

1 code implementation20 Mar 2024 Hongzhan Chen, Hehong Chen, Ming Yan, Wenshen Xu, Xing Gao, Weizhou Shen, Xiaojun Quan, Chenliang Li, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we introduce RoleInteract, the first benchmark designed to systematically evaluate the sociality of role-playing conversational agents at both individual and group levels of social interactions.

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

1 code implementation19 Mar 2024 Anwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan, Liang Zhang, Bo Zhang, Chen Li, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou

In this work, we emphasize the importance of structure information in Visual Document Understanding and propose the Unified Structure Learning to boost the performance of MLLMs.

document understanding Optical Character Recognition (OCR)

Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment

1 code implementation17 Mar 2024 Feifan Song, Bowen Yu, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang, Yongbin Li

Additionally, the concept of diversity for prompts can be more complex than responses that are typically quantified by single digits.

Data Augmentation

Improving Cross-lingual Representation for Semantic Retrieval with Code-switching

no code implementations3 Mar 2024 Mieradilijiang Maimaiti, Yuanhang Zheng, Ji Zhang, Fei Huang, Yue Zhang, Wenpei Luo, Kaiyu Huang

Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario.

Question Answering Retrieval +3

Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training

no code implementations1 Mar 2024 Haowei Liu, Yaya Shi, Haiyang Xu, Chunfeng Yuan, Qinghao Ye, Chenliang Li, Ming Yan, Ji Zhang, Fei Huang, Bing Li, Weiming Hu

In vision-language pre-training (VLP), masked image modeling (MIM) has recently been introduced for fine-grained cross-modal alignment.

Representation Learning

Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark

1 code implementation29 Feb 2024 Zhikun Xu, Yinghui Li, Ruixue Ding, Xinyu Wang, Boli Chen, Yong Jiang, Hai-Tao Zheng, Wenlian Lu, Pengjun Xie, Fei Huang

To promote the improvement of Chinese LLMs' ability to answer dynamic questions, in this paper, we introduce CDQA, a Chinese Dynamic QA benchmark containing question-answer pairs related to the latest news on the Chinese Internet.

Question Answering

Training-Free Long-Context Scaling of Large Language Models

1 code implementation27 Feb 2024 Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.


Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval

no code implementations26 Feb 2024 Haowei Liu, Yaya Shi, Haiyang Xu, Chunfeng Yuan, Qinghao Ye, Chenliang Li, Ming Yan, Ji Zhang, Fei Huang, Bing Li, Weiming Hu

In this work, we propose the UNIFY framework, which learns lexicon representations to capture fine-grained semantics and combines the strengths of latent and lexicon representations for video-text retrieval.

Retrieval Text Retrieval +1

Budget-Constrained Tool Learning with Planning

1 code implementation25 Feb 2024 Yuanhang Zheng, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

Despite intensive efforts devoted to tool learning, the problem of budget-constrained tool learning, which focuses on resolving user queries within a specific budget constraint, has been widely overlooked.

Self-Retrieval: Building an Information Retrieval System with One Large Language Model

no code implementations23 Feb 2024 Qiaoyu Tang, Jiawei Chen, Bowen Yu, Yaojie Lu, Cheng Fu, Haiyang Yu, Hongyu Lin, Fei Huang, Ben He, Xianpei Han, Le Sun, Yongbin Li

The rise of large language models (LLMs) has transformed the role of information retrieval (IR) systems in the way to humans accessing information.

Information Retrieval Language Modelling +2

Model Composition for Multimodal Large Language Models

no code implementations20 Feb 2024 Chi Chen, Yiyang Du, Zheng Fang, Ziyue Wang, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu

In this paper, we propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model.

PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs

no code implementations20 Feb 2024 An Liu, Zonghan Yang, Zhenhe Zhang, Qingyuan Hu, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

While Large language models (LLMs) have demonstrated considerable capabilities across various natural language tasks, they often fall short of the performance achieved by domain-specific state-of-the-art models.

text-classification Text Classification

Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion

1 code implementation19 Feb 2024 Ziyue Wang, Chi Chen, Yiqi Zhu, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu

With the bloom of Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) that incorporate LLMs with pre-trained vision models have recently demonstrated impressive performance across diverse vision-language tasks.

Meta Ranking: Less Capable Language Models are Capable for Single Response Judgement

1 code implementation19 Feb 2024 Zijun Liu, Boqun Kou, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu

Although Large Language Models (LLMs) have demonstrated strong performance on a wide range of tasks, they still face reliability challenges such as hallucination.


AI Hospital: Interactive Evaluation and Collaboration of LLMs as Intern Doctors for Clinical Diagnosis

1 code implementation15 Feb 2024 Zhihao Fan, Jialong Tang, Wei Chen, Siyuan Wang, Zhongyu Wei, Jun Xi, Fei Huang, Jingren Zhou

To simulate the procedure, we collect high-quality medical records to create patient, examiner, and medical director agents.

Question Answering

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

1 code implementation29 Jan 2024 Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

To assess the performance of Mobile-Agent, we introduced Mobile-Eval, a benchmark for evaluating mobile device operations.

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

1 code implementation14 Jan 2024 Weizhou Shen, Chenliang Li, Hongzhan Chen, Ming Yan, Xiaojun Quan, Hehong Chen, Ji Zhang, Fei Huang

Each component is implemented by a single LLM that focuses on a specific capability and collaborates with others to accomplish the task.

Language Modelling Large Language Model

Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection

no code implementations11 Jan 2024 Wei Ye, Chaoya Jiang, Haiyang Xu, Chenhao Ye, Chenliang Li, Ming Yan, Shikun Zhang, Songhang Huang, Fei Huang

Vision Transformers (ViTs) have become increasingly popular in large-scale Vision and Language Pre-training (VLP) models.

LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training

no code implementations3 Jan 2024 Rujiao Long, Hangdi Xing, Zhibo Yang, Qi Zheng, Zhi Yu, Cong Yao, Fei Huang

We model TSR as a logical location regression problem and propose a new TSR framework called LORE, standing for LOgical location REgression network, which for the first time regresses logical location as well as spatial location of table cells in a unified network.


DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever

no code implementations2 Jan 2024 Zhichao Yin, Binyuan Hui, Min Yang, Fei Huang, Yongbin Li

Recently, substantial advancements in pre-trained vision-language models have greatly enhanced the capabilities of multi-modal dialog systems.

Language Modelling Retrieval

Unifying Structured Data as Graph for Data-to-Text Pre-Training

1 code implementation2 Jan 2024 Shujie Li, Liang Li, Ruiying Geng, Min Yang, Binhua Li, Guanghu Yuan, Wanwei He, Shao Yuan, Can Ma, Fei Huang, Yongbin Li

In this paper, we unify different types of structured data (i. e., table, key-value data, knowledge graph) into the graph format and cast different data-to-text generation tasks as graph-to-text generation.

Data-to-Text Generation

EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models with Semi-structured Data

no code implementations25 Dec 2023 Shirong Ma, Shen Huang, Shulin Huang, Xiaobin Wang, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

Experimental results demonstrate the effectiveness of continual pre-training of E-commerce LLMs and the efficacy of our devised data mixing strategy.

In-Context Learning

One Shot Learning as Instruction Data Prospector for Large Language Models

1 code implementation16 Dec 2023 Yunshui Li, Binyuan Hui, Xiaobo Xia, Jiaxi Yang, Min Yang, Lei Zhang, Shuzheng Si, Junhao Liu, Tongliang Liu, Fei Huang, Yongbin Li

Nuggets assesses the potential of individual instruction examples to act as effective one shot examples, thereby identifying those that can significantly enhance diverse task performance.

One-Shot Learning

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

1 code implementation12 Dec 2023 Chaoya Jiang, Haiyang Xu, Mengfan Dong, Jiaxing Chen, Wei Ye, Ming Yan, Qinghao Ye, Ji Zhang, Fei Huang, Shikun Zhang

We first analyzed the representation distribution of textual and visual tokens in MLLM, revealing two important findings: 1) there is a significant gap between textual and visual representations, indicating unsatisfactory cross-modal representation alignment; 2) representations of texts that contain and do not contain hallucinations are entangled, making it challenging to distinguish them.

Contrastive Learning Hallucination +4

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

1 code implementation7 Dec 2023 Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan

Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance.

Trajectory Planning

Jointly spatial-temporal representation learning for individual trajectories

no code implementations7 Dec 2023 Fei Huang, Jianrong Lv, Yang Yue

The proposed ST-GraphRL consists of three compositions: (i) a weighted directed spatial-temporal graph to explicitly construct mobility interactions in both space and time dimensions; (ii) a two-stage jointly encoder (i. e., decoupling and fusion), to learn entangled spatial-temporal dependencies by independently decomposing and jointly aggregating space and time information; (iii) a decoder guides ST-GraphRL to learn explicit mobility regularities by simulating the spatial-temporal distributions of trajectories.

Representation Learning

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

1 code implementation30 Nov 2023 Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang

In this work, towards a more versatile copilot for academic paper writing, we mainly focus on strengthening the multi-modal diagram analysis ability of Multimodal LLMs.

Language Modelling Large Language Model

Speculative Contrastive Decoding

no code implementations15 Nov 2023 Hongyi Yuan, Keming Lu, Fei Huang, Zheng Yuan, Chang Zhou

Large language models~(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias.

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

1 code implementation6 Nov 2023 Le Yu, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li

Then, we use DARE as a versatile plug-and-play technique to sparsify delta parameters of multiple SFT homologous models for mitigating parameter interference and merge them into a single model by parameter fusing.

GSM8K Instruction Following

Improving Factual Consistency of Text Summarization by Adversarially Decoupling Comprehension and Embellishment Abilities of LLMs

no code implementations30 Oct 2023 Huawen Feng, Yan Fan, Xiong Liu, Ting-En Lin, Zekun Yao, Yuchuan Wu, Fei Huang, Yongbin Li, Qianli Ma

Despite the recent progress in text summarization made by large language models (LLMs), they often generate summaries that are factually inconsistent with original articles, known as "hallucinations" in text generation.

Text Generation Text Summarization

Diversify Question Generation with Retrieval-Augmented Style Transfer

1 code implementation23 Oct 2023 Qi Gou, Zehua Xia, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Nguyen Cam-Tu

Given a textual passage and an answer, humans are able to ask questions with various expressions, but this ability is still challenging for most question generation (QG) systems.

Question Answering Question Generation +4

Improving Seq2Seq Grammatical Error Correction via Decoding Interventions

1 code implementation23 Oct 2023 Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang

In this paper, we propose a unified decoding intervention framework that employs an external critic to assess the appropriateness of the token to be generated incrementally, and then dynamically influence the choice of the next token.

Grammatical Error Correction Language Modelling

Improving Question Generation with Multi-level Content Planning

1 code implementation20 Oct 2023 Zehua Xia, Qi Gou, Bowen Yu, Haiyang Yu, Fei Huang, Yongbin Li, Cam-Tu Nguyen

Previous studies have suggested that key phrase selection is essential for question generation (QG), yet it is still challenging to connect such disjointed phrases into meaningful questions, particularly for long context.

Answer Generation Question Generation +2

FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

1 code implementation18 Oct 2023 Xiang Chen, Duanzheng Song, Honghao Gui, Chenxi Wang, Ningyu Zhang, Jiang Yong, Fei Huang, Chengfei Lv, Dan Zhang, Huajun Chen

Despite their impressive generative capabilities, LLMs are hindered by fact-conflicting hallucinations in real-world applications.

Benchmarking Hallucination

Editing Personality for Large Language Models

1 code implementation3 Oct 2023 Shengyu Mao, Xiaohan Wang, Mengru Wang, Yong Jiang, Pengjun Xie, Fei Huang, Ningyu Zhang

This task seeks to adjust the models' responses to opinion-related questions on specified topics since an individual's personality often manifests in the form of their expressed opinions, thereby showcasing different personality traits.

VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue

no code implementations14 Sep 2023 Yunshui Li, Binyuan Hui, Zhaochao Yin, Wanwei He, Run Luo, Yuxing Long, Min Yang, Fei Huang, Yongbin Li

Visually-grounded dialog systems, which integrate multiple modes of communication such as text and visual inputs, have become an increasingly popular area of investigation.

Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking

1 code implementation4 Sep 2023 Yong Cao, Ruixue Ding, Boli Chen, Xianzhi Li, Min Chen, Daniel Hershcovich, Pengjun Xie, Fei Huang

Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates, which is crucial for location-related services such as navigation maps.

Chunking Multi-Task Learning +1

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

1 code implementation2 Sep 2023 Chenliang Li, Hehong Chen, Ming Yan, Weizhou Shen, Haiyang Xu, Zhikai Wu, Zhicheng Zhang, Wenmeng Zhou, Yingda Chen, Chen Cheng, Hongzhu Shi, Ji Zhang, Fei Huang, Jingren Zhou

Large language models (LLMs) have recently demonstrated remarkable capabilities to comprehend human intentions, engage in reasoning, and design planning-like behavior.

EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce

1 code implementation14 Aug 2023 Yangning Li, Shirong Ma, Xiaobin Wang, Shen Huang, Chengyue Jiang, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

EcomInstruct scales up the data size and task diversity by constructing atomic tasks with E-commerce basic data types, such as product information, user reviews.

Instruction Following Language Modelling +2

A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment

1 code implementation10 Aug 2023 Yingxiu Zhao, Bowen Yu, Binyuan Hui, Haiyang Yu, Fei Huang, Yongbin Li, Nevin L. Zhang

Training large language models (LLMs) with open-domain instruction data has yielded remarkable success in aligning to end tasks and human preferences.

Wider and Deeper LLM Networks are Fairer LLM Evaluators

1 code implementation3 Aug 2023 Xinghua Zhang, Bowen Yu, Haiyang Yu, Yangyu Lv, Tingwen Liu, Fei Huang, Hongbo Xu, Yongbin Li

Each perspective corresponds to the role of a specific LLM neuron in the first layer.

CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility

1 code implementation19 Jul 2023 Guohai Xu, Jiayi Liu, Ming Yan, Haotian Xu, Jinghui Si, Zhuoran Zhou, Peng Yi, Xing Gao, Jitao Sang, Rong Zhang, Ji Zhang, Chao Peng, Fei Huang, Jingren Zhou

In this paper, we present CValues, the first Chinese human values evaluation benchmark to measure the alignment ability of LLMs in terms of both safety and responsibility criteria.

BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization

no code implementations17 Jul 2023 Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang

Specifically, We incorporate a Text-Semantics-Aware Patch Selector (TSPS) into the ViT backbone to perform a coarse-grained visual token extraction and then attach a flexible Transformer-based Patch Abstraction Decoder (PAD) upon the backbone for top-level visual abstraction.

Text Summarization

DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering

1 code implementation13 Jul 2023 Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang

Existing evaluation metrics for natural language generation (NLG) tasks face the challenges on generalization ability and interpretability.

Dialogue Generation nlg evaluation +3

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

1 code implementation4 Jul 2023 Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Yuhao Dan, Chenlin Zhao, Guohai Xu, Chenliang Li, Junfeng Tian, Qian Qi, Ji Zhang, Fei Huang

Nevertheless, without in-domain training, these models tend to ignore fine-grained OCR features, such as sophisticated tables or large blocks of text, which are essential for OCR-free document understanding.

document understanding Language Modelling +2

Preference Ranking Optimization for Human Alignment

1 code implementation30 Jun 2023 Feifan Song, Bowen Yu, Minghao Li, Haiyang Yu, Fei Huang, Yongbin Li, Houfeng Wang

In this manner, PRO effectively transforms human alignment into aligning the probability ranking of n responses generated by LLM with the preference ranking of humans towards these responses.

Unified Language Representation for Question Answering over Text, Tables, and Images

no code implementations29 Jun 2023 Bowen Yu, Cheng Fu, Haiyang Yu, Fei Huang, Yongbin Li

When trying to answer complex questions, people often rely on multiple sources of information, such as visual, textual, and tabular data.

Question Answering Retrieval

DiffDTM: A conditional structure-free framework for bioactive molecules generation targeted for dual proteins

no code implementations24 Jun 2023 Lei Huang, Zheng Yuan, Huihui Yan, Rong Sheng, Linjing Liu, Fuzhou Wang, Weidun Xie, Nanjun Chen, Fei Huang, Songfang Huang, Ka-Chun Wong, Yaoyun Zhang

However, molecule generation targeted for dual protein targets still faces formidable challenges including protein 3D structure data requisition for model training, auto-regressive sampling, and model generalization for unseen targets.

CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality

no code implementations20 Jun 2023 Liang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li

To alleviate these limitations, in this paper, we present CATS, a pragmatic Chinese answer-to-sequence dataset with large scale and high quality.

Universal Information Extraction with Meta-Pretrained Self-Retrieval

no code implementations18 Jun 2023 Xin Cong. Bowen Yu, Mengcheng Fang, Tingwen Liu, Haiyang Yu, Zhongkai Hu, Fei Huang, Yongbin Li, Bin Wang

Inspired by the fact that large amount of knowledge are stored in the pretrained language models~(PLM) and can be retrieved explicitly, in this paper, we propose MetaRetriever to retrieve task-specific knowledge from PLMs to enhance universal IE.


Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

1 code implementation7 Jun 2023 Haiyang Xu, Qinghao Ye, Xuan Wu, Ming Yan, Yuan Miao, Jiabo Ye, Guohai Xu, Anwen Hu, Yaya Shi, Guangwei Xu, Chenliang Li, Qi Qian, Maofei Que, Ji Zhang, Xiao Zeng, Fei Huang

In addition, to facilitate a comprehensive evaluation of video-language models, we carefully build the largest human-annotated Chinese benchmarks covering three popular video-language tasks of cross-modal retrieval, video captioning, and video category classification.

Cross-Modal Retrieval Language Modelling +3

Multimodal Recommendation Dialog with Subjective Preference: A New Challenge and Benchmark

1 code implementation26 May 2023 Yuxing Long, Binyuan Hui, Caixia Yuan1, Fei Huang, Yongbin Li, Xiaojie Wang

Existing multimodal task-oriented dialog data fails to demonstrate the diverse expressions of user subjective preferences and recommendation acts in the real-life shopping scenario.

Multimodal Recommendation

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

1 code implementation25 May 2023 Yue Zhang, Bo Zhang, Haochen Jiang, Zhenghua Li, Chen Li, Fei Huang, Min Zhang

We introduce NaSGEC, a new dataset to facilitate research on Chinese grammatical error correction (CGEC) for native speaker texts from multiple domains.

Grammatical Error Correction

PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts

1 code implementation24 May 2023 Yunshui Li, Binyuan Hui, Zhichao Yin, Min Yang, Fei Huang, Yongbin Li

It utilizes a combination of several fundamental experts to accommodate multiple dialogue-related tasks and can be pre-trained using limited dialogue and extensive non-dialogue multi-modal data.

Dialogue State Tracking Image Retrieval +4

Optimizing Non-Autoregressive Transformers with Contrastive Learning

no code implementations23 May 2023 Chenxin An, Jiangtao Feng, Fei Huang, Xipeng Qiu, Lingpeng Kong

In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.

Contrastive Learning Machine Translation +2

Iterative Forward Tuning Boosts In-context Learning in Language Models

no code implementations22 May 2023 Jiaxi Yang, Binyuan Hui, Min Yang, Binhua Li, Fei Huang, Yongbin Li

In this paper, we propose an effective and efficient two-stage framework to boost ICL in LLMs by exploiting a dual form between Transformer attention and gradient descent-based optimization.

Decision Making In-Context Learning +1

Speech-Text Dialog Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment

1 code implementation19 May 2023 Tianshu Yu, Haoyu Gao, Ting-En Lin, Min Yang, Yuchuan Wu, Wentao Ma, Chao Wang, Fei Huang, Yongbin Li

In this paper, we propose Speech-text dialog Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model.

Emotion Recognition in Conversation Multimodal Intent Recognition +1

Causal Document-Grounded Dialogue Pre-training

1 code implementation18 May 2023 Yingxiu Zhao, Bowen Yu, Haiyang Yu, Bowen Li, Jinyang Li, Chao Wang, Fei Huang, Yongbin Li, Nevin L. Zhang

To tackle this issue, we are the first to present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora.

Knowledge Rumination for Pre-trained Language Models

1 code implementation15 May 2023 Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun Chen, Ningyu Zhang

Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs.

Language Modelling

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering

no code implementations14 May 2023 Qianglong Chen, Guohai Xu, Ming Yan, Ji Zhang, Fei Huang, Luo Si, Yin Zhang

Existing knowledge-enhanced methods have achieved remarkable results in certain QA tasks via obtaining diverse knowledge from different knowledge bases.

Explanation Generation Question Answering

GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark

no code implementations11 May 2023 Dongyang Li, Ruixue Ding, Qiang Zhang, Zheng Li, Boli Chen, Pengjun Xie, Yao Xu, Xin Li, Ning Guo, Fei Huang, Xiaofeng He

With a fast developing pace of geographic applications, automatable and intelligent models are essential to be designed to handle the large volume of information.

Entity Alignment Natural Language Understanding

Domain Incremental Lifelong Learning in an Open World

1 code implementation11 May 2023 Yi Dai, Hao Lang, Yinhe Zheng, Bowen Yu, Fei Huang, Yongbin Li

Specifically, we dedicate task-level prompts to capture task-specific knowledge to retain high LL performances and maintain instance-level prompts to learn knowledge shared across input samples to improve the model's generalization performance.

Language Modelling

Long-Tailed Question Answering in an Open World

1 code implementation11 May 2023 Yi Dai, Hao Lang, Yinhe Zheng, Fei Huang, Yongbin Li

A retrieve-then-rerank frame is further introduced to select in-context examples, which guild the LM to generate text that express knowledge for QA tasks.

Knowledge Distillation Language Modelling +1

Out-of-Domain Intent Detection Considering Multi-Turn Dialogue Contexts

no code implementations5 May 2023 Hao Lang, Yinhe Zheng, Binyuan Hui, Fei Huang, Yongbin Li

Out-of-Domain (OOD) intent detection is vital for practical dialogue systems, and it usually requires considering multi-turn dialogue contexts.

Intent Detection

A Survey on Out-of-Distribution Detection in NLP

no code implementations5 May 2023 Hao Lang, Yinhe Zheng, Yixuan Li, Jian Sun, Fei Huang, Yongbin Li

Out-of-distribution (OOD) detection is essential for the reliable and safe deployment of machine learning systems in the real world.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

no code implementations NeurIPS 2023 Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Rongyu Cao, Ruiying Geng, Nan Huo, Xuanhe Zhou, Chenhao Ma, Guoliang Li, Kevin C. C. Chang, Fei Huang, Reynold Cheng, Yongbin Li

Our emphasis on database values highlights the new challenges of dirty database contents, external knowledge between NL questions and database contents, and SQL efficiency, particularly in the context of massive databases.

Semantic Parsing SQL Parsing +1

Transforming Visual Scene Graphs to Image Captions

1 code implementation3 May 2023 Xu Yang, Jiawei Peng, Zihua Wang, Haiyang Xu, Qinghao Ye, Chenliang Li, Songfang Huang, Fei Huang, Zhangzikang Li, Yu Zhang

In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.

Attribute Descriptive +1

Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation

1 code implementation24 Apr 2023 Fei Huang, Pei Ke, Minlie Huang

Non-AutoRegressive (NAR) text generation models have drawn much attention because of their significantly faster decoding speed and good generation quality in machine translation.

Machine Translation Text Generation

VECO 2.0: Cross-lingual Language Model Pre-training with Multi-granularity Contrastive Learning

no code implementations17 Apr 2023 Zhen-Ru Zhang, Chuanqi Tan, Songfang Huang, Fei Huang

Recent studies have demonstrated the potential of cross-lingual transferability by training a unified Transformer encoder for multiple languages.

Contrastive Learning Language Modelling +1

ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human

1 code implementation16 Apr 2023 Junfeng Tian, Hehong Chen, Guohai Xu, Ming Yan, Xing Gao, Jianhai Zhang, Chenliang Li, Jiayi Liu, Wenshen Xu, Haiyang Xu, Qi Qian, Wei Wang, Qinghao Ye, Jiejing Zhang, Ji Zhang, Fei Huang, Jingren Zhou

In this paper, we present ChatPLUG, a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format.

World Knowledge

RRHF: Rank Responses to Align Language Models with Human Feedback without tears

1 code implementation11 Apr 2023 Zheng Yuan, Hongyi Yuan, Chuanqi Tan, Wei Wang, Songfang Huang, Fei Huang

Reinforcement Learning from Human Feedback (RLHF) facilitates the alignment of large language models with human preferences, significantly enhancing the quality of interactions between humans and models.

Language Modelling Large Language Model

Revisiting Automatic Question Summarization Evaluation in the Biomedical Domain

no code implementations18 Mar 2023 Hongyi Yuan, Yaoyun Zhang, Fei Huang, Songfang Huang

To better understand whether commonly used evaluation metrics are capable of evaluating automatic summarization in the biomedical domain, we conduct human evaluations of summarization quality from four different aspects of a biomedical question summarization task.

Text Generation

RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training

1 code implementation1 Mar 2023 Zheng Yuan, Qiao Jin, Chuanqi Tan, Zhengyun Zhao, Hongyi Yuan, Fei Huang, Songfang Huang

We propose to retrieve similar image-text pairs based on ITC from pretraining datasets and introduce a novel retrieval-attention module to fuse the representation of the image and the question with the retrieved images and texts.

Question Answering Retrieval +1

Molecular Geometry-aware Transformer for accurate 3D Atomic System modeling

no code implementations2 Feb 2023 Zheng Yuan, Yaoyun Zhang, Chuanqi Tan, Wei Wang, Fei Huang, Songfang Huang

To alleviate this limitation, we propose Moleformer, a novel Transformer architecture that takes nodes (atoms) and edges (bonds and nonbonding atom pairs) as inputs and models the interactions among them using rotational and translational invariant geometry-aware spatial encoding.

Initial Structure to Relaxed Energy (IS2RE), Direct

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

4 code implementations1 Feb 2023 Haiyang Xu, Qinghao Ye, Ming Yan, Yaya Shi, Jiabo Ye, Yuanhong Xu, Chenliang Li, Bin Bi, Qi Qian, Wei Wang, Guohai Xu, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou

In contrast to predominant paradigms of solely relying on sequence-to-sequence generation or encoder-based instance discrimination, mPLUG-2 introduces a multi-module composition network by sharing common universal modules for modality collaboration and disentangling different modality modules to deal with modality entanglement.

Action Classification Image Classification +7

Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning

1 code implementation31 Jan 2023 Yunhu Ye, Binyuan Hui, Min Yang, Binhua Li, Fei Huang, Yongbin Li

To alleviate the above challenges, we exploit large language models (LLMs) as decomposers for effective table-based reasoning, which (i) decompose huge evidence (a huge table) into sub-evidence (a small table) to mitigate the interference of useless information for table reasoning; and (ii) decompose complex questions into simpler sub-questions for text reasoning.

Hallucination Semantic Parsing +1

One Model for All Domains: Collaborative Domain-Prefix Tuning for Cross-Domain NER

2 code implementations25 Jan 2023 Xiang Chen, Lei LI, Shuofei Qiao, Ningyu Zhang, Chuanqi Tan, Yong Jiang, Fei Huang, Huajun Chen

Previous typical solutions mainly obtain a NER model by pre-trained language models (PLMs) with data from a rich-resource domain and adapt it to the target domain.

NER Text Generation

Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing

1 code implementation18 Jan 2023 Jinyang Li, Binyuan Hui, Reynold Cheng, Bowen Qin, Chenhao Ma, Nan Huo, Fei Huang, Wenyu Du, Luo Si, Yongbin Li

Recently, the pre-trained text-to-text transformer model, namely T5, though not specialized for text-to-SQL parsing, has achieved state-of-the-art performance on standard benchmarks targeting domain generalization.

Domain Generalization Inductive Bias +3

MGeo: Multi-Modal Geographic Pre-Training Method

1 code implementation11 Jan 2023 Ruixue Ding, Boli Chen, Pengjun Xie, Fei Huang, Xin Li, Qiang Zhang, Yao Xu

Single-modal PTMs can barely make use of the important GC and therefore have limited performance.

Language Modelling

Learning Trajectory-Word Alignments for Video-Language Tasks

no code implementations ICCV 2023 Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang

To amend this, we propose a novel TW-BERT to learn Trajectory-Word alignment by a newly designed trajectory-to-word (T2W) attention for solving video-language tasks.

Question Answering Retrieval +4

BUS: Efficient and Effective Vision-Language Pre-Training with Bottom-Up Patch Summarization.

no code implementations ICCV 2023 Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang

In this paper, we propose a Bottom-Up Patch Summarization approach named BUS which is inspired by the Document Summarization Task in NLP to learn a concise visual summary of lengthy visual token sequences, guided by textual semantics.

Abstractive Text Summarization Document Summarization

HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training

no code implementations ICCV 2023 Qinghao Ye, Guohai Xu, Ming Yan, Haiyang Xu, Qi Qian, Ji Zhang, Fei Huang

We achieve state-of-the-art results on 15 well-established video-language understanding and generation tasks, especially on temporal-oriented datasets (e. g., SSv2-Template and SSv2-Label) with 8. 6% and 11. 1% improvement respectively.

TGIF-Action TGIF-Frame +7

Reasoning with Language Model Prompting: A Survey

2 code implementations19 Dec 2022 Shuofei Qiao, Yixin Ou, Ningyu Zhang, Xiang Chen, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, Huajun Chen

Reasoning, as an essential ability for complex problem-solving, can provide back-end support for various real-world applications, such as medical diagnosis, negotiation, etc.

Arithmetic Reasoning Common Sense Reasoning +4

HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation

1 code implementation17 Dec 2022 Hongyi Yuan, Zheng Yuan, Chuanqi Tan, Fei Huang, Songfang Huang

Unlike previous works that only add noise to inputs or parameters, we argue that the hidden representations of Transformers layers convey more diverse and meaningful language information.

Language Modelling Natural Language Inference

Chaining Simultaneous Thoughts for Numerical Reasoning

no code implementations29 Nov 2022 Zhihong Shao, Fei Huang, Minlie Huang

Given that rich information is hidden behind ubiquitous numbers in text, numerical reasoning over text should be an essential skill of AI systems.

Towards Generalizable and Robust Text-to-SQL Parsing

1 code implementation23 Oct 2022 Chang Gao, Bowen Li, Wenxuan Zhang, Wai Lam, Binhua Li, Fei Huang, Luo Si, Yongbin Li

Text-to-SQL parsing tackles the problem of mapping natural language questions to executable SQL queries.

SQL Parsing Text-To-SQL

STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing

1 code implementation21 Oct 2022 ZeFeng Cai, Xiangyu Li, Binyuan Hui, Min Yang, Bowen Li, Binhua Li, Zheng Cao, Weijie Li, Fei Huang, Luo Si, Yongbin Li

Concretely, we propose two novel pre-training objectives which respectively explore the context-dependent interactions of NL utterances and SQL queries within each text-to-SQL conversation: (i) schema state tracking (SST) objective that tracks and explores the schema states of context-dependent SQL queries in the form of schema-states by predicting and updating the value of each schema slot during interaction; (ii) utterance dependency tracking (UDT) objective that employs weighted contrastive learning to pull together two semantically similar NL utterances and push away the representations of semantically dissimilar NL utterances within each conversation.

Contrastive Learning SQL Parsing +1

Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots

no code implementations20 Oct 2022 Haomin Fu, Yeqin Zhang, Haiyang Yu, Jian Sun, Fei Huang, Luo Si, Yongbin Li, Cam-Tu Nguyen

This paper introduces Doc2Bot, a novel dataset for building machines that help users seek information via conversations.

dialog state tracking Response Generation

Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning

2 code implementations19 Oct 2022 Hongqiu Wu, Ruixue Ding, Hai Zhao, Boli Chen, Pengjun Xie, Fei Huang, Min Zhang

Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling, which serves the ultimate purpose of pre-trained language models (PrLMs), generalizing well on a mass of scenarios.

Language Modelling Meta-Learning

Entity-to-Text based Data Augmentation for various Named Entity Recognition Tasks

no code implementations19 Oct 2022 Xuming Hu, Yong Jiang, Aiwei Liu, Zhongqiang Huang, Pengjun Xie, Fei Huang, Lijie Wen, Philip S. Yu

Data augmentation techniques have been used to alleviate the problem of scarce labeled data in various NER tasks (flat, nested, and discontinuous NER tasks).

Data Augmentation named-entity-recognition +3

SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation

1 code implementation14 Sep 2022 Wanwei He, Yinpei Dai, Min Yang, Jian Sun, Fei Huang, Luo Si, Yongbin Li

To capture the structured dialog semantics, we pre-train the dialog understanding module via a novel tree-induced semi-supervised contrastive learning objective with the help of extra dialog annotations.

Contrastive Learning dialog state tracking +1

SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers

1 code implementation COLING 2022 Bowen Qin, Lihan Wang, Binyuan Hui, Bowen Li, Xiangpeng Wei, Binhua Li, Fei Huang, Luo Si, Min Yang, Yongbin Li

To improve the generalizability and stability of neural text-to-SQL parsers, we propose a model uncertainty constraint to refine the query representations by enforcing the output representations of different perturbed encoding networks to be consistent with each other.

SQL Parsing Text-To-SQL

A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions

no code implementations29 Aug 2022 Bowen Qin, Binyuan Hui, Lihan Wang, Min Yang, Jinyang Li, Binhua Li, Ruiying Geng, Rongyu Cao, Jian Sun, Luo Si, Fei Huang, Yongbin Li

In recent years, deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output SQL query.

SQL Parsing Text-To-SQL

Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing

2 code implementations28 Jun 2022 Lihan Wang, Bowen Qin, Binyuan Hui, Bowen Li, Min Yang, Bailin Wang, Binhua Li, Fei Huang, Luo Si, Yongbin Li

The importance of building text-to-SQL parsers which can be applied to new databases has long been acknowledged, and a critical step to achieve this goal is schema linking, i. e., properly recognizing mentions of unseen columns or tables when generating SQLs.

SQL Parsing Text-To-SQL

Adversarial Self-Attention for Language Understanding

1 code implementation25 Jun 2022 Hongqiu Wu, Ruixue Ding, Hai Zhao, Pengjun Xie, Fei Huang, Min Zhang

Deep neural models (e. g. Transformer) naturally learn spurious features, which create a ``shortcut'' between the labels and inputs, thus impairing the generalization and robustness.

Machine Reading Comprehension Named Entity Recognition (NER) +4

On the Learning of Non-Autoregressive Transformers

no code implementations13 Jun 2022 Fei Huang, Tianhua Tao, Hao Zhou, Lei LI, Minlie Huang

Non-autoregressive Transformer (NAT) is a family of text generation models, which aims to reduce the decoding latency by predicting the whole sentences in parallel.

Text Generation

Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems

no code implementations30 May 2022 Ting-En Lin, Yuchuan Wu, Fei Huang, Luo Si, Jian Sun, Yongbin Li

In this paper, we present Duplex Conversation, a multi-turn, multimodal spoken dialogue system that enables telephone-based agents to interact with customers like a human.

Data Augmentation Spoken Dialogue Systems

Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

2 code implementations29 May 2022 Xiang Chen, Lei LI, Ningyu Zhang, Xiaozhuan Liang, Shumin Deng, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

Specifically, vanilla prompt learning may struggle to utilize atypical instances by rote during fully-supervised training or overfit shallow patterns with low-shot data.

Few-Shot Text Classification Memorization +5

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections

3 code implementations24 May 2022 Chenliang Li, Haiyang Xu, Junfeng Tian, Wei Wang, Ming Yan, Bin Bi, Jiabo Ye, Hehong Chen, Guohai Xu, Zheng Cao, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou, Luo Si

Large-scale pretrained foundation models have been an emerging paradigm for building artificial intelligence (AI) systems, which can be quickly adapted to a wide range of downstream tasks.

Computational Efficiency Image Captioning +6

Directed Acyclic Transformer for Non-Autoregressive Machine Translation

1 code implementation16 May 2022 Fei Huang, Hao Zhou, Yang Liu, Hang Li, Minlie Huang

Non-autoregressive Transformers (NATs) significantly reduce the decoding latency by generating all tokens in parallel.

Knowledge Distillation Machine Translation +1

Meta-Learning Based Knowledge Extrapolation for Knowledge Graphs in the Federated Setting

1 code implementation10 May 2022 Mingyang Chen, Wen Zhang, Zhen Yao, Xiangnan Chen, Mengxiao Ding, Fei Huang, Huajun Chen

We study the knowledge extrapolation problem to embed new components (i. e., entities and relations) that come with emerging knowledge graphs (KGs) in the federated setting.

Knowledge Graphs Link Prediction +1

Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction

1 code implementation7 May 2022 Xiang Chen, Ningyu Zhang, Lei LI, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

To deal with these issues, we propose a novel Hierarchical Visual Prefix fusion NeTwork (HVPNeT) for visual-enhanced entity and relation extraction, aiming to achieve more effective and robust performance.

named-entity-recognition Named Entity Recognition +3

Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning

1 code implementation4 May 2022 Xiang Chen, Lei LI, Ningyu Zhang, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

Note that the previous parametric learning paradigm can be viewed as memorization regarding training data as a book and inference as the close-book test.

Few-Shot Learning Memorization +3

Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion

1 code implementation4 May 2022 Xiang Chen, Ningyu Zhang, Lei LI, Shumin Deng, Chuanqi Tan, Changliang Xu, Fei Huang, Luo Si, Huajun Chen

Since most MKGs are far from complete, extensive knowledge graph completion studies have been proposed focusing on the multimodal entity, relation extraction and link prediction.

Information Retrieval Link Prediction +4

MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction

2 code implementations NAACL 2022 Yue Zhang, Zhenghua Li, Zuyi Bao, Jiacheng Li, Bo Zhang, Chen Li, Fei Huang, Min Zhang

This paper presents MuCGEC, a multi-reference multi-source evaluation dataset for Chinese Grammatical Error Correction (CGEC), consisting of 7, 063 sentences collected from three Chinese-as-a-Second-Language (CSL) learner sources.

Grammatical Error Correction Sentence

Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base

no code implementations17 Apr 2022 Cunxiang Wang, Fuli Luo, Yanyang Li, Runxin Xu, Fei Huang, Yue Zhang

Pre-trained language models (PLMs) like BERT have made significant progress in various downstream NLP tasks.

Self-Supervised Learning

Image Captioning In the Transformer Age

1 code implementation15 Apr 2022 Yang Xu, Li Li, Haiyang Xu, Songfang Huang, Fei Huang, Jianfei Cai

This drawback inspires the researchers to develop a homogeneous architecture that facilitates end-to-end training, for which Transformer is the perfect one that has proven its huge potential in both vision and language domains and thus can be used as the basic component of the visual encoder and language decoder in an IC pipeline.

Image Captioning Self-Supervised Learning

Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency

no code implementations ACL 2022 Yanyang Li, Fuli Luo, Runxin Xu, Songfang Huang, Fei Huang, LiWei Wang

Structured pruning has been extensively studied on monolingual pre-trained language models and is yet to be fully evaluated on their multilingual counterparts.