Search Results for author: Jianfeng Gao

Found 322 papers, 190 papers with code

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

9 code implementations • ICLR 2021 • Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen

Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks.

Ranked #1 on Common Sense Reasoning on SWAG

Common Sense Reasoning Coreference Resolution +10

124,889

Paper
Code

On the Variance of the Adaptive Learning Rate and Beyond

21 code implementations • ICLR 2020 • Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Jiawei Han

The learning rate warmup heuristic achieves remarkable success in stabilizing training, accelerating convergence and improving generalization for adaptive stochastic optimization algorithms like RMSprop and Adam.

Image Classification Language Modelling +3

47,906

Paper
Code

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

11 code implementations • 27 Jul 2016 • Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, Jianfeng Gao

In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base.

Face Recognition Image Captioning

21,246

Paper
Code

Embedding Entities and Relations for Learning and Inference in Knowledge Bases

9 code implementations • 20 Dec 2014 • Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, Li Deng

We consider learning representations of entities and relations in KBs using the neural-embedding approach.

Ranked #10 on Link Prediction on UMLS

Link Prediction

20,095

Paper
Code

Unified Language Model Pre-training for Natural Language Understanding and Generation

9 code implementations • NeurIPS 2019 • Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.

Ranked #2 on Generative Question Answering on CoQA (using extra training data)

Abstractive Text Summarization Document Summarization +7

18,315

Paper
Code

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

3 code implementations • 28 Feb 2020 • Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM).

Ranked #4 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Language Modelling +3

18,315

Paper
Code

Pseudo-Masked Language Models for Unified Language Model Pre-Training

1 code implementation • ICML 2020 • Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

Language Modelling Natural Language Understanding +1

18,315

Paper
Code

MIND: A Large-scale Dataset for News Recommendation

2 code implementations • ACL 2020 • Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, Ming Zhou

News recommendation is an important technique for personalized news service.

News Recommendation

17,957

Paper
Code

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

1 code implementation • 18 Sep 2023 • Yadong Lu, Chunyuan Li, Haotian Liu, Jianwei Yang, Jianfeng Gao, Yelong Shen

We find that scaling LMM consistently enhances model performance and improves language capabilities, and performance of LoRA/QLoRA tuning of LMM are comparable to the performance of full-model fine-tuning.

Ranked #47 on Visual Question Answering on MM-Vet

Visual Question Answering

16,058

Paper
Code

Segment Everything Everywhere All at Once

2 code implementations • NeurIPS 2023 • Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, JianFeng Wang, Lijuan Wang, Jianfeng Gao, Yong Jae Lee

In SEEM, we propose a novel decoding mechanism that enables diverse prompting for all types of segmentation tasks, aiming at a universal segmentation interface that behaves like large language models (LLMs).

Image Segmentation Interactive Segmentation +4

13,443

Paper
Code

Focal Modulation Networks

6 code implementations • 22 Mar 2022 • Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao

For semantic segmentation with UPerNet, FocalNet base at single-scale outperforms Swin by 2. 4, and beats Swin at multi-scale (50. 5 v. s.

Ranked #8 on Object Detection on COCO minival (using extra training data)

Image Classification Object Detection +2

12,048

Paper
Code

Learning deep structured semantic models for web search using clickthrough data

5 code implementations • CIKM 2013 • Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, Larry Heck

The proposed deep structured semantic models are discriminatively trained by maximizing the conditional likelihood of the clicked documents given a query using the clickthrough data.

Document Ranking

4,096

Paper
Code

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V

3 code implementations • 17 Oct 2023 • Jianwei Yang, Hao Zhang, Feng Li, Xueyan Zou, Chunyuan Li, Jianfeng Gao

We present Set-of-Mark (SoM), a new visual prompting method, to unleash the visual grounding abilities of large multimodal models (LMMs), such as GPT-4V.

Interactive Segmentation Referring Expression +4

4,031

Paper
Code

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

2 code implementations • 1 Nov 2023 • Wei-Ge Chen, Irina Spiridonova, Jianwei Yang, Jianfeng Gao, Chunyuan Li

LLaVA-Interactive is a research prototype for multimodal human-AI interaction.

Image Generation Image Segmentation +1

4,031

Paper
Code

Instruction Tuning with GPT-4

2 code implementations • 6 Apr 2023 • Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao

Prior work has shown that finetuning large language models (LLMs) using machine-generated instruction-following data enables such models to achieve remarkable zero-shot capabilities on new tasks, and no human-written instructions are needed.

Instruction Following

3,961

Paper
Code

DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

6 code implementations • 1 Nov 2019 • Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan

We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer).

Conversational Response Generation Response Generation

2,321

Paper
Code

DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation

1 code implementation • ACL 2020 • Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan

We present a large, tunable neural conversational response generation model, DIALOGPT (dialogue generative pre-trained transformer).

Conversational Response Generation Response Generation

2,321

Paper
Code

Multi-Task Deep Neural Networks for Natural Language Understanding

7 code implementations • ACL 2019 • Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao

In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks.

Ranked #2 on Natural Language Inference on SciTail

Domain Adaptation Language Modelling +5

2,199

Paper
Code

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

3 code implementations • 20 Apr 2019 • Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao

This paper explores the use of knowledge distillation to improve a Multi-Task Deep Neural Network (MT-DNN) (Liu et al., 2019) for learning text representations across multiple natural language understanding tasks.

Ranked #1 on Semantic Textual Similarity on SentEval

Ensemble Learning Knowledge Distillation +5

2,199

Paper
Code

A Hybrid Neural Network Model for Commonsense Reasoning

3 code implementations • WS 2019 • Pengcheng He, Xiaodong Liu, Weizhu Chen, Jianfeng Gao

An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERT-based contextual encoder but use different model-specific input and output layers.

Ranked #1 on Natural Language Understanding on PDP60

Common Sense Reasoning Coreference Resolution +6

2,199

Paper
Code

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

6 code implementations • ACL 2020 • Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, Tuo Zhao

However, due to limited data resources from downstream tasks and the extremely large capacity of pre-trained models, aggressive fine-tuning often causes the adapted model to overfit the data of downstream tasks and forget the knowledge of the pre-trained model.

Ranked #1 on Natural Language Inference on QNLI

Linguistic Acceptability Natural Language Inference +4

2,199

Paper
Code

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

3 code implementations • ACL 2020 • Xiaodong Liu, Yu Wang, Jianshu ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models.

Knowledge Distillation Multi-Task Learning +2

2,199

Paper
Code

Adversarial Training for Large Neural Language Models

3 code implementations • 20 Apr 2020 • Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao

In natural language processing (NLP), pre-training large neural language models such as BERT have demonstrated impressive gain in generalization for a variety of tasks, with further improvement from adversarial fine-tuning.

Ranked #6 on Natural Language Inference on ANLI test (using extra training data)

Natural Language Inference Natural Language Understanding

2,199

Paper
Code

Posterior Differential Regularization with f-divergence for Improving Model Robustness

2 code implementations • NAACL 2021 • Hao Cheng, Xiaodong Liu, Lis Pereira, YaoLiang Yu, Jianfeng Gao

Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework.

Domain Generalization

2,199

Paper
Code

Targeted Adversarial Training for Natural Language Understanding

1 code implementation • NAACL 2021 • Lis Pereira, Xiaodong Liu, Hao Cheng, Hoifung Poon, Jianfeng Gao, Ichiro Kobayashi

We present a simple yet effective Targeted Adversarial Training (TAT) algorithm to improve adversarial training for natural language understanding.

Natural Language Understanding

2,199

Paper
Code

Grounded Language-Image Pre-training

2 code implementations • CVPR 2022 • Liunian Harold Li, Pengchuan Zhang, Haotian Zhang, Jianwei Yang, Chunyuan Li, Yiwu Zhong, Lijuan Wang, Lu Yuan, Lei Zhang, Jenq-Neng Hwang, Kai-Wei Chang, Jianfeng Gao

The unification brings two benefits: 1) it allows GLIP to learn from both detection and grounding data to improve both tasks and bootstrap a good grounding model; 2) GLIP can leverage massive image-text pairs by generating grounding boxes in a self-training fashion, making the learned representation semantic-rich.

Ranked #1 on 2D Object Detection on RF100

Described Object Detection Few-Shot Object Detection +1

1,952

Paper
Code

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

8 code implementations • 19 Apr 2022 • Chunyuan Li, Haotian Liu, Liunian Harold Li, Pengchuan Zhang, Jyoti Aneja, Jianwei Yang, Ping Jin, Houdong Hu, Zicheng Liu, Yong Jae Lee, Jianfeng Gao

In general, these language-augmented visual models demonstrate strong transferability to a variety of datasets and tasks.

Ranked #1 on Object Detection on ELEVATER

Fairness Few-Shot Image Classification +4

1,952

Paper
Code

GLIPv2: Unifying Localization and Vision-Language Understanding

1 code implementation • 12 Jun 2022 • Haotian Zhang, Pengchuan Zhang, Xiaowei Hu, Yen-Chun Chen, Liunian Harold Li, Xiyang Dai, Lijuan Wang, Lu Yuan, Jenq-Neng Hwang, Jianfeng Gao

We present GLIPv2, a grounded VL understanding model, that serves both localization tasks (e. g., object detection, instance segmentation) and Vision-Language (VL) understanding tasks (e. g., VQA, image captioning).

Ranked #1 on Phrase Grounding on Flickr30k Entities Test (using extra training data)

Contrastive Learning Image Captioning +7

1,952

Paper
Code

Semantic-SAM: Segment and Recognize Anything at Any Granularity

1 code implementation • 10 Jul 2023 • Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang, Jianfeng Gao

In this paper, we introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.

Image Segmentation Segmentation +1

1,915

Paper
Code

Visual In-Context Prompting

3 code implementations • 22 Nov 2023 • Feng Li, Qing Jiang, Hao Zhang, Tianhe Ren, Shilong Liu, Xueyan Zou, Huaizhe xu, Hongyang Li, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao

In-context prompting in large language models (LLMs) has become a prevalent approach to improve zero-shot capabilities, but this idea is less explored in the vision domain.

Segmentation Visual Prompting

1,915

Paper
Code

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

2 code implementations • 18 Nov 2021 • Pengcheng He, Jianfeng Gao, Weizhu Chen

We thus propose a new gradient-disentangled embedding sharing method that avoids the tug-of-war dynamics, improving both training efficiency and the quality of the pre-trained model.

Ranked #1 on Natural Language Inference on MRPC

Natural Language Inference Natural Language Understanding +2

1,842

Paper
Code

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

1 code implementation • 21 Aug 2022 • Pengcheng He, Baolin Peng, Liyang Lu, Song Wang, Jie Mei, Yang Liu, Ruochen Xu, Hany Hassan Awadalla, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang

Z-Code++ creates new state of the art on 9 out of 13 text summarization tasks across 5 languages.

Abstractive Text Summarization Language Modelling +1

1,842

Paper
Code

GLIGEN: Open-Set Grounded Text-to-Image Generation

1 code implementation • CVPR 2023 • Yuheng Li, Haotian Liu, Qingyang Wu, Fangzhou Mu, Jianwei Yang, Jianfeng Gao, Chunyuan Li, Yong Jae Lee

Large-scale text-to-image diffusion models have made amazing advances.

Ranked #4 on Conditional Text-to-Image Synthesis on COCO-MIG

Conditional Text-to-Image Synthesis Image Inpainting

1,787

Paper
Code

Generalized Decoding for Pixel, Image, and Language

1 code implementation • CVPR 2023 • Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, JianFeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, Jianfeng Gao

We present X-Decoder, a generalized decoding model that can predict pixel-level segmentation and language tokens seamlessly.

Ranked #4 on Instance Segmentation on ADE20K val (using extra training data)

Image Segmentation Panoptic Segmentation +3

1,245

Paper
Code

A Simple Framework for Open-Vocabulary Segmentation and Detection

2 code implementations • ICCV 2023 • Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang

We present OpenSeeD, a simple Open-vocabulary Segmentation and Detection framework that jointly learns from different segmentation and detection datasets.

Ranked #2 on Instance Segmentation on ADE20K val (using extra training data)

Instance Segmentation Panoptic Segmentation +2

1,245

Paper
Code

Unified Vision-Language Pre-Training for Image Captioning and VQA

3 code implementations • 24 Sep 2019 • Luowei Zhou, Hamid Palangi, Lei Zhang, Houdong Hu, Jason J. Corso, Jianfeng Gao

The model is unified in that (1) it can be fine-tuned for either vision-language generation (e. g., image captioning) or understanding (e. g., visual question answering) tasks, and (2) it uses a shared multi-layer transformer network for both encoding and decoding, which differs from many existing methods where the encoder and decoder are implemented using separate models.

Ranked #1 on Image Captioning on Flickr30k Captions test

Image Captioning Question Answering +2

1,199

Paper
Code

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

4 code implementations • ECCV 2020 • Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiao-Wei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, Yejin Choi, Jianfeng Gao

Large-scale pre-training methods of learning cross-modal representations on image-text pairs are becoming popular for vision-language tasks.

Ranked #1 on Image Retrieval on MS COCO (Recall@10 metric)

Image Captioning Image Retrieval +3

1,199

Paper
Code

Focal Self-attention for Local-Global Interactions in Vision Transformers

3 code implementations • 1 Jul 2021 • Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Xiyang Dai, Bin Xiao, Lu Yuan, Jianfeng Gao

With focal self-attention, we propose a new variant of Vision Transformer models, called Focal Transformer, which achieves superior performance over the state-of-the-art vision Transformers on a range of public image classification and object detection benchmarks.

Ranked #17 on Instance Segmentation on COCO test-dev

Image Classification Instance Segmentation +3

1,185

Paper
Code

Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

1 code implementation • NeurIPS 2021 • Ge Yang, Edward Hu, Igor Babuschkin, Szymon Sidor, Xiaodong Liu, David Farhi, Nick Ryder, Jakub Pachocki, Weizhu Chen, Jianfeng Gao

Hyperparameter (HP) tuning in deep learning is an expensive process, prohibitively so for neural networks (NNs) with billions of parameters. We show that, in the recently discovered Maximal Update Parametrization ($\mu$P), many optimal HPs remain stable even as model size changes.

1,166

Paper
Code

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

3 code implementations • 7 Mar 2022 • Greg Yang, Edward J. Hu, Igor Babuschkin, Szymon Sidor, Xiaodong Liu, David Farhi, Nick Ryder, Jakub Pachocki, Weizhu Chen, Jianfeng Gao

Hyperparameter (HP) tuning in deep learning is an expensive process, prohibitively so for neural networks (NNs) with billions of parameters.

1,166

Paper
Code

VinVL: Revisiting Visual Representations in Vision-Language Models

7 code implementations • CVPR 2021 • Pengchuan Zhang, Xiujun Li, Xiaowei Hu, Jianwei Yang, Lei Zhang, Lijuan Wang, Yejin Choi, Jianfeng Gao

In our experiments we feed the visual features generated by the new object detection model into a Transformer-based VL fusion model \oscar \cite{li2020oscar}, and utilize an improved approach \short\ to pre-train the VL model and fine-tune it on a wide range of downstream VL tasks.

Ranked #2 on Image-text matching on CommercialAdsDataset

Image Captioning Image-text matching +4

1,027

Paper
Code

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models

1 code implementation • NeurIPS 2023 • Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao

At the heart of Chameleon is an LLM-based planner that assembles a sequence of tools to execute to generate the final response.

Logical Reasoning

1,017

Paper
Code

Vision-Language Pre-training: Basics, Recent Advances, and Future Trends

1 code implementation • 17 Oct 2022 • Zhe Gan, Linjie Li, Chunyuan Li, Lijuan Wang, Zicheng Liu, Jianfeng Gao

This paper surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years.

Few-Shot Learning Image Captioning +11

999

Paper
Code

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

1 code implementation • 18 Sep 2023 • Chunyuan Li, Zhe Gan, Zhengyuan Yang, Jianwei Yang, Linjie Li, Lijuan Wang, Jianfeng Gao

This paper presents a comprehensive survey of the taxonomy and evolution of multimodal foundation models that demonstrate vision and vision-language capabilities, focusing on the transition from specialist models to general-purpose assistants.

Text-to-Image Generation

999

Paper
Code

A Diversity-Promoting Objective Function for Neural Conversation Models

15 code implementations • NAACL 2016 • Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, Bill Dolan

Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e. g., "I don't know") regardless of the input.

Conversational Response Generation Response Generation

892

Paper
Code

GODEL: Large-Scale Pre-Training for Goal-Directed Dialog

1 code implementation • 22 Jun 2022 • Baolin Peng, Michel Galley, Pengcheng He, Chris Brockett, Lars Liden, Elnaz Nouri, Zhou Yu, Bill Dolan, Jianfeng Gao

We introduce GODEL (Grounded Open Dialogue Language Model), a large pre-trained language model for dialog.

Language Modelling Open-Domain Dialog

834

Paper
Code

ConvLab: Multi-Domain End-to-End Dialog System Platform

2 code implementations • ACL 2019 • Sungjin Lee, Qi Zhu, Ryuichi Takanobu, Xiang Li, Yaoqin Zhang, Zheng Zhang, Jinchao Li, Baolin Peng, Xiujun Li, Minlie Huang, Jianfeng Gao

We present ConvLab, an open-source multi-domain end-to-end dialog system platform, that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches, ranging from conventional pipeline systems to end-to-end neural models, in common environments.

819

Paper
Code

End-to-End Task-Completion Neural Dialogue Systems

13 code implementations • IJCNLP 2017 • Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, Asli Celikyilmaz

One of the major drawbacks of modularized task-completion dialogue systems is that each module is trained individually, which presents several challenges.

Chatbot

807

Paper
Code

A User Simulator for Task-Completion Dialogues

10 code implementations • 17 Dec 2016 • Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen

Then, one can train reinforcement learning agents in an online fashion as they interact with the simulator.

reinforcement-learning Reinforcement Learning (RL) +2

807

Paper
Code

RegionCLIP: Region-based Language-Image Pretraining

1 code implementation • CVPR 2022 • Yiwu Zhong, Jianwei Yang, Pengchuan Zhang, Chunyuan Li, Noel Codella, Liunian Harold Li, Luowei Zhou, Xiyang Dai, Lu Yuan, Yin Li, Jianfeng Gao

However, we show that directly applying such models to recognize image regions for object detection leads to poor performance due to a domain shift: CLIP was trained to match an image as a whole to a text description, without capturing the fine-grained alignment between image regions and text spans.

Ranked #11 on Open Vocabulary Object Detection on MSCOCO (using extra training data)

Image Classification Object +3

644

Paper
Code

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

1 code implementation • 9 Nov 2023 • Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li

LLaVA-Plus is a general-purpose multimodal assistant that expands the capabilities of large multimodal models.

Ranked #1 on LMM real-life tasks on Leaderboard

Instruction Following LLM real-life tasks +3

621

Paper
Code

Deep Learning Based Text Classification: A Comprehensive Review

2 code implementations • 6 Apr 2020 • Shervin Minaee, Nal Kalchbrenner, Erik Cambria, Narjes Nikzad, Meysam Chenaghlu, Jianfeng Gao

Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural language inference.

BIG-bench Machine Learning General Classification +5

570

Paper
Code

Focal Attention for Long-Range Interactions in Vision Transformers

1 code implementation • NeurIPS 2021 • Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Xiyang Dai, Bin Xiao, Lu Yuan, Jianfeng Gao

With focal attention, we propose a new variant of Vision Transformer models, called Focal Transformers, which achieve superior performance over the state-of-the-art (SoTA) Vision Transformers on a range of public image classification and object detection benchmarks.

Image Classification object-detection +2

542

Paper
Code

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

1 code implementation • 27 Feb 2024 • Yixin Liu, Kai Zhang, Yuan Li, Zhiling Yan, Chujie Gao, Ruoxi Chen, Zhengqing Yuan, Yue Huang, Hanchi Sun, Jianfeng Gao, Lifang He, Lichao Sun

Sora is a text-to-video generative AI model, released by OpenAI in February 2024.

Marketing Video Generation

449

Paper
Code

CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search

3 code implementations • 3 Nov 2020 • Chenyan Xiong, Zhenghao Liu, Si Sun, Zhuyun Dai, Kaitao Zhang, Shi Yu, Zhiyuan Liu, Hoifung Poon, Jianfeng Gao, Paul Bennett

Neural rankers based on deep pretrained language models (LMs) have been shown to improve many information retrieval benchmarks.

Domain Adaptation Few-Shot Learning +2

443

Paper
Code

ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems

1 code implementation • ACL 2020 • Qi Zhu, Zheng Zhang, Yan Fang, Xiang Li, Ryuichi Takanobu, Jinchao Li, Baolin Peng, Jianfeng Gao, Xiaoyan Zhu, Minlie Huang

We present ConvLab-2, an open-source toolkit that enables researchers to build task-oriented dialogue systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems.

Task-Oriented Dialogue Systems

442

Paper
Code

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

3 code implementations • ICCV 2021 • Pengchuan Zhang, Xiyang Dai, Jianwei Yang, Bin Xiao, Lu Yuan, Lei Zhang, Jianfeng Gao

This paper presents a new Vision Transformer (ViT) architecture Multi-Scale Vision Longformer, which significantly enhances the ViT of \cite{dosovitskiy2020image} for encoding high-resolution images using two techniques.

Ranked #45 on Instance Segmentation on COCO minival

Image Classification Instance Segmentation +2

403

Paper
Code

Efficient Self-supervised Vision Transformers for Representation Learning

1 code implementation • ICLR 2022 • Chunyuan Li, Jianwei Yang, Pengchuan Zhang, Mei Gao, Bin Xiao, Xiyang Dai, Lu Yuan, Jianfeng Gao

This paper investigates two techniques for developing efficient self-supervised vision transformers (EsViT) for visual representation learning.

Ranked #16 on Self-Supervised Image Classification on ImageNet

Representation Learning Self-Supervised Image Classification

403

Paper
Code

Image Scene Graph Generation (SGG) Benchmark

1 code implementation • 27 Jul 2021 • Xiaotian Han, Jianwei Yang, Houdong Hu, Lei Zhang, Jianfeng Gao, Pengchuan Zhang

There is a surge of interest in image scene graph generation (object, attribute and relationship detection) due to the need of building fine-grained image understanding models that go beyond object detection.

Attribute Graph Generation +6

375

Paper
Code

Florence: A New Foundation Model for Computer Vision

1 code implementation • 22 Nov 2021 • Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, JianFeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang

Computer vision foundation models, which are trained on diverse, large-scale dataset and can be adapted to a wide range of downstream tasks, are critical for this mission to solve real-world computer vision applications.

Ranked #1 on Action Recognition In Videos on Kinetics-600

Action Classification Action Recognition In Videos +12

367

Paper
Code

Unified Contrastive Learning in Image-Text-Label Space

1 code implementation • CVPR 2022 • Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Bin Xiao, Ce Liu, Lu Yuan, Jianfeng Gao

Particularly, it attains gains up to 9. 2% and 14. 5% in average on zero-shot recognition benchmarks over the language-image contrastive learning and supervised learning methods, respectively.

Contrastive Learning Image Classification +2

367

Paper
Code

K-LITE: Learning Transferable Visual Models with External Knowledge

2 code implementations • 20 Apr 2022 • Sheng Shen, Chunyuan Li, Xiaowei Hu, Jianwei Yang, Yujia Xie, Pengchuan Zhang, Zhe Gan, Lijuan Wang, Lu Yuan, Ce Liu, Kurt Keutzer, Trevor Darrell, Anna Rohrbach, Jianfeng Gao

We propose K-LITE, a simple strategy to leverage external knowledge for building transferable visual systems: In training, it enriches entities in text with WordNet and Wiktionary knowledge, leading to an efficient and scalable approach to learning image representations that uses knowledge about the visual concepts.

Benchmarking Descriptive +4

367

Paper
Code

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

1 code implementation • EMNLP 2020 • Chunyuan Li, Xiang Gao, Yuan Li, Baolin Peng, Xiujun Li, Yizhe Zhang, Jianfeng Gao

We hope that our first pre-trained big VAE language model itself and results can help the NLP community renew the interests of deep generative models in the era of large-scale pre-training, and make these principled methods more practical.

Language Modelling Representation Learning +1

355

Paper
Code

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

12 code implementations • 28 Nov 2016 • Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, Tong Wang

The size of the dataset and the fact that the questions are derived from real user search queries distinguishes MS MARCO from other well-known publicly available datasets for machine reading comprehension and question-answering.

Benchmarking Machine Reading Comprehension +1

338

Paper
Code

Understanding the Difficulty of Training Transformers

2 code implementations • EMNLP 2020 • Liyuan Liu, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Jiawei Han

Transformers have proved effective in many NLP tasks.

Ranked #5 on Machine Translation on WMT2014 English-French

Machine Translation

322

Paper
Code

Very Deep Transformers for Neural Machine Translation

4 code implementations • 18 Aug 2020 • Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao

We explore the application of very deep Transformer models for Neural Machine Translation (NMT).

Ranked #1 on Machine Translation on WMT2014 English-French (using extra training data)

Machine Translation NMT +1

322

Paper
Code

Object-driven Text-to-Image Synthesis via Adversarial Training

1 code implementation • CVPR 2019 • Wenbo Li, Pengchuan Zhang, Lei Zhang, Qiuyuan Huang, Xiaodong He, Siwei Lyu, Jianfeng Gao

In this paper, we propose Object-driven Attentive Generative Adversarial Newtorks (Obj-GANs) that allow object-centered text-to-image synthesis for complex scenes.

Image Generation Object

283

Paper
Code

TrustLLM: Trustworthiness in Large Language Models

1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.

Ethics Fairness

271

Paper
Code

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models

1 code implementation • 5 Dec 2023 • Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang

To address this issue, we have created GVC data that allows for the combination of grounding and chat capabilities.

233

Paper
Code

StoryGAN: A Sequential Conditional GAN for Story Visualization

1 code implementation • CVPR 2019 • Yitong Li, Zhe Gan, Yelong Shen, Jingjing Liu, Yu Cheng, Yuexin Wu, Lawrence Carin, David Carlson, Jianfeng Gao

We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework.

Sentence Story Visualization +1

231

Paper
Code

Deep Reinforcement Learning for Dialogue Generation

8 code implementations • EMNLP 2016 • Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, Dan Jurafsky

Recent neural models of dialogue generation offer great promise for generating responses for conversational agents, but tend to be shortsighted, predicting utterances one at a time while ignoring their influence on future outcomes.

Dialogue Generation Policy Gradient Methods +2

194

Paper
Code

Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems

2 code implementations • 29 Jul 2018 • Xiujun Li, Yu Wang, Siqi Sun, Sarah Panda, Jingjing Liu, Jianfeng Gao

This proposal introduces a Dialogue Challenge for building end-to-end task-completion dialogue systems, with the goal of encouraging the dialogue research community to collaborate and benchmark on standard datasets and unified experimental environment.

192

Paper
Code

Few-shot Natural Language Generation for Task-Oriented Dialog

2 code implementations • Findings of the Association for Computational Linguistics 2020 • Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao

It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains.

Ranked #4 on Data-to-Text Generation on MULTIWOZ 2.1

Data-to-Text Generation Few-Shot Learning

188

Paper
Code

Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access

1 code implementation • ACL 2017 • Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, Li Deng

In this paper, we address this limitation by replacing symbolic queries with an induced "soft" posterior distribution over the KB that indicates which entities the user is interested in.

reinforcement-learning Reinforcement Learning (RL) +2

186

Paper
Code

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

1 code implementation • 3 Oct 2023 • Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao

To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks.

Chatbot Image Captioning +5

177

Paper
Code

A Knowledge-Grounded Neural Conversation Model

2 code implementations • 7 Feb 2017 • Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, Michel Galley

We generalize the widely-used Seq2Seq approach by conditioning responses on both conversation history and external "facts", allowing the model to be versatile and applicable in an open-domain setting.

Slot Filling

174

Paper
Code

Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing

2 code implementations • NAACL 2019 • Hao Fu, Chunyuan Li, Xiaodong Liu, Jianfeng Gao, Asli Celikyilmaz, Lawrence Carin

Variational autoencoders (VAEs) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks.

Language Modelling Response Generation +1

172

Paper
Code

Feature Quantization Improves GAN Training

2 code implementations • ICML 2020 • Yang Zhao, Chunyuan Li, Ping Yu, Jianfeng Gao, Changyou Chen

The instability in GAN training has been a long-standing problem despite remarkable research efforts.

Ranked #1 on Conditional Image Generation on CIFAR-100

Conditional Image Generation Face Generation +3

169

Paper
Code

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning

3 code implementations • ACL 2018 • Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Kam-Fai Wong, Shang-Yu Su

During dialogue policy learning, the world model is constantly updated with real user experience to approach real user behavior, and in turn, the dialogue agent is optimized using both real experience and simulated experience.

Reinforcement Learning (RL) Task-Completion Dialogue Policy Learning

150

Paper
Code

From Captions to Visual Concepts and Back

1 code implementation • CVPR 2015 • Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, C. Lawrence Zitnick, Geoffrey Zweig

The language model learns from a set of over 400, 000 image descriptions to capture the statistics of word usage.

Ranked #1 on Image Captioning on COCO Captions test

Image Captioning Language Modelling +3

150

Paper
Code

XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation

1 code implementation • 8 Jun 2021 • Subhabrata Mukherjee, Ahmed Hassan Awadallah, Jianfeng Gao

While deep and large pre-trained models are the state-of-the-art for various natural language processing tasks, their huge size poses significant challenges for practical uses in resource constrained settings.

Knowledge Distillation NER +1

150

Paper
Code

Stochastic Answer Networks for Machine Reading Comprehension

5 code implementations • ACL 2018 • Xiaodong Liu, Yelong Shen, Kevin Duh, Jianfeng Gao

We propose a simple yet robust stochastic answer network (SAN) that simulates multi-step reasoning in machine reading comprehension.

Ranked #24 on Question Answering on SQuAD1.1 dev

Machine Reading Comprehension Question Answering +2

148

Paper
Code

Stochastic Answer Networks for Natural Language Inference

3 code implementations • 21 Apr 2018 • Xiaodong Liu, Kevin Duh, Jianfeng Gao

We propose a stochastic answer network (SAN) to explore multi-step inference strategies in Natural Language Inference.

Ranked #32 on Natural Language Inference on SNLI

Natural Language Inference

148

Paper
Code

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

5 code implementations • NAACL 2019 • Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao

We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.

Machine Reading Comprehension Machine Translation +3

148

Paper
Code

Stochastic Answer Networks for SQuAD 2.0

5 code implementations • 24 Sep 2018 • Xiaodong Liu, Wei Li, Yuwei Fang, Aerin Kim, Kevin Duh, Jianfeng Gao

This paper presents an extension of the Stochastic Answer Network (SAN), one of the state-of-the-art machine reading comprehension models, to be able to judge whether a question is unanswerable or not.

Machine Reading Comprehension Question Answering

148

Paper
Code

Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge

2 code implementations • 22 Oct 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao

We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.

Open-Domain Question Answering

143

Paper
Code

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

1 code implementation • NeurIPS 2022 • Zi-Yi Dou, Aishwarya Kamath, Zhe Gan, Pengchuan Zhang, JianFeng Wang, Linjie Li, Zicheng Liu, Ce Liu, Yann Lecun, Nanyun Peng, Jianfeng Gao, Lijuan Wang

Vision-language (VL) pre-training has recently received considerable attention.

Ranked #1 on Phrase Grounding on Flickr30k Entities Dev

Described Object Detection Image Captioning +5

123

Paper
Code

Learning Customized Visual Models with Retrieval-Augmented Knowledge

1 code implementation • CVPR 2023 • Haotian Liu, Kilho Son, Jianwei Yang, Ce Liu, Jianfeng Gao, Yong Jae Lee, Chunyuan Li

Image-text contrastive learning models such as CLIP have demonstrated strong task transfer ability.

Ranked #1 on Semi-Supervised Image Classification on ImageNet - 1% labeled data (using extra training data)

Contrastive Learning Retrieval +3

117

Paper
Code

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

1 code implementation • 24 May 2022 • Yaqing Wang, Sahaj Agarwal, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao

Standard fine-tuning of large pre-trained language models (PLMs) for downstream tasks requires updating hundreds of millions to billions of parameters, and storing a large copy of the PLM weights for every task resulting in increased cost for storing, sharing and serving the models.

Natural Language Understanding Sparse Learning

116

Paper
Code

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

1 code implementation • 31 Oct 2022 • Yaqing Wang, Sahaj Agarwal, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao

116

Paper
Code

Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base

1 code implementation • IJCNLP 2015 • Wen-tau Yih, Ming-Wei Chang, Xiaodong He, Jianfeng Gao

Entity Linking Graph Generation +3

112

Paper
Code

RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling

1 code implementation • 14 May 2021 • Yizhe Zhang, Siqi Sun, Xiang Gao, Yuwei Fang, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan

We propose a framework that alleviates this data constraint by jointly training a grounded generator and document retriever on the language model signal.

Dialogue Generation Language Modelling +1

111

Paper
Code

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang

In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.

Ranked #1 on Common Sense Reasoning on CommonsenseQA (using extra training data)

Common Sense Reasoning

106

Paper
Code

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

2 code implementations • 13 Nov 2023 • An Yan, Zhengyuan Yang, Wanrong Zhu, Kevin Lin, Linjie Li, JianFeng Wang, Jianwei Yang, Yiwu Zhong, Julian McAuley, Jianfeng Gao, Zicheng Liu, Lijuan Wang

We first benchmark MM-Navigator on our collected iOS screen dataset.

Action Localization

106

Paper
Code

HittER: Hierarchical Transformers for Knowledge Graph Embeddings

2 code implementations • EMNLP 2021 • Sanxing Chen, Xiaodong Liu, Jianfeng Gao, Jian Jiao, Ruofei Zhang, Yangfeng Ji

Our proposed model consists of two different Transformer blocks: the bottom block extracts features of each entity-relation pair in the local neighborhood of the source entity and the top block aggregates the relational information from outputs of the bottom block.

Ranked #1 on Link Prediction on FB15k-237 (Hit@10 metric)

Knowledge Graph Embeddings Link Prediction +2

105

Paper
Code

Stacked Attention Networks for Image Question Answering

16 code implementations • CVPR 2016 • Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Smola

Thus, we develop a multiple-layer SAN in which we query an image multiple times to infer the answer progressively.

Ranked #5 on Visual Question Answering (VQA) on VQA v1 test-std

Visual Question Answering (VQA)

104

Paper
Code

ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format

1 code implementation • 30 Nov 2022 • Qi Zhu, Christian Geishauser, Hsien-Chin Lin, Carel van Niekerk, Baolin Peng, Zheng Zhang, Michael Heck, Nurul Lubis, Dazhen Wan, Xiaochen Zhu, Jianfeng Gao, Milica Gašić, Minlie Huang

Task-oriented dialogue (TOD) systems function as digital assistants, guiding users through various tasks such as booking flights or finding restaurants.

Reinforcement Learning (RL) Transfer Learning

Paper
Code

BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs

3 code implementations • 2 Mar 2023 • Sheng Zhang, Yanbo Xu, Naoto Usuyama, Hanwen Xu, Jaspreet Bagga, Robert Tinn, Sam Preston, Rajesh Rao, Mu Wei, Naveen Valluri, Cliff Wong, Andrea Tupini, Yu Wang, Matt Mazzola, Swadheen Shukla, Lars Liden, Jianfeng Gao, Matthew P. Lungren, Tristan Naumann, Sheng Wang, Hoifung Poon

Therefore, training an effective generalist biomedical model requires high-quality multimodal data, such as parallel image-text pairs.

Ranked #3 on Medical Visual Question Answering on SLAKE-English

Medical Visual Question Answering Pneumonia Detection +3

Paper
Code

Learning a Decision Tree Algorithm with Transformers

1 code implementation • 6 Feb 2024 • Yufan Zhuang, Liyuan Liu, Chandan Singh, Jingbo Shang, Jianfeng Gao

We then train MetaTree to produce the trees that achieve strong generalization performance.

Paper
Code

Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs

1 code implementation • 3 Nov 2023 • Qingru Zhang, Chandan Singh, Liyuan Liu, Xiaodong Liu, Bin Yu, Jianfeng Gao, Tuo Zhao

In human-written articles, we often leverage the subtleties of text style, such as bold and italics, to guide the attention of readers.

Paper
Code

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

1 code implementation • CVPR 2020 • Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao

By training on a large amount of image-text-action triplets in a self-supervised learning manner, the pre-trained model provides generic representations of visual environments and language instructions.

Ranked #1 on Visual Navigation on Help, Anna! (HANNA)

Navigate Self-Supervised Learning +2

Paper
Code

PIQA: Reasoning about Physical Commonsense in Natural Language

2 code implementations • 26 Nov 2019 • Yonatan Bisk, Rowan Zellers, Ronan Le Bras, Jianfeng Gao, Yejin Choi

Questions requiring this kind of physical commonsense pose a challenge to today's natural language understanding systems.

Ranked #36 on Question Answering on PIQA

Natural Language Understanding Physical Commonsense Reasoning +1

Paper
Code

Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading

1 code implementation • ACL 2019 • Lianhui Qin, Michel Galley, Chris Brockett, Xiaodong Liu, Xiang Gao, Bill Dolan, Yejin Choi, Jianfeng Gao

Although neural conversation models are effective in learning how to produce fluent responses, their primary challenge lies in knowing what to say to make the conversation contentful and non-vacuous.

Informativeness Reading Comprehension +1

Paper
Code

Data Augmentation for Spoken Language Understanding via Pretrained Language Models

1 code implementation • 29 Apr 2020 • Baolin Peng, Chenguang Zhu, Michael Zeng, Jianfeng Gao

The training of spoken language understanding (SLU) models often faces the problem of data scarcity.

Data Augmentation Spoken Language Understanding +1

Paper
Code

SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching

1 code implementation • 11 May 2020 • Baolin Peng, Chunyuan Li, Jinchao Li, Shahin Shayandeh, Lars Liden, Jianfeng Gao

We present a new method SOLOIST that uses transfer learning and machine teaching to build task bots at scale.

Ranked #4 on End-To-End Dialogue Modelling on MULTIWOZ 2.0

End-To-End Dialogue Modelling Few-Shot Learning +4

Paper
Code

Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning

1 code implementation • 30 Aug 2022 • Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon

We present a bi-encoder framework for named entity recognition (NER), which applies contrastive learning to map candidate text spans and entity types into the same vector representation space.

Ranked #1 on Named Entity Recognition (NER) on BC5CDR

Contrastive Learning Metric Learning +5

Paper
Code

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

1 code implementation • 3 Dec 2016 • Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng

Natural language understanding and dialogue policy learning are both essential in conversational systems that predict the next system actions in response to a current user utterance.

Natural Language Understanding

Paper
Code

Augmenting Interpretable Models with LLMs during Training

4 code implementations • 23 Sep 2022 • Chandan Singh, Armin Askari, Rich Caruana, Jianfeng Gao

Recent large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks.

Additive models Language Modelling +3

Paper
Code

Explaining Patterns in Data with Language Models via Interpretable Autoprompting

2 code implementations • 4 Oct 2022 • Chandan Singh, John X. Morris, Jyoti Aneja, Alexander M. Rush, Jianfeng Gao

Large language models (LLMs) have displayed an impressive ability to harness natural language to perform complex tasks.

Explanation Generation Natural Language Understanding +2

Paper
Code

Explaining black box text modules in natural language with language models

2 code implementations • 17 May 2023 • Chandan Singh, Aliyah R. Hsu, Richard Antonello, Shailee Jain, Alexander G. Huth, Bin Yu, Jianfeng Gao

Here, we ask whether we can automatically obtain natural language explanations for black box text modules.

Explanation Generation

Paper
Code

Rethinking Interpretability in the Era of Large Language Models

1 code implementation • 30 Jan 2024 • Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao

We highlight two emerging research priorities for LLM interpretation: using LLMs to directly analyze new datasets and to generate interactive explanations.

Interpretable Machine Learning

Paper
Code

Semantic Compositional Networks for Visual Captioning

1 code implementation • CVPR 2017 • Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, Kenneth Tran, Jianfeng Gao, Lawrence Carin, Li Deng

The degree to which each member of the ensemble is used to generate an image caption is tied to the image-dependent probability of the corresponding tag.

Image Captioning Semantic Composition +1

Paper
Code

Generation-Augmented Retrieval for Open-domain Question Answering

1 code implementation • ACL 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

We demonstrate that the generated contexts substantially enrich the semantics of the queries and GAR with sparse representations (BM25) achieves comparable or better performance than state-of-the-art dense retrieval methods such as DPR.

Ranked #9 on Passage Retrieval on Natural Questions

Natural Questions Open-Domain Question Answering +4

Paper
Code

Rider: Reader-Guided Passage Reranking for Open-Domain Question Answering

1 code implementation • 1 Jan 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer.

Natural Questions Open-Domain Question Answering +2

Paper
Code

Reader-Guided Passage Reranking for Open-Domain Question Answering

1 code implementation • Findings (ACL) 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen

Open-Domain Question Answering

Paper
Code

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training

1 code implementation • CVPR 2021 • Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Jianfeng Gao, Dongdong Zhang, Nan Duan

We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training.

Image Captioning Image Retrieval +4

Paper
Code

PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking

2 code implementations • EMNLP 2020 • Hannah Rashkin, Asli Celikyilmaz, Yejin Choi, Jianfeng Gao

We propose the task of outline-conditioned story generation: given an outline as a set of phrases that describe key characters and events to appear in a story, the task is to generate a coherent narrative that is consistent with the provided outline.

Story Generation

Paper
Code

Few-Shot Generative Conversational Query Rewriting

1 code implementation • 9 Jun 2020 • Shi Yu, Jiahua Liu, Jingqin Yang, Chenyan Xiong, Paul Bennett, Jianfeng Gao, Zhiyuan Liu

Conversational query rewriting aims to reformulate a concise conversational query to a fully specified, context-independent query that can be effectively handled by existing information retrieval systems.

Information Retrieval Retrieval +2

Paper
Code

Structuring Latent Spaces for Stylized Response Generation

1 code implementation • IJCNLP 2019 • Xiang Gao, Yizhe Zhang, Sungjin Lee, Michel Galley, Chris Brockett, Jianfeng Gao, Bill Dolan

This structure allows the system to generate stylized relevant responses by sampling in the neighborhood of the conversation model prediction, and continuously control the style level.

Response Generation Style Transfer

Paper
Code

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

1 code implementation • CVPR 2019 • Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et.

Ranked #3 on Vision-Language Navigation on Room2Room

Vision and Language Navigation Vision-Language Navigation

Paper
Code

Taming Sparsely Activated Transformer with Stochastic Experts

1 code implementation • ICLR 2022 • Simiao Zuo, Xiaodong Liu, Jian Jiao, Young Jin Kim, Hany Hassan, Ruofei Zhang, Tuo Zhao, Jianfeng Gao

While most on-going research focuses on improving SAMs models by exploring methods of routing inputs to experts, our analysis reveals that such research might not lead to the solution we expect, i. e., the commonly-used routing methods based on gating mechanisms do not work better than randomly routing inputs to experts.

Machine Translation Translation

Paper
Code

Implicit Deep Latent Variable Models for Text Generation

1 code implementation • IJCNLP 2019 • Le Fang, Chunyuan Li, Jianfeng Gao, Wen Dong, Changyou Chen

Deep latent variable models (LVM) such as variational auto-encoder (VAE) have recently played an important role in text generation.

Language Modelling Response Generation +2

Paper
Code

Few-Shot Named Entity Recognition: A Comprehensive Study

2 code implementations • 29 Dec 2020 • Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, Jiawei Han

This paper presents a comprehensive study to efficiently build named entity recognition (NER) systems when a small number of in-domain labeled data is available.

Few-Shot Learning named-entity-recognition +2

Paper
Code

Visually-Augmented Language Modeling

1 code implementation • 20 May 2022 • Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.

Image Retrieval Language Modelling +1

Paper
Code

Open Domain Question Answering with A Unified Knowledge Interface

1 code implementation • ACL 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao

The retriever-reader framework is popular for open-domain question answering (ODQA) due to its ability to use explicit knowledge.

Data-to-Text Generation Natural Questions +2

Paper
Code

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

1 code implementation • 4 May 2023 • Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao

Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA and OTT-QA.

Ranked #4 on Question Answering on HotpotQA

Open-Domain Question Answering Retrieval +1

Paper
Code

TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency

1 code implementation • 5 Nov 2016 • Adji B. Dieng, Chong Wang, Jianfeng Gao, John Paisley

The proposed TopicRNN model integrates the merits of RNNs and latent topic models: it captures local (syntactic) dependencies using an RNN and global (semantic) dependencies using latent topics.

Language Modelling Sentiment Analysis +1

Paper
Code

KAT: A Knowledge Augmented Transformer for Vision-and-Language

1 code implementation • NAACL 2022 • Liangke Gui, Borui Wang, Qiuyuan Huang, Alex Hauptmann, Yonatan Bisk, Jianfeng Gao

The primary focus of recent work with largescale transformers has been on optimizing the amount of information packed into the model's parameters.

Answer Generation Retrieval +1

Paper
Code

Agent AI: Surveying the Horizons of Multimodal Interaction

1 code implementation • 7 Jan 2024 • Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Yejin Choi, Katsushi Ikeuchi, Hoi Vo, Li Fei-Fei, Jianfeng Gao

To accelerate research on agent-based multimodal intelligence, we define "Agent AI" as a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data, and can produce meaningful embodied actions.

Paper
Code

Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving

3 code implementations • 15 Oct 2019 • Imanol Schlag, Paul Smolensky, Roland Fernandez, Nebojsa Jojic, Jürgen Schmidhuber, Jianfeng Gao

We incorporate Tensor-Product Representations within the Transformer in order to better support the explicit representation of relation structure.

Ranked #1 on Question Answering on Mathematics Dataset

Math Question Answering

Paper
Code

EmailSum: Abstractive Email Thread Summarization

1 code implementation • ACL 2021 • Shiyue Zhang, Asli Celikyilmaz, Jianfeng Gao, Mohit Bansal

Furthermore, we find that widely used automatic evaluation metrics (ROUGE, BERTScore) are weakly correlated with human judgments on this email thread summarization task.

Ranked #1 on Email Thread Summarization on EmailSum (short)

Abstractive Text Summarization Email Thread Summarization

Paper
Code

Self-Verification Improves Few-Shot Clinical Information Extraction

1 code implementation • 30 May 2023 • Zelalem Gero, Chandan Singh, Hao Cheng, Tristan Naumann, Michel Galley, Jianfeng Gao, Hoifung Poon

Extracting patient information from unstructured text is a critical task in health decision-support and clinical research.

In-Context Learning

Paper
Code

Efficient Long Sequence Modeling via State Space Augmented Transformer

1 code implementation • 15 Dec 2022 • Simiao Zuo, Xiaodong Liu, Jian Jiao, Denis Charles, Eren Manavoglu, Tuo Zhao, Jianfeng Gao

Specifically, we augment a SSM into the bottom layer of SPADE, and we employ efficient local attention methods for the other layers.

Computational Efficiency Language Modelling +2

Paper
Code

Deep Reinforcement Learning with a Natural Language Action Space

3 code implementations • ACL 2016 • Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng, Mari Ostendorf

This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games.

Q-Learning reinforcement-learning +2

Paper
Code

AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation

1 code implementation • 14 Oct 2022 • Ganesh Jawahar, Subhabrata Mukherjee, Xiaodong Liu, Young Jin Kim, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah, Sebastien Bubeck, Jianfeng Gao

Furthermore, existing MoE works do not consider computational constraints (e. g., FLOPs, latency) to guide their design.

Machine Translation Neural Architecture Search +1

Paper
Code

Towards Amortized Ranking-Critical Training for Collaborative Filtering

1 code implementation • 10 Jun 2019 • Sam Lobel, Chunyuan Li, Jianfeng Gao, Lawrence Carin

In this paper we investigate new methods for training collaborative filtering models based on actor-critic reinforcement learning, to directly optimize the non-differentiable quality metrics of interest.

Ranked #4 on Recommendation Systems on Million Song Dataset

Collaborative Filtering Learning-To-Rank +1

Paper
Code

RaCT: Toward Amortized Ranking-Critical Training For Collaborative Filtering

1 code implementation • ICLR 2020 • Sam Lobel*, Chunyuan Li*, Jianfeng Gao, Lawrence Carin

We investigate new methods for training collaborative filtering models based on actor-critic reinforcement learning, to more directly maximize ranking-based objective functions.

Collaborative Filtering Learning-To-Rank +2

Paper
Code

Training Vision-Language Transformers from Captions

1 code implementation • 19 May 2022 • Liangke Gui, Yingshan Chang, Qiuyuan Huang, Subhojit Som, Alex Hauptmann, Jianfeng Gao, Yonatan Bisk

Vision-Language Transformers can be learned without low-level human labels (e. g. class labels, bounding boxes, etc).

Paper
Code

REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning

1 code implementation • IJCNLP 2019 • Ming Jiang, Junjie Hu, Qiuyuan Huang, Lei Zhang, Jana Diesner, Jianfeng Gao

In this study, we present a fine-grained evaluation method REO for automatically measuring the performance of image captioning systems.

Image Captioning

Paper
Code

TIGEr: Text-to-Image Grounding for Image Caption Evaluation

1 code implementation • IJCNLP 2019 • Ming Jiang, Qiuyuan Huang, Lei Zhang, Xin Wang, Pengchuan Zhang, Zhe Gan, Jana Diesner, Jianfeng Gao

This paper presents a new metric called TIGEr for the automatic evaluation of image captioning systems.

Image Captioning Text Matching

Paper
Code

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

1 code implementation • 4 Nov 2021 • Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Greg Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao

We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks.

Few-Shot Learning Natural Language Understanding

Paper
Code

LiST: Lite Prompted Self-training Makes Parameter-Efficient Few-shot Learners

1 code implementation • Findings (NAACL) 2022 • Yaqing Wang, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao

The first is the use of self-training to leverage large amounts of unlabeled data for prompt-based FN in few-shot settings.

Few-Shot Learning In-Context Learning

Paper
Code

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

1 code implementation • ICLR 2022 • Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Analysis shows that the proposed schedule indeed reduces the redundancy and improves generalization performance.

Image Classification Machine Translation +2

Paper
Code

Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement Learning

1 code implementation • 21 Aug 2018 • Ziyu Yao, Xiujun Li, Jianfeng Gao, Brian Sadler, Huan Sun

Given a text description, most existing semantic parsers synthesize a program in one shot.

Hierarchical Reinforcement Learning reinforcement-learning +2

Paper
Code

Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning

3 code implementations • EMNLP 2018 • Shang-Yu Su, Xiujun Li, Jianfeng Gao, Jingjing Liu, Yun-Nung Chen

This paper presents a Discriminative Deep Dyna-Q (D3Q) approach to improving the effectiveness and robustness of Deep Dyna-Q (DDQ), a recently proposed framework that extends the Dyna-Q algorithm to integrate planning for task-completion dialogue policy learning.

Task-Completion Dialogue Policy Learning

Paper
Code

What Makes A Good Story? Designing Composite Rewards for Visual Storytelling

1 code implementation • 11 Sep 2019 • Junjie Hu, Yu Cheng, Zhe Gan, Jingjing Liu, Jianfeng Gao, Graham Neubig

Previous storytelling approaches mostly focused on optimizing traditional metrics such as BLEU, ROUGE and CIDEr.

Ranked #10 on Visual Storytelling on VIST

Visual Storytelling

Paper
Code

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions

1 code implementation • 28 May 2022 • Ansong Ni, Jeevana Priya Inala, Chenglong Wang, Oleksandr Polozov, Christopher Meek, Dragomir Radev, Jianfeng Gao

We show that our use of self-sampled correct and partially-correct solutions can benefit learning and help guide the sampling process, leading to more efficient exploration of the solution space.

Ranked #138 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning Efficient Exploration +3

Paper
Code

Tree Prompting: Efficient Task Adaptation without Fine-Tuning

2 code implementations • 21 Oct 2023 • John X. Morris, Chandan Singh, Alexander M. Rush, Jianfeng Gao, Yuntian Deng

Prompting language models (LMs) is the main interface for applying them to new tasks.

Classification Decision Making

Paper
Code

WebQA: Multihop and Multimodal QA

2 code implementations • CVPR 2022 • Yingshan Chang, Mridu Narang, Hisami Suzuki, Guihong Cao, Jianfeng Gao, Yonatan Bisk

Scaling Visual Question Answering (VQA) to the open-domain and multi-hop nature of web searches, requires fundamental advances in visual representation learning, knowledge aggregation, and language generation.

Image Retrieval Multimodal Reasoning +4

Paper
Code

A Persona-Based Neural Conversation Model

1 code implementation • ACL 2016 • Jiwei Li, Michel Galley, Chris Brockett, Georgios P. Spithourakis, Jianfeng Gao, Bill Dolan

We present persona-based models for handling the issue of speaker consistency in neural response generation.

Response Generation

Paper
Code

A Hybrid Retrieval-Generation Neural Conversation Model

1 code implementation • 19 Apr 2019 • Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, Jingjing Liu

In this paper, we propose a hybrid neural conversation model that combines the merits of both response retrieval and generation methods.

Retrieval Text Generation +1

Paper
Code

Fault-Aware Neural Code Rankers

1 code implementation • 4 Jun 2022 • Jeevana Priya Inala, Chenglong Wang, Mei Yang, Andres Codas, Mark Encarnación, Shuvendu K Lahiri, Madanlal Musuvathi, Jianfeng Gao

Large language models (LLMs) have demonstrated an impressive ability to generate code for various programming tasks.

Code Generation

Paper
Code

Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization

4 code implementations • NeurIPS 2018 • Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan, Xiujun Li, Chris Brockett, Bill Dolan

Responses generated by neural conversational models tend to lack informativeness and diversity.

Conversational Response Generation Informativeness

Paper
Code

Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers

1 code implementation • 21 May 2023 • Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, Alvin Cheung, Jianfeng Gao, Xia Song

This paper explores the effectiveness of model-generated signals in improving zero-shot generalization of text-to-text Transformers such as T5.

Zero-shot Generalization

Paper
Code

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads

1 code implementation • EMNLP 2016 • Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lihong Li, Li Deng

We introduce an online popularity prediction and tracking task as a benchmark task for reinforcement learning with a combinatorial, natural language action space.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

1 code implementation • NeurIPS 2015 • Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i. e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document.

General Classification Topic Models

Paper
Code

Consistent Dialogue Generation with Self-supervised Feature Learning

1 code implementation • 13 Mar 2019 • Yizhe Zhang, Xiang Gao, Sungjin Lee, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan

Generating responses that are consistent with the dialogue context is one of the central challenges in building engaging conversational agents.

Dialogue Generation Response Generation

Paper
Code

CodeExp: Explanatory Code Document Generation

1 code implementation • 25 Nov 2022 • Haotian Cui, Chenglong Wang, JunJie Huang, Jeevana Priya Inala, Todd Mytkowicz, Bo wang, Jianfeng Gao, Nan Duan

Our experiments show that (1) our refined training dataset lets models achieve better performance in the explanation generation tasks compared to larger unrefined data (15x larger), and (2) fine-tuned models can generate well-structured long docstrings comparable to human-written ones.

Explanation Generation Text Generation

Paper
Code

Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language

1 code implementation • 18 Nov 2020 • Hassan Akbari, Hamid Palangi, Jianwei Yang, Sudha Rao, Asli Celikyilmaz, Roland Fernandez, Paul Smolensky, Jianfeng Gao, Shih-Fu Chang

In this paper, we propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning.

Dictionary Learning Disentanglement +1

Paper
Code

Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations

2 code implementations • ICML 2020 • Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao

The encoder of TP-N2F employs TPR `binding' to encode natural-language symbolic structure in vector space and the decoder uses TPR `unbinding' to generate, in symbolic space, a sequential program represented by relational tuples, each consisting of a relation (or operation) and a number of arguments.

Program Synthesis Text Generation

Paper
Code

A Controllable Model of Grounded Response Generation

1 code implementation • 1 May 2020 • Zeqiu Wu, Michel Galley, Chris Brockett, Yizhe Zhang, Xiang Gao, Chris Quirk, Rik Koncel-Kedziorski, Jianfeng Gao, Hannaneh Hajishirzi, Mari Ostendorf, Bill Dolan

Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses.

Informativeness Response Generation

Paper
Code

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

1 code implementation • 2 Mar 2021 • Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao

The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient largescale high-quality training datasets.

Data Augmentation Document Summarization +1

Paper
Code

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

1 code implementation • EMNLP 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Adversarial regularization has been shown to improve the generalization performance of deep learning models in various natural language processing tasks.

Machine Translation Natural Language Understanding +1

Paper
Code

Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization

1 code implementation • NAACL 2021 • Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng Gao

On several syntactic and semantic probing tasks, we demonstrate the emergent structural information in the role vectors and improved syntactic interpretability in the TPR layer outputs.

Abstractive Text Summarization

Paper
Code

Contrastive Multi-document Question Generation

1 code implementation • EACL 2021 • Woon Sang Cho, Yizhe Zhang, Sudha Rao, Asli Celikyilmaz, Chenyan Xiong, Jianfeng Gao, Mengdi Wang, Bill Dolan

In the SL stage, a single-document question generator is trained.

Contrastive Learning Question Generation +3

Paper
Code

RMM: A Recursive Mental Model for Dialog Navigation

1 code implementation • 2 May 2020 • Homero Roman Roman, Yonatan Bisk, Jesse Thomason, Asli Celikyilmaz, Jianfeng Gao

In this paper, we go beyond instruction following and introduce a two-agent task where one agent navigates and asks questions that a second, guiding agent answers.

Answer Generation Instruction Following

Paper
Code

RMM: A Recursive Mental Model for Dialogue Navigation

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Homero Roman Roman, Yonatan Bisk, Jesse Thomason, Asli Celikyilmaz, Jianfeng Gao

In this paper, we go beyond instruction following and introduce a two-agent task where one agent navigates and asks questions that a second, guiding agent answers.

Answer Generation Instruction Following

Paper
Code

Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering

1 code implementation • 11 Oct 2022 • Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao

Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular.

Open-Domain Question Answering Retrieval

Paper
Code

Guided Dialog Policy Learning without Adversarial Learning in the Loop

1 code implementation • 7 Apr 2020 • Ziming Li, Sungjin Lee, Baolin Peng, Jinchao Li, Julia Kiseleva, Maarten de Rijke, Shahin Shayandeh, Jianfeng Gao

Reinforcement Learning (RL) methods have emerged as a popular choice for training an efficient and effective dialogue policy.

Reinforcement Learning (RL)

Paper
Code

Guided Dialogue Policy Learning without Adversarial Learning in the Loop

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Ziming Li, Sungjin Lee, Baolin Peng, Jinchao Li, Julia Kiseleva, Maarten de Rijke, Shahin Shayandeh, Jianfeng Gao

Reinforcement learning methods have emerged as a popular choice for training an efficient and effective dialogue policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Compositional Processing Emerges in Neural Networks Solving Math Problems

1 code implementation • 19 May 2021 • Jacob Russin, Roland Fernandez, Hamid Palangi, Eric Rosen, Nebojsa Jojic, Paul Smolensky, Jianfeng Gao

A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition.

Math Mathematical Reasoning

Paper
Code

Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models

1 code implementation • 4 Nov 2021 • Boxin Wang, Chejian Xu, Shuohang Wang, Zhe Gan, Yu Cheng, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li

In this paper, we present Adversarial GLUE (AdvGLUE), a new multi-task benchmark to quantitatively and thoroughly explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks.

Ranked #1 on Adversarial Robustness on AdvGLUE

Adversarial Attack Adversarial Robustness +1

Paper
Code

Teaching Language Models to Self-Improve through Interactive Demonstrations

1 code implementation • 20 Oct 2023 • Xiao Yu, Baolin Peng, Michel Galley, Jianfeng Gao, Zhou Yu

The self-improving ability of large language models (LLMs), enabled by prompting them to analyze and revise their own outputs, has garnered significant interest in recent research.

Math

Paper
Code

Language-Based Image Editing with Recurrent Attentive Models

1 code implementation • CVPR 2018 • Jianbo Chen, Yelong Shen, Jianfeng Gao, Jingjing Liu, Xiaodong Liu

First, we introduce a synthetic dataset, called CoSaL, to evaluate the end-to-end performance of our LBIE system.

Colorization Image Colorization +3

Paper
Code

Bi-directional Attention with Agreement for Dependency Parsing

1 code implementation • EMNLP 2016 • Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, Li Deng

We develop a novel bi-directional attention model for dependency parsing, which learns to agree on headword predictions from the forward and backward parsing directions.

Ranked #4 on Chinese Dependency Parsing on Chinese Pennbank

Dependency Parsing

Paper
Code

Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning

1 code implementation • 19 Nov 2018 • Yuexin Wu, Xiujun Li, Jingjing Liu, Jianfeng Gao, Yiming Yang

Training task-completion dialogue agents with reinforcement learning usually requires a large number of real user experiences.

Active Learning Q-Learning +1

Paper
Code

Robust Navigation with Language Pretraining and Stochastic Sampling

1 code implementation • IJCNLP 2019 • Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah Smith, Yejin Choi

Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments.

Vision and Language Navigation

Paper
Code

Differentiable Tree Operations Promote Compositional Generalization

1 code implementation • 1 Jun 2023 • Paul Soulos, Edward Hu, Kate McCurdy, Yunmo Chen, Roland Fernandez, Paul Smolensky, Jianfeng Gao

To facilitate the learning of these symbolic sequences, we introduce a differentiable tree interpreter that compiles high-level symbolic tree operations into subsymbolic matrix operations on tensors.

Semantic Parsing Text Generation

Paper
Code

ARCH: Efficient Adversarial Regularized Training with Caching

1 code implementation • Findings (EMNLP) 2021 • Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao

Adversarial regularization can improve model generalization in many natural language processing tasks.

Machine Translation Natural Language Understanding +1

Paper
Code

Execution-based Evaluation for Data Science Code Generation Models

1 code implementation • 17 Nov 2022 • JunJie Huang, Chenglong Wang, Jipeng Zhang, Cong Yan, Haotian Cui, Jeevana Priya Inala, Colin Clement, Nan Duan, Jianfeng Gao

Code generation models can benefit data scientists' productivity by automatically generating code from context and text descriptions.

Code Generation Model Selection

Paper
Code

Is Self-Repair a Silver Bullet for Code Generation?

1 code implementation • 16 Jun 2023 • Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama

We hypothesize that this is because self-repair is bottlenecked by the model's ability to provide feedback on its own code; using a stronger model to artificially boost the quality of the feedback, we observe substantially larger performance gains.

Code Generation

Paper
Code

Character-level Deep Conflation for Business Data Analytics

2 code implementations • 8 Feb 2017 • Zhe Gan, P. D. Singh, Ameet Joshi, Xiaodong He, Jianshu Chen, Jianfeng Gao, Li Deng

Connecting different text attributes associated with the same entity (conflation) is important in business data analytics since it could help merge two different tables in a database to provide a more comprehensive profile of an entity.

Paper
Code

HUBERT Untangles BERT to Improve Transfer across NLP Tasks

1 code implementation • 25 Oct 2019 • Mehrad Moradshahi, Hamid Palangi, Monica S. Lam, Paul Smolensky, Jianfeng Gao

We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model.

Language Modelling

Paper
Code

Localized Symbolic Knowledge Distillation for Visual Commonsense Models

2 code implementations • NeurIPS 2023 • Jae Sung Park, Jack Hessel, Khyathi Raghavi Chandu, Paul Pu Liang, Ximing Lu, Peter West, Youngjae Yu, Qiuyuan Huang, Jianfeng Gao, Ali Farhadi, Yejin Choi

Empirical results and human evaluations in a zero-shot setup demonstrate that our distillation method results in more precise VL models of reasoning compared to a baseline of passing a generated referring expression to an LLM.

Instruction Following Knowledge Distillation +3

Paper
Code

Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries

1 code implementation • 1 Mar 2024 • Zelalem Gero, Chandan Singh, Yiqing Xie, Sheng Zhang, Tristan Naumann, Jianfeng Gao, Hoifung Poon

Summarizing clinical text is crucial in health decision-support and clinical research.

Attribute Text Summarization

Paper
Code

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

1 code implementation • 31 Jul 2020 • Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.

Ranked #2 on Participant Intervention Comparison Outcome Extraction on EBM-NLP (using extra training data)

Continual Pretraining +11

Paper
Code

Language Models as Inductive Reasoners

1 code implementation • 21 Dec 2022 • Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei

To this end, we propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts, and create a dataset termed DEER containing 1. 2k rule-fact pairs for the task, where rules and facts are written in natural language.

Philosophy

Paper
Code

OPERA: Harmonizing Task-Oriented Dialogs and Information Seeking Experience

1 code implementation • 24 Jun 2022 • Miaoran Li, Baolin Peng, Jianfeng Gao, Zhu Zhang

Existing studies in conversational AI mostly treat task-oriented dialog (TOD) and question answering (QA) as separate tasks.

Question Answering

Paper
Code

Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning

1 code implementation • 25 Jan 2024 • Yanda Chen, Chandan Singh, Xiaodong Liu, Simiao Zuo, Bin Yu, He He, Jianfeng Gao

We propose explanation-consistency finetuning (EC-finetuning), a method that adapts LLMs to generate more consistent natural-language explanations on related examples.

Question Answering

Paper
Code

The Neural Painter: Multi-Turn Image Generation

no code implementations • 16 Jun 2018 • Ryan Y. Benmalek, Claire Cardie, Serge Belongie, Xiadong He, Jianfeng Gao

In this work we combine two research threads from Vision/ Graphics and Natural Language Processing to formulate an image generation task conditioned on attributes in a multi-turn setting.

Benchmarking Conditional Image Generation

Paper
Add Code

Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

no code implementations • NeurIPS 2018 • Minjia Zhang, Xiaodong Liu, Wenhan Wang, Jianfeng Gao, Yuxiong He

Neural language models (NLMs) have recently gained a renewed interest by achieving state-of-the-art performance across many natural language processing (NLP) tasks.

Language Modelling Machine Translation +1

Paper
Add Code

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search

no code implementations • NeurIPS 2018 • Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao

In order to effectively train the agent from sparse rewards, we combine MCTS with the neural policy to generate trajectories yielding more positive rewards.

Ranked #44 on Link Prediction on WN18RR (Hits@3 metric)

Knowledge Base Completion Link Prediction +2

Paper
Add Code

Discourse-Aware Neural Rewards for Coherent Text Generation

no code implementations • NAACL 2018 • Antoine Bosselut, Asli Celikyilmaz, Xiaodong He, Jianfeng Gao, Po-Sen Huang, Yejin Choi

In this paper, we investigate the use of discourse-aware rewards with reinforcement learning to guide a model to generate long, coherent text.

reinforcement-learning Reinforcement Learning (RL) +3

Paper
Add Code

Link Prediction using Embedded Knowledge Graphs

no code implementations • 14 Nov 2016 • Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao

Since large knowledge bases are typically incomplete, missing facts need to be inferred from observed facts in a task called knowledge base completion.

Knowledge Base Completion Knowledge Graphs +1

Paper
Add Code

Subgoal Discovery for Hierarchical Dialogue Policy Learning

no code implementations • EMNLP 2018 • Da Tang, Xiujun Li, Jianfeng Gao, Chong Wang, Lihong Li, Tony Jebara

Experiments with simulated and real users show that our approach performs competitively against a state-of-the-art method that requires human-defined subgoals.

Hierarchical Reinforcement Learning

Paper
Add Code

Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear

no code implementations • 3 Nov 2016 • Zachary C. Lipton, Kamyar Azizzadenesheli, Abhishek Kumar, Lihong Li, Jianfeng Gao, Li Deng

We introduce intrinsic fear (IF), a learned reward shaping that guards DRL agents against periodic catastrophes.

Atari Games General Classification +3

Paper
Add Code

Dynamic Fusion Networks for Machine Reading Comprehension

no code implementations • 14 Nov 2017 • Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu

This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC).

Machine Reading Comprehension

Paper
Add Code

Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning

no code implementations • 31 Oct 2017 • Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Yun-Nung Chen, Kam-Fai Wong

This paper presents a new method --- adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in task-completion dialogue systems.

Task-Completion Dialogue Policy Learning

Paper
Add Code

BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems

no code implementations • 17 Aug 2016 • Zachary C. Lipton, Xiujun Li, Jianfeng Gao, Lihong Li, Faisal Ahmed, Li Deng

We present a new algorithm that significantly improves the efficiency of exploration for deep Q-learning agents in dialogue systems.

Efficient Exploration Q-Learning +4

Paper
Add Code

BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems

no code implementations • 15 Nov 2017 • Zachary Lipton, Xiujun Li, Jianfeng Gao, Lihong Li, Faisal Ahmed, Li Deng

We present a new algorithm that significantly improves the efficiency of exploration for deep Q-learning agents in dialogue systems.

Efficient Exploration Q-Learning +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.