Search Results for author: Xiujun Li

Found 33 papers, 19 papers with code

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

4 code implementations ECCV 2020 Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiao-Wei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, Yejin Choi, Jianfeng Gao

Large-scale pre-training methods of learning cross-modal representations on image-text pairs are becoming popular for vision-language tasks.

 Ranked #1 on Image Retrieval on MS COCO (Recall@10 metric)

Image Captioning Image Retrieval +3

VinVL: Revisiting Visual Representations in Vision-Language Models

7 code implementations CVPR 2021 Pengchuan Zhang, Xiujun Li, Xiaowei Hu, Jianwei Yang, Lei Zhang, Lijuan Wang, Yejin Choi, Jianfeng Gao

In our experiments we feed the visual features generated by the new object detection model into a Transformer-based VL fusion model \oscar \cite{li2020oscar}, and utilize an improved approach \short\ to pre-train the VL model and fine-tune it on a wide range of downstream VL tasks.

Image Captioning Image-text matching +4

ConvLab: Multi-Domain End-to-End Dialog System Platform

2 code implementations ACL 2019 Sungjin Lee, Qi Zhu, Ryuichi Takanobu, Xiang Li, Yaoqin Zhang, Zheng Zhang, Jinchao Li, Baolin Peng, Xiujun Li, Minlie Huang, Jianfeng Gao

We present ConvLab, an open-source multi-domain end-to-end dialog system platform, that enables researchers to quickly set up experiments with reusable components and compare a large set of different approaches, ranging from conventional pipeline systems to end-to-end neural models, in common environments.

End-to-End Task-Completion Neural Dialogue Systems

13 code implementations IJCNLP 2017 Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, Asli Celikyilmaz

One of the major drawbacks of modularized task-completion dialogue systems is that each module is trained individually, which presents several challenges.

Chatbot

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

1 code implementation EMNLP 2020 Chunyuan Li, Xiang Gao, Yuan Li, Baolin Peng, Xiujun Li, Yizhe Zhang, Jianfeng Gao

We hope that our first pre-trained big VAE language model itself and results can help the NLP community renew the interests of deep generative models in the era of large-scale pre-training, and make these principled methods more practical.

Language Modelling Representation Learning +1

Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems

2 code implementations29 Jul 2018 Xiujun Li, Yu Wang, Siqi Sun, Sarah Panda, Jingjing Liu, Jianfeng Gao

This proposal introduces a Dialogue Challenge for building end-to-end task-completion dialogue systems, with the goal of encouraging the dialogue research community to collaborate and benchmark on standard datasets and unified experimental environment.

Few-shot Natural Language Generation for Task-Oriented Dialog

2 code implementations Findings of the Association for Computational Linguistics 2020 Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao

It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains.

Data-to-Text Generation Few-Shot Learning

Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access

1 code implementation ACL 2017 Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, Li Deng

In this paper, we address this limitation by replacing symbolic queries with an induced "soft" posterior distribution over the KB that indicates which entities the user is interested in.

reinforcement-learning Reinforcement Learning (RL) +2

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning

3 code implementations ACL 2018 Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Kam-Fai Wong, Shang-Yu Su

During dialogue policy learning, the world model is constantly updated with real user experience to approach real user behavior, and in turn, the dialogue agent is optimized using both real experience and simulated experience.

Reinforcement Learning (RL) Task-Completion Dialogue Policy Learning

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation

1 code implementation NeurIPS 2023 Yujie Lu, Xianjun Yang, Xiujun Li, Xin Eric Wang, William Yang Wang

Existing automatic evaluation on text-to-image synthesis can only provide an image-text matching score, without considering the object-level compositionality, which results in poor correlation with human judgments.

Attribute Image Generation +2

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

1 code implementation CVPR 2020 Weituo Hao, Chunyuan Li, Xiujun Li, Lawrence Carin, Jianfeng Gao

By training on a large amount of image-text-action triplets in a self-supervised learning manner, the pre-trained model provides generic representations of visual environments and language instructions.

Navigate Self-Supervised Learning +2

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

1 code implementation3 Dec 2016 Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng

Natural language understanding and dialogue policy learning are both essential in conversational systems that predict the next system actions in response to a current user utterance.

Natural Language Understanding

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

1 code implementation CVPR 2019 Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et.

Vision and Language Navigation Vision-Language Navigation

Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning

3 code implementations EMNLP 2018 Shang-Yu Su, Xiujun Li, Jianfeng Gao, Jingjing Liu, Yun-Nung Chen

This paper presents a Discriminative Deep Dyna-Q (D3Q) approach to improving the effectiveness and robustness of Deep Dyna-Q (DDQ), a recently proposed framework that extends the Dyna-Q algorithm to integrate planning for task-completion dialogue policy learning.

Task-Completion Dialogue Policy Learning

Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning

1 code implementation19 Nov 2018 Yuexin Wu, Xiujun Li, Jingjing Liu, Jianfeng Gao, Yiming Yang

Training task-completion dialogue agents with reinforcement learning usually requires a large number of real user experiences.

Active Learning Q-Learning +1

Robust Navigation with Language Pretraining and Stochastic Sampling

1 code implementation IJCNLP 2019 Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah Smith, Yejin Choi

Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments.

Vision and Language Navigation

Subgoal Discovery for Hierarchical Dialogue Policy Learning

no code implementations EMNLP 2018 Da Tang, Xiujun Li, Jianfeng Gao, Chong Wang, Lihong Li, Tony Jebara

Experiments with simulated and real users show that our approach performs competitively against a state-of-the-art method that requires human-defined subgoals.

Hierarchical Reinforcement Learning

Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning

no code implementations31 Oct 2017 Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Yun-Nung Chen, Kam-Fai Wong

This paper presents a new method --- adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in task-completion dialogue systems.

Task-Completion Dialogue Policy Learning

BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems

no code implementations15 Nov 2017 Zachary Lipton, Xiujun Li, Jianfeng Gao, Lihong Li, Faisal Ahmed, Li Deng

We present a new algorithm that significantly improves the efficiency of exploration for deep Q-learning agents in dialogue systems.

Efficient Exploration Q-Learning +4

Recurrent Reinforcement Learning: A Hybrid Approach

no code implementations10 Sep 2015 Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, Ji He

Successful applications of reinforcement learning in real-world problems often require dealing with partially observable states.

reinforcement-learning Reinforcement Learning (RL)

Budgeted Policy Learning for Task-Oriented Dialogue Systems

no code implementations ACL 2019 Zhirui Zhang, Xiujun Li, Jianfeng Gao, Enhong Chen

This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents.

Scheduling Task-Oriented Dialogue Systems

Multi-View Learning for Vision-and-Language Navigation

no code implementations2 Mar 2020 Qiaolin Xia, Xiujun Li, Chunyuan Li, Yonatan Bisk, Zhifang Sui, Jianfeng Gao, Yejin Choi, Noah A. Smith

Learning to navigate in a visual environment following natural language instructions is a challenging task because natural language instructions are highly variable, ambiguous, and under-specified.

MULTI-VIEW LEARNING Navigate +1

MiniVLM: A Smaller and Faster Vision-Language Model

no code implementations13 Dec 2020 JianFeng Wang, Xiaowei Hu, Pengchuan Zhang, Xiujun Li, Lijuan Wang, Lei Zhang, Jianfeng Gao, Zicheng Liu

We design a Two-stage Efficient feature Extractor (TEE), inspired by the one-stage EfficientDet network, to significantly reduce the time cost of visual feature extraction by $95\%$, compared to a baseline model.

Language Modelling

VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following

no code implementations29 Nov 2023 Yujie Lu, Xiujun Li, William Yang Wang, Yejin Choi

We introduce VISUAL EMBEDDED INSTRUCTION (VIM), a new framework designed to evaluate the visual instruction following capability of Multimodal Large Language Models (MLLMs).

In-Context Learning visual instruction following

Cannot find the paper you are looking for? You can Submit a new open access paper.