Search Results for author: Wenhan Xiong

Found 45 papers, 26 papers with code

FLAME: Factuality-Aware Alignment for Large Language Models

no code implementations2 May 2024 Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen

Furthermore, reward functions used in standard RL can also encourage hallucination, because it guides the LLM to provide more helpful responses on a diverse set of instructions, often preferring longer and more detailed responses.

Hallucination Instruction Following +1

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

1 code implementation12 Apr 2024 Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou

The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.

Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning

no code implementations18 Mar 2024 Rao Fu, Jingyu Liu, Xilun Chen, Yixin Nie, Wenhan Xiong

This paper introduces Scene-LLM, a 3D-visual-language model that enhances embodied agents' abilities in interactive 3D indoor environments by integrating the reasoning strengths of Large Language Models (LLMs).

Dense Captioning Language Modelling +1

The Role of Chain-of-Thought in Complex Vision-Language Reasoning Task

no code implementations15 Nov 2023 Yifan Wu, Pengchuan Zhang, Wenhan Xiong, Barlas Oguz, James C. Gee, Yixin Nie

The study explores the effectiveness of the Chain-of-Thought approach, known for its proficiency in language tasks by breaking them down into sub-tasks and intermediate steps, in improving vision-language tasks that demand sophisticated perception and reasoning.

Visual Reasoning

Effective Long-Context Scaling of Foundation Models

2 code implementations27 Sep 2023 Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.

Continual Pretraining Language Modelling

LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models

1 code implementation30 Aug 2023 Chi Han, Qifan Wang, Hao Peng, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang

As a result, their performance suffers drastically on inputs longer than those encountered during training, substantially limiting their applications in real-world tasks involving long contexts such as encoding scientific articles, code repositories, or long dialogues.

2k 4k +1

Code Llama: Open Foundation Models for Code

2 code implementations24 Aug 2023 Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.

16k Code Generation +1

Prompting Large Language Models with Speech Recognition Abilities

no code implementations21 Jul 2023 Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Furthermore, we perform ablation studies to investigate whether the LLM can be completely frozen during training to maintain its original capabilities, scaling up the audio encoder, and increasing the audio encoder striding to generate fewer embeddings.

Abstractive Text Summarization Automatic Speech Recognition +3

Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality

no code implementations23 May 2023 Harman Singh, Pengchuan Zhang, Qifan Wang, Mengjiao Wang, Wenhan Xiong, Jingfei Du, Yu Chen

Along with this, we propose novel negative mining techniques in the scene graph space for improving attribute binding and relation understanding.

 Ranked #1 on Image Retrieval on CREPE (Compositional REPresentation Evaluation) (Recall@1 (HN-Comp, UC) metric)

Attribute Contrastive Learning +4

Multi-Head State Space Model for Speech Recognition

no code implementations21 May 2023 Yassir Fathullah, Chunyang Wu, Yuan Shangguan, Junteng Jia, Wenhan Xiong, Jay Mahadeokar, Chunxi Liu, Yangyang Shi, Ozlem Kalinli, Mike Seltzer, Mark J. F. Gales

State space models (SSMs) have recently shown promising results on small-scale sequence and language modelling tasks, rivalling and outperforming many attention-based approaches.

Language Modelling speech-recognition +1

VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation

no code implementations4 May 2023 Xilun Chen, Lili Yu, Wenhan Xiong, Barlas Oğuz, Yashar Mehdad, Wen-tau Yih

We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamental vision-language concepts, and then adapted to video data in an intermediate video-text pre-training stage to learn video-specific skills such as spatio-temporal reasoning.

Decoder Question Answering +4

3DGen: Triplane Latent Diffusion for Textured Mesh Generation

no code implementations9 Mar 2023 Anchit Gupta, Wenhan Xiong, Yixin Nie, Ian Jones, Barlas Oğuz

We take another step along this direction, combining these developments in a two-step pipeline consisting of 1) a triplane VAE which can learn latent representations of textured meshes and 2) a conditional diffusion model which generates the triplane features.

Image Generation Texture Synthesis

CLIP-Layout: Style-Consistent Indoor Scene Synthesis with Semantic Furniture Embedding

1 code implementation7 Mar 2023 Jingyu Liu, Wenhan Xiong, Ian Jones, Yixin Nie, Anchit Gupta, Barlas Oğuz

Whether heuristic or learned, these methods ignore instance-level visual attributes of objects, and as a result may produce visually less coherent scenes.

Indoor Scene Synthesis Scene Generation

Bridging the Training-Inference Gap for Dense Phrase Retrieval

no code implementations25 Oct 2022 Gyuwan Kim, Jinhyuk Lee, Barlas Oguz, Wenhan Xiong, Yizhe Zhang, Yashar Mehdad, William Yang Wang

Building dense retrievers requires a series of standard procedures, including training and validating neural models and creating indexes for efficient search.

Open-Domain Question Answering Passage Retrieval +1

SCROLLS: Standardized CompaRison Over Long Language Sequences

2 code implementations10 Jan 2022 Uri Shaham, Elad Segal, Maor Ivgi, Avia Efrat, Ori Yoran, Adi Haviv, Ankit Gupta, Wenhan Xiong, Mor Geva, Jonathan Berant, Omer Levy

NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild.

Decoder Long-range modeling +2

Boosted Dense Retriever

no code implementations NAACL 2022 Patrick Lewis, Barlas Oğuz, Wenhan Xiong, Fabio Petroni, Wen-tau Yih, Sebastian Riedel

DrBoost is trained in stages: each component model is learned sequentially and specialized by focusing only on retrieval mistakes made by the current ensemble.

Quantization Retrieval

Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation

no code implementations10 Dec 2021 Tianyi Liu, Zuxuan Wu, Wenhan Xiong, Jingjing Chen, Yu-Gang Jiang

Our experiments show that there is a trade-off between understanding tasks and generation tasks while using the same model, and a feasible way to improve both tasks is to use more data.

Image-text matching Language Modelling +8

Zero-shot Fact Verification by Claim Generation

1 code implementation ACL 2021 Liangming Pan, Wenhu Chen, Wenhan Xiong, Min-Yen Kan, William Yang Wang

However, for each new domain that requires fact verification, creating a dataset by manually writing claims and linking them to their supporting evidence is expensive.

2k Fact Verification

Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval

1 code implementation ICLR 2021 Wenhan Xiong, Xiang Lorraine Li, Srini Iyer, Jingfei Du, Patrick Lewis, William Yang Wang, Yashar Mehdad, Wen-tau Yih, Sebastian Riedel, Douwe Kiela, Barlas Oğuz

We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on two multi-hop datasets, HotpotQA and multi-evidence FEVER.

Question Answering Retrieval

Do Multi-hop Readers Dream of Reasoning Chains?

1 code implementation WS 2019 Haoyu Wang, Mo Yu, Xiaoxiao Guo, Rajarshi Das, Wenhan Xiong, Tian Gao

General Question Answering (QA) systems over texts require the multi-hop reasoning capability, i. e. the ability to reason with information collected from multiple passages to derive the answer.

Question Answering

Simple yet Effective Bridge Reasoning for Open-Domain Multi-Hop Question Answering

no code implementations WS 2019 Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Hong Wang, Shiyu Chang, Murray Campbell, William Yang Wang

To resolve this issue, we introduce a new sub-problem of open-domain multi-hop QA, which aims to recognize the bridge (\emph{i. e.}, the anchor that links to the answer passage) from the context of a set of start passages with a reading comprehension model.

Information Retrieval Multi-hop Question Answering +3

Neural Correction Model for Open-Domain Named Entity Recognition

1 code implementation13 Sep 2019 Mengdi Zhu, Zheye Deng, Wenhan Xiong, Mo Yu, Ming Zhang, William Yang Wang

In this work, to address the low precision and recall problems, we first utilize DBpedia as the source of distant supervision to annotate abstracts from Wikipedia and design a neural correction model trained with a human-annotated NER dataset, DocRED, to correct the false entity labels.

Multi-Task Learning named-entity-recognition +4

Meta Reasoning over Knowledge Graphs

no code implementations13 Aug 2019 Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang

The ability to reason over learned knowledge is an innate ability for humans and humans can easily master new reasoning rules with only a few demonstrations.

Few-Shot Learning Knowledge Base Completion +1

TWEETQA: A Social Media Focused Question Answering Dataset

no code implementations ACL 2019 Wenhan Xiong, Jiawei Wu, Hong Wang, Vivek Kulkarni, Mo Yu, Shiyu Chang, Xiaoxiao Guo, William Yang Wang

With social media becoming increasingly pop-ular on which lots of news and real-time eventsare reported, developing automated questionanswering systems is critical to the effective-ness of many applications that rely on real-time knowledge.

Question Answering

Self-Supervised Learning for Contextualized Extractive Summarization

2 code implementations ACL 2019 Hong Wang, Xin Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang

Existing models for extractive summarization are usually trained from scratch with a cross-entropy loss, which does not explicitly capture the global context at the document level.

Extractive Summarization Self-Supervised Learning

Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader

2 code implementations ACL 2019 Wenhan Xiong, Mo Yu, Shiyu Chang, Xiaoxiao Guo, William Yang Wang

We propose a new end-to-end question answering model, which learns to aggregate answer evidence from an incomplete knowledge base (KB) and a set of retrieved text snippets.

Question Answering

Sentence Embedding Alignment for Lifelong Relation Extraction

2 code implementations NAACL 2019 Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang

We formulate such a challenging problem as lifelong relation extraction and investigate memory-efficient incremental learning methods without catastrophically forgetting knowledge learned from previous tasks.

Incremental Learning Relation +4

Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing

1 code implementation NAACL 2019 Wenhan Xiong, Jiawei Wu, Deren Lei, Mo Yu, Shiyu Chang, Xiaoxiao Guo, William Yang Wang

Existing entity typing systems usually exploit the type hierarchy provided by knowledge base (KB) schema to model label correlations and thus improve the overall performance.

Entity Typing Inductive Bias

SafeRoute: Learning to Navigate Streets Safely in an Urban Environment

1 code implementation3 Nov 2018 Sharon Levy, Wenhan Xiong, Elizabeth Belding, William Yang Wang

We propose SafeRoute, a novel solution to the problem of navigating cities and avoiding street harassment and crime.

Navigate Representation Learning

One-Shot Relational Learning for Knowledge Graphs

1 code implementation EMNLP 2018 Wenhan Xiong, Mo Yu, Shiyu Chang, Xiaoxiao Guo, William Yang Wang

Knowledge graphs (KGs) are the key components of various natural language processing applications.

Relational Reasoning

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

3 code implementations16 Jun 2018 Wenhan Xiong, Xiaoxiao Guo, Mo Yu, Shiyu Chang, Bo-Wen Zhou, William Yang Wang

We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs.

Efficient Exploration reinforcement-learning +1

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

1 code implementation ECCV 2018 Xin Wang, Wenhan Xiong, Hongmin Wang, William Yang Wang

In this paper, we take a radical approach to bridge the gap between synthetic studies and real-world practices---We propose a novel, planned-ahead hybrid reinforcement learning model that combines model-free and model-based reinforcement learning to solve a real-world vision-language navigation task.

Model-based Reinforcement Learning reinforcement-learning +4

Variational Knowledge Graph Reasoning

no code implementations NAACL 2018 Wenhu Chen, Wenhan Xiong, Xifeng Yan, William Wang

Inferring missing links in knowledge graphs (KG) has attracted a lot of attention from the research community.

Knowledge Graphs Link Prediction +2

Cannot find the paper you are looking for? You can Submit a new open access paper.