Search Results for author: Nan Yang

Found 85 papers, 29 papers with code

Unified Language Model Pre-training for Natural Language Understanding and Generation

9 code implementations NeurIPS 2019 Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.

Ranked #2 on Generative Question Answering on CoQA (using extra training data)

Abstractive Text Summarization Document Summarization +7

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

1 code implementation NeurIPS 2020 Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, Ming Zhou

The small model (student) is trained by deeply mimicking the self-attention module, which plays a vital role in Transformer networks, of the large model (teacher).

Zero-shot Text Search

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

3 code implementations28 Feb 2020 Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM).

Ranked #4 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Language Modelling +3

Pseudo-Masked Language Models for Unified Language Model Pre-Training

1 code implementation ICML 2020 Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM).

Language Modelling Natural Language Understanding +1

s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning

1 code implementation26 Oct 2021 Hangbo Bao, Li Dong, Wenhui Wang, Nan Yang, Furu Wei

Pretrained bidirectional Transformers, such as BERT, have achieved significant improvements in a wide variety of language understanding tasks, while it is not straightforward to directly apply them for natural language generation.

Abstractive Text Summarization Question Generation +2

SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval

1 code implementation6 Jul 2022 Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei

It employs a simple bottleneck architecture that learns to compress the passage information into a dense vector through self-supervised pre-training.

Language Modelling Passage Retrieval +1

Multilingual E5 Text Embeddings: A Technical Report

1 code implementation8 Feb 2024 Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei

This technical report presents the training methodology and evaluation results of the open-source multilingual E5 text embedding models, released in mid-2023.

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

4 code implementations NAACL 2021 Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, He-Yan Huang, Ming Zhou

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.

Contrastive Learning Cross-Lingual Transfer +2

Learning to Retrieve In-Context Examples for Large Language Models

2 code implementations14 Jul 2023 Liang Wang, Nan Yang, Furu Wei

Our framework initially trains a reward model based on LLM feedback to evaluate the quality of candidate examples, followed by knowledge distillation to train a bi-encoder based dense retriever.

In-Context Learning Knowledge Distillation

Generative Representational Instruction Tuning

2 code implementations15 Feb 2024 Niklas Muennighoff, Hongjin Su, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela

Notably, we find that GRIT matches training on only generative or embedding data, thus we can unify both at no performance loss.

Language Modelling Large Language Model +1

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

1 code implementation CVPR 2021 Felix Wimbauer, Nan Yang, Lukas von Stumberg, Niclas Zeller, Daniel Cremers

Unlike other multi-view stereo methods, MonoRec is able to reconstruct both static and moving objects by leveraging the predicted masks.

Fine-Tuning LLaMA for Multi-Stage Text Retrieval

1 code implementation12 Oct 2023 Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin

Our findings demonstrate that the effectiveness of large language models indeed surpasses that of smaller models.

Passage Retrieval Retrieval +1

Behind the Scenes: Density Fields for Single View Reconstruction

2 code implementations CVPR 2023 Felix Wimbauer, Nan Yang, Christian Rupprecht, Daniel Cremers

By directly sampling color from the available views instead of storing color in the density field, our scene representation becomes significantly less complex compared to NeRFs, and a neural network can predict it in a single forward pass.

Depth Estimation Depth Prediction +1

Neural Question Generation from Text: A Preliminary Study

6 code implementations6 Apr 2017 Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, Ming Zhou

Automatic question generation aims to generate questions from a text passage where the generated questions can be answered by certain sub-spans of the given passage.

Position Question Generation +2

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

1 code implementation19 Sep 2023 Dawei Zhu, Nan Yang, Liang Wang, YiFan Song, Wenhao Wu, Furu Wei, Sujian Li

To decouple train length from target length for efficient context window extension, we propose Positional Skip-wisE (PoSE) training that smartly simulates long inputs using a fixed context window.

2k Position

Improving Text Embeddings with Large Language Models

1 code implementation31 Dec 2023 Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei

In this paper, we introduce a novel and simple method for obtaining high-quality text embeddings using only synthetic data and less than 1k training steps.

Selective Encoding for Abstractive Sentence Summarization

2 code implementations ACL 2017 Qingyu Zhou, Nan Yang, Furu Wei, Ming Zhou

We propose a selective encoding model to extend the sequence-to-sequence framework for abstractive sentence summarization.

Sentence Sentence Summarization

Multiview Identifiers Enhanced Generative Retrieval

1 code implementation26 May 2023 Yongqi Li, Nan Yang, Liang Wang, Furu Wei, Wenjie Li

Instead of simply matching a query to pre-existing passages, generative retrieval generates identifier strings of passages as the retrieval target.

Retrieval

Learning to Rank in Generative Retrieval

2 code implementations27 Jun 2023 Yongqi Li, Nan Yang, Liang Wang, Furu Wei, Wenjie Li

However, only learning to generate is insufficient for generative retrieval.

Learning-To-Rank Passage Ranking +3

Learning Diverse Document Representations with Deep Query Interactions for Dense Retrieval

1 code implementation8 Aug 2022 Zehan Li, Nan Yang, Liang Wang, Furu Wei

In this paper, we propose a new dense retrieval model which learns diverse document representations with deep query interactions.

Retrieval

LongEmbed: Extending Embedding Models for Long Context Retrieval

1 code implementation18 Apr 2024 Dawei Zhu, Liang Wang, Nan Yang, YiFan Song, Wenhao Wu, Furu Wei, Sujian Li

This paper explores context window extension of existing embedding models, pushing the limit to 32k without requiring additional training.

4k 8k +1

Sequential Copying Networks

1 code implementation6 Jul 2018 Qingyu Zhou, Nan Yang, Furu Wei, Ming Zhou

Copying mechanism shows effectiveness in sequence-to-sequence based neural network models for text generation tasks, such as abstractive sentence summarization and question generation.

Question Generation Question-Generation +3

Challenges in Monocular Visual Odometry: Photometric Calibration, Motion Bias and Rolling Shutter Effect

no code implementations11 May 2017 Nan Yang, Rui Wang, Xiang Gao, Daniel Cremers

Monocular visual odometry (VO) and simultaneous localization and mapping (SLAM) have seen tremendous improvements in accuracy, robustness and efficiency, and have gained increasing popularity over recent years.

Monocular Visual Odometry Simultaneous Localization and Mapping

Relaxed Wasserstein with Applications to GANs

no code implementations19 May 2017 Xin Guo, Johnny Hong, Tianyi Lin, Nan Yang

Wasserstein Generative Adversarial Networks (WGANs) provide a versatile class of models, which have attracted great attention in various applications.

Image Generation

S-Net: From Answer Extraction to Answer Generation for Machine Reading Comprehension

no code implementations15 Jun 2017 Chuanqi Tan, Furu Wei, Nan Yang, Bowen Du, Weifeng Lv, Ming Zhou

We build the answer extraction model with state-of-the-art neural networks for single passage reading comprehension, and propose an additional task of passage ranking to help answer extraction in multiple passages.

Answer Generation Machine Reading Comprehension +1

Ambiguity set and learning via Bregman and Wasserstein

no code implementations23 May 2017 Xin Guo, Johnny Hong, Nan Yang

Construction of ambiguity set in robust optimization relies on the choice of divergences between probability distributions.

BIG-bench Machine Learning

Brains and pseudorandom generators

no code implementations26 Nov 2013 Vašek Chvátal, Mark Goldsmith, Nan Yang

In a pioneering classic, Warren McCulloch and Walter Pitts proposed a model of the central nervous system; motivated by EEG recordings of normal brain activity, Chv\' atal and Goldsmith asked whether or not this model can be engineered to provide pseudorandom number generators.

EEG

Jointly Modeling Topics and Intents with Global Order Structure

no code implementations7 Dec 2015 Bei Chen, Jun Zhu, Nan Yang, Tian Tian, Ming Zhou, Bo Zhang

Modeling document structure is of great importance for discourse analysis and related applications.

Radical-Enhanced Chinese Character Embedding

no code implementations18 Apr 2014 Yaming Sun, Lei Lin, Duyu Tang, Nan Yang, Zhenzhou Ji, Xiaolong Wang

We present a method to leverage radical for learning Chinese character embedding.

Chinese Word Segmentation

Attention-Guided Answer Distillation for Machine Reading Comprehension

no code implementations EMNLP 2018 Minghao Hu, Yuxing Peng, Furu Wei, Zhen Huang, Dongsheng Li, Nan Yang, Ming Zhou

Despite that current reading comprehension systems have achieved significant advancements, their promising performances are often obtained at the cost of making an ensemble of numerous models.

Knowledge Distillation Machine Reading Comprehension

Sequence-to-Dependency Neural Machine Translation

no code implementations ACL 2017 Shuangzhi Wu, Dong-dong Zhang, Nan Yang, Mu Li, Ming Zhou

Nowadays a typical Neural Machine Translation (NMT) model generates translations from left to right as a linear sequence, during which latent syntactic structures of the target sentences are not explicitly concerned.

Machine Translation NMT +1

Improving Attention Modeling with Implicit Distortion and Fertility for Machine Translation

no code implementations COLING 2016 Shi Feng, Shujie Liu, Nan Yang, Mu Li, Ming Zhou, Kenny Q. Zhu

In neural machine translation, the attention mechanism facilitates the translation process by producing a soft alignment between the source sentence and the target sentence.

Machine Translation Sentence +1

Multi-Frame GAN: Image Enhancement for Stereo Visual Odometry in Low Light

no code implementations15 Oct 2019 Eunah Jung, Nan Yang, Daniel Cremers

We propose the concept of a multi-frame GAN (MFGAN) and demonstrate its potential as an image sequence enhancement for stereo visual odometry in low light conditions.

Image Enhancement Optical Flow Estimation +2

D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry

no code implementations CVPR 2020 Nan Yang, Lukas von Stumberg, Rui Wang, Daniel Cremers

We propose D3VO as a novel framework for monocular visual odometry that exploits deep networks on three levels -- deep depth, pose and uncertainty estimation.

Monocular Depth Estimation Monocular Visual Odometry

LM-Reloc: Levenberg-Marquardt Based Direct Visual Relocalization

no code implementations13 Oct 2020 Lukas von Stumberg, Patrick Wenzel, Nan Yang, Daniel Cremers

The learned features significantly improve the robustness of direct image alignment, especially for relocalization across different conditions.

Pose Estimation

Coverage Analysis for 3D Terahertz Communication Systems with Blockage and Directional Antennas

no code implementations16 Apr 2020 Akram Shafie, Nan Yang, Zhuo Sun, Salman Durrani

We further show that the coverage performance improvement brought by increasing the antenna directivity at APs is higher than that brought by increasing the antenna directivity at UEs.

Expected Density of Cooperative Bacteria in a 2D Quorum Sensing Based Molecular Communication System

no code implementations1 Dec 2018 Yuting Fang, Adam Noel, Andrew W. Eckford, Nan Yang

The number of molecules observed at each randomly-distributed bacterium is first derived by characterizing the diffusion and degradation of signaling molecules within the population.

Hybrid Beamforming for Terahertz Multi-Carrier Systems over Frequency Selective Fading

no code implementations14 Oct 2019 Hang Yuan, Nan Yang, Kai Yang, Chong Han, Jianping An

We consider a three-dimensional wideband THz channel by incorporating the joint effect of molecular absorption, high sparsity, and multi-path fading, and consider the carrier frequency offset in multi-carrier systems.

Directional Modulation-Enabled Secure Transmission with Intelligent Reflecting Surface

no code implementations7 Jul 2020 Liangling Lai, Jinsong Hu, Youjia Chen, Haifeng Zheng, Nan Yang

We propose a new secure transmission scheme which uses directional modulation (DM) with artificial noise and is aided by the intelligent reflecting surface (IRS).

Position

Coverage Analysis for 3D Terahertz Communication Systems

no code implementations20 Apr 2021 Akram Shafie, Nan Yang, Salman Durrani, Xiangyun Zhou, Chong Han, Markku Juntti

We conduct novel coverage probability analysis of downlink transmission in a three-dimensional (3D) terahertz (THz) communication (THzCom) system.

SINGA-Easy: An Easy-to-Use Framework for MultiModal Analysis

no code implementations3 Aug 2021 Naili Xing, Sai Ho Yeung, ChengHao Cai, Teck Khim Ng, Wei Wang, Kaiyuan Yang, Nan Yang, Meihui Zhang, Gang Chen, Beng Chin Ooi

Specifically, in terms of usability, it is demanding for non-experts to implement deep learning models, obtain the right settings for the entire machine learning pipeline, manage models and datasets, and exploit external data sources all together.

Image Classification

Spectrum Allocation with Adaptive Sub-band Bandwidth for Terahertz Communication Systems

no code implementations10 Nov 2021 Akram Shafie, Nan Yang, Sheeraz Alvi, Chong Han, Salman Durrani, Josep M. Jornet

Aided by numerical results, we show that by enabling and optimizing ASB, significantly higher throughput can be achieved as compared to adopting equal sub-band bandwidth, and this throughput gain is most profound when the power budget constraint is more stringent.

Novel Spectrum Allocation Among Multiple Transmission Windows for Terahertz Communication Systems

no code implementations6 Jul 2022 Akram Shafie, Nan Yang, Chong Han, Josep M. Jornet

We also show that a further data rate gain can be obtained by optimally determining the unused spectra at the edges of TWs, as compared to avoiding using pre-defined spectra at the edges of TWs.

Terahertz Communications for 6G and Beyond Wireless Networks: Challenges, Key Advancements, and Opportunities

no code implementations22 Jul 2022 Akram Shafie, Nan Yang, Chong Han, Josep Miquel Jornet, Markku Juntti, Thomas Kurner

The unprecedented increase in wireless data traffic, predicted to occur within the next decade, is motivating academia and industries to look beyond contemporary wireless standards and conceptualize the sixth-generation (6G) wireless networks.

Management

An Unsupervised Learning Approach for Spectrum Allocation in Terahertz Communication Systems

no code implementations7 Aug 2022 Akram Shafie, Chunhui Li, Nan Yang, Xiangyun Zhou, Trung Q. Duong

Numerical results demonstrate that comparing to existing approaches, our proposed unsupervised learning-based approach achieves a higher data rate, especially when the molecular absorption coefficient within the spectrum of interest varies in a highly non-linear manner.

CCR: Facial Image Editing with Continuity, Consistency and Reversibility

no code implementations22 Sep 2022 Nan Yang, Xin Luan, Huidi Jia, Zhi Han, Yandong Tang

In this work, we put forward three concepts and corresponding definitions: editing continuity, consistency, and reversibility.

Attribute

Adversarial Transformer for Repairing Human Airway Segmentation

no code implementations21 Oct 2022 Zeyu Tang, Nan Yang, Simon Walsh, Guang Yang

Discontinuity in the delineation of peripheral bronchioles hinders the potential clinical application of automated airway segmentation models.

Segmentation

Terahertz Communications for Massive Connectivity and Security in 6G and Beyond Era

no code implementations25 Oct 2022 Nan Yang, Akram Shafie

Terahertz (THz) communications (THzCom) has experienced a meteoric rise of interest, due to its benefits for ultra-high data rate transmission in the sixth generation (6G) and beyond era.

Management

4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions

no code implementations31 Dec 2022 Patrick Wenzel, Nan Yang, Rui Wang, Niclas Zeller, Daniel Cremers

In this paper, we present a novel visual SLAM and long-term localization benchmark for autonomous driving in challenging conditions based on the large-scale 4Seasons dataset.

Autonomous Driving Benchmarking +2

Random Padding Data Augmentation

no code implementations17 Feb 2023 Nan Yang, Laicheng Zhong, Fan Huang, Dong Yuan, Wei Bao

Random Padding is parameter-free, simple to construct, and compatible with the majority of CNN-based recognition models.

Data Augmentation Image Classification +1

FedIL: Federated Incremental Learning from Decentralized Unlabeled Data with Convergence Analysis

no code implementations23 Feb 2023 Nan Yang, Dong Yuan, Charles Z Liu, Yongkun Deng, Wei Bao

Most existing federated learning methods assume that clients have fully labeled data to train on, while in reality, it is hard for the clients to get task-specific labels due to users' privacy concerns, high labeling costs, or lack of expertise.

Federated Learning Incremental Learning +1

Real-time scheduling of renewable power systems through planning-based reinforcement learning

no code implementations9 Mar 2023 Shaohuai Liu, Jinbo Liu, Weirui Ye, Nan Yang, Guanglun Zhang, Haiwang Zhong, Chongqing Kang, Qirong Jiang, Xuri Song, Fangchun Di, Yang Gao

The well-trained scheduling agent significantly reduces renewable curtailment and load shedding, which are issues arising from traditional scheduling's reliance on inaccurate day-ahead forecasts.

reinforcement-learning Reinforcement Learning (RL) +1

Query2doc: Query Expansion with Large Language Models

no code implementations14 Mar 2023 Liang Wang, Nan Yang, Furu Wei

This paper introduces a simple yet effective query expansion approach, denoted as query2doc, to improve both sparse and dense retrieval systems.

Memorization Retrieval

FedMAE: Federated Self-Supervised Learning with One-Block Masked Auto-Encoder

no code implementations20 Mar 2023 Nan Yang, Xuanyu Chen, Charles Z. Liu, Dong Yuan, Wei Bao, Lizhen Cui

Latest federated learning (FL) methods started to focus on how to use unlabeled data in clients for training due to users' privacy concerns, high labeling costs, or lack of expertise.

Federated Learning Image Reconstruction +1

Combining Adversaries with Anti-adversaries in Training

no code implementations25 Apr 2023 Xiaoling Zhou, Nan Yang, Ou wu

On the basis of our theoretical findings, a more general learning objective that combines adversaries and anti-adversaries with varied bounds on each training sample is presented.

Fairness Meta-Learning

UAV-assisted IoT Monitoring Network: Adaptive Multiuser Access for Low-Latency and High-Reliability Under Bursty Traffic

no code implementations25 Apr 2023 Nilupuli Senadhira, Salman Durrani, Sheeraz A. Alvi, Nan Yang, Xiangyun Zhou

In this work, we propose an adaptive system design for an Internet of Things (IoT) monitoring network with latency and reliability requirements, where IoT devices generate time-critical and event-triggered bursty traffic, and an unmanned aerial vehicle (UAV) aggregates and relays sensed data to the base station.

Incremental Dense Reconstruction from Monocular Video with Guided Sparse Feature Volume Fusion

no code implementations24 May 2023 Xingxing Zuo, Nan Yang, Nathaniel Merrill, Binbin Xu, Stefan Leutenegger

Incrementally recovering 3D dense structures from monocular videos is of paramount importance since it enables various robotics and AR applications.

Large Search Model: Redefining Search Stack in the Era of LLMs

no code implementations23 Oct 2023 Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei

Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others.

Language Modelling Large Language Model +3

Enhancing Traffic Object Detection in Variable Illumination with RGB-Event Fusion

no code implementations1 Nov 2023 Zhanwen Liu, Nan Yang, Yang Wang, Yuke Li, Xiangmo Zhao, Fei-Yue Wang

To address this issue, we introduce bio-inspired event cameras and propose a novel Structure-aware Fusion Network (SFNet) that extracts sharp and complete object structures from the event stream to compensate for the lost information in images through cross-modality fusion, enabling the network to obtain illumination-robust representations for traffic object detection.

Object object-detection +2

Time-Frequency Localization Characteristics of the Delay-Doppler Plane Orthogonal Pulse

no code implementations13 Nov 2023 Akram Shafie, Jinhong Yuan, Nan Yang, Hai Lin

Furthermore, we determine the TFA for the recently proposed generalized design of the DDOP.

Event-driven Real-time Retrieval in Web Search

no code implementations1 Dec 2023 Nan Yang, Shusen Zhang, Yannan Zhang, Xiaoling Bai, Hualong Deng, Tianhua Zhou, Jin Ma

The Event information is then integrated with the query through a cross-attention mechanism, resulting in a time-context query representation.

Information Retrieval Retrieval

Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework

no code implementations17 Mar 2024 Kaiyan Chang, Kun Wang, Nan Yang, Ying Wang, Dantong Jin, Wenlong Zhu, Zhirong Chen, Cangyuan Li, Hao Yan, Yunhao Zhou, Zhuoliang Zhao, Yuan Cheng, Yudong Pan, Yiqi Liu, Mengdi Wang, Shengwen Liang, Yinhe Han, Huawei Li, Xiaowei Li

Our 13B model (ChipGPT-FT) has a pass rate improvement compared with GPT-3. 5 in Verilog generation and outperforms in EDA script (i. e., SiliconCompiler) generation with only 200 EDA script data.

Data Augmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.