Search Results for author: Tao Qin

Found 197 papers, 97 papers with code

Global Ranking Using Continuous Conditional Random Fields

no code implementations • NeurIPS 2008 • Tao Qin, Tie-Yan Liu, Xu-Dong Zhang, De-Sheng Wang, Hang Li

It can naturally represent the content information of objects as well as the relation information between objects, necessary for global ranking.

Information Retrieval Learning-To-Rank +1

Paper
Add Code

A New Probabilistic Model for Rank Aggregation

no code implementations • NeurIPS 2010 • Tao Qin, Xiubo Geng, Tie-Yan Liu

To avoid these limitations, in this paper, we propose a new model, which is defined with a coset-permutation distance, and models the generation of a permutation as a stagewise process.

Paper
Add Code

Pure Price of Anarchy for Generalized Second Price Auction

no code implementations • 23 May 2013 • Wenkui Ding, Tao Wu, Tao Qin, Tie-Yan Liu

Previous studies have shown that the pure Price Of Anarchy (POA) of GSP is 1. 25 when there are two ad slots and 1. 259 when three ad slots.

Computer Science and Game Theory

Paper
Add Code

Introducing LETOR 4.0 Datasets

3 code implementations • 9 Jun 2013 • Tao Qin, Tie-Yan Liu

We call the two query sets MQ2007 and MQ2008 for short.

Learning-To-Rank

441

Paper
Code

Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising

no code implementations • NeurIPS 2013 • Min Xu, Tao Qin, Tie-Yan Liu

In search advertising, the search engine needs to select the most profitable advertisements to display, which can be formulated as an instance of online learning with partial feedback, also known as the stochastic multi-armed bandit (MAB) problem.

Selection bias

Paper
Add Code

Agent Behavior Prediction and Its Generalization Analysis

no code implementations • 19 Apr 2014 • Fei Tian, Haifang Li, Wei Chen, Tao Qin, Enhong Chen, Tie-Yan Liu

Then we prove a generalization bound for the machine learning algorithms on the behavior data generated by the new Markov chain, which depends on both the Markovian parameters and the covering number of the function class compounded by the loss function for behavior prediction and the behavior prediction model.

BIG-bench Machine Learning

Paper
Add Code

Generalization Analysis for Game-Theoretic Machine Learning

no code implementations • 9 Oct 2014 • Haifang Li, Fei Tian, Wei Chen, Tao Qin, Tie-Yan Liu

For Internet applications like sponsored search, cautions need to be taken when using machine learning to optimize their mechanisms (e. g., auction) since self-interested agents in these applications may change their behaviors (and thus the data distribution) in response to the mechanisms.

BIG-bench Machine Learning

Paper
Add Code

Thompson Sampling for Budgeted Multi-armed Bandits

no code implementations • 1 May 2015 • Yingce Xia, Haifang Li, Tao Qin, Nenghai Yu, Tie-Yan Liu

In this paper, we extend the Thompson sampling to Budgeted MAB, where there is random cost for pulling an arm and the total cost is constrained by a budget.

Multi-Armed Bandits Thompson Sampling

Paper
Add Code

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

no code implementations • NeurIPS 2016 • Xiang Li, Tao Qin, Jian Yang, Tie-Yan Liu

Based on the 2-Component shared embedding, we design a new RNN algorithm and evaluate it using the language modeling task on several benchmark datasets.

Language Modelling Machine Translation

Paper
Add Code

Dual Learning for Machine Translation

1 code implementation • NeurIPS 2016 • Yingce Xia, Di He, Tao Qin, Li-Wei Wang, Nenghai Yu, Tie-Yan Liu, Wei-Ying Ma

Based on the feedback signals generated during this process (e. g., the language-model likelihood of the output of a model, and the reconstruction error of the original sentence after the primal and dual translations), we can iteratively update the two models until convergence (e. g., using the policy gradient methods).

Language Modelling Machine Translation +4

Paper
Code

Randomized Mechanisms for Selling Reserved Instances in Cloud

no code implementations • 22 Nov 2016 • Jia Zhang, Weidong Ma, Tao Qin, Xiaoming Sun, Tie-Yan Liu

We then extend our mechanism to the general case and achieve a competitive ratio $\frac{1}{42\log k\log T}$ for both social welfare and revenue, where $T$ is the ratio of the maximum request length to the minimum request length and $k$ is the ratio of the maximum request value density to the minimum request value density.

Cloud Computing

Paper
Add Code

Learning What Data to Learn

no code implementations • 28 Feb 2017 • Yang Fan, Fei Tian, Tao Qin, Jiang Bian, Tie-Yan Liu

Machine learning is essentially the sciences of playing with data.

BIG-bench Machine Learning Image Classification

Paper
Add Code

Adversarial Neural Machine Translation

no code implementations • 20 Apr 2017 • Lijun Wu, Yingce Xia, Li Zhao, Fei Tian, Tao Qin, Jian-Huang Lai, Tie-Yan Liu

The goal of the adversary is to differentiate the translation result generated by the NMT model from that by human.

Machine Translation NMT +1

Paper
Add Code

Reinforcement Learning for Learning Rate Control

no code implementations • 31 May 2017 • Chang Xu, Tao Qin, Gang Wang, Tie-Yan Liu

Stochastic gradient descent (SGD), which updates the model parameters by adding a local gradient times a learning rate at each step, is widely used in model training of machine learning algorithms such as neural networks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Question Answering and Question Generation as Dual Tasks

no code implementations • 7 Jun 2017 • Duyu Tang, Nan Duan, Tao Qin, Zhao Yan, Ming Zhou

On one side, the QA model judges whether the generated question of a QG model is relevant to the answer.

Question Answering Question Generation +1

Paper
Add Code

Dual Supervised Learning

1 code implementation • ICML 2017 • Yingce Xia, Tao Qin, Wei Chen, Jiang Bian, Nenghai Yu, Tie-Yan Liu

Many supervised learning tasks are emerged in dual forms, e. g., English-to-French translation vs. French-to-English translation, speech recognition vs. text to speech, and image classification vs. image generation.

General Classification Image Classification +6

142

Paper
Code

Decoding with Value Networks for Neural Machine Translation

no code implementations • NeurIPS 2017 • Di He, Hanqing Lu, Yingce Xia, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Inspired by the success and methodology of AlphaGo, in this paper we propose using a prediction network to improve beam search, which takes the source sentence $x$, the currently available decoding output $y_1,\cdots, y_{t-1}$ and a candidate word $w$ at step $t$ as inputs and predicts the long-term value (e. g., BLEU score) of the partial target sentence if it is completed by the NMT model.

Machine Translation NMT +2

Paper
Add Code

Deliberation Networks: Sequence Generation Beyond One-Pass Decoding

no code implementations • NeurIPS 2017 • Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, Tie-Yan Liu

In this work, we introduce the deliberation process into the encoder-decoder framework and propose deliberation networks for sequence generation.

Image Captioning Machine Translation +3

Paper
Add Code

Achieving Human Parity on Automatic Chinese to English News Translation

2 code implementations • 15 Mar 2018 • Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dong-dong Zhang, Zhirui Zhang, Ming Zhou

Machine translation has made rapid advances in recent years.

Ranked #3 on Machine Translation on WMT 2017 English-Chinese

Machine Translation Translation

Paper
Code

Conditional Image-to-Image Translation

no code implementations • CVPR 2018 • Jianxin Lin, Yingce Xia, Tao Qin, Zhibo Chen, Tie-Yan Liu

In this paper, we study a new problem, conditional image-to-image translation, which is to translate an image from the source domain to the target domain conditioned on a given image in the target domain.

Image-to-Image Translation Translation

Paper
Add Code

Learning to Teach

no code implementations • ICLR 2018 • Yang Fan, Fei Tian, Tao Qin, Xiang-Yang Li, Tie-Yan Liu

Teaching plays a very important role in our society, by spreading human knowledge and educating our next generations.

BIG-bench Machine Learning Image Classification

Paper
Add Code

Efficient Sequence Learning with Group Recurrent Networks

no code implementations • NAACL 2018 • Fei Gao, Lijun Wu, Li Zhao, Tao Qin, Xue-Qi Cheng, Tie-Yan Liu

Recurrent neural networks have achieved state-of-the-art results in many artificial intelligence tasks, such as language modeling, neural machine translation, speech recognition and so on.

Language Modelling Machine Translation +3

Paper
Add Code

Dense Information Flow for Neural Machine Translation

1 code implementation • NAACL 2018 • Yanyao Shen, Xu Tan, Di He, Tao Qin, Tie-Yan Liu

Recently, neural machine translation has achieved remarkable progress by introducing well-designed deep neural networks into its encoder-decoder framework.

Ranked #68 on Machine Translation on WMT2014 English-German

Machine Translation NMT +1

Paper
Code

Towards Binary-Valued Gates for Robust LSTM Training

1 code implementation • ICML 2018 • Zhuohan Li, Di He, Fei Tian, Wei Chen, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Long Short-Term Memory (LSTM) is one of the most widely used recurrent structures in sequence modeling.

Paper
Code

Double Path Networks for Sequence to Sequence Learning

1 code implementation • COLING 2018 • Kaitao Song, Xu Tan, Di He, Jianfeng Lu, Tao Qin, Tie-Yan Liu

In this work we propose Double Path Networks for Sequence to Sequence learning (DPN-S2S), which leverage the advantages of both models by using double path information fusion.

Paper
Code

Model-Level Dual Learning

no code implementations • ICML 2018 • Yingce Xia, Xu Tan, Fei Tian, Tao Qin, Nenghai Yu, Tie-Yan Liu

Many artificial intelligence tasks appear in dual forms like English$\leftrightarrow$French translation and speech$\leftrightarrow$text transformation.

Machine Translation Sentiment Analysis +1

Paper
Add Code

Neural Architecture Optimization

5 code implementations • NeurIPS 2018 • Renqian Luo, Fei Tian, Tao Qin, Enhong Chen, Tie-Yan Liu

The performance predictor and the encoder enable us to perform gradient based optimization in the continuous space to find the embedding of a new architecture with potentially better accuracy.

Ranked #3 on Neural Architecture Search on CIFAR-10 Image Classification

Evolutionary Algorithms General Classification +3

282

Paper
Code

A Study of Reinforcement Learning for Neural Machine Translation

1 code implementation • EMNLP 2018 • Lijun Wu, Fei Tian, Tao Qin, Jian-Huang Lai, Tie-Yan Liu

Recent studies have shown that reinforcement learning (RL) is an effective approach for improving the performance of neural machine translation (NMT) system.

Machine Translation NMT +3

185

Paper
Code

Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter

no code implementations • EMNLP 2018 • Lijun Wu, Xu Tan, Di He, Fei Tian, Tao Qin, Jian-Huang Lai, Tie-Yan Liu

Many previous works have discussed the relationship between error propagation and the \emph{accuracy drop} (i. e., the left part of the translated sentence is often better than its right part in left-to-right decoding models) problem.

Machine Translation Sentence +2

Paper
Add Code

FRAGE: Frequency-Agnostic Word Representation

2 code implementations • NeurIPS 2018 • Chengyue Gong, Di He, Xu Tan, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks.

Ranked #3 on Machine Translation on IWSLT2015 German-English

Language Modelling Machine Translation +5

118

Paper
Code

Learning to Teach with Dynamic Loss Functions

no code implementations • NeurIPS 2018 • Lijun Wu, Fei Tian, Yingce Xia, Yang Fan, Tao Qin, Jian-Huang Lai, Tie-Yan Liu

Different from typical learning settings in which the loss function of a machine learning model is predefined and fixed, in our framework, the loss function of a machine learning model (we call it student) is defined by another machine learning model (we call it teacher).

BIG-bench Machine Learning Image Classification +1

Paper
Add Code

Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation

no code implementations • NeurIPS 2018 • Tianyu He, Xu Tan, Yingce Xia, Di He, Tao Qin, Zhibo Chen, Tie-Yan Liu

Neural Machine Translation (NMT) has achieved remarkable progress with the quick evolvement of model structures.

Machine Translation NMT +1

Paper
Add Code

Sentence-wise Smooth Regularization for Sequence to Sequence Learning

no code implementations • 12 Dec 2018 • Chengyue Gong, Xu Tan, Di He, Tao Qin

Maximum-likelihood estimation (MLE) is widely used in sequence to sequence tasks for model training.

Machine Translation Multi-class Classification +3

Paper
Add Code

Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input

no code implementations • 23 Dec 2018 • Junliang Guo, Xu Tan, Di He, Tao Qin, Linli Xu, Tie-Yan Liu

Non-autoregressive translation (NAT) models, which remove the dependence on previous target tokens from the inputs of the decoder, achieve significantly inference speedup but at the cost of inferior accuracy compared to autoregressive translation (AT) models.

Machine Translation Sentence +2

Paper
Add Code

Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation

1 code implementation • 11 Feb 2019 • Jianxin Lin, Zhibo Chen, Yingce Xia, Sen Liu, Tao Qin, Jiebo Luo

After pre-training, this network is used to extract the domain-specific features of each image.

Attribute Disentanglement +2

Paper
Code

Non-Autoregressive Machine Translation with Auxiliary Regularization

no code implementations • 22 Feb 2019 • Yiren Wang, Fei Tian, Di He, Tao Qin, ChengXiang Zhai, Tie-Yan Liu

However, the high efficiency has come at the cost of not capturing the sequential dependency on the target side of translation, which causes NAT to suffer from two kinds of translation errors: 1) repeated translations (due to indistinguishable adjacent decoder hidden states), and 2) incomplete translations (due to incomplete transfer of source side information via the decoder hidden states).

Machine Translation Sentence +1

Paper
Add Code

Multilingual Neural Machine Translation with Knowledge Distillation

1 code implementation • ICLR 2019 • Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu

Multilingual machine translation, which translates multiple languages with a single model, has attracted much attention due to its efficiency of offline training and online serving.

Knowledge Distillation Machine Translation +1

Paper
Code

Competitive Bridge Bidding with Deep Neural Networks

no code implementations • 3 Mar 2019 • Jiang Rong, Tao Qin, Bo An

Second, based on the analysis of the impact of other players' unknown cards on one's final rewards, we design two neural networks to deal with imperfect information, the first one inferring the cards of the partner and the second one taking the outputs of the first one as part of its input to select a bid.

Paper
Add Code

Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion

no code implementations • 6 Apr 2019 • Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu

Recently, G2P conversion is viewed as a sequence to sequence task and modeled by RNN or CNN based encoder-decoder framework.

Ranked #1 on Text-To-Speech Synthesis on CMUDict 0.7b

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Multi-Agent Dual Learning

no code implementations • ICLR 2019 • Yiren Wang, Yingce Xia, Tianyu He, Fei Tian, Tao Qin, ChengXiang Zhai, Tie-Yan Liu

Dual learning has attracted much attention in machine learning, computer vision and natural language processing communities.

Ranked #1 on Machine Translation on WMT2016 English-German

Machine Translation Translation

Paper
Add Code

Local Image-to-Image Translation via Pixel-wise Highway Adaptive Instance Normalization

1 code implementation • ICLR 2019 • Wonwoong Cho, Seunghwan Choi, Junwoo Park, David Keetae Park, Tao Qin, Jaegul Choo

Recently, image-to-image translation has seen a significant success.

Image-to-Image Translation Translation

Paper
Code

Dual Learning: Theoretical Study and Algorithmic Extensions

no code implementations • ICLR 2019 • Zhibing Zhao, Yingce Xia, Tao Qin, Tie-Yan Liu

Based on the theoretical discoveries, we extend dual learning by introducing more related mappings and propose highly symmetric frameworks, cycle dual learning and multipath dual learning, in both of which we can leverage the feedback signals from additional domains to improve the qualities of the mappings.

Machine Translation Translation

Paper
Add Code

Hint-based Training for Non-Autoregressive Translation

no code implementations • ICLR 2019 • Zhuohan Li, Di He, Fei Tian, Tao Qin, Li-Wei Wang, Tie-Yan Liu

To improve the accuracy of NART models, in this paper, we propose to leverage the hints from a well-trained ART model to train the NART model.

Machine Translation Translation

Paper
Add Code

MASS: Masked Sequence to Sequence Pre-training for Language Generation

7 code implementations • 7 May 2019 • Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu

Pre-training and fine-tuning, e. g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks.

Ranked #2 on Unsupervised Machine Translation on WMT2014 English-French

Conversational Response Generation Response Generation +5

1,115

Paper
Code

Almost Unsupervised Text to Speech and Automatic Speech Recognition

no code implementations • 13 May 2019 • Yi Ren, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Text to speech (TTS) and automatic speech recognition (ASR) are two dual tasks in speech processing and both achieve impressive performance thanks to the recent advance in deep learning and large amount of aligned speech and text data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

FastSpeech: Fast, Robust and Controllable Text to Speech

21 code implementations • NeurIPS 2019 • Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

In this work, we propose a novel feed-forward network based on Transformer to generate mel-spectrogram in parallel for TTS.

Ranked #10 on Text-To-Speech Synthesis on LJSpeech (using extra training data)

Speech Synthesis Text-To-Speech Synthesis

28,968

Paper
Code

FastSpeech: Fast,Robustand Controllable Text-to-Speech

11 code implementations • 22 May 2019 • Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Compared with traditional concatenative and statistical parametric approaches, neural network based end-to-end models suffer from slow inference speed, and the synthesized speech is usually not robust (i. e., some words are skipped or repeated) and lack of controllability (voice speed or prosody control).

Text-To-Speech Synthesis

10,084

Paper
Code

Soft Contextual Data Augmentation for Neural Machine Translation

1 code implementation • ACL 2019 • Jinhua Zhu, Fei Gao, Lijun Wu, Yingce Xia, Tao Qin, Wengang Zhou, Xue-Qi Cheng, Tie-Yan Liu

While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study in natural language tasks is still very limited.

Data Augmentation Language Modelling +3

Paper
Code

Image-to-Image Translation with Multi-Path Consistency Regularization

no code implementations • 29 May 2019 • Jianxin Lin, Yingce Xia, Yijun Wang, Tao Qin, Zhibo Chen

In this work, we introduce a new kind of loss, multi-path consistency loss, which evaluates the differences between direct translation $\mathcal{D}_s\to\mathcal{D}_t$ and indirect translation $\mathcal{D}_s\to\mathcal{D}_a\to\mathcal{D}_t$ with $\mathcal{D}_a$ as an auxiliary domain, to regularize training.

Face to Face Translation Image-to-Image Translation +1

Paper
Add Code

Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View

2 code implementations • ICLR 2020 • Yiping Lu, Zhuohan Li, Di He, Zhiqing Sun, Bin Dong, Tao Qin, Li-Wei Wang, Tie-Yan Liu

In this paper, we provide a novel perspective towards understanding the architecture: we show that the Transformer can be mathematically interpreted as a numerical Ordinary Differential Equation (ODE) solver for a convection-diffusion equation in a multi-particle dynamic system.

Position Sentence

145

Paper
Code

Unsupervised Pivot Translation for Distant Languages

no code implementations • ACL 2019 • Yichong Leng, Xu Tan, Tao Qin, Xiang-Yang Li, Tie-Yan Liu

In this work, we introduce unsupervised pivot translation for distant languages, which translates a language to a distant language through multiple hops, and the unsupervised translation on each hop is relatively easier than the original direct translation.

Machine Translation NMT +1

Paper
Add Code

What and Where to Translate: Local Mask-based Image-to-Image Translation

no code implementations • 9 Jun 2019 • Wonwoong Cho, Seunghwan Choi, Junwoo Park, David Keetae Park, Tao Qin, Jaegul Choo

First, those methods extract style from an entire exemplar which includes noisy information, which impedes a translation model from properly extracting the intended style of the exemplar.

Image-to-Image Translation Translation

Paper
Add Code

Depth Growing for Neural Machine Translation

1 code implementation • ACL 2019 • Lijun Wu, Yiren Wang, Yingce Xia, Fei Tian, Fei Gao, Tao Qin, Jian-Huang Lai, Tie-Yan Liu

While very deep neural networks have shown effectiveness for computer vision and text classification applications, how to increase the network depth of neural machine translation (NMT) models for better translation quality remains a challenging problem.

Ranked #11 on Machine Translation on WMT2014 English-French

Machine Translation NMT +3

Paper
Code

Representation Degeneration Problem in Training Natural Language Generation Models

1 code implementation • ICLR 2019 • Jun Gao, Di He, Xu Tan, Tao Qin, Li-Wei Wang, Tie-Yan Liu

We study an interesting problem in training neural network-based models for natural language generation tasks, which we call the \emph{representation degeneration problem}.

Language Modelling Machine Translation +3

Paper
Code

Language Graph Distillation for Low-Resource Machine Translation

no code implementations • 17 Aug 2019 • Tianyu He, Jiale Chen, Xu Tan, Tao Qin

Neural machine translation on low-resource language is challenging due to the lack of bilingual sentence pairs.

Knowledge Distillation Machine Translation +3

Paper
Add Code

Hard but Robust, Easy but Sensitive: How Encoder and Decoder Perform in Neural Machine Translation

no code implementations • 17 Aug 2019 • Tianyu He, Xu Tan, Tao Qin

Neural machine translation (NMT) typically adopts the encoder-decoder framework.

Machine Translation NMT +1

Paper
Add Code

Multilingual Neural Machine Translation with Language Clustering

no code implementations • IJCNLP 2019 • Xu Tan, Jiale Chen, Di He, Yingce Xia, Tao Qin, Tie-Yan Liu

We study two methods for language clustering: (1) using prior knowledge, where we cluster languages according to language family, and (2) using language embedding, in which we represent each language by an embedding vector and cluster them in the embedding space.

Clustering Machine Translation +2

Paper
Add Code

Efficient Bidirectional Neural Machine Translation

no code implementations • 25 Aug 2019 • Xu Tan, Yingce Xia, Lijun Wu, Tao Qin

In this paper, we propose an efficient method to generate a sequence in both left-to-right and right-to-left manners using a single encoder and decoder, combining the advantages of both generation directions.

Machine Translation Translation

Paper
Add Code

Effective Search of Logical Forms for Weakly Supervised Knowledge-Based Question Answering

no code implementations • 6 Sep 2019 • Tao Shen, Xiubo Geng, Tao Qin, Guodong Long, Jing Jiang, Daxin Jiang

These two problems lead to a poorly-trained semantic parsing model.

Question Answering Semantic Parsing +1

Paper
Add Code

Hint-Based Training for Non-Autoregressive Machine Translation

1 code implementation • IJCNLP 2019 • Zhuohan Li, Zi Lin, Di He, Fei Tian, Tao Qin, Li-Wei Wang, Tie-Yan Liu

Due to the unparallelizable nature of the autoregressive factorization, AutoRegressive Translation (ART) models have to generate tokens sequentially during decoding and thus suffer from high inference latency.

Machine Translation Translation

Paper
Code

Balanced One-shot Neural Architecture Optimization

1 code implementation • 24 Sep 2019 • Renqian Luo, Tao Qin, Enhong Chen

One-shot NAS is proposed to reduce the expense but shows inferior performance against conventional NAS and is not adequately stable.

Neural Architecture Search

109

Paper
Code

Independence-aware Advantage Estimation

no code implementations • 25 Sep 2019 • Pushi Zhang, Li Zhao, Guoqing Liu, Jiang Bian, Minglie Huang, Tao Qin, Tie-Yan Liu

Most of existing advantage function estimation methods in reinforcement learning suffer from the problem of high variance, which scales unfavorably with the time horizon.

Paper
Add Code

Demonstration Actor Critic

no code implementations • 25 Sep 2019 • Guoqing Liu, Li Zhao, Pushi Zhang, Jiang Bian, Tao Qin, Nenghai Yu, Tie-Yan Liu

One approach leverages demonstration data in a supervised manner, which is simple and direct, but can only provide supervision signal over those states seen in the demonstrations.

Paper
Add Code

Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

1 code implementation • IJCNLP 2019 • Tao Shen, Xiubo Geng, Tao Qin, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, Daxin Jiang

We consider the problem of conversational question answering over a large-scale knowledge base.

Conversational Question Answering Multi-Task Learning +1

Paper
Code

Exploiting Monolingual Data at Scale for Neural Machine Translation

no code implementations • IJCNLP 2019 • Lijun Wu, Yiren Wang, Yingce Xia, Tao Qin, Jian-Huang Lai, Tie-Yan Liu

In this work, we study how to use both the source-side and target-side monolingual data for NMT, and propose an effective strategy leveraging both of them.

Ranked #1 on Machine Translation on WMT2016 English-German (SacreBLEU metric, using extra training data)

Machine Translation NMT +1

Paper
Add Code

Machine Translation With Weakly Paired Documents

no code implementations • IJCNLP 2019 • Lijun Wu, Jinhua Zhu, Di He, Fei Gao, Tao Qin, Jian-Huang Lai, Tie-Yan Liu

1) We provide a simple approach to mine implicitly bilingual sentence pairs from document pairs which can then be used as supervised training signals.

Sentence Translation +1

Paper
Add Code

Fully Parameterized Quantile Function for Distributional Reinforcement Learning

6 code implementations • NeurIPS 2019 • Derek Yang, Li Zhao, Zichuan Lin, Tao Qin, Jiang Bian, Tie-Yan Liu

The key challenge in practical distributional RL algorithms lies in how to parameterize estimated distributions so as to better approximate the true continuous distribution.

Ranked #3 on Atari Games on Atari 2600 Skiing (using extra training data)

Atari Games Distributional Reinforcement Learning +2

2,505

Paper
Code

Distributional Reward Decomposition for Reinforcement Learning

no code implementations • NeurIPS 2019 • Zichuan Lin, Li Zhao, Derek Yang, Tao Qin, Guangwen Yang, Tie-Yan Liu

Many reinforcement learning (RL) tasks have specific properties that can be leveraged to modify existing RL algorithms to adapt to those tasks and further improve performance, and a general class of such properties is the multiple reward channel.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Microsoft Research Asia's Systems for WMT19

no code implementations • WS 2019 • Yingce Xia, Xu Tan, Fei Tian, Fei Gao, Weicong Chen, Yang Fan, Linyuan Gong, Yichong Leng, Renqian Luo, Yiren Wang, Lijun Wu, Jinhua Zhu, Tao Qin, Tie-Yan Liu

We Microsoft Research Asia made submissions to 11 language directions in the WMT19 news translation tasks.

Data Augmentation Knowledge Distillation +1

Paper
Add Code

Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

2 code implementations • 20 Nov 2019 • Junliang Guo, Xu Tan, Linli Xu, Tao Qin, Enhong Chen, Tie-Yan Liu

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models.

Machine Translation Translation

Paper
Code

Neural Machine Translation with Soft Prototype

1 code implementation • NeurIPS 2019 • Yiren Wang, Yingce Xia, Fei Tian, Fei Gao, Tao Qin, Cheng Xiang Zhai, Tie-Yan Liu

Neural machine translation models usually use the encoder-decoder framework and generate translation from left to right (or right to left) without fully utilizing the target-side global information.

Machine Translation Translation

Paper
Code

Normalization Helps Training of Quantized LSTM

1 code implementation • NeurIPS 2019 • Lu Hou, Jinhua Zhu, James Kwok, Fei Gao, Tao Qin, Tie-Yan Liu

The long-short-term memory (LSTM), though powerful, is memory and computa\x02tion expensive.

Quantization

Paper
Code

A Study of Multilingual Neural Machine Translation

no code implementations • 25 Dec 2019 • Xu Tan, Yichong Leng, Jiale Chen, Yi Ren, Tao Qin, Tie-Yan Liu

Multilingual neural machine translation (NMT) has recently been investigated from different aspects (e. g., pivot translation, zero-shot translation, fine-tuning, or training from scratch) and in different settings (e. g., rich resource and low resource, one-to-many, and many-to-one translation).

Machine Translation NMT +1

Paper
Add Code

Incorporating BERT into Neural Machine Translation

3 code implementations • ICLR 2020 • Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

While BERT is more commonly used as fine-tuning instead of contextual embedding for downstream language understanding tasks, in NMT, our preliminary exploration of using BERT as contextual embedding is better than using for fine-tuning.

Ranked #1 on Unsupervised Machine Translation on WMT2014 English-French

Natural Language Understanding NMT +5

351

Paper
Code

Semi-Supervised Neural Architecture Search

2 code implementations • NeurIPS 2020 • Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, Tie-Yan Liu

On ImageNet, it achieves 23. 5% top-1 error rate (under 600M FLOPS constraint) using 4 GPU-days for search.

Ranked #81 on Neural Architecture Search on ImageNet

Natural Language Transduction Neural Architecture Search

Paper
Code

Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View.

no code implementations • ICLR Workshop DeepDiffEq 2019 • Yiping Lu*, Zhuohan Li*, Di He, Zhiqing Sun, Bin Dong, Tao Qin, LiWei Wang, Tie-Yan Liu

In particular, how words in a sentence are abstracted into contexts by passing through the layers of the Transformer can be interpreted as approximating multiple particles' movement in the space using the Lie-Trotter splitting scheme and the Euler's method.

Sentence

Paper
Add Code

Suphx: Mastering Mahjong with Deep Reinforcement Learning

no code implementations • 30 Mar 2020 • Junjie Li, Sotetsu Koyamada, Qiwei Ye, Guoqing Liu, Chao Wang, Ruihan Yang, Li Zhao, Tao Qin, Tie-Yan Liu, Hsiao-Wuen Hon

Artificial Intelligence (AI) has achieved great success in many domains, and game AI is widely regarded as its beachhead since the dawn of AI.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

MPNet: Masked and Permuted Pre-training for Language Understanding

6 code implementations • NeurIPS 2020 • Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu

Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem.

Ranked #16 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (using extra training data)

Language Modelling Masked Language Modeling +3

124,527

Paper
Code

LightPAFF: A Two-Stage Distillation Framework for Pre-training and Fine-tuning

no code implementations • 27 Apr 2020 • Kaitao Song, Hao Sun, Xu Tan, Tao Qin, Jianfeng Lu, Hongzhi Liu, Tie-Yan Liu

While pre-training and fine-tuning, e. g., BERT~\citep{devlin2018bert}, GPT-2~\citep{radford2019language}, have achieved great success in language understanding and generation tasks, the pre-trained models are usually too big for online deployment in terms of both memory cost and inference speed, which hinders them from practical online usage.

Knowledge Distillation Language Modelling

Paper
Add Code

Dual Learning: Theoretical Study and an Algorithmic Extension

no code implementations • 17 May 2020 • Zhibing Zhao, Yingce Xia, Tao Qin, Lirong Xia, Tie-Yan Liu

Dual learning has been successfully applied in many machine learning applications including machine translation, image-to-image transformation, etc.

Machine Translation Translation

Paper
Add Code

MultiSpeech: Multi-Speaker Text to Speech with Transformer

1 code implementation • 8 Jun 2020 • Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin, Tie-Yan Liu

Transformer-based text to speech (TTS) model (e. g., Transformer TTS~\cite{li2019neural}, FastSpeech~\cite{ren2019fastspeech}) has shown the advantages of training and inference efficiency over RNN-based model (e. g., Tacotron~\cite{shen2018natural}) due to its parallel computation in training and/or inference.

Paper
Code

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

32 code implementations • ICLR 2021 • Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

In this paper, we propose FastSpeech 2, which addresses the issues in FastSpeech and better solves the one-to-many mapping problem in TTS by 1) directly training the model with ground-truth target instead of the simplified output from teacher, and 2) introducing more variation information of speech (e. g., pitch, energy and more accurate duration) as conditional inputs.

Ranked #6 on Text-To-Speech Synthesis on LJSpeech (using extra training data)

Knowledge Distillation Speech Synthesis +1

28,968

Paper
Code

UWSpeech: Speech to Speech Translation for Unwritten Languages

no code implementations • 14 Jun 2020 • Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Ke-jun Zhang, Tie-Yan Liu

Existing speech to speech translation systems heavily rely on the text of target language: they usually translate source language either to target text and then synthesize target speech from text, or directly to target speech with target text for auxiliary training.

speech-recognition Speech Recognition +2

Paper
Add Code

Multi-branch Attentive Transformer

1 code implementation • 18 Jun 2020 • Yang Fan, Shufang Xie, Yingce Xia, Lijun Wu, Tao Qin, Xiang-Yang Li, Tie-Yan Liu

While the multi-branch architecture is one of the key ingredients to the success of computer vision tasks, it has not been well investigated in natural language processing, especially sequence learning tasks.

Ranked #4 on Machine Translation on WMT2014 English-German (SacreBLEU metric)

Code Generation Machine Translation +2

Paper
Code

SimulSpeech: End-to-End Simultaneous Speech to Text Translation

no code implementations • ACL 2020 • Yi Ren, Jinglin Liu, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu

In this work, we develop SimulSpeech, an end-to-end simultaneous speech to text translation system which translates speech in source language to text in target language concurrently.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

DeepSinger: Singing Voice Synthesis with Data Mined From the Web

no code implementations • 9 Jul 2020 • Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu

DeepSinger has several advantages over previous SVS systems: 1) to the best of our knowledge, it is the first SVS system that directly mines training data from music websites, 2) the lyrics-to-singing alignment model further avoids any human efforts for alignment labeling and greatly reduces labeling cost, 3) the singing model based on a feed-forward Transformer is simple and efficient, by removing the complicated acoustic feature modeling in parametric synthesis and leveraging a reference encoder to capture the timbre of a singer from noisy singing data, and 4) it can synthesize singing voices in multiple languages and multiple singers.

Sentence Singing Voice Synthesis

Paper
Add Code

Accuracy Prediction with Non-neural Model for Neural Architecture Search

1 code implementation • 9 Jul 2020 • Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, Tie-Yan Liu

Considering that most architectures are represented as sequences of discrete symbols which are more like tabular data and preferred by non-neural predictors, in this paper, we study an alternative approach which uses non-neural model for accuracy prediction.

Ranked #81 on Neural Architecture Search on ImageNet

Neural Architecture Search

Paper
Code

Learning to Reweight with Deep Interactions

no code implementations • 9 Jul 2020 • Yang Fan, Yingce Xia, Lijun Wu, Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li

Recently, the concept of teaching has been introduced into machine learning, in which a teacher model is used to guide the training of a student model (which will be used in real tasks) through data selection, loss function design, etc.

Image Classification Machine Translation +1

Paper
Add Code

Temporally Correlated Task Scheduling for Sequence Learning

2 code implementations • 10 Jul 2020 • Xueqing Wu, Lewen Wang, Yingce Xia, Weiqing Liu, Lijun Wu, Shufang Xie, Tao Qin, Tie-Yan Liu

In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks, which are different in terms of how much input information to use or which future step to predict.

Machine Translation Scheduling +1

14,123

Paper
Code

Learning to Match Distributions for Domain Adaptation

1 code implementation • 17 Jul 2020 • Chaohui Yu, Jindong Wang, Chang Liu, Tao Qin, Renjun Xu, Wenjie Feng, Yiqiang Chen, Tie-Yan Liu

However, it remains challenging to determine which method is suitable for a given application since they are built with certain priors or bias.

Domain Adaptation Inductive Bias

12,828

Paper
Code

Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation

no code implementations • 17 Jul 2020 • Jinglin Liu, Yi Ren, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu

SAT contains a hyperparameter k, and each k value defines a SAT task with different degrees of parallelism.

Machine Translation Sentence +1

Paper
Add Code

LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition

no code implementations • 9 Aug 2020 • Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu

However, there are more than 6, 000 languages in the world and most languages are lack of speech training data, which poses significant challenges when building TTS and ASR systems for extremely low-resource languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

PopMAG: Pop Music Accompaniment Generation

1 code implementation • 18 Aug 2020 • Yi Ren, Jinzheng He, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu

To improve harmony, in this paper, we propose a novel MUlti-track MIDI representation (MuMIDI), which enables simultaneous multi-track generation in a single sequence and explicitly models the dependency of the notes from different tracks.

Music Modeling

579

Paper
Code

HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis

1 code implementation • 3 Sep 2020 • Jiawei Chen, Xu Tan, Jian Luan, Tao Qin, Tie-Yan Liu

To tackle the difficulty of singing modeling caused by high sampling rate (wider frequency band and longer waveform), we introduce multi-scale adversarial training in both the acoustic model and vocoder to improve singing modeling.

Singing Voice Synthesis Vocal Bursts Intensity Prediction

105

Paper
Code

Knowledge-Aware Procedural Text Understanding with Multi-Stage Training

no code implementations • 28 Sep 2020 • Zhihan Zhang, Xiubo Geng, Tao Qin, Yunfang Wu, Daxin Jiang

In this work, we focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.

Procedural Text Understanding

Paper
Add Code

Towards Interpretable Reasoning over Paragraph Effects in Situation

1 code implementation • EMNLP 2020 • Mucheng Ren, Xiubo Geng, Tao Qin, Heyan Huang, Daxin Jiang

We focus on the task of reasoning over paragraph effects in situation, which requires a model to understand the cause and effect described in a background paragraph, and apply the knowledge to a novel situation.

Paper
Code

Masked Contrastive Representation Learning for Reinforcement Learning

1 code implementation • 15 Oct 2020 • Jinhua Zhu, Yingce Xia, Lijun Wu, Jiajun Deng, Wengang Zhou, Tao Qin, Houqiang Li

During inference, the CNN encoder and the policy network are used to take actions, and the Transformer module is discarded.

Atari Games Contrastive Learning +3

Paper
Code

Learning Causal Semantic Representation for Out-of-Distribution Prediction

1 code implementation • NeurIPS 2021 • Chang Liu, Xinwei Sun, Jindong Wang, Haoyue Tang, Tao Li, Tao Qin, Wei Chen, Tie-Yan Liu

Conventional supervised learning methods, especially deep ones, are found to be sensitive to out-of-distribution (OOD) examples, largely because the learned representation mixes the semantic factor with the variation factor due to their domain-specific correlation, while only the semantic factor causes the output.

Domain Adaptation

Paper
Code

Latent Causal Invariant Model

no code implementations • 4 Nov 2020 • Xinwei Sun, Botong Wu, Xiangyu Zheng, Chang Liu, Wei Chen, Tao Qin, Tie-Yan Liu

To avoid spurious correlation, we propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.

Disentanglement

Paper
Add Code

RD$^2$: Reward Decomposition with Representation Decomposition

no code implementations • NeurIPS 2020 • Zichuan Lin, Derek Yang, Li Zhao, Tao Qin, Guangwen Yang, Tie-Yan Liu

In this work, we propose a set of novel reward decomposition principles by constraining uniqueness and compactness of different state features/representations relevant to different sub-rewards.

Paper
Add Code

SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint

1 code implementation • 9 Dec 2020 • Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin

Automatic song writing aims to compose a song (lyric and/or melody) by machine, which is an interesting topic in both academia and industry.

Sentence

4,180

Paper
Code

Denoising Text to Speech with Frame-Level Noise Modeling

no code implementations • 17 Dec 2020 • Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu

In DenoiSpeech, we handle real-world noisy speech by modeling the fine-grained frame-level noise with a noise condition module, which is jointly trained with the TTS model.

Denoising

Paper
Add Code

ChemistryQA: A Complex Question Answering Dataset from Chemistry

no code implementations • 1 Jan 2021 • Zhuoyu Wei, Wei Ji, Xiubo Geng, Yining Chen, Baihua Chen, Tao Qin, Daxin Jiang

We notice that some real-world QA tasks are more complex, which cannot be solved by end-to-end neural networks or translated to any kind of formal representations.

Machine Reading Comprehension Math +1

Paper
Add Code

Task-Agnostic and Adaptive-Size BERT Compression

no code implementations • 1 Jan 2021 • Jin Xu, Xu Tan, Renqian Luo, Kaitao Song, Li Jian, Tao Qin, Tie-Yan Liu

NAS-BERT trains a big supernet on a carefully designed search space containing various architectures and outputs multiple compressed models with adaptive sizes and latency.

Language Modelling Model Compression +1

Paper
Add Code

Learning to Use Future Information in Simultaneous Translation

1 code implementation • 1 Jan 2021 • Xueqing Wu, Yingce Xia, Lijun Wu, Shufang Xie, Weiqing Liu, Tao Qin, Tie-Yan Liu

For wait-k inference, we observe that wait-m training with $m>k$ in simultaneous NMT (i. e., using more future information for training than inference) generally outperforms wait-k training.

Machine Translation NMT +2

Paper
Code

Image-to-Image Translation: Methods and Applications

no code implementations • 21 Jan 2021 • Yingxue Pang, Jianxin Lin, Tao Qin, Zhibo Chen

Image-to-image translation (I2I) aims to transfer images from a source domain to a target domain while preserving the content representations.

Image-to-Image Translation Pose Estimation +2

Paper
Add Code

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

4 code implementations • 8 Feb 2021 • Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu

Text to speech (TTS) has been broadly used to synthesize natural and intelligible speech in different scenarios.

Model Compression Neural Architecture Search

1,279

Paper
Code

Return-Based Contrastive Representation Learning for Reinforcement Learning

no code implementations • ICLR 2021 • Guoqing Liu, Chuheng Zhang, Li Zhao, Tao Qin, Jinhua Zhu, Jian Li, Nenghai Yu, Tie-Yan Liu

Recently, various auxiliary tasks have been proposed to accelerate representation learning and improve sample efficiency in deep reinforcement learning (RL).

Atari Games reinforcement-learning +2

Paper
Add Code

MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition

no code implementations • 25 Feb 2021 • Linghui Meng, Jin Xu, Xu Tan, Jindong Wang, Tao Qin, Bo Xu

In this paper, we propose MixSpeech, a simple yet effective data augmentation method based on mixup for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

AdaSpeech: Adaptive Text to Speech for Custom Voice

2 code implementations • ICLR 2021 • Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu

2) To better trade off the adaptation parameters and voice quality, we introduce conditional layer normalization in the mel-spectrogram decoder of AdaSpeech, and fine-tune this part in addition to speaker embedding for adaptation.

155

Paper
Code

Generalizing to Unseen Domains: A Survey on Domain Generalization

1 code implementation • 2 Mar 2021 • Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, Philip S. Yu

Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain.

Domain Generalization Out-of-Distribution Generalization +1

12,828

Paper
Code

Learning Invariant Representations across Domains and Tasks

no code implementations • 3 Mar 2021 • Jindong Wang, Wenjie Feng, Chang Liu, Chaohui Yu, Mingxuan Du, Renjun Xu, Tao Qin, Tie-Yan Liu

Being expensive and time-consuming to collect massive COVID-19 image samples to train deep classification models, transfer learning is a promising approach by transferring knowledge from the abundant typical pneumonia datasets for COVID-19 image classification.

Domain Adaptation Image Classification +1

Paper
Add Code

IOT: Instance-wise Layer Reordering for Transformer Structures

1 code implementation • ICLR 2021 • Jinhua Zhu, Lijun Wu, Yingce Xia, Shufang Xie, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

Based on this observation, in this work, we break the assumption of the fixed layer order in the Transformer and introduce instance-wise layer reordering into the model structure.

Abstractive Text Summarization Code Generation +2

Paper
Code

UniDrop: A Simple yet Effective Technique to Improve Transformer without Extra Cost

no code implementations • NAACL 2021 • Zhen Wu, Lijun Wu, Qi Meng, Yingce Xia, Shufang Xie, Tao Qin, Xinyu Dai, Tie-Yan Liu

Therefore, in this paper, we integrate different dropout techniques into the training of Transformer models.

Ranked #4 on Machine Translation on IWSLT2014 English-German

General Classification Machine Translation +3

Paper
Add Code

Cross-domain Speech Recognition with Unsupervised Character-level Distribution Matching

1 code implementation • 15 Apr 2021 • Wenxin Hou, Jindong Wang, Xu Tan, Tao Qin, Takahiro Shinozaki

End-to-end automatic speech recognition (ASR) can achieve promising performance with large-scale training data.

Ranked #1 on Cross-environment ASR on Libri-Adapt

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

12,826

Paper
Code

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

1 code implementation • 20 Apr 2021 • Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu

In adaptation, we use untranscribed speech data for speech reconstruction and only fine-tune the TTS decoder.

Paper
Code

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

1 code implementation • NeurIPS 2021 • Yichong Leng, Xu Tan, Linchen Zhu, Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiang-Yang Li, Ed Lin, Tie-Yan Liu

A straightforward solution to reduce latency, inspired by non-autoregressive (NAR) neural machine translation, is to use an NAR sequence generation model for ASR error correction, which, however, comes at the cost of significantly increased ASR error rate.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

1,279

Paper
Code

Exploiting Adapters for Cross-lingual Low-resource Speech Recognition

2 code implementations • 18 May 2021 • Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki

Based on our previous MetaAdapter that implicitly leverages adapters, we propose a novel algorithms called SimAdapter for explicitly learning knowledge from adapters.

Ranked #1 on Cross-Lingual ASR on Common Voice

Cross-Lingual ASR General Knowledge +3

12,828

Paper
Code

Distance-Enhanced Graph Neural Network for Link Prediction

1 code implementation • NA 2021 • Boling Li, Yingce Xia, Shufang Xie, Lijun Wu, Tao Qin

To overcome this difficulty, we propose an anchorbased distance: First, we randomly select K anchor vertices from the graph and then calculate the shortest distances of all vertices in the graph to them.

Ranked #1 on Link Property Prediction on ogbl-ddi

Link Prediction Link Property Prediction

Paper
Code

Learning Structures for Deep Neural Networks

no code implementations • 27 May 2021 • Jinhui Yuan, Fei Pan, Chunting Zhou, Tao Qin, Tie-Yan Liu

We further establish connections between this principle and the theory of Bayesian optimal classification, and empirically verify that larger entropy of the outputs of a deep neural network indeed corresponds to a better classification accuracy.

Classification Image Classification

Paper
Add Code

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

no code implementations • 30 May 2021 • Jin Xu, Xu Tan, Renqian Luo, Kaitao Song, Jian Li, Tao Qin, Tie-Yan Liu

The technical challenge of NAS-BERT is that training a big supernet on the pre-training task is extremely costly.

Language Modelling Model Compression +1

Paper
Add Code

MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training

2 code implementations • Findings (ACL) 2021 • Mingliang Zeng, Xu Tan, Rui Wang, Zeqian Ju, Tao Qin, Tie-Yan Liu

Inspired by the success of pre-training models in natural language processing, in this paper, we develop MusicBERT, a large-scale pre-trained model for music understanding.

Classification Emotion Classification +2

4,180

Paper
Code

PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior

1 code implementation • ICLR 2022 • Sang-gil Lee, Heeseung Kim, Chaehun Shin, Xu Tan, Chang Liu, Qi Meng, Tao Qin, Wei Chen, Sungroh Yoon, Tie-Yan Liu

Denoising diffusion probabilistic models have been recently proposed to generate high-quality samples by estimating the gradient of the data density.

Audio Generation Denoising +2

1,279

Paper
Code

Dual-view Molecule Pre-training

1 code implementation • 17 Jun 2021 • Jinhua Zhu, Yingce Xia, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

After pre-training, we can use either the Transformer branch (this one is recommended according to empirical results), the GNN branch, or both for downstream tasks.

Ranked #1 on Molecular Property Prediction on HIV dataset

Molecular Property Prediction Property Prediction +2

Paper
Code

R-Drop: Regularized Dropout for Neural Networks

9 code implementations • NeurIPS 2021 • Xiaobo Liang, Lijun Wu, Juntao Li, Yue Wang, Qi Meng, Tao Qin, Wei Chen, Min Zhang, Tie-Yan Liu

Dropout is a powerful and widely used technique to regularize the training of deep neural networks.

Ranked #4 on Machine Translation on WMT2014 English-French

Abstractive Text Summarization Image Classification +3

857

Paper
Code

A Survey on Neural Speech Synthesis

1 code implementation • 29 Jun 2021 • Xu Tan, Tao Qin, Frank Soong, Tie-Yan Liu

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural speech given text, is a hot research topic in speech, language, and machine learning communities and has broad applications in the industry.

Speech Synthesis

350

Paper
Code

On the Generative Utility of Cyclic Conditionals

1 code implementation • NeurIPS 2021 • Chang Liu, Haoyue Tang, Tao Qin, Jintao Wang, Tie-Yan Liu

This is motivated by the observation that deep generative models, in addition to a likelihood model $p(x|z)$, often also use an inference model $q(z|x)$ for extracting representation, but they rely on a usually uninformative prior distribution $p(z)$ to define a joint distribution, which may render problems like posterior collapse and manifold mismatch.

Paper
Code

Supervised Off-Policy Ranking

1 code implementation • 3 Jul 2021 • Yue Jin, Yue Zhang, Tao Qin, Xudong Zhang, Jian Yuan, Houqiang Li, Tie-Yan Liu

Inspired by the two observations, in this work, we study a new problem, supervised off-policy ranking (SOPR), which aims to rank a set of target policies based on supervised learning by leveraging off-policy data and policies with known performance.

Off-policy evaluation

Paper
Code

DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling

1 code implementation • ACL 2021 • Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu

In this paper, we develop DeepRapper, a Transformer-based rap generation system that can model both rhymes and rhythms.

Language Modelling

4,180

Paper
Code

AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style

no code implementations • 6 Jul 2021 • Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu

While recent text to speech (TTS) models perform very well in synthesizing reading-style (e. g., audiobook) speech, it is still challenging to synthesize spontaneous-style speech (e. g., podcast or conversation), mainly because of two reasons: 1) the lack of training data for spontaneous speech; 2) the difficulty in modeling the filled pauses (um and uh) and diverse rhythms in spontaneous speech.

Paper
Add Code

A Survey on Low-Resource Neural Machine Translation

no code implementations • 9 Jul 2021 • Rui Wang, Xu Tan, Renqian Luo, Tao Qin, Tie-Yan Liu

Neural approaches have achieved state-of-the-art accuracy on machine translation but suffer from the high cost of collecting large scale parallel data.

Low-Resource Neural Machine Translation NMT +1

Paper
Add Code

AdaRNN: Adaptive Learning and Forecasting of Time Series

2 code implementations • 10 Aug 2021 • Yuntao Du, Jindong Wang, Wenjie Feng, Sinno Pan, Tao Qin, Renjun Xu, Chongjun Wang

This paper proposes Adaptive RNNs (AdaRNN) to tackle the TCS problem by building an adaptive model that generalizes well on the unseen test data.

Human Activity Recognition Time Series +1

14,123

Paper
Code

Analyzing and Mitigating Interference in Neural Architecture Search

no code implementations • 29 Aug 2021 • Jin Xu, Xu Tan, Kaitao Song, Renqian Luo, Yichong Leng, Tao Qin, Tie-Yan Liu, Jian Li

In this paper, we investigate the interference issue by sampling different child models and calculating the gradient similarity of shared operators, and observe: 1) the interference on a shared operator between two child models is positively correlated with the number of different operators; 2) the interference is smaller when the inputs and outputs of the shared operator are more similar.

Neural Architecture Search Reading Comprehension

Paper
Add Code

PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription

no code implementations • 16 Sep 2021 • Chen Zhang, Jiaxing Yu, LuChin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang

Considering that there is a large amount of ASR training data, a straightforward method is to leverage ASR data to enhance ALT training.

Automatic Lyrics Transcription Automatic Speech Recognition +3

Paper
Add Code

TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method

1 code implementation • 20 Sep 2021 • Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiangyang Li, Tao Qin, Tie-Yan Liu

In this paper, we develop TeleMelody, a two-stage lyric-to-melody generation system with music template (e. g., tonality, chord progression, rhythm pattern, and cadence) to bridge the gap between lyrics and melodies (i. e., the system consists of a lyric-to-template module and a template-to-melody module).

4,180

Paper
Code

Discovering Drug-Target Interaction Knowledge from Biomedical Literature

no code implementations • 27 Sep 2021 • Yutai Hou, Yingce Xia, Lijun Wu, Shufang Xie, Yang Fan, Jinhua Zhu, Wanxiang Che, Tao Qin, Tie-Yan Liu

We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.

Paper
Add Code

Multi-Agent Reinforcement Learning with Shared Resource in Inventory Management

no code implementations • 29 Sep 2021 • Mingxiao Feng, Guozi Liu, Li Zhao, Lei Song, Jiang Bian, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

We consider inventory management (IM) problem for a single store with a large number of SKUs (stock keeping units) in this paper, where we need to make replenishment decisions for each SKU to balance its supply and demand.

Management Multi-agent Reinforcement Learning +2

Paper
Add Code

Particle Based Stochastic Policy Optimization

no code implementations • 29 Sep 2021 • Qiwei Ye, Yuxuan Song, Chang Liu, Fangyun Wei, Tao Qin, Tie-Yan Liu

Stochastic polic have been widely applied for their good property in exploration and uncertainty quantification.

Ranked #1 on MuJoCo Games on Ant-v3

MuJoCo Games Offline RL +2

Paper
Add Code

Exploiting Class Activation Value for Partial-Label Learning

3 code implementations • ICLR 2022 • Fei Zhang, Lei Feng, Bo Han, Tongliang Liu, Gang Niu, Tao Qin, Masashi Sugiyama

As the first contribution, we empirically show that the class activation map (CAM), a simple technique for discriminating the learning patterns of each class in images, is surprisingly better at making accurate predictions than the model itself on selecting the true label from candidate labels.

Multi-class Classification Partial Label Learning

Paper
Code

Target-Side Data Augmentation for Sequence Generation

1 code implementation • ICLR 2022 • Shufang Xie, Ang Lv, Yingce Xia, Lijun Wu, Tao Qin, Rui Yan, Tie-Yan Liu

Autoregressive sequence generation, a prevalent task in machine learning and natural language processing, generates every target token conditioned on both a source input and previously generated target tokens.

Abstractive Text Summarization Data Augmentation +2

Paper
Code

FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition

1 code implementation • Findings (EMNLP) 2021 • Yichong Leng, Xu Tan, Rui Wang, Linchen Zhu, Jin Xu, Wenjie Liu, Linquan Liu, Tao Qin, Xiang-Yang Li, Edward Lin, Tie-Yan Liu

Although multiple candidates are generated by an ASR system through beam search, current error correction approaches can only correct one sentence at a time, failing to leverage the voting effect from multiple candidates to better detect and correct error tokens.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

1,279

Paper
Code

GNN is a Counter? Revisiting GNN for Question Answering

no code implementations • ICLR 2022 • Kuan Wang, Yuyu Zhang, Diyi Yang, Le Song, Tao Qin

To open the black box of GNN and investigate these problems, we dissect state-of-the-art GNN modules for QA and analyze their reasoning capability.

Ranked #12 on Question Answering on OpenBookQA

Knowledge Graphs Question Answering

Paper
Add Code

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

1 code implementation • NeurIPS 2021 • Pushi Zhang, Xiaoyu Chen, Li Zhao, Wei Xiong, Tao Qin, Tie-Yan Liu

To fully inherit the benefits of distributional RL and hybrid reward architectures, we introduce Multi-Dimensional Distributional DQN (MD3QN), which extends distributional RL to model the joint return distribution from multiple reward sources.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Code

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

1 code implementation • NeurIPS 2021 • Jongjin Park, Younggyo Seo, Chang Liu, Li Zhao, Tao Qin, Jinwoo Shin, Tie-Yan Liu

Behavioral cloning has proven to be effective for learning sequential decision-making policies from expert demonstrations.

Imitation Learning

Paper
Code

Pre-training Co-evolutionary Protein Representation via A Pairwise Masked Language Model

no code implementations • 29 Oct 2021 • Liang He, Shizhuo Zhang, Lijun Wu, Huanhuan Xia, Fusong Ju, He Zhang, Siyuan Liu, Yingce Xia, Jianwei Zhu, Pan Deng, Bin Shao, Tao Qin, Tie-Yan Liu

The key problem in the protein sequence representation learning is to capture the co-evolutionary information reflected by the inter-residue co-variation in the sequences.

Language Modelling Multiple Sequence Alignment +1

Paper
Add Code

Towards Generating Real-World Time Series Data

1 code implementation • 16 Nov 2021 • Hengzhi Pei, Kan Ren, Yuqing Yang, Chang Liu, Tao Qin, Dongsheng Li

In this paper, we propose a novel generative framework for RTS data - RTSGAN to tackle the aforementioned challenges.

Generative Adversarial Network Time Series +1

Paper
Code

Recovering Latent Causal Factor for Generalization to Distributional Shifts

1 code implementation • NeurIPS 2021 • Xinwei Sun, Botong Wu, Xiangyu Zheng, Chang Liu, Wei Chen, Tao Qin, Tie-Yan Liu

To avoid such a spurious correlation, we propose \textbf{La}tent \textbf{C}ausal \textbf{I}nvariance \textbf{M}odels (LaCIM) that specifies the underlying causal structure of the data and the source of distributional shifts, guiding us to pursue only causal factor for prediction.

Paper
Code

Personalized Federated Learning with Adaptive Batchnorm for Healthcare

1 code implementation • 1 Dec 2021 • Wang Lu, Jindong Wang, Yiqiang Chen, Xin Qin, Renjun Xu, Dimitrios Dimitriadis, Tao Qin

There is a growing interest in applying machine learning techniques to healthcare.

Personalized Federated Learning Specificity

310

Paper
Code

Speech-T: Transducer for Text to Speech and Beyond

no code implementations • NeurIPS 2021 • Jiawei Chen, Xu Tan, Yichong Leng, Jin Xu, Guihua Wen, Tao Qin, Tie-Yan Liu

Experiments on LJSpeech datasets demonstrate that Speech-T 1) is more robust than the attention based autoregressive TTS model due to its inherent monotonic alignments between text and speech; 2) naturally supports streaming TTS with good voice quality; and 3) enjoys the benefit of joint modeling TTS and ASR in a single network.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Adaptive Memory Networks with Self-supervised Learning for Unsupervised Anomaly Detection

no code implementations • 3 Jan 2022 • Yuxin Zhang, Jindong Wang, Yiqiang Chen, Han Yu, Tao Qin

In this paper, we propose a novel approach called Adaptive Memory Network with Self-supervised Learning (AMSL) to address these challenges and enhance the generalization ability in unsupervised anomaly detection.

Self-Supervised Learning Sleep Stage Detection +3

Paper
Add Code

You May Not Need Ratio Clipping in PPO

no code implementations • 31 Jan 2022 • Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson

Furthermore, we show that ESPO can be easily scaled up to distributed training with many workers, delivering strong performance as well.

Continuous Control

Paper
Add Code

Direct Molecular Conformation Generation

1 code implementation • 3 Feb 2022 • Jinhua Zhu, Yingce Xia, Chang Liu, Lijun Wu, Shufang Xie, Yusong Wang, Tong Wang, Tao Qin, Wengang Zhou, Houqiang Li, Haiguang Liu, Tie-Yan Liu

Molecular conformation generation aims to generate three-dimensional coordinates of all the atoms in a molecule and is an important task in bioinformatics and pharmacology.

Molecular Docking

Paper
Code

Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality

no code implementations • ICLR 2022 • Jiawei Huang, Jinglin Chen, Li Zhao, Tao Qin, Nan Jiang, Tie-Yan Liu

Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Revisiting Over-Smoothness in Text to Speech

no code implementations • ACL 2022 • Yi Ren, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu

Then we conduct a comprehensive study on NAR-TTS models that use some advanced modeling methods.

Paper
Add Code

Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech

no code implementations • 31 Mar 2022 • Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao

However, the works apply pre-training with character-based units to enhance the TTS phoneme encoder, which is inconsistent with the TTS fine-tuning that takes phonemes as input.

Paper
Add Code

AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios

no code implementations • 1 Apr 2022 • Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu

We model the speaker characteristics systematically to improve the generalization on new speakers.

Speech Synthesis

Paper
Add Code

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

1 code implementation • 20 Apr 2022 • Yisheng Xiao, Lijun Wu, Junliang Guo, Juntao Li, Min Zhang, Tao Qin, Tie-Yan Liu

While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, autoregressive (AR) generation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +11

139

Paper
Code

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

3 code implementations • 9 May 2022 • Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, YuanHao Yi, Lei He, Frank Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu

In this paper, we answer these questions by first defining the human-level quality based on the statistical significance of subjective measure and introducing appropriate guidelines to judge it, and then developing a TTS system called NaturalSpeech that achieves human-level quality on a benchmark dataset.

Ranked #1 on Text-To-Speech Synthesis on LJSpeech (using extra training data)

Sentence Speech Synthesis +1

1,279

Paper
Code

Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling

1 code implementation • 25 May 2022 • Kaitao Song, Yichong Leng, Xu Tan, Yicheng Zou, Tao Qin, Dongsheng Li

Previous works on sentence scoring mainly adopted either causal language modeling (CLM) like GPT or masked language modeling (MLM) like BERT, which have some limitations: 1) CLM only utilizes unidirectional information for the probability estimation of a sentence without considering bidirectional context, which affects the scoring quality; 2) MLM can only estimate the probability of partial tokens at a time and thus requires multiple forward passes to estimate the probability of the whole sentence, which incurs large computation and time cost.

Causal Language Modeling Language Modelling +2

Paper
Code

Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret

1 code implementation • 25 May 2022 • Jiawei Huang, Li Zhao, Tao Qin, Wei Chen, Nan Jiang, Tie-Yan Liu

We propose a new learning framework that captures the tiered structure of many real-world user-interaction applications, where the users can be divided into two groups based on their different tolerance on exploration risks and should be treated separately.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

1 code implementation • 30 May 2022 • Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu

Combining this novel perspective of two-stage synthesis with advanced generative models (i. e., the diffusion models), the proposed BinauralGrad is able to generate accurate and high-fidelity binaural audio samples.

Audio Synthesis

1,279

Paper
Code

SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction

2 code implementations • 20 Jun 2022 • Qizhi Pei, Lijun Wu, Jinhua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Haiguang Liu, Tie-Yan Liu, Rui Yan

Accurate prediction of Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery, facilitating the identification of drugs that can effectively interact with specific targets and regulate their activities.

Ranked #1 on Drug Discovery on KIBA

Drug Discovery Language Modelling +2

Paper
Code

RetroGraph: Retrosynthetic Planning with Graph Search

1 code implementation • 23 Jun 2022 • Shufang Xie, Rui Yan, Peng Han, Yingce Xia, Lijun Wu, Chenjuan Guo, Bin Yang, Tao Qin

We observe that the same intermediate molecules are visited many times in the searching process, and they are usually independently treated in previous tree-based methods (e. g., AND-OR tree search, Monte Carlo tree search).

Ranked #2 on Multi-step retrosynthesis on USPTO-190

Drug Discovery Multi-step retrosynthesis

124

Paper
Code

Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations

no code implementations • 30 Jun 2022 • Akiko Eriguchi, Shufang Xie, Tao Qin, Hany Hassan Awadalla

Multilingual Neural Machine Translation (MNMT) enables one system to translate sentences from multiple source languages to multiple target languages, greatly reducing deployment costs compared with conventional bilingual systems.

Machine Translation Translation

Paper
Add Code

A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation

no code implementations • NAACL 2022 • Kexun Zhang, Rui Wang, Xu Tan, Junliang Guo, Yi Ren, Tao Qin, Tie-Yan Liu

Furthermore, we take the best of both and design a new loss function to better handle the complicated syntactic multi-modality in real-world datasets.

Machine Translation Translation

Paper
Add Code

Unified 2D and 3D Pre-Training of Molecular Representations

1 code implementation • 14 Jul 2022 • Jinhua Zhu, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

The model is pre-trained on three tasks: reconstruction of masked atoms and coordinates, 3D conformation generation conditioned on 2D graph, and 2D graph generation conditioned on 3D conformation.

Graph Generation Molecular Property Prediction +3

Paper
Code

Inspector: Pixel-Based Automated Game Testing via Exploration, Detection, and Investigation

no code implementations • 18 Jul 2022 • Guoqing Liu, Mengzhang Cai, Li Zhao, Tao Qin, Adrian Brown, Jimmy Bischoff, Tie-Yan Liu

In this work, we propose using only screenshots/pixels as input for automated game testing and build a general game testing agent, Inspector, that can be easily applied to different games without deep integration with games.

Imitation Learning Object

Paper
Add Code

Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation

1 code implementation • 11 Aug 2022 • Ang Lv, Xu Tan, Tao Qin, Tie-Yan Liu, Rui Yan

These characteristics cannot be well handled by neural generation models that learn lyric-to-melody mapping in an end-to-end way, due to several issues: (1) lack of aligned lyric-melody training data to sufficiently learn lyric-melody feature alignment; (2) lack of controllability in generation to better and explicitly align the lyric-melody features.

Language Modelling Retrieval

4,180

Paper
Code

MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks

1 code implementation • 30 Aug 2022 • Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu

Specifically, 1) we design an expert system to generate a melody by developing musical elements from motifs to phrases then to sections with repetitions and variations according to pre-given musical form; 2) considering the generated melody is lack of musical richness, we design a Transformer based refinement model to improve the melody without changing its musical form.

Music Generation

4,180

Paper
Code

Tailoring Molecules for Protein Pockets: a Transformer-based Generative Solution for Structured-based Drug Design

1 code implementation • 30 Aug 2022 • Kehan Wu, Yingce Xia, Yang Fan, Pan Deng, Haiguang Liu, Lijun Wu, Shufang Xie, Tong Wang, Tao Qin, Tie-Yan Liu

Structure-based drug design is drawing growing attentions in computer-aided drug discovery.

Drug Discovery

Paper
Code

Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation

1 code implementation • 19 Oct 2022 • Botao Yu, Peiling Lu, Rui Wang, Wei Hu, Xu Tan, Wei Ye, Shikun Zhang, Tao Qin, Tie-Yan Liu

A recent trend is to use Transformer or its variants in music generation, which is, however, suboptimal, because the full attention cannot efficiently model the typically long music sequences (e. g., over 10, 000 tokens), and the existing models have shortcomings in generating musical repetition structures.

Music Generation

4,180

Paper
Code

BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

2 code implementations • 19 Oct 2022 • Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, Tie-Yan Liu

Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain.

Ranked #1 on Document Classification on HOC (Micro F1 metric)

Document Classification Language Modelling +3

124,527

Paper
Code

Incorporating Pre-training Paradigm for Antibody Sequence-Structure Co-design

no code implementations • 26 Oct 2022 • Kaiyuan Gao, Lijun Wu, Jinhua Zhu, Tianbo Peng, Yingce Xia, Liang He, Shufang Xie, Tao Qin, Haiguang Liu, Kun He, Tie-Yan Liu

Specifically, we first pre-train an antibody language model based on the sequence data, then propose a one-shot way for sequence and structure generation of CDR to avoid the heavy cost and error propagation from an autoregressive manner, and finally leverage the pre-trained antibody model for the antigen-specific antibody generation model with some carefully designed modules.

Language Modelling Specificity

Paper
Add Code

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

1 code implementation • 2 Dec 2022 • Yichong Leng, Xu Tan, Wenjie Liu, Kaitao Song, Rui Wang, Xiang-Yang Li, Tao Qin, Edward Lin, Tie-Yan Liu

In this paper, we propose SoftCorrect with a soft error detection mechanism to avoid the limitations of both explicit and implicit error detection.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

1,279

Paper
Code

TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets

1 code implementation • 5 Dec 2022 • Yuanying Cai, Chuheng Zhang, Li Zhao, Wei Shen, Xuyun Zhang, Lei Song, Jiang Bian, Tao Qin, TieYan Liu

There are two challenges for this setting: 1) The optimal trade-off between optimizing the RL signal and the behavior cloning (BC) signal changes on different states due to the variation of the action coverage induced by different behavior policies.

D4RL Offline RL +2

Paper
Code

An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context

no code implementations • 24 Dec 2022 • Xiaoyu Chen, Xiangming Zhu, Yufeng Zheng, Pushi Zhang, Li Zhao, Wenxue Cheng, Peng Cheng, Yongqiang Xiong, Tao Qin, Jianyu Chen, Tie-Yan Liu

One of the key challenges in deploying RL to real-world applications is to adapt to variations of unknown environment contexts, such as changing terrains in robotic tasks and fluctuated bandwidth in congestion control.

Paper
Add Code

Regeneration Learning: A Learning Paradigm for Data Generation

no code implementations • 21 Jan 2023 • Xu Tan, Tao Qin, Jiang Bian, Tie-Yan Liu, Yoshua Bengio

Regeneration learning extends the concept of representation learning to data generation tasks, and can be regarded as a counterpart of traditional representation learning, since 1) regeneration learning handles the abstraction (Y') of the target data Y for data generation while traditional representation learning handles the abstraction (X') of source data X for data understanding; 2) both the processes of Y'-->Y in regeneration learning and X-->X' in representation learning can be learned in a self-supervised way (e. g., pre-training); 3) both the mappings from X to Y' in regeneration learning and from X' to Y in representation learning are simpler than the direct mapping from X to Y.

Image Generation Representation Learning +6

Paper
Add Code

N-Gram Nearest Neighbor Machine Translation

no code implementations • 30 Jan 2023 • Rui Lv, Junliang Guo, Rui Wang, Xu Tan, Qi Liu, Tao Qin

Nearest neighbor machine translation augments the Autoregressive Translation~(AT) with $k$-nearest-neighbor retrieval, by comparing the similarity between the token-level context representations of the target tokens in the query and the datastore.

Domain Adaptation Machine Translation +2

Paper
Add Code

Retrosynthetic Planning with Dual Value Networks

1 code implementation • 31 Jan 2023 • Guoqing Liu, Di Xue, Shufang Xie, Yingce Xia, Austin Tripp, Krzysztof Maziarz, Marwin Segler, Tao Qin, Zongzhang Zhang, Tie-Yan Liu

Retrosynthesis, which aims to find a route to synthesize a target molecule from commercially available starting materials, is a critical task in drug discovery and materials design.

Ranked #1 on Multi-step retrosynthesis on USPTO-190

Drug Discovery Multi-step retrosynthesis +2

Paper
Code

De Novo Molecular Generation via Connection-aware Motif Mining

1 code implementation • 2 Feb 2023 • Zijie Geng, Shufang Xie, Yingce Xia, Lijun Wu, Tao Qin, Jie Wang, Yongdong Zhang, Feng Wu, Tie-Yan Liu

The obtained motif vocabulary consists of not only molecular motifs (i. e., the frequent fragments), but also their connection information, indicating how the motifs are connected with each other.

Paper
Code

AMOM: Adaptive Masking over Masking for Conditional Masked Language Model

1 code implementation • 13 Mar 2023 • Yisheng Xiao, Ruiyang Xu, Lijun Wu, Juntao Li, Tao Qin, Yan-Tie Liu, Min Zhang

Experiments on \textbf{3} different tasks (neural machine translation, summarization, and code generation) with \textbf{15} datasets in total confirm that our proposed simple method achieves significant performance improvement over the strong CMLM model.

Code Generation Language Modelling +2

Paper
Code

NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers

1 code implementation • 18 Apr 2023 • Kai Shen, Zeqian Ju, Xu Tan, Yanqing Liu, Yichong Leng, Lei He, Tao Qin, Sheng Zhao, Jiang Bian

To enhance the zero-shot capability that is important to achieve diverse speech synthesis, we design a speech prompting mechanism to facilitate in-context learning in the diffusion model and the duration/pitch predictor.

In-Context Learning Speech Synthesis

1,193

Paper
Code

Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem

1 code implementation • 19 Apr 2023 • Yan Jin, Yuandong Ding, Xuanhao Pan, Kun He, Li Zhao, Tao Qin, Lei Song, Jiang Bian

Traveling Salesman Problem (TSP), as a classic routing optimization problem originally arising in the domain of transportation and logistics, has become a critical task in broader domains, such as manufacturing and biology.

Traveling Salesman Problem

Paper
Code

ResiDual: Transformer with Dual Residual Connections

1 code implementation • 28 Apr 2023 • Shufang Xie, Huishuai Zhang, Junliang Guo, Xu Tan, Jiang Bian, Hany Hassan Awadalla, Arul Menezes, Tao Qin, Rui Yan

In this paper, we propose ResiDual, a novel Transformer architecture with Pre-Post-LN (PPLN), which fuses the connections in Post-LN and Pre-LN together and inherits their advantages while avoids their limitations.

Machine Translation

Paper
Code

O-GNN: Incorporating Ring Priors into Molecular Modeling

1 code implementation • ICLR 2023 • Jinhua Zhu, Kehan Wu, Bohan Wang, Yingce Xia, Shufang Xie, Qi Meng, Lijun Wu, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

Despite the recent success of molecular modeling with graph neural networks (GNNs), few models explicitly take rings in compounds into consideration, consequently limiting the expressiveness of the models.

Ranked #1 on Graph Regression on PCQM4M-LSC (Validation MAE metric)

Graph Regression Molecular Property Prediction +3

Paper
Code

MolXPT: Wrapping Molecules with Text for Generative Pre-training

no code implementations • 18 May 2023 • Zequn Liu, Wei zhang, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Ming Zhang, Tie-Yan Liu

Considering that text is the most important record for scientific discovery, in this paper, we propose MolXPT, a unified language model of text and molecules pre-trained on SMILES (a sequence representation of molecules) wrapped by text.

Ranked #1 on Molecular Property Prediction on ClinTox

Language Modelling Molecular Property Prediction +3

Paper
Add Code

Extract and Attend: Improving Entity Translation in Neural Machine Translation

no code implementations • 4 Jun 2023 • Zixin Zeng, Rui Wang, Yichong Leng, Junliang Guo, Xu Tan, Tao Qin, Tie-Yan Liu

Inspired by this translation process, we propose an Extract-and-Attend approach to enhance entity translation in NMT, where the translation candidates of source entities are first extracted from a dictionary and then attended to by the NMT model to generate the target sentence.

Machine Translation NMT +2

Paper
Add Code

Retrosynthesis Prediction with Local Template Retrieval

no code implementations • 7 Jun 2023 • Shufang Xie, Rui Yan, Junliang Guo, Yingce Xia, Lijun Wu, Tao Qin

Furthermore, we propose a lightweight adapter to adjust the weights when combing neural network and KNN predictions conditioned on the hidden representation and the retrieved templates.

Drug Discovery Retrieval +1

Paper
Add Code

PromptTTS 2: Describing and Generating Voices with Text Prompt

no code implementations • 5 Sep 2023 • Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian

TTS approaches based on the text prompt face two main challenges: 1) the one-to-many problem, where not all details about voice variability can be described in the text prompt, and 2) the limited availability of text prompt datasets, where vendors and large cost of data labeling are required to write text prompts for speech.

Language Modelling Large Language Model

Paper
Add Code

FABind: Fast and Accurate Protein-Ligand Binding

1 code implementation • NeurIPS 2023 • Qizhi Pei, Kaiyuan Gao, Lijun Wu, Jinhua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Kun He, Tie-Yan Liu, Rui Yan

In this work, we propose $\mathbf{FABind}$, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.

Drug Discovery Pose Estimation +1

Paper
Code

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

1 code implementation • 28 Nov 2023 • Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, Renqian Luo, Scott Mayer McKinney, Robert Osazuwa Ness, Hoifung Poon, Tao Qin, Naoto Usuyama, Chris White, Eric Horvitz

We find that prompting innovation can unlock deeper specialist capabilities and show that GPT-4 easily tops prior leading results for medical benchmarks.

Ranked #1 on Question Answering on MedQA

Electrical Engineering Experimental Design +3

5,034

Paper
Code

BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning

1 code implementation • 27 Feb 2024 • Qizhi Pei, Lijun Wu, Kaiyuan Gao, Xiaozhuan Liang, Yin Fang, Jinhua Zhu, Shufang Xie, Tao Qin, Rui Yan

However, previous efforts like BioT5 faced challenges in generalizing across diverse tasks and lacked a nuanced understanding of molecular structures, particularly in their textual representations (e. g., IUPAC).

Ranked #1 on Molecule Captioning on ChEBI-20

Forward reaction prediction Molecule Captioning +3

Paper
Code

Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey

2 code implementations • 3 Mar 2024 • Qizhi Pei, Lijun Wu, Kaiyuan Gao, Jinhua Zhu, Yue Wang, Zun Wang, Tao Qin, Rui Yan

The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.

Property Prediction

Paper
Code

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

no code implementations • 5 Mar 2024 • Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao

Specifically, 1) we design a neural codec with factorized vector quantization (FVQ) to disentangle speech waveform into subspaces of content, prosody, timbre, and acoustic details; 2) we propose a factorized diffusion model to generate attributes in each subspace following its corresponding prompt.

Quantization Speech Synthesis

Paper
Add Code

Building Multilingual Machine Translation Systems That Serve Arbitrary XY Translations

no code implementations • NAACL 2022 • Akiko Eriguchi, Shufang Xie, Tao Qin, Hany Hassan

Machine Translation Translation

Paper
Add Code

Machine Translation With Weakly Paired Bilingual Documents

no code implementations • ICLR 2019 • Lijun Wu, Jinhua Zhu, Di He, Fei Gao, Xu Tan, Tao Qin, Tie-Yan Liu

Neural machine translation, which achieves near human-level performance in some languages, strongly relies on the availability of large amounts of parallel sentences, which hinders its applicability to low-resource language pairs.

Sentence Translation +1

Paper
Add Code

mixSeq: A Simple Data Augmentation Methodfor Neural Machine Translation

no code implementations • ACL (IWSLT) 2021 • Xueqing Wu, Yingce Xia, Jinhua Zhu, Lijun Wu, Shufang Xie, Yang Fan, Tao Qin

Data augmentation, which refers to manipulating the inputs (e. g., adding random noise, masking specific parts) to enlarge the dataset, has been widely adopted in machine learning.

Data Augmentation Machine Translation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.