no code implementations • ACL (IWSLT) 2021 • Xueqing Wu, Yingce Xia, Jinhua Zhu, Lijun Wu, Shufang Xie, Yang Fan, Tao Qin
Data augmentation, which refers to manipulating the inputs (e. g., adding random noise, masking specific parts) to enlarge the dataset, has been widely adopted in machine learning.
no code implementations • 9 Mar 2025 • Gongbo Zhang, Yanting Li, Renqian Luo, Pipi Hu, Zeru Zhao, Lingbo Li, Guoqing Liu, Zun Wang, Ran Bi, Kaiyuan Gao, Liya Guo, Yu Xie, Chang Liu, Jia Zhang, Tian Xie, Robert Pinsler, Claudio Zeni, Ziheng Lu, Yingce Xia, Marwin Segler, Maik Riechert, Li Yuan, Lei Chen, Haiguang Liu, Tao Qin
We validate the effectiveness of UniGenX on material and small molecule generation tasks, achieving a significant leap in state-of-the-art performance for material crystal structure prediction and establishing new state-of-the-art results for small molecule structure prediction, de novo design, and conditional generation.
no code implementations • 15 Feb 2025 • Mingqian Ma, Guoqing Liu, Chuan Cao, Pan Deng, Tri Dao, Albert Gu, Peiran Jin, Zhao Yang, Yingce Xia, Renqian Luo, Pipi Hu, Zun Wang, Yuan-Jyue Chen, Haiguang Liu, Tao Qin
To address these challenges, we propose HybriDNA, a decoder-only DNA language model that incorporates a hybrid Transformer-Mamba2 architecture, seamlessly integrating the strengths of attention mechanisms with selective state-space models.
no code implementations • 11 Feb 2025 • Yingce Xia, Peiran Jin, Shufang Xie, Liang He, Chuan Cao, Renqian Luo, Guoqing Liu, Yue Wang, Zequn Liu, Yuan-Jyue Chen, Zekun Guo, Yeqi Bai, Pan Deng, Yaosen Min, Ziheng Lu, Hongxia Hao, Han Yang, Jielan Li, Chang Liu, Jia Zhang, Jianwei Zhu, Kehan Wu, Wei zhang, Kaiyuan Gao, Qizhi Pei, Qian Wang, Xixian Liu, Yanting Li, Houtian Zhu, Yeqing Lu, Mingqian Ma, Zun Wang, Tian Xie, Krzysztof Maziarz, Marwin Segler, Zhao Yang, Zilong Chen, Yu Shi, Shuxin Zheng, Lijun Wu, Chen Hu, Peggy Dai, Tie-Yan Liu, Haiguang Liu, Tao Qin
Foundation models have revolutionized natural language processing and artificial intelligence, significantly enhancing how machines comprehend and generate human languages.
no code implementations • 21 Jul 2024 • Qizhi Pei, Lijun Wu, Zhenyu He, Jinhua Zhu, Yingce Xia, Shufang Xie, Rui Yan
Specifically, we propose a \emph{label aggregation} with \emph{pair-wise retrieval} and a \emph{representation aggregation} with \emph{point-wise retrieval} of the nearest neighbors.
1 code implementation • 11 Oct 2023 • Qizhi Pei, Wei zhang, Jinhua Zhu, Kehan Wu, Kaiyuan Gao, Lijun Wu, Yingce Xia, Rui Yan
Recent advancements in biological research leverage the integration of molecules, proteins, and natural language to enhance drug discovery.
Ranked #4 on
Text-based de novo Molecule Generation
on ChEBI-20
1 code implementation • NeurIPS 2023 • Qizhi Pei, Kaiyuan Gao, Lijun Wu, Jinhua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Kun He, Tie-Yan Liu, Rui Yan
In this work, we propose $\mathbf{FABind}$, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.
Ranked #4 on
Blind Docking
on PDBBind
no code implementations • 7 Jun 2023 • Shufang Xie, Rui Yan, Junliang Guo, Yingce Xia, Lijun Wu, Tao Qin
Furthermore, we propose a lightweight adapter to adjust the weights when combing neural network and KNN predictions conditioned on the hidden representation and the retrieved templates.
no code implementations • 18 May 2023 • Zequn Liu, Wei zhang, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Ming Zhang, Tie-Yan Liu
Considering that text is the most important record for scientific discovery, in this paper, we propose MolXPT, a unified language model of text and molecules pre-trained on SMILES (a sequence representation of molecules) wrapped by text.
Ranked #1 on
Molecular Property Prediction
on BACE
1 code implementation • 12 May 2023 • Griffin Adams, Bichlien H Nguyen, Jake Smith, Yingce Xia, Shufang Xie, Anna Ostropolets, Budhaditya Deb, Yuan-Jyue Chen, Tristan Naumann, Noémie Elhadad
Summarization models often generate text that is poorly calibrated to quality metrics because they are trained to maximize the likelihood of a single reference (MLE).
1 code implementation • ICLR 2023 • Jinhua Zhu, Kehan Wu, Bohan Wang, Yingce Xia, Shufang Xie, Qi Meng, Lijun Wu, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
Despite the recent success of molecular modeling with graph neural networks (GNNs), few models explicitly take rings in compounds into consideration, consequently limiting the expressiveness of the models.
Ranked #1 on
Graph Regression
on PCQM4M-LSC
(Validation MAE metric)
1 code implementation • 2 Feb 2023 • Zijie Geng, Shufang Xie, Yingce Xia, Lijun Wu, Tao Qin, Jie Wang, Yongdong Zhang, Feng Wu, Tie-Yan Liu
The obtained motif vocabulary consists of not only molecular motifs (i. e., the frequent fragments), but also their connection information, indicating how the motifs are connected with each other.
1 code implementation • 31 Jan 2023 • Guoqing Liu, Di Xue, Shufang Xie, Yingce Xia, Austin Tripp, Krzysztof Maziarz, Marwin Segler, Tao Qin, Zongzhang Zhang, Tie-Yan Liu
Retrosynthesis, which aims to find a route to synthesize a target molecule from commercially available starting materials, is a critical task in drug discovery and materials design.
Ranked #1 on
Multi-step retrosynthesis
on USPTO-190
no code implementations • 26 Oct 2022 • Kaiyuan Gao, Lijun Wu, Jinhua Zhu, Tianbo Peng, Yingce Xia, Liang He, Shufang Xie, Tao Qin, Haiguang Liu, Kun He, Tie-Yan Liu
Specifically, we first pre-train an antibody language model based on the sequence data, then propose a one-shot way for sequence and structure generation of CDR to avoid the heavy cost and error propagation from an autoregressive manner, and finally leverage the pre-trained antibody model for the antigen-specific antibody generation model with some carefully designed modules.
3 code implementations • 19 Oct 2022 • Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, Tie-Yan Liu
Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain.
Ranked #1 on
Document Classification
on HOC
(Micro F1 metric)
1 code implementation • 30 Aug 2022 • Kehan Wu, Yingce Xia, Yang Fan, Pan Deng, Haiguang Liu, Lijun Wu, Shufang Xie, Tong Wang, Tao Qin, Tie-Yan Liu
Structure-based drug design is drawing growing attentions in computer-aided drug discovery.
1 code implementation • 14 Jul 2022 • Jinhua Zhu, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
The model is pre-trained on three tasks: reconstruction of masked atoms and coordinates, 3D conformation generation conditioned on 2D graph, and 2D graph generation conditioned on 3D conformation.
1 code implementation • 23 Jun 2022 • Shufang Xie, Rui Yan, Peng Han, Yingce Xia, Lijun Wu, Chenjuan Guo, Bin Yang, Tao Qin
We observe that the same intermediate molecules are visited many times in the searching process, and they are usually independently treated in previous tree-based methods (e. g., AND-OR tree search, Monte Carlo tree search).
Ranked #2 on
Multi-step retrosynthesis
on USPTO-190
2 code implementations • 20 Jun 2022 • Qizhi Pei, Lijun Wu, Jinhua Zhu, Yingce Xia, Shufang Xie, Tao Qin, Haiguang Liu, Tie-Yan Liu, Rui Yan
Accurate prediction of Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery, facilitating the identification of drugs that can effectively interact with specific targets and regulate their activities.
Ranked #1 on
Drug Discovery
on KIBA
1 code implementation • 3 Feb 2022 • Jinhua Zhu, Yingce Xia, Chang Liu, Lijun Wu, Shufang Xie, Yusong Wang, Tong Wang, Tao Qin, Wengang Zhou, Houqiang Li, Haiguang Liu, Tie-Yan Liu
Molecular conformation generation aims to generate three-dimensional coordinates of all the atoms in a molecule and is an important task in bioinformatics and pharmacology.
1 code implementation • 11 Jan 2022 • Wendi Li, Xiao Yang, Weiqing Liu, Yingce Xia, Jiang Bian
To handle concept drift, previous methods first detect when/where the concept drift happens and then adapt models to fit the distribution of the latest data.
1 code implementation • 12 Dec 2021 • Wentao Xu, Yingce Xia, Weiqing Liu, Jiang Bian, Jian Yin, Tie-Yan Liu
Next, we use a tree-attention aggregator to incorporate the graph structure information into the aggregation module on the meta-path.
1 code implementation • NeurIPS 2021 • Jinpeng Li, Yingce Xia, Rui Yan, Hongda Sun, Dongyan Zhao, Tie-Yan Liu
Considering there is no parallel data between the contexts and the responses of target style S1, existing works mainly use back translation to generate stylized synthetic data for training, where the data about context, target style S1 and an intermediate style S0 is used.
no code implementations • 29 Oct 2021 • Liang He, Shizhuo Zhang, Lijun Wu, Huanhuan Xia, Fusong Ju, He Zhang, Siyuan Liu, Yingce Xia, Jianwei Zhu, Pan Deng, Bin Shao, Tao Qin, Tie-Yan Liu
The key problem in the protein sequence representation learning is to capture the co-evolutionary information reflected by the inter-residue co-variation in the sequences.
2 code implementations • 26 Oct 2021 • Wentao Xu, Weiqing Liu, Lewen Wang, Yingce Xia, Jiang Bian, Jian Yin, Tie-Yan Liu
To overcome the shortcomings of previous work, we proposed a novel stock trend forecasting framework that can adequately mine the concept-oriented shared information from predefined concepts and hidden concepts.
no code implementations • 29 Sep 2021 • Mengji Zhang, Yingce Xia, Nian Wu, Kun Qian, Jianyang Zeng
Manually interpreting the MS/MS spectrum into the molecules (i. e., the simplified molecular-input line-entry system, SMILES) is often costly and cumbersome, mainly due to the synthesis and labeling of isotopes and the requirement of expert knowledge.
1 code implementation • ICLR 2022 • Shufang Xie, Ang Lv, Yingce Xia, Lijun Wu, Tao Qin, Rui Yan, Tie-Yan Liu
Autoregressive sequence generation, a prevalent task in machine learning and natural language processing, generates every target token conditioned on both a source input and previously generated target tokens.
no code implementations • 27 Sep 2021 • Yutai Hou, Yingce Xia, Lijun Wu, Shufang Xie, Yang Fan, Jinhua Zhu, Wanxiang Che, Tao Qin, Tie-Yan Liu
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
1 code implementation • 17 Jun 2021 • Jinhua Zhu, Yingce Xia, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
After pre-training, we can use either the Transformer branch (this one is recommended according to empirical results), the GNN branch, or both for downstream tasks.
Ranked #1 on
Molecular Property Prediction
on HIV dataset
1 code implementation • NA 2021 • Boling Li, Yingce Xia, Shufang Xie, Lijun Wu, Tao Qin
To overcome this difficulty, we propose an anchorbased distance: First, we randomly select K anchor vertices from the graph and then calculate the shortest distances of all vertices in the graph to them.
Ranked #1 on
Link Property Prediction
on ogbl-ddi
no code implementations • NAACL 2021 • Zhen Wu, Lijun Wu, Qi Meng, Yingce Xia, Shufang Xie, Tao Qin, Xinyu Dai, Tie-Yan Liu
Therefore, in this paper, we integrate different dropout techniques into the training of Transformer models.
Ranked #5 on
Machine Translation
on IWSLT2014 English-German
1 code implementation • ICLR 2021 • Jinhua Zhu, Lijun Wu, Yingce Xia, Shufang Xie, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
Based on this observation, in this work, we break the assumption of the fixed layer order in the Transformer and introduce instance-wise layer reordering into the model structure.
1 code implementation • 1 Jan 2021 • Xueqing Wu, Yingce Xia, Lijun Wu, Shufang Xie, Weiqing Liu, Tao Qin, Tie-Yan Liu
For wait-k inference, we observe that wait-m training with $m>k$ in simultaneous NMT (i. e., using more future information for training than inference) generally outperforms wait-k training.
no code implementations • 19 Oct 2020 • Hao Wang, Jia Zhang, Yingce Xia, Jiang Bian, Chao Zhang, Tie-Yan Liu
However, most existing studies overlook the code's intrinsic structural logic, which indeed contains a wealth of semantic information, and fails to capture intrinsic features of codes.
1 code implementation • 15 Oct 2020 • Jinhua Zhu, Yingce Xia, Lijun Wu, Jiajun Deng, Wengang Zhou, Tao Qin, Houqiang Li
During inference, the CNN encoder and the policy network are used to take actions, and the Transformer module is discarded.
2 code implementations • 10 Jul 2020 • Xueqing Wu, Lewen Wang, Yingce Xia, Weiqing Liu, Lijun Wu, Shufang Xie, Tao Qin, Tie-Yan Liu
In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks, which are different in terms of how much input information to use or which future step to predict.
no code implementations • 9 Jul 2020 • Yang Fan, Yingce Xia, Lijun Wu, Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li
Recently, the concept of teaching has been introduced into machine learning, in which a teacher model is used to guide the training of a student model (which will be used in real tasks) through data selection, loss function design, etc.
1 code implementation • 18 Jun 2020 • Yang Fan, Shufang Xie, Yingce Xia, Lijun Wu, Tao Qin, Xiang-Yang Li, Tie-Yan Liu
While the multi-branch architecture is one of the key ingredients to the success of computer vision tasks, it has not been well investigated in natural language processing, especially sequence learning tasks.
Ranked #4 on
Machine Translation
on WMT2014 English-German
(SacreBLEU metric)
no code implementations • 17 May 2020 • Zhibing Zhao, Yingce Xia, Tao Qin, Lirong Xia, Tie-Yan Liu
Dual learning has been successfully applied in many machine learning applications including machine translation, image-to-image transformation, etc.
1 code implementation • ECCV 2020 • Jianxin Lin, Yingxue Pang, Yingce Xia, Zhibo Chen, Jiebo Luo
With TuiGAN, an image is translated in a coarse-to-fine manner where the generated image is gradually refined from global structures to local details.
3 code implementations • ICLR 2020 • Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
While BERT is more commonly used as fine-tuning instead of contextual embedding for downstream language understanding tasks, in NMT, our preliminary exploration of using BERT as contextual embedding is better than using for fine-tuning.
1 code implementation • NeurIPS 2019 • Yiren Wang, Yingce Xia, Fei Tian, Fei Gao, Tao Qin, Cheng Xiang Zhai, Tie-Yan Liu
Neural machine translation models usually use the encoder-decoder framework and generate translation from left to right (or right to left) without fully utilizing the target-side global information.
no code implementations • WS 2019 • Yingce Xia, Xu Tan, Fei Tian, Fei Gao, Weicong Chen, Yang Fan, Linyuan Gong, Yichong Leng, Renqian Luo, Yiren Wang, Lijun Wu, Jinhua Zhu, Tao Qin, Tie-Yan Liu
We Microsoft Research Asia made submissions to 11 language directions in the WMT19 news translation tasks.
no code implementations • IJCNLP 2019 • Lijun Wu, Yiren Wang, Yingce Xia, Tao Qin, Jian-Huang Lai, Tie-Yan Liu
In this work, we study how to use both the source-side and target-side monolingual data for NMT, and propose an effective strategy leveraging both of them.
Ranked #1 on
Machine Translation
on WMT2016 English-German
(SacreBLEU metric, using extra
training data)
no code implementations • 25 Aug 2019 • Xu Tan, Yingce Xia, Lijun Wu, Tao Qin
In this paper, we propose an efficient method to generate a sequence in both left-to-right and right-to-left manners using a single encoder and decoder, combining the advantages of both generation directions.
no code implementations • IJCNLP 2019 • Xu Tan, Jiale Chen, Di He, Yingce Xia, Tao Qin, Tie-Yan Liu
We study two methods for language clustering: (1) using prior knowledge, where we cluster languages according to language family, and (2) using language embedding, in which we represent each language by an embedding vector and cluster them in the embedding space.
1 code implementation • ACL 2019 • Lijun Wu, Yiren Wang, Yingce Xia, Fei Tian, Fei Gao, Tao Qin, Jian-Huang Lai, Tie-Yan Liu
While very deep neural networks have shown effectiveness for computer vision and text classification applications, how to increase the network depth of neural machine translation (NMT) models for better translation quality remains a challenging problem.
Ranked #11 on
Machine Translation
on WMT2014 English-French
1 code implementation • 1 Jun 2019 • Jianxin Lin, Yingce Xia, Sen Liu, Shuqin Zhao, Zhibo Chen
Image-to-image translation models have shown remarkable ability on transferring images among different domains.
no code implementations • 29 May 2019 • Jianxin Lin, Yingce Xia, Yijun Wang, Tao Qin, Zhibo Chen
In this work, we introduce a new kind of loss, multi-path consistency loss, which evaluates the differences between direct translation $\mathcal{D}_s\to\mathcal{D}_t$ and indirect translation $\mathcal{D}_s\to\mathcal{D}_a\to\mathcal{D}_t$ with $\mathcal{D}_a$ as an auxiliary domain, to regularize training.
1 code implementation • ACL 2019 • Jinhua Zhu, Fei Gao, Lijun Wu, Yingce Xia, Tao Qin, Wengang Zhou, Xue-Qi Cheng, Tie-Yan Liu
While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study in natural language tasks is still very limited.
no code implementations • ICLR 2019 • Yiren Wang, Yingce Xia, Tianyu He, Fei Tian, Tao Qin, ChengXiang Zhai, Tie-Yan Liu
Dual learning has attracted much attention in machine learning, computer vision and natural language processing communities.
Ranked #1 on
Machine Translation
on WMT2016 English-German
no code implementations • ICLR 2019 • Zhibing Zhao, Yingce Xia, Tao Qin, Tie-Yan Liu
Based on the theoretical discoveries, we extend dual learning by introducing more related mappings and propose highly symmetric frameworks, cycle dual learning and multipath dual learning, in both of which we can leverage the feedback signals from additional domains to improve the qualities of the mappings.
1 code implementation • 11 Feb 2019 • Jianxin Lin, Zhibo Chen, Yingce Xia, Sen Liu, Tao Qin, Jiebo Luo
After pre-training, this network is used to extract the domain-specific features of each image.
no code implementations • NeurIPS 2018 • Tianyu He, Xu Tan, Yingce Xia, Di He, Tao Qin, Zhibo Chen, Tie-Yan Liu
Neural Machine Translation (NMT) has achieved remarkable progress with the quick evolvement of model structures.
no code implementations • NeurIPS 2018 • Lijun Wu, Fei Tian, Yingce Xia, Yang Fan, Tao Qin, Jian-Huang Lai, Tie-Yan Liu
Different from typical learning settings in which the loss function of a machine learning model is predefined and fixed, in our framework, the loss function of a machine learning model (we call it student) is defined by another machine learning model (we call it teacher).
no code implementations • ICML 2018 • Yingce Xia, Xu Tan, Fei Tian, Tao Qin, Nenghai Yu, Tie-Yan Liu
Many artificial intelligence tasks appear in dual forms like English$\leftrightarrow$French translation and speech$\leftrightarrow$text transformation.
no code implementations • 25 May 2018 • Haifang Li, Yingce Xia, Wensheng Zhang
Policy evaluation with linear function approximation is an important problem in reinforcement learning.
no code implementations • CVPR 2018 • Jianxin Lin, Yingce Xia, Tao Qin, Zhibo Chen, Tie-Yan Liu
In this paper, we study a new problem, conditional image-to-image translation, which is to translate an image from the source domain to the target domain conditioned on a given image in the target domain.
2 code implementations • 15 Mar 2018 • Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dong-dong Zhang, Zhirui Zhang, Ming Zhou
Machine translation has made rapid advances in recent years.
Ranked #3 on
Machine Translation
on WMT 2017 English-Chinese
no code implementations • NeurIPS 2017 • Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, Tie-Yan Liu
In this work, we introduce the deliberation process into the encoder-decoder framework and propose deliberation networks for sequence generation.
Ranked #22 on
Machine Translation
on WMT2014 English-French
no code implementations • NeurIPS 2017 • Di He, Hanqing Lu, Yingce Xia, Tao Qin, Li-Wei Wang, Tie-Yan Liu
Inspired by the success and methodology of AlphaGo, in this paper we propose using a prediction network to improve beam search, which takes the source sentence $x$, the currently available decoding output $y_1,\cdots, y_{t-1}$ and a candidate word $w$ at step $t$ as inputs and predicts the long-term value (e. g., BLEU score) of the partial target sentence if it is completed by the NMT model.
1 code implementation • ICML 2017 • Yingce Xia, Tao Qin, Wei Chen, Jiang Bian, Nenghai Yu, Tie-Yan Liu
Many supervised learning tasks are emerged in dual forms, e. g., English-to-French translation vs. French-to-English translation, speech recognition vs. text to speech, and image classification vs. image generation.
no code implementations • 20 Apr 2017 • Lijun Wu, Yingce Xia, Li Zhao, Fei Tian, Tao Qin, Jian-Huang Lai, Tie-Yan Liu
The goal of the adversary is to differentiate the translation result generated by the NMT model from that by human.
1 code implementation • NeurIPS 2016 • Yingce Xia, Di He, Tao Qin, Li-Wei Wang, Nenghai Yu, Tie-Yan Liu, Wei-Ying Ma
Based on the feedback signals generated during this process (e. g., the language-model likelihood of the output of a model, and the reconstruction error of the original sentence after the primal and dual translations), we can iteratively update the two models until convergence (e. g., using the policy gradient methods).
no code implementations • 1 May 2015 • Yingce Xia, Haifang Li, Tao Qin, Nenghai Yu, Tie-Yan Liu
In this paper, we extend the Thompson sampling to Budgeted MAB, where there is random cost for pulling an arm and the total cost is constrained by a budget.