no code implementations • 2 Mar 2024 • Chiyu Zhang, Honglong Cai, Yuezhang, Li, Yuexin Wu, Le Hou, Muhammad Abdul-Mageed
Text Style Transfer (TST) seeks to alter the style of text while retaining its core content.
no code implementations • 5 Feb 2024 • Zihan Wang, Yunxuan Li, Yuexin Wu, Liangchen Luo, Le Hou, Hongkun Yu, Jingbo Shang
In this paper, to avoid the expensive effort of human annotation on the verifier training data, we introduce Model-induced Process Supervision (MiPS), a novel method for automating data curation.
no code implementations • 2 Oct 2023 • Ziqi Wang, Le Hou, Tianjian Lu, Yuexin Wu, Yunxuan Li, Hongkun Yu, Heng Ji
Specifically, we reformulate the training objective of reinforcement learning from human feedback (RLHF) -- instead of maximizing response quality for a given input, we maximize the quality gap of the response conditioned on a reference response.
no code implementations • 24 May 2023 • Sheng Shen, Le Hou, Yanqi Zhou, Nan Du, Shayne Longpre, Jason Wei, Hyung Won Chung, Barret Zoph, William Fedus, Xinyun Chen, Tu Vu, Yuexin Wu, Wuyang Chen, Albert Webson, Yunxuan Li, Vincent Zhao, Hongkun Yu, Kurt Keutzer, Trevor Darrell, Denny Zhou
Sparse Mixture-of-Experts (MoE) is a neural architecture design that can be utilized to add learnable parameters to Large Language Models (LLMs) without increasing inference cost.
1 code implementation • 18 Apr 2023 • Yuexin Wu, I-Chan Huang, Xiaolei Huang
Experiments demonstrate the effectiveness of our approach in enhancing model robustness overall and infrequent tokens.
1 code implementation • 21 Oct 2022 • Ziqi Wang, Yuexin Wu, Frederick Liu, Daogao Liu, Le Hou, Hongkun Yu, Jing Li, Heng Ji
However, these data augmentation methods either potentially cause shifts in decision boundaries (representation interpolation), are not expressive enough (token replacement), or introduce too much computational overhead (augmentation with models).
no code implementations • 20 Oct 2022 • Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han
We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74. 4%->82. 1% on GSM8K, 78. 2%->83. 0% on DROP, 90. 0%->94. 4% on OpenBookQA, and 63. 4%->67. 9% on ANLI-A3) and achieves state-of-the-art-level performance, without any ground truth label.
Ranked #1 on
Question Answering
on DROP
1 code implementation • *SEM (NAACL) 2022 • Yuexin Wu, Xiaolei Huang
Unsupervised domain adaptation (UDA) augments model performance with only accessible annotations from the source domain and unlabeled data from the target domain.
no code implementations • ACL 2022 • Le Hou, Richard Yuanzhe Pang, Tianyi Zhou, Yuexin Wu, Xinying Song, Xiaodan Song, Denny Zhou
Transformer-based models generally allocate the same amount of computation for each token in a given sequence.
1 code implementation • 24 Feb 2022 • Zhuoning Yuan, Yuexin Wu, Zi-Hao Qiu, Xianzhi Du, Lijun Zhang, Denny Zhou, Tianbao Yang
In this paper, we study contrastive learning from an optimization perspective, aiming to analyze and address a fundamental issue of existing contrastive learning methods that either rely on a large batch size or a large dictionary of feature vectors.
1 code implementation • 9 Dec 2020 • Yuexin Wu, Xiaolei Huang
Rating prediction is a core problem in recommender systems to quantify user's preferences towards items, however, rating imbalance naturally roots in real-world user ratings that cause biased predictions and lead to poor performance on tail ratings.
1 code implementation • 8 Dec 2020 • Yuexin Wu, Tianyu Gao, Sihao Wang, Zhongmin Xiong
As the first attempt in this field to address this problem, we propose a flexible dual-optimizer model to gain robustness from both regression loss and classification loss.
1 code implementation • 12 Jun 2020 • Donghan Yu, Yiming Yang, Ruohong Zhang, Yuexin Wu
Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN).
4 code implementations • 17 Nov 2019 • Donghan Yu, Ruohong Zhang, Zhengbao Jiang, Yuexin Wu, Yiming Yang
Graph Convolutional Networks (GCNs) have received increasing attention in the machine learning community for effectively leveraging both the content features of nodes and the linkage patterns across graphs in various applications.
no code implementations • 16 Oct 2019 • Yuexin Wu, Yichong Xu, Aarti Singh, Yiming Yang, Artur Dubrawski
Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data.
no code implementations • 25 Sep 2019 • Yuexin Wu, Yichong Xu, Aarti Singh, Artur Dubrawski, Yiming Yang
Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data.
1 code implementation • CVPR 2019 • Yitong Li, Zhe Gan, Yelong Shen, Jingjing Liu, Yu Cheng, Yuexin Wu, Lawrence Carin, David Carlson, Jianfeng Gao
We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework.
1 code implementation • 19 Nov 2018 • Yuexin Wu, Xiujun Li, Jingjing Liu, Jianfeng Gao, Yiming Yang
Training task-completion dialogue agents with reinforcement learning usually requires a large number of real user experiences.
1 code implementation • EMNLP 2018 • Ruochen Xu, Yiming Yang, Naoki Otani, Yuexin Wu
Supervised methods for this problem rely on the availability of cross-lingual supervision, either using parallel corpora or bilingual lexicons as the labeled data for training, which may not be available for many low resource languages.
1 code implementation • WS 2018 • Junjie Hu, Wei-Cheng Chang, Yuexin Wu, Graham Neubig
In this paper, propose a method to effectively encode the local and global contextual information for each target word using a three-part neural network approach.
1 code implementation • ICML 2017 • Hanxiao Liu, Yuexin Wu, Yiming Yang
Large-scale multi-relational embedding refers to the task of learning the latent representations for entities and relations in large knowledge graphs.
Ranked #24 on
Link Prediction
on WN18
no code implementations • NeurIPS 2016 • Zhilin Yang, Ye Yuan, Yuexin Wu, Ruslan Salakhutdinov, William W. Cohen
We propose a novel extension of the encoder-decoder framework, called a review network.
no code implementations • 8 Dec 2014 • Yichao Zhou, Yuexin Wu, Jianyang Zeng
The computation of the global minimum energy conformation (GMEC) is an important and challenging topic in structure-based computational protein design.