no code implementations • 25 Mar 2024 • Shujian Zhang, Lemeng Wu, Chengyue Gong, Xingchao Liu
Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
no code implementations • 29 Jan 2024 • Khai Nguyen, Shujian Zhang, Tam Le, Nhat Ho
From the RPD, we derive the random-path slicing distribution (RPSD) and two variants of sliced Wasserstein, i. e., the Random-Path Projection Sliced Wasserstein (RPSW) and the Importance Weighted Random-Path Projection Sliced Wasserstein (IWRPSW).
no code implementations • 4 May 2023 • Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
Ultimately, with this prompt paragraph, AutoML-GPT will automatically conduct the experiments from data processing to model architecture, hyperparameter tuning, and predicted training log.
1 code implementation • 29 Apr 2023 • Korawat Tanwisuth, Shujian Zhang, Huangjie Zheng, Pengcheng He, Mingyuan Zhou
Through prompting, large-scale pre-trained models have become more expressive and powerful, gaining significant attention in recent years.
2 code implementations • 20 Feb 2023 • Yihao Feng, Shentao Yang, Shujian Zhang, JianGuo Zhang, Caiming Xiong, Mingyuan Zhou, Huan Wang
Prior works mainly focus on adopting advanced RL techniques to train the ToD agents, while the design of the reward function is not well studied.
no code implementations • 8 Feb 2023 • Korawat Tanwisuth, Shujian Zhang, Pengcheng He, Mingyuan Zhou
Finally, it refines the target model on the target domain data without guidance from the source model.
no code implementations • CVPR 2023 • Xingchao Liu, Lemeng Wu, Shujian Zhang, Chengyue Gong, Wei Ping, Qiang Liu
To further accelerate the computation of the back-propagation, we propose to use a non-uniform discretization to approximate the ODE trajectory, where we measure how straight the trajectory is and gather the straight parts into one discretization step.
no code implementations • 2 Nov 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu
Experiments on different tasks across open question answering, dialogue conversation, and fact verification show that our method consistently outperforms its baselines.
1 code implementation • 12 Oct 2022 • Shentao Yang, Shujian Zhang, Yihao Feng, Mingyuan Zhou
In offline model-based reinforcement learning (offline MBRL), we learn a dynamic model from historically collected data, and subsequently utilize the learned model and fixed datasets for policy learning, without further interacting with the environment.
1 code implementation • 14 Jun 2022 • Shentao Yang, Yihao Feng, Shujian Zhang, Mingyuan Zhou
Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process.
no code implementations • Findings (NAACL) 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou
Active learning, which effectively collects informative unlabeled data for annotation, reduces the demand for labeled data.
1 code implementation • 2 Dec 2021 • Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, Qiang Liu
We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.
Ranked #48 on Text-to-Image Generation on MS COCO
1 code implementation • NeurIPS 2021 • Shujian Zhang, Xinjie Fan, Huangjie Zheng, Korawat Tanwisuth, Mingyuan Zhou
The neural attention mechanism has been incorporated into deep neural networks to achieve state-of-the-art performance in various domains.
1 code implementation • NeurIPS 2021 • Korawat Tanwisuth, Xinjie Fan, Huangjie Zheng, Shujian Zhang, Hao Zhang, Bo Chen, Mingyuan Zhou
Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space.
no code implementations • 29 Sep 2021 • Shujian Zhang, Zhibin Duan, Huangjie Zheng, Pengcheng He, Bo Chen, Weizhu Chen, Mingyuan Zhou
Crossformer with states sharing not only provides the desired cross-layer guidance and regularization but also reduces the memory requirement.
1 code implementation • EMNLP 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi
Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples.
no code implementations • 9 Jun 2021 • Shujian Zhang, Xinjie Fan, Bo Chen, Mingyuan Zhou
Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks.
1 code implementation • Findings (ACL) 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi
We study calibration in question answering, estimating whether model correctly predicts answer for each question.
1 code implementation • ICLR 2021 • Xinjie Fan, Shujian Zhang, Korawat Tanwisuth, Xiaoning Qian, Mingyuan Zhou
However, the quality of uncertainty estimation is highly dependent on the dropout probabilities.
no code implementations • 13 Feb 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi
We depart from the standard practice of collecting a single reference per each training example, and find that collecting multiple references can achieve better accuracy under the fixed annotation budget.
1 code implementation • NeurIPS 2020 • Xinjie Fan, Shujian Zhang, Bo Chen, Mingyuan Zhou
Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability.