Search Results for author: Shujian Zhang

Found 21 papers, 11 papers with code

Bayesian Attention Modules

1 code implementation • NeurIPS 2020 • Xinjie Fan, Shujian Zhang, Bo Chen, Mingyuan Zhou

Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability.

Image Captioning Machine Translation +4

Paper
Code

Capturing Label Distribution: A Case Study in NLI

no code implementations • 13 Feb 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi

We depart from the standard practice of collecting a single reference per each training example, and find that collecting multiple references can achieve better accuracy under the fixed annotation budget.

Natural Language Inference

Paper
Add Code

Contextual Dropout: An Efficient Sample-Dependent Dropout Module

1 code implementation • ICLR 2021 • Xinjie Fan, Shujian Zhang, Korawat Tanwisuth, Xiaoning Qian, Mingyuan Zhou

However, the quality of uncertainty estimation is highly dependent on the dropout probabilities.

Image Classification Question Answering +1

Paper
Code

Knowing More About Questions Can Help: Improving Calibration in Question Answering

1 code implementation • Findings (ACL) 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi

We study calibration in question answering, estimating whether model correctly predicts answer for each question.

Answer Generation Data Augmentation +4

Paper
Code

Bayesian Attention Belief Networks

no code implementations • 9 Jun 2021 • Shujian Zhang, Xinjie Fan, Bo Chen, Mingyuan Zhou

Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks.

Machine Translation Question Answering +2

Paper
Add Code

Learning with Different Amounts of Annotation: From Zero to Many Labels

1 code implementation • EMNLP 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi

Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples.

Data Augmentation Entity Typing +1

Paper
Code

Crossformer: Transformer with Alternated Cross-Layer Guidance

no code implementations • 29 Sep 2021 • Shujian Zhang, Zhibin Duan, Huangjie Zheng, Pengcheng He, Bo Chen, Weizhu Chen, Mingyuan Zhou

Crossformer with states sharing not only provides the desired cross-layer guidance and regularization but also reduces the memory requirement.

Inductive Bias Machine Translation +3

Paper
Add Code

A Prototype-Oriented Framework for Unsupervised Domain Adaptation

1 code implementation • NeurIPS 2021 • Korawat Tanwisuth, Xinjie Fan, Huangjie Zheng, Shujian Zhang, Hao Zhang, Bo Chen, Mingyuan Zhou

Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space.

Unsupervised Domain Adaptation

Paper
Code

Alignment Attention by Matching Key and Query Distributions

1 code implementation • NeurIPS 2021 • Shujian Zhang, Xinjie Fan, Huangjie Zheng, Korawat Tanwisuth, Mingyuan Zhou

The neural attention mechanism has been incorporated into deep neural networks to achieve state-of-the-art performance in various domains.

Graph Attention Question Answering +1

Paper
Code

FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization

1 code implementation • 2 Dec 2021 • Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, Qiang Liu

We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.

Ranked #48 on Text-to-Image Generation on MS COCO

counterfactual Navigate +1

198

Paper
Code

ALLSH: Active Learning Guided by Local Sensitivity and Hardness

no code implementations • Findings (NAACL) 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou

Active learning, which effectively collects informative unlabeled data for annotation, reduces the demand for labeled data.

Active Learning Few-Shot Learning

Paper
Add Code

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

1 code implementation • 14 Jun 2022 • Shentao Yang, Yihao Feng, Shujian Zhang, Mingyuan Zhou

Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process.

Continuous Control Offline RL +2

Paper
Code

A Unified Framework for Alternating Offline Model Training and Policy Learning

1 code implementation • 12 Oct 2022 • Shentao Yang, Shujian Zhang, Yihao Feng, Mingyuan Zhou

In offline model-based reinforcement learning (offline MBRL), we learn a dynamic model from historically collected data, and subsequently utilize the learned model and fixed datasets for policy learning, without further interacting with the environment.

Continuous Control Model-based Reinforcement Learning +2

Paper
Code

Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models

no code implementations • 2 Nov 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu

Experiments on different tasks across open question answering, dialogue conversation, and fact verification show that our method consistently outperforms its baselines.

Answer Generation Fact Verification +2

Paper
Add Code

FlowGrad: Controlling the Output of Generative ODEs With Gradients

no code implementations • CVPR 2023 • Xingchao Liu, Lemeng Wu, Shujian Zhang, Chengyue Gong, Wei Ping, Qiang Liu

To further accelerate the computation of the back-propagation, we propose to use a non-uniform discretization to approximate the ODE trajectory, where we measure how straight the trajectory is and gather the straight parts into one discretization step.

Image Manipulation

Paper
Add Code

A Prototype-Oriented Clustering for Domain Shift with Source Privacy

no code implementations • 8 Feb 2023 • Korawat Tanwisuth, Shujian Zhang, Pengcheng He, Mingyuan Zhou

Finally, it refines the target model on the target domain data without guidance from the source model.

Clustering

Paper
Add Code

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems

2 code implementations • 20 Feb 2023 • Yihao Feng, Shentao Yang, Shujian Zhang, JianGuo Zhang, Caiming Xiong, Mingyuan Zhou, Huan Wang

Prior works mainly focus on adopting advanced RL techniques to train the ToD agents, while the design of the reward function is not well studied.

Learning-To-Rank Reinforcement Learning (RL) +2

818

Paper
Code

POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models

1 code implementation • 29 Apr 2023 • Korawat Tanwisuth, Shujian Zhang, Huangjie Zheng, Pengcheng He, Mingyuan Zhou

Through prompting, large-scale pre-trained models have become more expressive and powerful, gaining significant attention in recent years.

Image Classification Natural Language Inference +1

Paper
Code

AutoML-GPT: Automatic Machine Learning with GPT

no code implementations • 4 May 2023 • Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou

Ultimately, with this prompt paragraph, AutoML-GPT will automatically conduct the experiments from data processing to model architecture, hyperparameter tuning, and predicted training log.

AutoML

Paper
Add Code

Sliced Wasserstein with Random-Path Projecting Directions

no code implementations • 29 Jan 2024 • Khai Nguyen, Shujian Zhang, Tam Le, Nhat Ho

From the RPD, we derive the random-path slicing distribution (RPSD) and two variants of sliced Wasserstein, i. e., the Random-Path Projection Sliced Wasserstein (RPSW) and the Importance Weighted Random-Path Projection Sliced Wasserstein (IWRPSW).

Denoising

Paper
Add Code

Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows

no code implementations • 25 Mar 2024 • Shujian Zhang, Lemeng Wu, Chengyue Gong, Xingchao Liu

Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.

Language Modelling Sentence +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.