Search Results for author: Shujian Zhang

Found 21 papers, 11 papers with code

Bayesian Attention Modules

1 code implementation NeurIPS 2020 Xinjie Fan, Shujian Zhang, Bo Chen, Mingyuan Zhou

Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability.

Image Captioning Machine Translation +4

Capturing Label Distribution: A Case Study in NLI

no code implementations13 Feb 2021 Shujian Zhang, Chengyue Gong, Eunsol Choi

We depart from the standard practice of collecting a single reference per each training example, and find that collecting multiple references can achieve better accuracy under the fixed annotation budget.

Natural Language Inference

Bayesian Attention Belief Networks

no code implementations9 Jun 2021 Shujian Zhang, Xinjie Fan, Bo Chen, Mingyuan Zhou

Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks.

Machine Translation Question Answering +2

Learning with Different Amounts of Annotation: From Zero to Many Labels

1 code implementation EMNLP 2021 Shujian Zhang, Chengyue Gong, Eunsol Choi

Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples.

Data Augmentation Entity Typing +1

Crossformer: Transformer with Alternated Cross-Layer Guidance

no code implementations29 Sep 2021 Shujian Zhang, Zhibin Duan, Huangjie Zheng, Pengcheng He, Bo Chen, Weizhu Chen, Mingyuan Zhou

Crossformer with states sharing not only provides the desired cross-layer guidance and regularization but also reduces the memory requirement.

Inductive Bias Machine Translation +3

A Prototype-Oriented Framework for Unsupervised Domain Adaptation

1 code implementation NeurIPS 2021 Korawat Tanwisuth, Xinjie Fan, Huangjie Zheng, Shujian Zhang, Hao Zhang, Bo Chen, Mingyuan Zhou

Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space.

Unsupervised Domain Adaptation

Alignment Attention by Matching Key and Query Distributions

1 code implementation NeurIPS 2021 Shujian Zhang, Xinjie Fan, Huangjie Zheng, Korawat Tanwisuth, Mingyuan Zhou

The neural attention mechanism has been incorporated into deep neural networks to achieve state-of-the-art performance in various domains.

Graph Attention Question Answering +1

FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization

1 code implementation2 Dec 2021 Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, Qiang Liu

We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.

counterfactual Navigate +1

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

1 code implementation14 Jun 2022 Shentao Yang, Yihao Feng, Shujian Zhang, Mingyuan Zhou

Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learning from static datasets, without interacting with the underlying environment during the learning process.

Continuous Control Offline RL +2

A Unified Framework for Alternating Offline Model Training and Policy Learning

1 code implementation12 Oct 2022 Shentao Yang, Shujian Zhang, Yihao Feng, Mingyuan Zhou

In offline model-based reinforcement learning (offline MBRL), we learn a dynamic model from historically collected data, and subsequently utilize the learned model and fixed datasets for policy learning, without further interacting with the environment.

Continuous Control Model-based Reinforcement Learning +2

Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models

no code implementations2 Nov 2022 Shujian Zhang, Chengyue Gong, Xingchao Liu

Experiments on different tasks across open question answering, dialogue conversation, and fact verification show that our method consistently outperforms its baselines.

Answer Generation Fact Verification +2

FlowGrad: Controlling the Output of Generative ODEs With Gradients

no code implementations CVPR 2023 Xingchao Liu, Lemeng Wu, Shujian Zhang, Chengyue Gong, Wei Ping, Qiang Liu

To further accelerate the computation of the back-propagation, we propose to use a non-uniform discretization to approximate the ODE trajectory, where we measure how straight the trajectory is and gather the straight parts into one discretization step.

Image Manipulation

A Prototype-Oriented Clustering for Domain Shift with Source Privacy

no code implementations8 Feb 2023 Korawat Tanwisuth, Shujian Zhang, Pengcheng He, Mingyuan Zhou

Finally, it refines the target model on the target domain data without guidance from the source model.

Clustering

POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models

1 code implementation29 Apr 2023 Korawat Tanwisuth, Shujian Zhang, Huangjie Zheng, Pengcheng He, Mingyuan Zhou

Through prompting, large-scale pre-trained models have become more expressive and powerful, gaining significant attention in recent years.

Image Classification Natural Language Inference +1

AutoML-GPT: Automatic Machine Learning with GPT

no code implementations4 May 2023 Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou

Ultimately, with this prompt paragraph, AutoML-GPT will automatically conduct the experiments from data processing to model architecture, hyperparameter tuning, and predicted training log.

AutoML

Sliced Wasserstein with Random-Path Projecting Directions

no code implementations29 Jan 2024 Khai Nguyen, Shujian Zhang, Tam Le, Nhat Ho

From the RPD, we derive the random-path slicing distribution (RPSD) and two variants of sliced Wasserstein, i. e., the Random-Path Projection Sliced Wasserstein (RPSW) and the Importance Weighted Random-Path Projection Sliced Wasserstein (IWRPSW).

Denoising

Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows

no code implementations25 Mar 2024 Shujian Zhang, Lemeng Wu, Chengyue Gong, Xingchao Liu

Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.

Language Modelling Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.