no code implementations • Findings (NAACL) 2022 • Chengyue Gong, Xiaocong Du, Dhruv Choudhary, Bhargav Bhushanam, Qiang Liu, Arun Kejariwal
On the definition side, we reduce the bias in transfer loss by focusing on the items to which information from high-frequency items can be efficiently transferred.
no code implementations • ICML 2020 • Mao Ye, Chengyue Gong, Lizhen Nie, Denny Zhou, Adam Klivans, Qiang Liu
Theoretically, we show that the small networks pruned using our method achieve provably lower loss than small networks trained from scratch with the same size.
no code implementations • 25 Mar 2024 • Shujian Zhang, Lemeng Wu, Chengyue Gong, Xingchao Liu
Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
no code implementations • 4 May 2023 • Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
Ultimately, with this prompt paragraph, AutoML-GPT will automatically conduct the experiments from data processing to model architecture, hyperparameter tuning, and predicted training log.
no code implementations • CVPR 2023 • Xingchao Liu, Lemeng Wu, Shujian Zhang, Chengyue Gong, Wei Ping, Qiang Liu
To further accelerate the computation of the back-propagation, we propose to use a non-uniform discretization to approximate the ODE trajectory, where we measure how straight the trajectory is and gather the straight parts into one discretization step.
1 code implementation • CVPR 2023 • Lemeng Wu, Dilin Wang, Chengyue Gong, Xingchao Liu, Yunyang Xiong, Rakesh Ranjan, Raghuraman Krishnamoorthi, Vikas Chandra, Qiang Liu
We perform evaluations on multiple 3D tasks and find that our PSF performs comparably to the standard diffusion model, outperforming other efficient 3D point cloud generation methods.
no code implementations • 2 Nov 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu
Experiments on different tasks across open question answering, dialogue conversation, and fact verification show that our method consistently outperforms its baselines.
3 code implementations • 7 Sep 2022 • Xingchao Liu, Chengyue Gong, Qiang Liu
The idea of rectified flow is to learn the ODE to follow the straight paths connecting the points drawn from \pi_0 and \pi_1 as much as possible.
no code implementations • 2 Sep 2022 • Lemeng Wu, Chengyue Gong, Xingchao Liu, Mao Ye, Qiang Liu
AI-based molecule generation provides a promising approach to a large area of biomedical sciences and engineering, such as antibody design, hydrolase engineering, or vaccine development.
no code implementations • Findings (NAACL) 2022 • Shujian Zhang, Chengyue Gong, Xingchao Liu, Pengcheng He, Weizhu Chen, Mingyuan Zhou
Active learning, which effectively collects informative unlabeled data for annotation, reduces the demand for labeled data.
no code implementations • 16 Feb 2022 • Chengyue Gong, Lemeng Wu, Qiang Liu
Although traditional optimization methods focus on finding a single optimal solution, most objective functions in modern machine learning problems, especially those in deep learning, often have multiple or infinite numbers of optima.
1 code implementation • 2 Dec 2021 • Xingchao Liu, Chengyue Gong, Lemeng Wu, Shujian Zhang, Hao Su, Qiang Liu
We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.
Ranked #48 on Text-to-Image Generation on MS COCO
no code implementations • NeurIPS 2021 • Chengyue Gong, Mao Ye, Qiang Liu
We propose a general method to construct centroid approximation for the distribution of maximum points of a random function (a. k. a.
no code implementations • NeurIPS 2021 • Chengyue Gong, Xingchao Liu, Qiang Liu
In this work, we consider constrained optimization as a more principled approach for trading off two losses, with a special emphasis on lexicographic optimization, a degenerated limit of constrained optimization which optimizes a secondary loss inside the optimal set of the main loss.
1 code implementation • ICLR 2022 • Chengyue Gong, Dilin Wang, Meng Li, Xinlei Chen, Zhicheng Yan, Yuandong Tian, Qiang Liu, Vikas Chandra
In this work, we observe that the poor performance is due to a gradient conflict issue: the gradients of different sub-networks conflict with that of the supernet more severely in ViTs than CNNs, which leads to early saturation in training and inferior convergence.
Ranked #7 on Neural Architecture Search on ImageNet
1 code implementation • EMNLP 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi
Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples.
no code implementations • CVPR 2021 • Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu
The idea is to generate a set of augmented data with some random perturbations or transforms, and minimize the maximum, or worst case loss over the augmented data.
1 code implementation • Findings (ACL) 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi
We study calibration in question answering, estimating whether model correctly predicts answer for each question.
no code implementations • NeurIPS 2021 • Chengyue Gong, Xingchao Liu, Qiang Liu
In this work, we consider constrained optimization as a more principled approach for trading off two losses, with a special emphasis on lexicographic optimization, a degenerated limit of constrained optimization which optimizes a secondary loss inside the optimal set of the main loss.
1 code implementation • 26 Apr 2021 • Chengyue Gong, Dilin Wang, Meng Li, Vikas Chandra, Qiang Liu
To alleviate this problem, in this work, we introduce novel loss functions in vision transformer training to explicitly encourage diversity across patch representations for more discriminative feature extraction.
Ranked #19 on Semantic Segmentation on Cityscapes val
2 code implementations • 16 Feb 2021 • Dilin Wang, Chengyue Gong, Meng Li, Qiang Liu, Vikas Chandra
Weight-sharing NAS builds a supernet that assembles all the architectures as its sub-networks and jointly trains the supernet with the sub-networks.
Ranked #12 on Neural Architecture Search on ImageNet
no code implementations • 13 Feb 2021 • Shujian Zhang, Chengyue Gong, Eunsol Choi
We depart from the standard practice of collecting a single reference per each training example, and find that collecting multiple references can achieve better accuracy under the fixed annotation budget.
no code implementations • 1 Jan 2021 • Chengyue Gong, Xingchao Liu, Qiang Liu
We apply our method to recently-proposed MOCO, SimCLR, SwAV and notice that we can reduce the computational cost with little loss on the performance of ImageNet linear classification and other downstream tasks.
1 code implementation • CVPR 2021 • Chengyue Gong, Dilin Wang, Meng Li, Vikas Chandra, Qiang Liu
Data augmentation (DA) is an essential technique for training state-of-the-art deep learning systems.
no code implementations • CVPR 2021 • Chengyue Gong, Dilin Wang, Qiang Liu
Semi-supervised learning (SSL) is a key approach toward more data-efficient machine learning by jointly leverage both labeled and unlabeled data.
2 code implementations • CVPR 2021 • Dilin Wang, Meng Li, Chengyue Gong, Vikas Chandra
Our discovered model family, AttentiveNAS models, achieves top-1 accuracy from 77. 3% to 80. 7% on ImageNet, and outperforms SOTA models, including BigNAS and Once-for-All networks.
Ranked #21 on Neural Architecture Search on ImageNet
1 code implementation • ACL 2020 • Mao Ye, Chengyue Gong, Qiang Liu
For security reasons, it is of critical importance to develop models with certified robustness that can provably guarantee that the prediction is can not be altered by any possible synonymous word substitution.
1 code implementation • 3 Mar 2020 • Mao Ye, Chengyue Gong, Lizhen Nie, Denny Zhou, Adam Klivans, Qiang Liu
This differs from the existing methods based on backward elimination, which remove redundant neurons from the large network.
no code implementations • NeurIPS 2020 • Dinghuai Zhang, Mao Ye, Chengyue Gong, Zhanxing Zhu, Qiang Liu
Randomized classifiers have been shown to provide a promising approach for achieving certified robustness against adversarial attacks in deep learning.
1 code implementation • 20 Feb 2020 • Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu
The idea is to generate a set of augmented data with some random perturbations or transforms and minimize the maximum, or worst case loss over the augmented data.
Ranked #187 on Image Classification on ImageNet
1 code implementation • 10 Jun 2019 • Dilin Wang, Chengyue Gong, Qiang Liu
Theoretically, we show that our adversarial mechanism effectively encourages the diversity of the embedding vectors, helping to increase the robustness of models.
Ranked #5 on Language Modelling on Penn Treebank (Word Level)
no code implementations • 12 Dec 2018 • Chengyue Gong, Xu Tan, Di He, Tao Qin
Maximum-likelihood estimation (MLE) is widely used in sequence to sequence tasks for model training.
2 code implementations • NeurIPS 2018 • Chengyue Gong, Di He, Xu Tan, Tao Qin, Li-Wei Wang, Tie-Yan Liu
Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks.
Ranked #3 on Machine Translation on IWSLT2015 German-English
no code implementations • NeurIPS 2017 • Chengyue Gong, Win-Bin Huang
A new model, named as deep dynamic poisson factorization model, is proposed in this paper for analyzing sequential count vectors.