1 code implementation • 12 Dec 2024 • Xichen Ye, Yifan Wu, Weizhong Zhang, Xiaoqiang Li, Yifan Chen, Cheng Jin
Previous research has shown that constraining the gradient of loss function with respect to model-predicted probabilities can enhance the model robustness against noisy labels.
1 code implementation • 4 Dec 2024 • Yifan Wu, Xichen Ye, Songmin Dai, Dengye Pan, Xiaoqiang Li, Weizhong Zhang, Yifan Chen
We recognize the "energy barrier" in OOD detection, which characterizes the energy difference between in-distribution (ID) and OOD samples and eases detection.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
1 code implementation • 3 Dec 2024 • Xichen Ye, Yifan Wu, Yiwen Xu, Xiaoqiang Li, Weizhong Zhang, Yifan Chen
By replacing MAE in APL with our proposed NNLFs, we enhance APL and present a new framework called Active Negative Loss (ANL).
no code implementations • 14 Aug 2024 • Yibin Wang, Weizhong Zhang, Cheng Jin
In the first stage, RSA enables the latent image to query features from all reference concepts simultaneously, extracting the overall semantic understanding to facilitate the initial semantic layout establishment.
1 code implementation • 21 Jul 2024 • Jipeng Zhang, Yaxuan Qin, Renjie Pi, Weizhong Zhang, Rui Pan, Tong Zhang
Achieving this goal poses non-trivial challenges: 1) data selection requires accurate data representations that reflect the training samples' quality, 2) considering the diverse nature of instruction datasets, and 3) ensuring the efficiency of the coreset selection algorithm for large models.
no code implementations • 15 Jun 2024 • Yuan Gao, Zujing Liu, Weizhong Zhang, Bo Du, Gui-Song Xia
We instead propose a novel optimization-based structural pruning that learns the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model.
1 code implementation • 23 May 2024 • Yibin Wang, Weizhong Zhang, Cheng Jin
Our key idea is to reconstruct the diffusion training process, introducing more refined guidance tailored to this task, to expose and rectify the model's attention at the character level and strengthen its learning of text regions.
1 code implementation • 9 May 2024 • Yuan Gao, Weizhong Zhang, Wenhan Luo, Lin Ma, Jin-Gang Yu, Gui-Song Xia, Jiayi Ma
We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance which we focus on, while preserving a single task inference cost of the primary task.
no code implementations • 7 May 2024 • Zhibo Zhang, Ximing Yang, Weizhong Zhang, Cheng Jin
Cross-modal knowledge transfer enhances point cloud representation learning in LiDAR semantic segmentation.
no code implementations • 25 Apr 2024 • Zhibo Zhang, Ximing Yang, Weizhong Zhang, Cheng Jin
We apply this robust fine-tuning method to mainstream 3D point cloud pre-trained models and evaluate the quality of model parameters and the degradation of downstream task performance.
1 code implementation • 8 Mar 2024 • Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin
This prior information is encoded into the attention weights, which are then integrated into the self-attention layers of the generator to guide the synthesis process.
no code implementations • 28 Feb 2024 • Weilin Wan, Weizhong Zhang, Quan Zhou, Fan Yi, Cheng Jin
Our neural activation prior is based on a key observation that, for a channel before the global pooling layer of a fully trained neural network, the probability of a few neurons being activated with a large response by an in-distribution (ID) sample is significantly higher than that by an OOD sample.
no code implementations • CVPR 2024 • Yanlu Cai, Weizhong Zhang, Yuan Wu, Cheng Jin
A natural solution is to artificially synthesize some samples i. e. 2D-3D pose pairs under massive new camera settings.
1 code implementation • 19 Dec 2023 • Kaiyi Zhang, Yang Chen, Ximing Yang, Weizhong Zhang, Cheng Jin
Based on this process, we introduce SGAS, a model for part editing that employs two strategies: feature disentanglement and constraint.
1 code implementation • CVPR 2024 • Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin
Specifically, we first develop two specialized pre-trained diffusion models, i. e., Text-driven Diffusion Model (TDM) and Subject-augmented Diffusion Model (SDM), for scene and person generation, respectively.
no code implementations • 29 Sep 2023 • Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang
Contrary to the conventional wisdom that focuses on learning invariant features for better OOD performance, our findings suggest that incorporating a large number of diverse spurious features weakens their individual contributions, leading to improved overall OOD generalization performance.
no code implementations • 16 Sep 2023 • Zhiwei Zhang, Weizhong Zhang, Yaowei Huang, Kani Chen
In this paper, we identify an underexplored problem in multivariate traffic series prediction: extreme events.
no code implementations • 24 Jan 2023 • Xiao Zhou, Renjie Pi, Weizhong Zhang, Yong Lin, Tong Zhang
The goal of coreset selection in supervised learning is to produce a weighted subset of data, so that training only on the subset achieves similar performance as training on the entire dataset.
1 code implementation • 24 Jan 2023 • Xiao Zhou, Yong Lin, Renjie Pi, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang
The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size.
1 code implementation • 21 Nov 2022 • Hanze Dong, Shizhe Diao, Weizhong Zhang, Tong Zhang
The resulting method is significantly more powerful than the standard normalization flow approach for generating data distributions with multiple modes.
2 code implementations • CVPR 2023 • Renjie Pi, Weizhong Zhang, Yueqi Xie, Jiahui Gao, Xiaoyu Wang, Sunghun Kim, Qifeng Chen
Specifically, we first reserve a short trajectory of global model snapshots on the server.
no code implementations • 10 Nov 2022 • Yueqi Xie, Weizhong Zhang, Renjie Pi, Fangzhao Wu, Qifeng Chen, Xing Xie, Sunghun Kim
Since at each round, the number of tunable parameters optimized on the server side equals the number of participating clients (thus independent of the model size), we are able to train a global model with massive parameters using only a small amount of proxy data (e. g., around one hundred samples).
no code implementations • 6 Jun 2022 • Zhichao Huang, Yanbo Fan, Chen Liu, Weizhong Zhang, Yong Zhang, Mathieu Salzmann, Sabine Süsstrunk, Jue Wang
While adversarial training and its variants have shown to be the most effective algorithms to defend against adversarial attacks, their extremely slow training process makes it hard to scale to large datasets like ImageNet.
2 code implementations • 25 May 2022 • Jiahui Gao, Renjie Pi, Yong Lin, Hang Xu, Jiacheng Ye, Zhiyong Wu, Weizhong Zhang, Xiaodan Liang, Zhenguo Li, Lingpeng Kong
In this paradigm, the synthesized data from the PLM acts as the carrier of knowledge, which is used to train a task-specific model with orders of magnitude fewer parameters than the PLM, achieving both higher performance and efficiency than prompt-based zero-shot learning methods on PLMs.
no code implementations • 14 Feb 2022 • Xupeng Shi, Pengfei Zheng, A. Adam Ding, Yuan Gao, Weizhong Zhang
Modern deep neural networks (DNNs) are vulnerable to adversarial attacks and adversarial training has been shown to be a promising method for improving the adversarial robustness of DNNs.
no code implementations • 27 Nov 2021 • Weizhong Zhang, Shuang Qiu
To the best of our knowledge, this is the first screening method which introduces the dual optimum estimation technique -- by carefully exploring and exploiting the strong convexity and the complex structure of the dual problem -- in static screening methods to dynamic screening.
1 code implementation • NeurIPS 2021 • Xiao Zhou, Weizhong Zhang, Zonghao Chen, Shizhe Diao, Tong Zhang
For the latter step, instead of using the chain rule based gradient estimators as in existing methods, we propose a variance reduced policy gradient estimator, which only requires two forward passes without backward propagation, thus achieving completely sparse training.
no code implementations • 10 Nov 2021 • Jiaxin Li, Yan Ding, Weizhong Zhang, Yifan Zhao, Lingxi Guo, Zhe Yang
Augmented reality technology based on image registration is becoming increasingly popular for the convenience of pre-surgery preparation and medical education.
1 code implementation • CVPR 2021 • Xiao Zhou, Weizhong Zhang, Hang Xu, Tong Zhang
Weight pruning is an effective technique to reduce the model size and inference time for deep neural networks in real-world deployments.
no code implementations • 1 Jan 2021 • Xiao Zhou, Weizhong Zhang, Tong Zhang
An appealing feature of ProbMask is that the amounts of weight redundancy can be learned automatically via our constraint and thus we avoid the problem of tuning pruning rates individually for different layers in a network.
1 code implementation • NeurIPS 2020 • Yihong Gu, Weizhong Zhang, Cong Fang, Jason D. Lee, Tong Zhang
With the help of a new technique called {\it neural network grafting}, we demonstrate that even during the entire training process, feature distributions of differently initialized networks remain similar at each layer.
no code implementations • NeurIPS 2018 • Xing Yan, Weizhong Zhang, Lin Ma, Wei Liu, Qi Wu
We propose a parsimonious quantile regression framework to learn the dynamic tail behaviors of financial asset returns.
no code implementations • 18 Nov 2019 • Cong Fang, Yihong Gu, Weizhong Zhang, Tong Zhang
This new analysis is consistent with empirical observations that deep neural networks are capable of learning efficient feature representations.
no code implementations • CVPR 2019 • Fangyu Zou, Li Shen, Zequn Jie, Weizhong Zhang, Wei Liu
Adam and RMSProp are two of the most influential adaptive stochastic algorithms for training deep neural networks, which have been pointed out to be divergent even in the convex setting via a few simple counterexamples.
2 code implementations • 2 Jun 2018 • Yunzhe Tao, Lin Ma, Weizhong Zhang, Jian Liu, Wei Liu, Qiang Du
Time series prediction has been studied in a variety of domains.
no code implementations • ICML 2018 • Weizhong Zhang, Bin Hong, Lin Ma, Wei Liu, Tong Zhang
Relying on this study, we subsequently propose a novel safe screening method to quickly identify the elements guaranteed to be included (we refer to them as active) or excluded (inactive) in the final optimal solution of SFM during the optimization process.
1 code implementation • ICML 2017 • Weizhong Zhang, Bin Hong, Wei Liu, Jieping Ye, Deng Cai, Xiaofei He, Jie Wang
By noting that sparse SVMs induce sparsities in both feature and sample spaces, we propose a novel approach, which is based on accurate estimations of the primal and dual optima of sparse SVMs, to simultaneously identify the inactive features and samples that are guaranteed to be irrelevant to the outputs.