Search Results for author: Chang Xu

Found 219 papers, 106 papers with code

On Dropping Clusters to Regularize Graph Convolutional Neural Networks

no code implementations ECCV 2020 Xikun Zhang, Chang Xu, DaCheng Tao

Dropout has been widely adopted to regularize graph convolutional networks (GCNs) by randomly zeroing entries of the node feature vectors and obtains promising performance on various tasks.

Action Recognition Skeleton Based Action Recognition

Learning Visual Abstract Reasoning through Dual-Stream Networks

1 code implementation29 Nov 2024 Kai Zhao, Chang Xu, Bailu Si

Visual abstract reasoning tasks present challenges for deep neural networks, exposing limitations in their capabilities.

Visual Reasoning

Unsupervised Multi-view UAV Image Geo-localization via Iterative Rendering

no code implementations22 Nov 2024 Haoyuan Li, Chang Xu, Wen Yang, Li Mi, Huai Yu, Haijian Zhang

As such, our unsupervised paradigm naturally avoids the problem of region-specific overfitting, enabling generic CVGL for UAV images without feature fine-tuning or data-driven training.

geo-localization Image Generation

FLMarket: Enabling Privacy-preserved Pre-training Data Pricing for Federated Learning

no code implementations18 Nov 2024 Zhenyu Wen, Wanglei Feng, Di wu, Haozhen Hu, Chang Xu, Bin Qian, Zhen Hong, Cong Wang, Shouling Ji

Federated Learning (FL), as a mainstream privacy-preserving machine learning paradigm, offers promising solutions for privacy-critical domains such as healthcare and finance.

Federated Learning Privacy Preserving

Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation

1 code implementation14 Nov 2024 Yuheng Shi, Minjing Dong, Chang Xu

Specifically, we introduce Trident, a training-free framework that first splices features extracted by CLIP and DINO from sub-images, then leverages SAM's encoder to create a correlation matrix for global aggregation, enabling a broadened receptive field for effective segmentation.

Segmentation Semantic Segmentation +1

Investigating Memorization in Video Diffusion Models

no code implementations29 Oct 2024 Chen Chen, Enhuai Liu, Daochang Liu, Mubarak Shah, Chang Xu

Diffusion models, widely used for image and video generation, face a significant limitation: the risk of memorizing and reproducing training data during inference, potentially generating unauthorized copyrighted content.

Memorization Video Generation

Exploring Local Memorization in Diffusion Models via Bright Ending Attention

no code implementations29 Oct 2024 Chen Chen, Daochang Liu, Mubarak Shah, Chang Xu

Furthermore, driven by our observation that local memorization significantly underperforms in existing tasks of measuring, detecting, and mitigating memorization in diffusion models compared to global memorization, we propose a simple yet effective method to integrate BE and the results of the new localization task into these existing frameworks.

Memorization

Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model

no code implementations24 Oct 2024 Jinxu Lin, Linwei Tao, Minjing Dong, Chang Xu

Existing data attribution methods for diffusion models typically quantify the contribution of a training sample by evaluating the change in diffusion loss when the sample is included or excluded from the training process.

Consistency Calibration: Improving Uncertainty Calibration via Consistency among Perturbed Neighbors

no code implementations16 Oct 2024 Linwei Tao, Haolan Guo, Minjing Dong, Chang Xu

Calibration is crucial in deep learning applications, especially in fields like healthcare and autonomous driving, where accurate confidence estimates are vital for decision-making.

Autonomous Driving Computational Efficiency +1

Feature Clipping for Uncertainty Calibration

no code implementations16 Oct 2024 Linwei Tao, Minjing Dong, Chang Xu

As the first calibration technique based on feature modification, feature clipping offers a novel approach to improving model calibration, showing significant improvements over both post-hoc and train-time calibration methods and pioneering a new avenue for feature-based model calibration.

MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model

no code implementations4 Sep 2024 Junjie Li, Yang Liu, Weiqing Liu, Shikai Fang, Lewen Wang, Chang Xu, Jiang Bian

This simulation relies on the finest structured data in financial market like orders thus building the finest realistic simulation.

Language Modelling Text Generation

FusionSAM: Latent Space driven Segment Anything Model for Multimodal Fusion and Segmentation

no code implementations26 Aug 2024 Daixun Li, Weiying Xie, Mingxiang Cao, Yunke Wang, Jiaqing Zhang, Yunsong Li, Leyuan Fang, Chang Xu

In this paper, we introduce SAM into multimodal image segmentation for the first time, proposing a novel framework that combines Latent Space Token Generation (LSTG) and Fusion Mask Prompting (FMP) modules to enhance SAM's multimodal fusion and segmentation capabilities.

Autonomous Driving Image Segmentation +4

Training-free Long Video Generation with Chain of Diffusion Model Experts

no code implementations24 Aug 2024 Wenhao Li, Yichao Cao, Xiu Su, Xi Lin, Shan You, Mingkai Zheng, Yi Chen, Chang Xu

It can generate high-quality videos with chain of off-the-shelf diffusion model experts, each expert responsible for a decoupled subtask.

Denoising Video Generation

Compress Guidance in Conditional Diffusion Sampling

no code implementations20 Aug 2024 Anh-Dung Dinh, Daochang Liu, Chang Xu

We found that enforcing guidance throughout the sampling process is often counterproductive due to the model-fitting issue, where samples are 'tuned' to match the classifier's parameters rather than generalizing the expected condition.

Diversity

Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness

1 code implementation16 Aug 2024 Hefei Mei, Minjing Dong, Chang Xu

To alleviate this issue, we redesign the diffusion framework from generating high-quality images to predicting distinguishable image labels.

Adversarial Robustness Image Classification +2

VSSD: Vision Mamba with Non-Causal State Space Duality

2 code implementations26 Jul 2024 Yuheng Shi, Minjing Dong, Mingjia Li, Chang Xu

Recently, State Space Duality (SSD), an improved variant of SSMs, was introduced in Mamba2 to enhance model performance and efficiency.

Image Classification Mamba +1

Enhancing Fine-grained Object Detection in Aerial Images via Orthogonal Mapping

1 code implementation25 Jul 2024 Haoran Zhu, Yifan Zhou, Chang Xu, Ruixiang Zhang, Wen Yang

This letter introduces Orthogonal Mapping (OM), a simple yet effective method aimed at addressing the challenge of semantic confusion inherent in FGOD.

object-detection Object Detection In Aerial Images

Training-free Composite Scene Generation for Layout-to-Image Synthesis

1 code implementation18 Jul 2024 Jiaqi Liu, Tao Huang, Chang Xu

Recent breakthroughs in text-to-image diffusion models have significantly advanced the generation of high-fidelity, photo-realistic images from textual descriptions.

Layout-to-Image Generation Scene Generation

Surgical Triplet Recognition via Diffusion Model

no code implementations19 Jun 2024 Daochang Liu, Axel Hu, Mubarak Shah, Chang Xu

In this paper, we propose DiffTriplet, a new generative framework for surgical triplet recognition employing the diffusion model, which predicts surgical triplets via iterative denoising.

Action Triplet Recognition Denoising +1

Locating and Extracting Relational Concepts in Large Language Models

1 code implementation19 Jun 2024 Zijian Wang, Britney White, Chang Xu

In this paper, we identify hidden states that can express entity and relational concepts through causal mediation analysis in fact recall processes.

World Knowledge

JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language Models

1 code implementation10 Jun 2024 Jialun Cao, Zhiyong Chen, Jiarong Wu, Shing-Chi Cheung, Chang Xu

First, we noticed that regarding project-level Java programming, LLMs are far behind undergraduate students (no project can be correctly completed by any studied LLMs, and at most 41. 17% Pass@5 in a more relaxed evaluation).

Benchmarking Code Generation +1

Pruning for Robust Concept Erasing in Diffusion Models

no code implementations26 May 2024 Tianyun Yang, Juan Cao, Chang Xu

Experimental results show a significant enhancement in our model's ability to resist adversarial inputs, achieving nearly a 40% improvement in erasing the NSFW content and a 30% improvement in erasing artwork style.

Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model

1 code implementation23 May 2024 Yuheng Shi, Minjing Dong, Chang Xu

To improve the performance of SSMs in vision tasks, a multi-scan strategy is widely adopted, which leads to significant redundancy of SSMs.

Mamba State Space Models

Detecting Every Object from Events

1 code implementation8 Apr 2024 Haitian Zhang, Chang Xu, Xinya Wang, Bingde Liu, Guang Hua, Lei Yu, Wen Yang

Object detection is critical in autonomous driving, and it is more practical yet challenging to localize objects of unknown categories: an endeavour known as Class-Agnostic Object Detection (CAOD).

Autonomous Driving Class-agnostic Object Detection +5

NeRF2Points: Large-Scale Point Cloud Generation From Street Views' Radiance Field Optimization

no code implementations7 Apr 2024 Peng Tu, Xun Zhou, Mingming Wang, Xiaojun Yang, Bo Peng, Ping Chen, Xiu Su, Yawen Huang, Yefeng Zheng, Chang Xu

Neural Radiance Fields (NeRF) have emerged as a paradigm-shifting methodology for the photorealistic rendering of objects and environments, enabling the synthesis of novel viewpoints with remarkable fidelity.

Autonomous Vehicles Point Cloud Generation

Towards Memorization-Free Diffusion Models

no code implementations CVPR 2024 Chen Chen, Daochang Liu, Chang Xu

Pretrained diffusion models and their outputs are widely accessible due to their exceptional capacity for synthesizing high-quality images and their open-source nature.

Denoising Memorization

ConGeo: Robust Cross-view Geo-localization across Ground View Variations

no code implementations20 Mar 2024 Li Mi, Chang Xu, Javiera Castillo-Navarro, Syrielle Montariol, Wen Yang, Antoine Bosselut, Devis Tuia

Cross-view geo-localization aims at localizing a ground-level query image by matching it to its corresponding geo-referenced aerial view.

geo-localization

Learning Cross-view Visual Geo-localization without Ground Truth

no code implementations19 Mar 2024 Haoyuan Li, Chang Xu, Wen Yang, Huai Yu, Gui-Song Xia

We observe that training on unlabeled cross-view images presents significant challenges, including the need to establish relationships within unlabeled data and reconcile view discrepancies between uncertain queries and references.

geo-localization Self-Supervised Learning

Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V

no code implementations18 Mar 2024 Siyu Xu, Yunke Wang, Daochang Liu, Chang Xu

Based on the observation that the accuracy of GPT-4V's image recognition varies significantly with the order of images within the collage prompt, our method further learns to optimize the arrangement of images for maximum recognition accuracy.

Navigate

Understanding Robustness of Visual State Space Models for Image Classification

1 code implementation16 Mar 2024 Chengbin Du, Yanxi Li, Chang Xu

VMamba exhibits exceptional generalizability with out-of-distribution data but shows scalability weaknesses against natural adversarial examples and common corruptions.

Adversarial Robustness Image Classification +1

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba

1 code implementation15 Mar 2024 Xiaohuan Pei, Tao Huang, Chang Xu

Inspired by this, this work proposes to explore the potential of visual state space models in light-weight model design and introduce a novel efficient model variant dubbed EfficientVMamba.

Language Modelling Mamba +1

LocalMamba: Visual State Space Model with Windowed Selective Scan

1 code implementation14 Mar 2024 Tao Huang, Xiaohuan Pei, Shan You, Fei Wang, Chen Qian, Chang Xu

This paper posits that the key to enhancing Vision Mamba (ViM) lies in optimizing scan directions for sequence modeling.

Mamba State Space Models

Active Generation for Image Classification

no code implementations11 Mar 2024 Tao Huang, Jiaqi Liu, Shan You, Chang Xu

Recently, the growing capabilities of deep generative models have underscored their potential in enhancing image classification accuracy.

Active Learning Classification +3

MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process

1 code implementation9 Mar 2024 Xinyao Fan, Yueying Wu, Chang Xu, Yuhao Huang, Weiqing Liu, Jiang Bian

However, the effective utilization of their strong modeling ability in the probabilistic time series forecasting task remains an open question, partially due to the challenge of instability arising from their stochastic nature.

Probabilistic Time Series Forecasting Time Series +1

Data-efficient Large Vision Models through Sequential Autoregression

1 code implementation7 Feb 2024 Jianyuan Guo, Zhiwei Hao, Chengcheng Wang, Yehui Tang, Han Wu, Han Hu, Kai Han, Chang Xu

Training general-purpose vision models on purely sequential visual data, eschewing linguistic inputs, has heralded a new frontier in visual understanding.

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

1 code implementation6 Feb 2024 Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wang

Recent advancements in large language models have sparked interest in their extraordinary and near-superhuman capabilities, leading researchers to explore methods for evaluating and optimizing these abilities, which is called superalignment.

Few-Shot Learning Knowledge Distillation +1

Accelerated Cloud for Artificial Intelligence (ACAI)

no code implementations30 Jan 2024 Dachi Chen, Weitian Ding, Chen Liang, Chang Xu, Junwei Zhang, Majd Sakr

Training an effective Machine learning (ML) model is an iterative process that requires effort in multiple dimensions.

Scheduling

Visual Imitation Learning with Calibrated Contrastive Representation

no code implementations21 Jan 2024 Yunke Wang, Linwei Tao, Bo Du, Yutian Lin, Chang Xu

Adversarial Imitation Learning (AIL) allows the agent to reproduce expert behavior with low-dimensional states and actions.

Contrastive Learning Imitation Learning

Robust Tiny Object Detection in Aerial Images amidst Label Noise

1 code implementation16 Jan 2024 Haoran Zhu, Chang Xu, Wen Yang, Ruixiang Zhang, Yan Zhang, Gui-Song Xia

In this study, we address the intricate issue of tiny object detection under noisy label supervision.

Denoising Object +2

MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process Download PDF

1 code implementation ICLR2024 2024 Xinyao Fan, Yueying Wu, Chang Xu, Yuhao Huang, Weiqing Liu, Jiang Bian

To address this challenge, we introduce a novel Multi-Granularity Time Series Diffusion (MG-TSD) model, which achieves state-of-the-art predictive performance by leveraging the inherent granularity levels within the data as given targets at intermediate diffusion steps to guide the learning process of diffusion models.

Probabilistic Time Series Forecasting Time Series +1

Random Entangled Tokens for Adversarially Robust Vision Transformer

no code implementations CVPR 2024 Huihui Gong, Minjing Dong, Siqi Ma, Seyit Camtepe, Surya Nepal, Chang Xu

Recognizing the challenge posed by the structural disparities between ViTs and CNNs we introduce a novel module input-independent random entangled self-attention (II-ReSA).

Adversarial Robustness

Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective

1 code implementation CVPR 2024 Jinjing Zhao, Fangyun Wei, Chang Xu

We systematically adapt the Faster R-CNN towards the Deformable DETR by integrating or repurposing each component of Deformable DETR and note that Deformable DETR's improved performance over Faster R-CNN is attributed to the adoption of advanced modules such as a superior proposal refiner (e. g. deformable attention rather than RoI Align).

Decoder object-detection +1

Residual Learning in Diffusion Models

no code implementations CVPR 2024 Junyu Zhang, Daochang Liu, Eunbyung Park, Shichao Zhang, Chang Xu

This gap results in a residual in the generated images adversely impacting the image quality.

Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models

1 code implementation NeurIPS 2023 Yichao Cao, Qingfei Tang, Xiu Su, Chen Song, Shan You, Xiaobo Lu, Chang Xu

We conduct a deep analysis of the three hierarchical features inherent in visual HOI detectors and propose a method for high-level relation extraction aimed at VL foundation models, which we call HO prompt-based learning.

Decoder Human-Object Interaction Detection +2

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

1 code implementation NeurIPS 2023 Zhiwei Hao, Jianyuan Guo, Kai Han, Yehui Tang, Han Hu, Yunhe Wang, Chang Xu

To tackle the challenge in distilling heterogeneous models, we propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.

Knowledge Distillation

Imitation Learning from Purified Demonstrations

1 code implementation11 Oct 2023 Yunke Wang, Minjing Dong, Yukun Zhao, Bo Du, Chang Xu

In the first step, we apply a forward diffusion process to smooth potential noises in imperfect demonstrations by introducing additional noise.

Imitation Learning Sequential Decision Making

Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation Robustness via Hypernetworks

no code implementations28 Sep 2023 Huihui Gong, Minjing Dong, Siqi Ma, Seyit Camtepe, Surya Nepal, Chang Xu

Adversarial training serves as one of the most popular and effective methods to defend against adversarial perturbations.

MM-NeRF: Multimodal-Guided 3D Multi-Style Transfer of Neural Radiance Field

no code implementations24 Sep 2023 Zijiang Yang, Zhongwei Qiu, Chang Xu, Dongmei Fu

3D style transfer aims to generate stylized views of 3D scenes with specified styles, which requires high-quality generating and keeping multi-view consistency.

Incremental Learning Style Transfer

Stealthy Physical Masked Face Recognition Attack via Adversarial Style Optimization

no code implementations18 Sep 2023 Huihui Gong, Minjing Dong, Siqi Ma, Seyit Camtepe, Surya Nepal, Chang Xu

Moreover, to ameliorate the phenomenon of sub-optimization with one fixed style, we propose to discover the optimal style given a target through style optimization in a continuous relaxation manner.

Face Recognition

Efficient Transfer Learning in Diffusion Models via Adversarial Noise

no code implementations23 Aug 2023 Xiyu Wang, Baijiong Lin, Daochang Liu, Chang Xu

Diffusion Probabilistic Models (DPMs) have demonstrated substantial promise in image generation tasks but heavily rely on the availability of large amounts of training data.

Denoising Diversity +2

Boosting Diffusion Models with an Adaptive Momentum Sampler

no code implementations23 Aug 2023 Xiyu Wang, Anh-Dung Dinh, Daochang Liu, Chang Xu

Our proposed sampler can be readily applied to a pre-trained diffusion model, utilizing momentum mechanisms and adaptive updating to smooth the reverse sampling process and ensure stable generation, resulting in outputs of enhanced quality.

A Benchmark Study on Calibration

no code implementations23 Aug 2023 Linwei Tao, Younan Zhu, Haolan Guo, Minjing Dong, Chang Xu

As far as we are aware, our research represents the first large-scale investigation into calibration properties and the premier study of calibration issues within NAS.

Neural Architecture Search

CoNe: Contrast Your Neighbours for Supervised Image Classification

1 code implementation21 Aug 2023 Mingkai Zheng, Shan You, Lang Huang, Xiu Su, Fei Wang, Chen Qian, Xiaogang Wang, Chang Xu

Moreover, to further boost the performance, we propose ``distributional consistency" as a more informative regularization to enable similar instances to have a similar probability distribution.

Classification Image Classification +1

Microstructure-Empowered Stock Factor Extraction and Utilization

no code implementations16 Aug 2023 Xianfeng Jiao, Zizhong Li, Chang Xu, Yang Liu, Weiqing Liu, Jiang Bian

To address these challenges, we propose a novel framework that aims to effectively extract essential factors from order flow data for diverse downstream tasks across different granularities and scenarios.

Stock Trend Prediction

Model Synthesis for Zero-Shot Model Attribution

1 code implementation29 Jul 2023 Tianyun Yang, Juan Cao, Danding Wang, Chang Xu

The design of the synthesis technique is motivated by observations on how the basic generative model's architecture building blocks and parameters influence fingerprint patterns, and it is validated through two designed metrics that examine synthetic models' fidelity and diversity.

Attribute Zero-shot Generalization

Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection

no code implementations ICCV 2023 Yichao Cao, Qingfei Tang, Feng Yang, Xiu Su, Shan You, Xiaobo Lu, Chang Xu

Human-Object Interaction (HOI) detection is a challenging computer vision task that requires visual models to address the complex interactive relationship between humans and objects and predict HOI triplets.

Human-Object Interaction Detection Sentence +1

What Can Simple Arithmetic Operations Do for Temporal Modeling?

2 code implementations ICCV 2023 Wenhao Wu, Yuxin Song, Zhun Sun, Jingdong Wang, Chang Xu, Wanli Ouyang

We conduct comprehensive ablation studies on the instantiation of ATMs and demonstrate that this module provides powerful temporal modeling capability at a low computational cost.

Action Classification Action Recognition +1

Neural Architecture Retrieval

1 code implementation16 Jul 2023 Xiaohuan Pei, Yanxi Li, Minjing Dong, Chang Xu

With the increasing number of new neural architecture designs and substantial existing neural architectures, it becomes difficult for the researchers to situate their contributions compared with existing neural architectures or establish the connections between their designs and other relevant ones.

Contrastive Learning Graph Representation Learning +1

GPT Self-Supervision for a Better Data Annotator

no code implementations7 Jun 2023 Xiaohuan Pei, Yanxi Li, Chang Xu

In the one-shot tuning phase, we sample a data from the support set as part of the prompt for GPT to generate a textual summary, which is then used to recover the original data.

One-Shot Learning Sentence

VanillaKD: Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale

1 code implementation25 May 2023 Zhiwei Hao, Jianyuan Guo, Kai Han, Han Hu, Chang Xu, Yunhe Wang

The tremendous success of large models trained on extensive datasets demonstrates that scale is a key ingredient in achieving superior results.

Data Augmentation Knowledge Distillation

Knowledge Diffusion for Distillation

1 code implementation NeurIPS 2023 Tao Huang, Yuan Zhang, Mingkai Zheng, Shan You, Fei Wang, Chen Qian, Chang Xu

To address this, we propose to denoise student features using a diffusion model trained by teacher features.

Denoising Image Classification +4

Dual Focal Loss for Calibration

1 code implementation23 May 2023 Linwei Tao, Minjing Dong, Chang Xu

While different variants of focal loss have been explored, it is difficult to find a balance between over-confidence and under-confidence.

Can GPT-4 Perform Neural Architecture Search?

1 code implementation21 Apr 2023 Mingkai Zheng, Xiu Su, Shan You, Fei Wang, Chen Qian, Chang Xu, Samuel Albanie

We investigate the potential of GPT-4~\cite{gpt4} to perform Neural Architecture Search (NAS) -- the task of designing effective neural architectures.

Navigate Neural Architecture Search

Dynamic Coarse-to-Fine Learning for Oriented Tiny Object Detection

1 code implementation CVPR 2023 Chang Xu, Jian Ding, Jinwang Wang, Wen Yang, Huai Yu, Lei Yu, Gui-Song Xia

Despite the exploration of adaptive label assignment in recent oriented object detectors, the extreme geometry shape and limited feature of oriented tiny objects still induce severe mismatch and imbalance issues.

object-detection Object Detection +2

Thin Films on the Skin, but not Frictional Agents, Attenuate the Percept of Pleasantness to Brushed Stimuli

no code implementations28 Feb 2023 Merat Rezaei, Saad S. Nagi, Chang Xu, Sarah McIntyre, Hakan Olausson, Gregory J. Gerling

Brushed stimuli are perceived as pleasant when stroked lightly on the skin surface of a touch receiver at certain velocities.

Friction

Two-in-one Knowledge Distillation for Efficient Facial Forgery Detection

no code implementations21 Feb 2023 Chuyang Zhou, Jiajun Huang, Daochang Liu, Chengbin Du, Siqi Ma, Surya Nepal, Chang Xu

More specifically, knowledge distillation on both the spatial and frequency branches has degraded performance than distillation only on the spatial branch.

Knowledge Distillation Vocal Bursts Valence Prediction

Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning

1 code implementation13 Feb 2023 Yunke Wang, Bo Du, Chang Xu

The trajectories of an initial agent policy could be closer to those non-optimal expert demonstrations, but within the framework of adversarial imitation learning, agent policy will be optimized to cheat the discriminator and produce trajectories that are similar to those optimal expert demonstrations.

Imitation Learning

Calibrating a Deep Neural Network with Its Predecessors

1 code implementation13 Feb 2023 Linwei Tao, Minjing Dong, Daochang Liu, Changming Sun, Chang Xu

However, early stopping, as a well-known technique to mitigate overfitting, fails to calibrate networks.

Anti-Compression Contrastive Facial Forgery Detection

no code implementations13 Feb 2023 Jiajun Huang, Xinqi Zhu, Chengbin Du, Siqi Ma, Surya Nepal, Chang Xu

To enhance the performance for such models, we consider the weak compressed and strong compressed data as two views of the original data and they should have similar representation and relationships with other samples.

Contrastive Learning

Beyond Pretrained Features: Noisy Image Modeling Provides Adversarial Defense

1 code implementation NeurIPS 2023 Zunzhi You, Daochang Liu, Bohyung Han, Chang Xu

Experimental results demonstrate that, in terms of adversarial robustness, NIM is superior to MIM thanks to its effective denoising capability.

Adversarial Defense Adversarial Robustness +4

Adversarial Robustness via Random Projection Filters

1 code implementation CVPR 2023 Minjing Dong, Chang Xu

Deep Neural Networks show superior performance in various tasks but are vulnerable to adversarial attacks.

Adversarial Robustness Attribute +1

Private Image Generation With Dual-Purpose Auxiliary Classifier

no code implementations CVPR 2023 Chen Chen, Daochang Liu, Siqi Ma, Surya Nepal, Chang Xu

However, apart from this standard utility, we identify the "reversed utility" as another crucial aspect, which computes the accuracy on generated data of a classifier trained using real data, dubbed as real2gen accuracy (r2g%).

Image Generation Privacy Preserving

Trade-Off Between Robustness and Accuracy of Vision Transformers

no code implementations CVPR 2023 Yanxi Li, Chang Xu

Although deep neural networks (DNNs) have shown great successes in computer vision tasks, they are vulnerable to perturbations on inputs, and there exists a trade-off between the natural accuracy and robustness to such perturbations, which is mainly caused by the existence of robust non-predictive features and non-robust predictive features.

ContraFeat: Contrasting Deep Features for Semantic Discovery

no code implementations14 Dec 2022 Xinqi Zhu, Chang Xu, DaCheng Tao

In this paper, we propose a model that automates this process and achieves state-of-the-art semantic discovery performance.

FastMIM: Expediting Masked Image Modeling Pre-training for Vision

1 code implementation13 Dec 2022 Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Yunhe Wang, Chang Xu

This paper presents FastMIM, a simple and generic framework for expediting masked image modeling with the following two steps: (i) pre-training vision backbones with low-resolution input images; and (ii) reconstructing Histograms of Oriented Gradients (HOG) feature instead of original RGB values of the input images.

GhostNetV2: Enhance Cheap Operation with Long-Range Attention

11 code implementations23 Nov 2022 Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang

The convolutional operation can only capture local information in a window region, which prevents performance from being further improved.

Boosting Semi-Supervised Semantic Segmentation with Probabilistic Representations

1 code implementation26 Oct 2022 Haoyu Xie, Changqi Wang, Mingkai Zheng, Minjing Dong, Shan You, Chong Fu, Chang Xu

In prevalent pixel-wise contrastive learning solutions, the model maps pixels to deterministic representations and regularizes them in the latent space.

Contrastive Learning Semi-Supervised Semantic Segmentation

Learning Differential Operators for Interpretable Time Series Modeling

no code implementations3 Sep 2022 Yingtao Luo, Chang Xu, Yang Liu, Weiqing Liu, Shun Zheng, Jiang Bian

In this work, we propose an learning framework that can automatically obtain interpretable PDE models from sequential data.

Decision Making Meta-Learning +2

Motion Robust High-Speed Light-Weighted Object Detection With Event Camera

1 code implementation24 Aug 2022 Bingde Liu, Chang Xu, Wen Yang, Huai Yu, Lei Yu

In this work, we propose a motion robust and high-speed detection pipeline which better leverages the event data.

Data Augmentation object-detection +3

RFLA: Gaussian Receptive Field based Label Assignment for Tiny Object Detection

1 code implementation18 Aug 2022 Chang Xu, Jinwang Wang, Wen Yang, Huai Yu, Lei Yu, Gui-Song Xia

Then, instead of assigning samples with IoU or center sampling strategy, a new Receptive Field Distance (RFD) is proposed to directly measure the similarity between the Gaussian receptive field and ground truth.

Object object-detection +1

LightViT: Towards Light-Weight Convolution-Free Vision Transformers

1 code implementation12 Jul 2022 Tao Huang, Lang Huang, Shan You, Fei Wang, Chen Qian, Chang Xu

Vision transformers (ViTs) are usually considered to be less light-weight than convolutional neural networks (CNNs) due to the lack of inductive bias.

Image Classification Inductive Bias +3

Masked Distillation with Receptive Tokens

1 code implementation29 May 2022 Tao Huang, Yuan Zhang, Shan You, Fei Wang, Chen Qian, Jian Cao, Chang Xu

To obtain a group of masks, the receptive tokens are learned via the regular task loss but with teacher fixed, and we also leverage a Dice loss to enrich the diversity of learned masks.

object-detection Object Detection +1

Knowledge Distillation from A Stronger Teacher

3 code implementations21 May 2022 Tao Huang, Shan You, Fei Wang, Chen Qian, Chang Xu

In this paper, we show that simply preserving the relations between the predictions of teacher and student would suffice, and propose a correlation-based loss to capture the intrinsic inter-class relations from the teacher explicitly.

Ranked #3 on Knowledge Distillation on ImageNet (using extra training data)

Image Classification Knowledge Distillation +2

Searching for Network Width with Bilaterally Coupled Network

1 code implementation25 Mar 2022 Xiu Su, Shan You, Jiyang Xie, Fei Wang, Chen Qian, ChangShui Zhang, Chang Xu

In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately.

Fairness

DyRep: Bootstrapping Training with Dynamic Re-parameterization

2 code implementations CVPR 2022 Tao Huang, Shan You, Bohan Zhang, Yuxuan Du, Fei Wang, Chen Qian, Chang Xu

Structural re-parameterization (Rep) methods achieve noticeable improvements on simple VGG-style networks.

Weak Augmentation Guided Relational Self-Supervised Learning

1 code implementation16 Mar 2022 Mingkai Zheng, Shan You, Fei Wang, Chen Qian, ChangShui Zhang, Xiaogang Wang, Chang Xu

Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved great success in learning visual representations without data annotations.

Contrastive Learning Relation +2

Multi-Tailed Vision Transformer for Efficient Inference

no code implementations3 Mar 2022 Yunke Wang, Bo Du, Wenyuan Wang, Chang Xu

To satisfy the sequential input of Transformer, the tail of ViT first splits each image into a sequence of visual tokens with a fixed length.

Relational Surrogate Loss Learning

1 code implementation ICLR 2022 Tao Huang, Zekang Li, Hua Lu, Yong Shan, Shusheng Yang, Yang Feng, Fei Wang, Shan You, Chang Xu

Evaluation metrics in machine learning are often hardly taken as loss functions, as they could be non-differentiable and non-decomposable, e. g., average precision and F1 score.

Image Classification Machine Reading Comprehension +3

GhostNets on Heterogeneous Devices via Cheap Operations

8 code implementations10 Jan 2022 Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian

The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.

DeepFake Disrupter: The Detector of DeepFake Is My Friend

no code implementations CVPR 2022 Xueyu Wang, Jiajun Huang, Siqi Ma, Surya Nepal, Chang Xu

We argue that the detectors do not share a similar perspective as human eyes, which might still be spoofed by the disrupted data.

Face Swapping

An Empirical Study of Adder Neural Networks for Object Detection

no code implementations NeurIPS 2021 Xinghao Chen, Chang Xu, Minjing Dong, Chunjing Xu, Yunhe Wang

Adder neural networks (AdderNets) have shown impressive performance on image classification with only addition operations, which are more energy efficient than traditional convolutional neural networks built with multiplications.

Autonomous Driving Face Detection +3

Handling Long-tailed Feature Distribution in AdderNets

no code implementations NeurIPS 2021 Minjing Dong, Yunhe Wang, Xinghao Chen, Chang Xu

Adder neural networks (ANNs) are designed for low energy cost which replace expensive multiplications in convolutional neural networks (CNNs) with cheaper additions to yield energy-efficient neural networks and hardware accelerations.

Knowledge Distillation

Towards Stable and Robust AdderNets

no code implementations NeurIPS 2021 Minjing Dong, Yunhe Wang, Xinghao Chen, Chang Xu

Adder neural network (AdderNet) replaces the original convolutions with massive multiplications by cheap additions while achieving comparable performance thus yields a series of energy-efficient neural networks.

Adversarial Robustness

GreedyNASv2: Greedier Search with a Greedy Path Filter

no code implementations CVPR 2022 Tao Huang, Shan You, Fei Wang, Chen Qian, ChangShui Zhang, Xiaogang Wang, Chang Xu

In this paper, we leverage an explicit path filter to capture the characteristics of paths and directly filter those weak ones, so that the search can be thus implemented on the shrunk space more greedily and efficiently.

An Image Patch is a Wave: Phase-Aware Vision MLP

8 code implementations CVPR 2022 Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Yanxi Li, Chao Xu, Yunhe Wang

To dynamically aggregate tokens, we propose to represent each token as a wave function with two parts, amplitude and phase.

Image Classification object-detection +2

A Normalized Gaussian Wasserstein Distance for Tiny Object Detection

3 code implementations26 Oct 2021 Jinwang Wang, Chang Xu, Wen Yang, Lei Yu

Our key observation is that Intersection over Union (IoU) based metrics such as IoU itself and its extensions are very sensitive to the location deviation of the tiny objects, and drastically deteriorate the detection performance when used in anchor-based detectors.

Object object-detection +1

Learning Versatile Convolution Filters for Efficient Visual Recognition

no code implementations20 Sep 2021 Kai Han, Yunhe Wang, Chang Xu, Chunjing Xu, Enhua Wu, DaCheng Tao

A series of secondary filters can be derived from a primary filter with the help of binary masks.

Hire-MLP: Vision MLP via Hierarchical Rearrangement

10 code implementations CVPR 2022 Jianyuan Guo, Yehui Tang, Kai Han, Xinghao Chen, Han Wu, Chao Xu, Chang Xu, Yunhe Wang

Previous vision MLPs such as MLP-Mixer and ResMLP accept linearly flattened image patches as input, making them inflexible for different input sizes and hard to capture spatial information.

Image Classification object-detection +2

DeepFake MNIST+: A DeepFake Facial Animation Dataset

1 code implementation18 Aug 2021 Jiajun Huang, Xueyu Wang, Bo Du, Pei Du, Chang Xu

It includes 10, 000 facial animation videos in ten different actions, which can spoof the recent liveness detectors.

DeepFake Detection Face Swapping +1

Neural Architecture Dilation for Adversarial Robustness

no code implementations NeurIPS 2021 Yanxi Li, Zhaohui Yang, Yunhe Wang, Chang Xu

With the tremendous advances in the architecture and scale of convolutional neural networks (CNNs) over the past few decades, they can easily reach or even exceed the performance of humans in certain tasks.

Adversarial Robustness

ReSSL: Relational Self-Supervised Learning with Weak Augmentation

2 code implementations NeurIPS 2021 Mingkai Zheng, Shan You, Fei Wang, Chen Qian, ChangShui Zhang, Xiaogang Wang, Chang Xu

Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved great success in learning visual representations without data annotations.

Contrastive Learning Relation +2

CMT: Convolutional Neural Networks Meet Vision Transformers

14 code implementations CVPR 2022 Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Xinghao Chen, Yunhe Wang, Chang Xu

Vision transformers have been successfully applied to image recognition tasks due to their ability to capture long-range dependencies within an image.

Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning

1 code implementation12 Jul 2021 Jun Wang, Chang Xu, Francisco Guzman, Ahmed El-Kishky, Yuqing Tang, Benjamin I. P. Rubinstein, Trevor Cohn

Neural machine translation systems are known to be vulnerable to adversarial test inputs, however, as we show in this paper, these systems are also vulnerable to training attacks.

Data Poisoning Machine Translation +3

ViTAS: Vision Transformer Architecture Search

1 code implementation25 Jun 2021 Xiu Su, Shan You, Jiyang Xie, Mingkai Zheng, Fei Wang, Chen Qian, ChangShui Zhang, Xiaogang Wang, Chang Xu

Vision transformers (ViTs) inherited the success of NLP but their structures have not been sufficiently investigated and optimized for visual tasks.

Inductive Bias Neural Architecture Search

ReNAS: Relativistic Evaluation of Neural Architecture Search

7 code implementations CVPR 2021 Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).

Neural Architecture Search

Positive-Unlabeled Data Purification in the Wild for Object Detection

no code implementations CVPR 2021 Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang

In this paper, we present a positive-unlabeled learning based scheme to expand training data by purifying valuable images from massive unlabeled ones, where the original training data are viewed as positive data and the unlabeled images in the wild are unlabeled data.

Knowledge Distillation object-detection +1

Learning Student Networks in the Wild

1 code implementation CVPR 2021 Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang

Experiments on various datasets demonstrate that the student networks learned by the proposed method can achieve comparable performance with those using the original dataset.

Knowledge Distillation Model Compression

K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets

no code implementations11 Jun 2021 Xiu Su, Shan You, Mingkai Zheng, Fei Wang, Chen Qian, ChangShui Zhang, Chang Xu

The operation weight for each path is represented as a convex combination of items in a dictionary with a simplex code.

Commutative Lie Group VAE for Disentanglement Learning

2 code implementations7 Jun 2021 Xinqi Zhu, Chang Xu, DaCheng Tao

Instead, we propose to encode the data variations with groups, a structure not only can equivariantly represent variations, but can also be adaptively optimized to preserve the properties of data variations.

Disentanglement

Patch Slimming for Efficient Vision Transformers

no code implementations CVPR 2022 Yehui Tang, Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chao Xu, DaCheng Tao

We first identify the effective patches in the last layer and then use them to guide the patch selection process of previous layers.

Efficient ViTs

Universal Adder Neural Networks

no code implementations29 May 2021 Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, Chunjing Xu, Tong Zhang

The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.

BCNet: Searching for Network Width with Bilaterally Coupled Network

no code implementations CVPR 2021 Xiu Su, Shan You, Fei Wang, Chen Qian, ChangShui Zhang, Chang Xu

In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately.

Where and What? Examining Interpretable Disentangled Representations

1 code implementation CVPR 2021 Xinqi Zhu, Chang Xu, DaCheng Tao

We thus impose a perturbation on a certain dimension of the latent code, and expect to identify the perturbation along this dimension from the generated images so that the encoding of simple variations can be enforced.

Disentanglement Model Selection +1

Distilling Object Detectors via Decoupled Features

1 code implementation CVPR 2021 Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu

To this end, we present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector.

Image Classification Knowledge Distillation +3

Joint Distribution across Representation Space for Out-of-Distribution Detection

no code implementations23 Mar 2021 Jingwei Xu, Siyuan Zhu, Zenan Li, Chang Xu

Specifically, We construct a generative model, called Latent Sequential Gaussian Mixture (LSGM), to depict how the in-distribution latent features are generated in terms of the trace of DNN inference across representation spaces.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Prioritized Architecture Sampling with Monto-Carlo Tree Search

1 code implementation CVPR 2021 Xiu Su, Tao Huang, Yanxi Li, Shan You, Fei Wang, Chen Qian, ChangShui Zhang, Chang Xu

One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network, which only needs to be trained once.

Neural Architecture Search

Learning Frequency-aware Dynamic Network for Efficient Super-Resolution

no code implementations ICCV 2021 Wenbin Xie, Dehua Song, Chang Xu, Chunjing Xu, HUI ZHANG, Yunhe Wang

Extensive experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures to obtain the better tradeoff between visual quality and computational complexity.

Image Super-Resolution

Manifold Regularized Dynamic Network Pruning

7 code implementations CVPR 2021 Yehui Tang, Yunhe Wang, Yixing Xu, Yiping Deng, Chao Xu, DaCheng Tao, Chang Xu

Then, the manifold relationship between instances and the pruned sub-networks will be aligned in the training procedure.

Network Pruning

LocalDrop: A Hybrid Regularization for Deep Neural Networks

no code implementations1 Mar 2021 Ziqing Lu, Chang Xu, Bo Du, Takashi Ishida, Lefei Zhang, Masashi Sugiyama

In neural networks, developing regularization algorithms to settle overfitting is one of the major study areas.

Learning Frequency Domain Approximation for Binary Neural Networks

3 code implementations NeurIPS 2021 Yixing Xu, Kai Han, Chang Xu, Yehui Tang, Chunjing Xu, Yunhe Wang

Binary neural networks (BNNs) represent original full-precision weights and activations into 1-bit with sign function.

Hero: On the Chaos When PATH Meets Modules

no code implementations24 Feb 2021 Ying Wang, Liang Qiao, Chang Xu, Yepang Liu, Shing-Chi Cheung, Na Meng, Hai Yu, Zhiliang Zhu

The results showed that \textsc{Hero} achieved a high detection rate of 98. 5\% on a DM issue benchmark and found 2, 422 new DM issues in 2, 356 popular Golang projects.

Software Engineering

REST: Relational Event-driven Stock Trend Forecasting

no code implementations15 Feb 2021 Wentao Xu, Weiqing Liu, Chang Xu, Jiang Bian, Jian Yin, Tie-Yan Liu

To remedy the first shortcoming, we propose to model the stock context and learn the effect of event information on the stocks under different contexts.

Locally Free Weight Sharing for Network Width Search

no code implementations ICLR 2021 Xiu Su, Shan You, Tao Huang, Fei Wang, Chen Qian, ChangShui Zhang, Chang Xu

In this paper, to better evaluate each width, we propose a locally free weight sharing strategy (CafeNet) accordingly.

PTN: A Poisson Transfer Network for Semi-supervised Few-shot Learning

no code implementations20 Dec 2020 Huaxi Huang, Junjie Zhang, Jian Zhang, Qiang Wu, Chang Xu

Second, the extra unlabeled samples are employed to transfer the knowledge from base classes to novel classes through contrastive learning.

Contrastive Learning Few-Shot Learning

Finite particle number description of neutron matter using the unitary correlation operator and high-momentum pair methods

no code implementations3 Dec 2020 Niu Wan, Takayuki Myo, Chang Xu, Hiroshi Toki, Hisashi Horiuchi, Mengjiao Lyu

The central short-range correlation coming from the short-range repulsion in the NN interaction is treated by the unitary correlation operator method (UCOM) and the tensor correlation and spin-orbit effects are described by the two-particle two-hole (2p2h) excitations of nucleon pairs, in which the two nucleons with a large relative momentum are regarded as a high-momentum pair (HM).

Nuclear Theory

Assessing Social License to Operate from the Public Discourse on Social Media

no code implementations COLING 2020 Chang Xu, Cecile Paris, Ross Sparks, Surya Nepal, Keith VanderLinden

Our experimental results show that SIRTA is highly effective in distilling stances from social posts for SLO level assessment, and that the continuous monitoring of SLO levels afforded by SIRTA enables the early detection of critical SLO changes.

text-classification Text Classification +2