Search Results for author: Hongkai Xiong

Found 72 papers, 26 papers with code

Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition

1 code implementation • 6 Apr 2020 • Hao Li, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

In this paper, we propose Attribute Mix, a data augmentation strategy at attribute level to expand the fine-grained samples.

Ranked #22 on Fine-Grained Image Classification on CUB-200-2011

Attribute Data Augmentation +1

567

Paper
Code

PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search

8 code implementations • ICLR 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, Hongkai Xiong

Differentiable architecture search (DARTS) provided a fast solution in finding effective network architectures, but suffered from large memory and computing overheads in jointly training a super-network and searching for an optimal architecture.

Ranked #20 on Neural Architecture Search on CIFAR-10

Neural Architecture Search

429

Paper
Code

From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

1 code implementation • 13 Oct 2023 • Dongsheng Jiang, Yuchen Liu, Songlin Liu, Jin'e Zhao, Hao Zhang, Zhen Gao, Xiaopeng Zhang, Jin Li, Hongkai Xiong

By simply equipping it with an MLP layer for alignment, DINO surpasses CLIP in fine-grained related perception tasks.

Hallucination Image Captioning +3

174

Paper
Code

Dual adaptive training of photonic neural networks

1 code implementation • 9 Dec 2022 • Ziyang Zheng, Zhengyang Duan, Hang Chen, Rui Yang, Sheng Gao, Haiou Zhang, Hongkai Xiong, Xing Lin

Photonic neural network (PNN) is a remarkable analog artificial intelligence (AI) accelerator that computes with photons instead of electrons to feature low latency, high energy efficiency, and high parallelism.

Image Classification

Paper
Code

Spatial-Temporal Transformer Networks for Traffic Flow Forecasting

1 code implementation • 9 Jan 2020 • Mingxing Xu, Wenrui Dai, Chunmiao Liu, Xing Gao, Weiyao Lin, Guo-Jun Qi, Hongkai Xiong

In this paper, we propose a novel paradigm of Spatial-Temporal Transformer Networks (STTNs) that leverages dynamical directed spatial dependencies and long-range temporal dependencies to improve the accuracy of long-term traffic forecasting.

Traffic Prediction

Paper
Code

Masked Autoencoders are Robust Data Augmentors

1 code implementation • 10 Jun 2022 • Haohang Xu, Shuangrui Ding, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.

Image Augmentation Image Classification +1

Paper
Code

SdAE: Self-distillated Masked Autoencoder

1 code implementation • 31 Jul 2022 • Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian

We also analyze how to build good views for the teacher branch to produce latent representation from the perspective of information bottleneck.

Descriptive Self-Supervised Learning

Paper
Code

Batch Normalization with Enhanced Linear Transformation

1 code implementation • 28 Nov 2020 • Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille

Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.

Paper
Code

Trained Rank Pruning for Efficient Deep Neural Networks

1 code implementation • 6 Dec 2018 • Yuhui Xu, Yuxi Li, Shuai Zhang, Wei Wen, Botao Wang, Yingyong Qi, Yiran Chen, Weiyao Lin, Hongkai Xiong

We propose Trained Rank Pruning (TRP), which iterates low rank approximation and training.

Quantization

Paper
Code

Trained Rank Pruning for Efficient Deep Neural Networks

1 code implementation • 9 Oct 2019 • Yuhui Xu, Yuxi Li, Shuai Zhang, Wei Wen, Botao Wang, Wenrui Dai, Yingyong Qi, Yiran Chen, Weiyao Lin, Hongkai Xiong

To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations.

Paper
Code

TRP: Trained Rank Pruning for Efficient Deep Neural Networks

1 code implementation • 30 Apr 2020 • Yuhui Xu, Yuxi Li, Shuai Zhang, Wei Wen, Botao Wang, Yingyong Qi, Yiran Chen, Weiyao Lin, Hongkai Xiong

The TRP trained network inherently has a low-rank structure, and is approximated with negligible performance loss, thus eliminating the fine-tuning process after low rank decomposition.

Paper
Code

Motion-aware Contrastive Video Representation Learning via Foreground-background Merging

1 code implementation • CVPR 2022 • Shuangrui Ding, Maomao Li, Tianyu Yang, Rui Qian, Haohang Xu, Qingyi Chen, Jue Wang, Hongkai Xiong

To alleviate such bias, we propose \textbf{F}oreground-b\textbf{a}ckground \textbf{Me}rging (FAME) to deliberately compose the moving foreground region of the selected video onto the static background of others.

Action Recognition Contrastive Learning +1

Paper
Code

Bag of Instances Aggregation Boosts Self-supervised Distillation

1 code implementation • ICLR 2022 • Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.

Contrastive Learning Self-Supervised Learning

Paper
Code

Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation

1 code implementation • 29 Nov 2023 • Shuangrui Ding, Rui Qian, Haohang Xu, Dahua Lin, Hongkai Xiong

In this paper, we propose a simple yet effective approach for self-supervised video object segmentation (VOS).

Clustering Object +6

Paper
Code

Deep Neural Network Compression with Single and Multiple Level Quantization

1 code implementation • 6 Mar 2018 • Yuhui Xu, Yongzhuang Wang, Aojun Zhou, Weiyao Lin, Hongkai Xiong

In this paper, we propose two novel network quantization approaches, single-level network quantization (SLQ) for high-bit quantization and multi-level network quantization (MLQ) for extremely low-bit quantization (ternary). We are the first to consider the network quantization from both width and depth level.

Neural Network Compression Quantization

Paper
Code

MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization

1 code implementation • 19 Oct 2020 • Wen Fei, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

We leverage the neural tangent kernel (NTK) theory to prove that our weight mean operation whitens activations and transits network into the chaotic regime like BN layer, and consequently, leads to an enhanced convergence.

Paper
Code

Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation

1 code implementation • ICCV 2023 • Shuangrui Ding, Peisen Zhao, Xiaopeng Zhang, Rui Qian, Hongkai Xiong, Qi Tian

Based on the STA score, we are able to progressively prune the tokens without introducing any additional parameters or requiring further re-training.

Video Recognition

Paper
Code

Hybrid ISTA: Unfolding ISTA With Convergence Guarantees Using Free-Form Deep Neural Networks

1 code implementation • 25 Apr 2022 • Ziyang Zheng, Wenrui Dai, Duoduo Xue, Chenglin Li, Junni Zou, Hongkai Xiong

This framework is general to endow arbitrary DNNs for solving linear inverse problems with convergence guarantees.

Compressive Sensing

Paper
Code

CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation

2 code implementations • 24 Nov 2022 • Yang You, Wenhao He, Jin Liu, Hongkai Xiong, Weiming Wang, Cewu Lu

We introduce a novel method, CPPF++, designed for sim-to-real pose estimation.

Pose Estimation

Paper
Code

Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance

1 code implementation • 3 Feb 2024 • Xinyu Peng, Ziyang Zheng, Wenrui Dai, Nuoqian Xiao, Chenglin Li, Junni Zou, Hongkai Xiong

In this paper, we propose the first unified interpretation for existing zero-shot methods from the perspective of approximating the conditional posterior mean for the reverse diffusion process of conditional sampling.

Paper
Code

Frequency-Aware Transformer for Learned Image Compression

1 code implementation • 25 Oct 2023 • Han Li, Shaohui Li, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

Learned image compression (LIC) has gained traction as an effective solution for image storage and transmission in recent years.

Image Compression

Paper
Code

AiluRus: A Scalable ViT Framework for Dense Prediction

1 code implementation • NeurIPS 2023 • Jin Li, Yaoming Wang, Xiaopeng Zhang, Bowen Shi, Dongsheng Jiang, Chenglin Li, Wenrui Dai, Hongkai Xiong, Qi Tian

Specifically, at the intermediate layer of the ViT, we utilize a spatial-aware density-based clustering algorithm to select representative tokens from the token sequence.

object-detection Object Detection +1

Paper
Code

Hierarchical Graph Networks for 3D Human Pose Estimation

1 code implementation • 23 Nov 2021 • Han Li, Bowen Shi, Wenrui Dai, Yabo Chen, Botao Wang, Yu Sun, Min Guo, Chenlin Li, Junni Zou, Hongkai Xiong

Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton.

Ranked #42 on 3D Human Pose Estimation on MPI-INF-3DHP (AUC metric)

3D Human Pose Estimation

Paper
Code

Light Field Reconstruction via Deep Adaptive Fusion of Hybrid Lenses

1 code implementation • 14 Feb 2021 • Jing Jin, Mantang Guo, Junhui Hou, Hui Liu, Hongkai Xiong

Besides, to promote the effectiveness of our method trained with simulated hybrid data on real hybrid data captured by a hybrid LF imaging system, we carefully design the network architecture and the training strategy.

Paper
Code

Adapting Shortcut With Normalizing Flow: An Efficient Tuning Framework for Visual Recognition

1 code implementation • CVPR 2023 • Yaoming Wang, Bowen Shi, Xiaopeng Zhang, Jin Li, Yuchen Liu, Wenrui Dai, Chenglin Li, Hongkai Xiong, Qi Tian

To mitigate the computational and storage demands, recent research has explored Parameter-Efficient Fine-Tuning (PEFT), which focuses on tuning a minimal number of parameters for efficient adaptation.

Paper
Code

Latency-Aware Differentiable Neural Architecture Search

1 code implementation • 17 Jan 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian, Hongkai Xiong

However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware.

Neural Architecture Search

Paper
Code

Zigzag Learning for Weakly Supervised Object Detection

no code implementations • CVPR 2018 • Xiaopeng Zhang, Jiashi Feng, Hongkai Xiong, Qi Tian

Unlike them, we propose a zigzag learning strategy to simultaneously discover reliable object instances and prevent the model from overfitting initial seeds.

Ranked #16 on Weakly Supervised Object Detection on PASCAL VOC 2012 test

Object object-detection +1

Paper
Add Code

Action Recognition with Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion

no code implementations • 20 Nov 2017 • Weiyao Lin, Yang Mi, Jianxin Wu, Ke Lu, Hongkai Xiong

In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams.

Action Recognition Temporal Action Localization

Paper
Add Code

Ensemble of Part Detectors for Simultaneous Classification and Localization

no code implementations • 29 May 2017 • Xiaopeng Zhang, Hongkai Xiong, Weiyao Lin, Qi Tian

Part-based representation has been proven to be effective for a variety of visual applications.

Classification Clustering +4

Paper
Add Code

DNQ: Dynamic Network Quantization

no code implementations • 6 Dec 2018 • Yuhui Xu, Shuai Zhang, Yingyong Qi, Jiaxian Guo, Weiyao Lin, Hongkai Xiong

Network quantization is an effective method for the deployment of neural networks on memory and energy constrained mobile devices.

Quantization

Paper
Add Code

Picking Deep Filter Responses for Fine-Grained Image Recognition

no code implementations • CVPR 2016 • Xiaopeng Zhang, Hongkai Xiong, Wengang Zhou, Weiyao Lin, Qi Tian

Recognizing fine-grained sub-categories such as birds and dogs is extremely challenging due to the highly localized and subtle differences in some specific parts.

Fine-Grained Image Recognition

Paper
Add Code

Group Re-Identification with Multi-grained Matching and Integration

no code implementations • 17 May 2019 • Weiyao Lin, Yuxi Li, Hao Xiao, John See, Junni Zou, Hongkai Xiong, Jingdong Wang, Tao Mei

The task of re-identifying groups of people underdifferent camera views is an important yet less-studied problem. Group re-identification (Re-ID) is a very challenging task sinceit is not only adversely affected by common issues in traditionalsingle object Re-ID problems such as viewpoint and human posevariations, but it also suffers from changes in group layout andgroup membership.

Paper
Add Code

iPool -- Information-based Pooling in Hierarchical Graph Neural Networks

no code implementations • 1 Jul 2019 • Xing Gao, Hongkai Xiong, Pascal Frossard

In this paper, we propose a parameter-free pooling operator, called iPool, that permits to retain the most informative features in arbitrary graphs.

Graph Classification

Paper
Add Code

AETv2: AutoEncoding Transformations for Self-Supervised Representation Learning by Minimizing Geodesic Distances in Lie Groups

no code implementations • 16 Nov 2019 • Feng Lin, Haohang Xu, Houqiang Li, Hongkai Xiong, Guo-Jun Qi

For this reason, we should use the geodesic to characterize how an image transform along the manifold of a transformation group, and adopt its length to measure the deviation between transformations.

Representation Learning Self-Supervised Learning

Paper
Add Code

FLAT: Few-Shot Learning via Autoencoding Transformation Regularizers

no code implementations • 29 Dec 2019 • Haohang Xu, Hongkai Xiong, Guo-Jun Qi

To this end, we present a novel regularization mechanism by learning the change of feature representations induced by a distribution of transformations without using the labels of data examples.

Data Augmentation Few-Shot Learning +1

Paper
Add Code

Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events

no code implementations • 9 May 2020 • Weiyao Lin, Huabin Liu, Shizhan Liu, Yuxi Li, Rui Qian, Tao Wang, Ning Xu, Hongkai Xiong, Guo-Jun Qi, Nicu Sebe

To this end, we present a new large-scale dataset with comprehensive annotations, named Human-in-Events or HiEve (Human-centric video analysis in complex Events), for the understanding of human motions, poses, and actions in a variety of realistic events, especially in crowd & complex events.

Action Recognition Pose Estimation

Paper
Add Code

Graph Pooling with Node Proximity for Hierarchical Representation Learning

no code implementations • 19 Jun 2020 • Xing Gao, Wenrui Dai, Chenglin Li, Hongkai Xiong, Pascal Frossard

In this paper, we propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.

Graph Classification Representation Learning

Paper
Add Code

Distilling Object Detectors with Task Adaptive Regularization

no code implementations • 23 Jun 2020 • Ruoyu Sun, Fuhui Tang, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.

Knowledge Distillation Object +1

Paper
Add Code

K-Shot Contrastive Learning of Visual Features with Multiple Instance Augmentations

no code implementations • 27 Jul 2020 • Haohang Xu, Hongkai Xiong, Guo-Jun Qi

In this paper, we propose the $K$-Shot Contrastive Learning (KSCL) of visual features by applying multiple augmentations to investigate the sample variations within individual instances.

Contrastive Learning

Paper
Add Code

Monotonic Robust Policy Optimization with Model Discrepancy

no code implementations • 1 Jan 2021 • Yuankun Jiang, Chenglin Li, Junni Zou, Wenrui Dai, Hongkai Xiong

To mitigate the model discrepancy between training and target (testing) environments, domain randomization (DR) can generate plenty of environments with a sufficient diversity by randomly sampling environment parameters in simulator.

Paper
Add Code

PAC-Bayesian Randomized Value Function with Informative Prior

no code implementations • 1 Jan 2021 • Yuankun Jiang, Chenglin Li, Junni Zou, Wenrui Dai, Hongkai Xiong

To address this, in this paper, we propose a Bayesian linear regression with informative prior (IP-BLR) operator to leverage the data-dependent prior in the learning process of randomized value function, which can leverage the statistics of training results from previous iterations.

Reinforcement Learning (RL)

Paper
Add Code

VEM-GCN: Topology Optimization with Variational EM for Graph Convolutional Networks

no code implementations • 1 Jan 2021 • Rui Yang, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

In the variational E-step, graph topology is optimized by approximating the posterior probability distribution of the latent adjacency matrix with a neural network learned from node embeddings.

Classification General Classification +2

Paper
Add Code

Center-wise Local Image Mixture For Contrastive Representation Learning

no code implementations • 5 Nov 2020 • Hao Li, Xiaopeng Zhang, Hongkai Xiong

Contrastive learning based on instance discrimination trains model to discriminate different transformations of the anchor sample from other samples, which does not consider the semantic similarity among samples.

Contrastive Learning Data Augmentation +3

Paper
Add Code

Seed the Views: Hierarchical Semantic Alignment for Contrastive Representation Learning

no code implementations • 4 Dec 2020 • Haohang Xu, Xiaopeng Zhang, Hao Li, Lingxi Xie, Hongkai Xiong, Qi Tian

In this paper, we propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to \textbf{Cross-samples and Multi-level} representation, and models the invariance to semantically similar images in a hierarchical way.

Contrastive Learning Representation Learning +2

Paper
Add Code

NCGNN: Node-Level Capsule Graph Neural Network for Semisupervised Classification

no code implementations • 7 Dec 2020 • Rui Yang, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

Therefore, it can relieve the over-smoothing issue and learn effective node representations over graphs with homophily or heterophily.

Classification Node Classification

Paper
Add Code

Multi-dataset Pretraining: A Unified Model for Semantic Segmentation

no code implementations • 8 Jun 2021 • Bowen Shi, Xiaopeng Zhang, Haohang Xu, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian

This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets regardless of their taxonomy labels, and followed by fine-tuning the pretrained model over specific dataset as usual.

Semantic Segmentation

Paper
Add Code

Message Passing in Graph Convolution Networks via Adaptive Filter Banks

no code implementations • 18 Jun 2021 • Xing Gao, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong, Pascal Frossard

Furthermore, each filter in the spectral domain corresponds to a message passing scheme, and diverse schemes are implemented via the filter bank.

Graph Classification Representation Learning

Paper
Add Code

Graph Neural Networks With Lifting-based Adaptive Graph Wavelets

no code implementations • 3 Aug 2021 • Mingxing Xu, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong, Pascal Frossard

To ensure that the learned graph representations are invariant to node permutations, a layer is employed at the input of the networks to reorder the nodes according to their local topology information.

Graph Representation Learning

Paper
Add Code

Learning Latent Architectural Distribution in Differentiable Neural Architecture Search via Variational Information Maximization

no code implementations • ICCV 2021 • Yaoming Wang, Yuchen Liu, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

Existing differentiable neural architecture search approaches simply assume the architectural distribution on each edge is independent of each other, which conflicts with the intrinsic properties of architecture.

Neural Architecture Search

Paper
Add Code

Variance Reduced Domain Randomization for Policy Gradient

no code implementations • 29 Sep 2021 • Yuankun Jiang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong

In this paper, we theoretically derive a bias-free and state/environment-dependent optimal baseline for DR, and analytically show its ability to achieve further variance reduction over the standard constant and state-dependent baselines for DR. We further propose a variance reduced domain randomization (VRDR) approach for policy gradient methods, to strike a tradeoff between the variance reduction and computational complexity in practice.

Policy Gradient Methods

Paper
Add Code

Understanding Self-supervised Learning via Information Bottleneck Principle

no code implementations • 29 Sep 2021 • Jin Li, Yaoming Wang, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong

To address this issue, we introduce the information bottleneck principle and propose the Self-supervised Variational Information Bottleneck (SVIB) learning framework.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

Graph Convolutional Networks via Adaptive Filter Banks

no code implementations • 29 Sep 2021 • Xing Gao, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong, Pascal Frossard

Graph convolutional networks have been a powerful tool in representation learning of networked data.

Representation Learning

Paper
Add Code

All-optical graph representation learning using integrated diffractive photonic computing units

no code implementations • 23 Apr 2022 • Tao Yan, Rui Yang, Ziyang Zheng, Xing Lin, Hongkai Xiong, Qionghai Dai

Photonic neural networks perform brain-inspired computations using photons instead of electrons that can achieve substantially improved computing performance.

Graph Representation Learning

Paper
Add Code

LiftPool: Lifting-based Graph Pooling for Hierarchical Graph Representation Learning

no code implementations • 27 Apr 2022 • Mingxing Xu, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

Subsequently, this local information is aligned and propagated to the preserved nodes to alleviate information loss in graph coarsening.

Graph Classification Graph Representation Learning

Paper
Add Code

Hierarchical Spherical CNNs with Lifting-based Adaptive Wavelets for Pooling and Unpooling

no code implementations • 31 May 2022 • Mingxing Xu, Chenglin Li, Wenrui Dai, Siheng Chen, Junni Zou, Pascal Frossard, Hongkai Xiong

Specifically, adaptive spherical wavelets are learned with a lifting structure that consists of trainable lifting operators (i. e., update and predict operators).

Paper
Add Code

Contrastive Regression for Domain Adaptation on Gaze Estimation

no code implementations • CVPR 2022 • Yaoming Wang, Yangzhou Jiang, Jin Li, Bingbing Ni, Wenrui Dai, Chenglin Li, Hongkai Xiong, Teng Li

Appearance-based Gaze Estimation leverages deep neural networks to regress the gaze direction from monocular images and achieve impressive performance.

Domain Generalization Gaze Estimation +1

Paper
Add Code

Optimization-based Block Coordinate Gradient Coding for Mitigating Partial Stragglers in Distributed Learning

no code implementations • 6 Jun 2022 • Qi Wang, Ying Cui, Chenglin Li, Junni Zou, Hongkai Xiong

To reduce computational complexity, we first transform each to an equivalent but much simpler discrete problem with N\llL variables representing the partition of the L coordinates into N blocks, each with identical redundancy.

Paper
Add Code

Dual Contrastive Learning for Spatio-temporal Representation

no code implementations • 12 Jul 2022 • Shuangrui Ding, Rui Qian, Hongkai Xiong

In this way, the static scene and the dynamic motion are simultaneously encoded into the compact RGB representation.

Contrastive Learning Representation Learning

Paper
Add Code

Motion-inductive Self-supervised Object Discovery in Videos

no code implementations • 1 Oct 2022 • Shuangrui Ding, Weidi Xie, Yabo Chen, Rui Qian, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

In this paper, we consider the task of unsupervised object discovery in videos.

Ranked #3 on Unsupervised Object Segmentation on DAVIS 2016

Object Object Discovery +5

Paper
Add Code

Lightweight network towards real-time image denoising on mobile devices

no code implementations • 9 Nov 2022 • Zhuoqun Liu, Meiguang Jin, Ying Chen, Huaida Liu, Canqian Yang, Hongkai Xiong

In this paper, we identify the real bottlenecks that affect the CNN-based models' run-time performance on mobile devices: memory access cost and NPU-incompatible operations, and build the model based on these.

Image Denoising

Paper
Add Code

Pose-Oriented Transformer with Uncertainty-Guided Refinement for 2D-to-3D Human Pose Estimation

no code implementations • 15 Feb 2023 • Han Li, Bowen Shi, Wenrui Dai, Hongwei Zheng, Botao Wang, Yu Sun, Min Guo, Chenlin Li, Junni Zou, Hongkai Xiong

There has been a recent surge of interest in introducing transformers to 3D human pose estimation (HPE) due to their powerful capabilities in modeling long-term dependencies.

3D Human Pose Estimation Position

Paper
Add Code

Learned Lossless Compression for JPEG via Frequency-Domain Prediction

no code implementations • 5 Mar 2023 • Jixiang Luo, Shaohui Li, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

In this paper, we propose a novel framework for learned lossless compression of JPEG images that achieves end-to-end optimized prediction of the distribution of decoded DCT coefficients.

Paper
Add Code

Dynamic Scenario Representation Learning for Motion Forecasting with Heterogeneous Graph Convolutional Recurrent Networks

no code implementations • 8 Mar 2023 • Xing Gao, Xiaogang Jia, Yikang Li, Hongkai Xiong

Due to the complex and changing interactions in dynamic scenarios, motion forecasting is a challenging problem in autonomous driving.

Ranked #1 on Trajectory Prediction on Argoverse2

Motion Forecasting Representation Learning +1

Paper
Add Code

Promoting Semantic Connectivity: Dual Nearest Neighbors Contrastive Learning for Unsupervised Domain Generalization

no code implementations • CVPR 2023 • Yuchen Liu, Yaoming Wang, Yabo Chen, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

Then, we propose a novel unsupervised domain generalization approach, namely Dual Nearest Neighbors contrastive learning with strong Augmentation (DN^2A).

Contrastive Learning Domain Generalization

Paper
Add Code

Hybrid Distillation: Connecting Masked Autoencoders with Contrastive Learners

no code implementations • 28 Jun 2023 • Bowen Shi, Xiaopeng Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian

In order to better obtain both discrimination and diversity, we propose a simple but effective Hybrid Distillation strategy, which utilizes both the supervised/CL teacher and the MIM teacher to jointly guide the student model.

Contrastive Learning Representation Learning

Paper
Add Code

ActionPrompt: Action-Guided 3D Human Pose Estimation With Text and Pose Prompting

no code implementations • 18 Jul 2023 • Hongwei Zheng, Han Li, Bowen Shi, Wenrui Dai, Botao Wan, Yu Sun, Min Guo, Hongkai Xiong

Recent 2D-to-3D human pose estimation (HPE) utilizes temporal consistency across sequences to alleviate the depth ambiguity problem but ignore the action related prior knowledge hidden in the pose sequence.

3D Human Pose Estimation

Paper
Add Code

Towards Unsupervised Domain Generalization for Face Anti-Spoofing

no code implementations • ICCV 2023 • Yuchen Liu, Yabo Chen, Mengran Gou, Chun-Ting Huang, Yaoming Wang, Wenrui Dai, Hongkai Xiong

In this paper, we propose the first Unsupervised Domain Generalization framework for Face Anti-Spoofing, namely UDG-FAS, which could exploit large amounts of easily accessible unlabeled data to learn generalizable features for enhancing the low-data regime of FAS.

Domain Generalization Face Anti-Spoofing

Paper
Add Code

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

no code implementations • 7 Dec 2023 • Yabo Chen, Jiemin Fang, YuYang Huang, Taoran Yi, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

We propose a cascade generation framework constructed with two Zero-1-to-3 models, named Cascade-Zero123, to tackle this issue, which progressively extracts 3D information from the source image.

Transparent objects

Paper
Add Code

Spatial-Temporal DAG Convolutional Networks for End-to-End Joint Effective Connectivity Learning and Resting-State fMRI Classification

no code implementations • 16 Dec 2023 • Rui Yang, Wenrui Dai, Huajun She, Yiping P. Du, Dapeng Wu, Hongkai Xiong

To address these issues in an end-to-end manner, we model the brain network as a directed acyclic graph (DAG) to discover direct causal connections between brain regions and propose Spatial-Temporal DAG Convolutional Network (ST-DAGCN) to jointly infer effective connectivity and classify rs-fMRI time series by learning brain representations based on nonlinear structural equation model.

Time Series Time Series Classification

Paper
Add Code

scBiGNN: Bilevel Graph Representation Learning for Cell Type Classification from Single-cell RNA Sequencing Data

no code implementations • 16 Dec 2023 • Rui Yang, Wenrui Dai, Chenglin Li, Junni Zou, Dapeng Wu, Hongkai Xiong

A gene-level GNN is established to adaptively learn gene-gene interactions and cell representations via the self-attention mechanism, and a cell-level GNN builds on the cell-cell graph that is constructed from the cell representations generated by the gene-level GNN.

Classification Graph Representation Learning

Paper
Add Code

UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding

no code implementations • 12 Jan 2024 • Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, Xiaopeng Zhang

Vision-language foundation models, represented by Contrastive language-image pre-training (CLIP), have gained increasing attention for jointly understanding both vision and textual tasks.

Panoptic Segmentation Retrieval +1

Paper
Add Code

Measuring the Discrepancy between 3D Geometric Models using Directional Distance Fields

no code implementations • 18 Jan 2024 • Siyu Ren, Junhui Hou, Xiaodong Chen, Hongkai Xiong, Wenping Wang

We then transfer the discrepancy between two 3D geometric models as the discrepancy between their DDFs defined on an identical domain, naturally establishing model correspondence.

Scene Flow Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.