Search Results for author: Xinchao Wang

Found 159 papers, 93 papers with code

Collaboration by Competition: Self-coordinated Knowledge Amalgamation for Multi-talent Student Learning

no code implementations ECCV 2020 Sihui Luo, Wenwen Pan, Xinchao Wang, Dazhou Wang, Haihong Tang, Mingli Song

To this end, we propose a self-coordinate knowledge amalgamation network (SOKA-Net) for learning the multi-talent student model.

Hallucinating Visual Instances in Total Absentia

no code implementations ECCV 2020 Jiayan Qiu, Yiding Yang, Xinchao Wang, DaCheng Tao

This seemingly minor difference in fact makes the HVITA a much challenging task, as the restoration algorithm would have to not only infer the category of the object in total absentia, but also hallucinate an object of which the appearance is consistent with the background.

Hallucination Image Inpainting +1

Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

1 code implementation13 Jun 2024 Chaoqin Huang, Haoyan Guan, Aofan Jiang, Yanfeng Wang, Michael Spratling, Xinchao Wang, Ya zhang

Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD) framework.

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

1 code implementation11 Jun 2024 Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang

To address this, we introduce AsyncDiff, a universal and plug-and-play acceleration scheme that enables model parallelism across multiple devices.


MVGamba: Unify 3D Content Generation as State Space Sequence Modeling

no code implementations10 Jun 2024 Xuanyu Yi, Zike Wu, Qiuhong Shen, Qingshan Xu, Pan Zhou, Joo-Hwee Lim, Shuicheng Yan, Xinchao Wang, Hanwang Zhang

Recent 3D large reconstruction models (LRMs) can generate high-quality 3D content in sub-seconds by integrating multi-view diffusion models with scalable multi-view reconstructors.

3D Generation Attribute

Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching

1 code implementation3 Jun 2024 Xinyin Ma, Gongfan Fang, Michael Bi Mi, Xinchao Wang

To address the challenge of the exponential search space in deep models for identifying layers to cache and remove, we propose a novel differentiable optimization objective.


GFlow: Recovering 4D World from Monocular Video

no code implementations28 May 2024 Shizun Wang, Xingyi Yang, Qiuhong Shen, Zhenxiang Jiang, Xinchao Wang

To this end, we introduce GFlow, a new framework that utilizes only 2D priors (depth and optical flow) to lift a video (3D) to a 4D explicit representation, entailing a flow of Gaussian splatting through space and time.

4D reconstruction Optical Flow Estimation

MambaOut: Do We Really Need Mamba for Vision?

1 code implementation13 May 2024 Weihao Yu, Xinchao Wang

For vision tasks, as image classification does not align with either characteristic, we hypothesize that Mamba is not necessary for this task; Detection and segmentation tasks are also not autoregressive, yet they adhere to the long-sequence characteristic, so we believe it is still worthwhile to explore Mamba's potential for these tasks.

Image Classification Instance Segmentation +2

Distilled Datamodel with Reverse Gradient Matching

no code implementations CVPR 2024 Jingwen Ye, Ruonan Yu, Songhua Liu, Xinchao Wang

To investigate the impact of changes in training data on a pre-trained model, a common approach is leave-one-out retraining.

Ungeneralizable Examples

no code implementations CVPR 2024 Jingwen Ye, Xinchao Wang

The training of contemporary deep learning models heavily relies on publicly available data, posing a risk of unauthorized access to online data and raising concerns about data privacy.

MindBridge: A Cross-Subject Brain Decoding Framework

1 code implementation CVPR 2024 Shizun Wang, Songhua Liu, Zhenxiong Tan, Xinchao Wang

Currently, brain decoding is confined to a per-subject-per-model paradigm, limiting its applicability to the same individual for whom the decoding model is trained.

Brain Decoding Data Augmentation +2

Hash3D: Training-free Acceleration for 3D Generation

1 code implementation9 Apr 2024 Xingyi Yang, Xinchao Wang

The evolution of 3D generative modeling has been notably propelled by the adoption of 2D diffusion models.

3D Generation Image to 3D +1

Unsegment Anything by Simulating Deformation

1 code implementation CVPR 2024 Jiahao Lu, Xingyi Yang, Xinchao Wang

Foundation segmentation models, while powerful, pose a significant risk: they enable users to effortlessly extract any objects from any digital content with a single click, potentially leading to copyright infringement or malicious misuse.


Relation Rectification in Diffusion Model

no code implementations CVPR 2024 Yinwei Wu, Xingyi Yang, Xinchao Wang

Despite their exceptional generative abilities, large text-to-image diffusion models, much like skilled but careless artists, often struggle with accurately depicting visual relationships between objects.


Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images

1 code implementation CVPR 2024 Chaoqin Huang, Aofan Jiang, Jinghao Feng, Ya zhang, Xinchao Wang, Yanfeng Wang

Recent advancements in large-scale visual-language pre-trained models have led to significant progress in zero-/few-shot anomaly detection within natural image domains.

Anomaly Classification Anomaly Detection

Through the Dual-Prism: A Spectral Perspective on Graph Data Augmentation for Graph Classification

no code implementations18 Jan 2024 Yutong Xia, Runpeng Yu, Yuxuan Liang, Xavier Bresson, Xinchao Wang, Roger Zimmermann

Graph Neural Networks (GNNs) have become the preferred tool to process graph data, with their efficacy being boosted through graph data augmentation techniques.

Data Augmentation Graph Classification

Neural Lineage

no code implementations CVPR 2024 Runpeng Yu, Xinchao Wang

Given a well-behaved neural network is possible to identify its parent based on which it was tuned?

Mutual-modality Adversarial Attack with Semantic Perturbation

no code implementations20 Dec 2023 Jingwen Ye, Ruonan Yu, Songhua Liu, Xinchao Wang

Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.

Adversarial Attack


no code implementations14 Dec 2023 Hanyang Kong, Dongze Lian, Michael Bi Mi, Xinchao Wang

We introduce DreamDrone, an innovative method for generating unbounded flythrough scenes from textual prompts.

Perpetual View Generation Scene Generation

SlimSAM: 0.1% Data Makes Segment Anything Slim

2 code implementations8 Dec 2023 Zigeng Chen, Gongfan Fang, Xinyin Ma, Xinchao Wang

To address this challenging trade-off, we introduce SlimSAM, a novel data-efficient SAM compression method that achieves superior performance with extremely less training data.

Generator Born from Classifier

no code implementations NeurIPS 2023 Runpeng Yu, Xinchao Wang

In this paper, we make a bold attempt toward an ambitious task: given a pre-trained classifier, we aim to reconstruct an image generator, without relying on any data samples.

Image Generation

DeepCache: Accelerating Diffusion Models for Free

2 code implementations CVPR 2024 Xinyin Ma, Gongfan Fang, Xinchao Wang

Diffusion models have recently gained unprecedented attention in the field of image synthesis due to their remarkable generative capabilities.

Denoising Image Generation

Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models

1 code implementation28 Nov 2023 Zhihe Lu, Jiawang Bai, Xin Li, Zeyu Xiao, Xinchao Wang

However, performance advancements are limited when relying solely on intricate algorithmic designs for a single model, even one exhibiting strong performance, e. g., CLIP-ViT-B/16.

Prompt Engineering

GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

1 code implementation NeurIPS 2023 Xin Li, Dongze Lian, Zhihe Lu, Jiawang Bai, Zhibo Chen, Xinchao Wang

To mitigate that, we propose an effective adapter-style tuning strategy, dubbed GraphAdapter, which performs the textual adapter by explicitly modeling the dual-modality structure knowledge (i. e., the correlation of different semantics/classes in textual and visual modalities) with a dual knowledge graph.

Transfer Learning

Priority-Centric Human Motion Generation in Discrete Latent Space

no code implementations ICCV 2023 Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, Xinchao Wang

We also present a motion discrete diffusion model that employs an innovative noise schedule, determined by the significance of each motion token within the entire motion sequence.

SG-Former: Self-guided Transformer with Evolving Token Reallocation

1 code implementation ICCV 2023 Sucheng Ren, Xingyi Yang, Songhua Liu, Xinchao Wang

At the heart of our approach is to utilize a significance map, which is estimated through hybrid-scale self-attention and evolves itself during training, to reallocate tokens based on the significance of each region.

Diffusion Model as Representation Learner

1 code implementation ICCV 2023 Xingyi Yang, Xinchao Wang

In this paper, we conduct an in-depth investigation of the representation power of DPMs, and propose a novel knowledge transfer method that leverages the knowledge acquired by generative DPMs for recognition tasks.

Denoising Image Classification +3

Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey

1 code implementation18 Aug 2023 Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, Zhibo Chen

Image restoration (IR) has been an indispensable and challenging task in the low-level vision field, which strives to improve the subjective quality of images distorted by various forms of degradation.

Deblurring Image Restoration +2

Auxiliary Tasks Benefit 3D Skeleton-based Human Motion Prediction

1 code implementation ICCV 2023 Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Xinchao Wang, Yanfeng Wang

To work with auxiliary tasks, we propose a novel auxiliary-adapted transformer, which can handle incomplete, corrupted motion data and achieve coordinate recovery via capturing spatial-temporal dependencies.

Human motion prediction Human Pose Forecasting +1

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities

1 code implementation4 Aug 2023 Weihao Yu, Zhengyuan Yang, Linjie Li, JianFeng Wang, Kevin Lin, Zicheng Liu, Xinchao Wang, Lijuan Wang

Problems include: (1) How to systematically structure and evaluate the complicated multimodal tasks; (2) How to design evaluation metrics that work well across question and answer types; and (3) How to give model insights beyond a simple performance ranking.

Math Zero-Shot Visual Question Answring

PseudoCal: A Source-Free Approach to Unsupervised Uncertainty Calibration in Domain Adaptation

no code implementations14 Jul 2023 Dapeng Hu, Jian Liang, Xinchao Wang, Chuan-Sheng Foo

The conventional in-domain calibration method, \textit{temperature scaling} (TempScal), encounters challenges due to domain distribution shifts and the absence of labeled target domain data.

Unsupervised Domain Adaptation

Distribution Shift Inversion for Out-of-Distribution Prediction

1 code implementation CVPR 2023 Runpeng Yu, Songhua Liu, Xingyi Yang, Xinchao Wang

Machine learning society has witnessed the emergence of a myriad of Out-of-Distribution (OoD) algorithms, which address the distribution shift between the training and the testing distribution by searching for a unified predictor or invariant feature representation.

Domain Generalization

Evolving Knowledge Mining for Class Incremental Segmentation

1 code implementation3 Jun 2023 Zhihe Lu, Shuicheng Yan, Xinchao Wang

In this paper, we for the first time investigate the efficient multi-grained knowledge reuse for CISS, and propose a novel method, Evolving kNowleDge minING (ENDING), employing a frozen backbone.

Class-Incremental Semantic Segmentation Knowledge Distillation

LLM-Pruner: On the Structural Pruning of Large Language Models

1 code implementation NeurIPS 2023 Xinyin Ma, Gongfan Fang, Xinchao Wang

With LLM being a general-purpose task solver, we explore its compression in a task-agnostic manner, which aims to preserve the multi-task solving and language generation ability of the original LLM.

Text Generation Zero-Shot Learning

Structural Pruning for Diffusion Models

1 code implementation NeurIPS 2023 Gongfan Fang, Xinyin Ma, Xinchao Wang

Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs).

Can SAM Boost Video Super-Resolution?

no code implementations11 May 2023 Zhihe Lu, Zeyu Xiao, Jiawang Bai, Zhiwei Xiong, Xinchao Wang

To use the SAM-based prior, we propose a simple yet effective module -- SAM-guidEd refinEment Module (SEEM), which can enhance both alignment and fusion procedures by the utilization of semantic information.

Optical Flow Estimation Video Super-Resolution

Deep Graph Reprogramming

no code implementations CVPR 2023 Yongcheng Jing, Chongbin Yuan, Li Ju, Yiding Yang, Xinchao Wang, DaCheng Tao

In this paper, we explore a novel model reusing task tailored for graph neural networks (GNNs), termed as "deep graph reprogramming".

3D Object Recognition Action Recognition +1

Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer

no code implementations CVPR 2023 Hao Tang, Songhua Liu, Tianwei Lin, Shaoli Huang, Fu Li, Dongliang He, Xinchao Wang

On the other hand, different from the vanilla version, we adopt a learnable scaling operation on content features before content-style feature interaction, which better preserves the original similarity between a pair of content features while ensuring the stylization quality.

Meta-Learning Style Transfer

Segment Anything in Non-Euclidean Domains: Challenges and Opportunities

no code implementations23 Apr 2023 Yongcheng Jing, Xinchao Wang, DaCheng Tao

The recent work known as Segment Anything (SA) has made significant strides in pushing the boundaries of semantic segmentation into the era of foundation models.

Image Inpainting object-detection +2

Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate

1 code implementation19 Apr 2023 Songhua Liu, Jingwen Ye, Xinchao Wang

Existing approaches either apply the holistic style of the style image in a global manner, or migrate local colors and textures of the style image to the content counterparts in a pre-defined way.

Style Transfer

Anything-3D: Towards Single-view Anything Reconstruction in the Wild

1 code implementation19 Apr 2023 Qiuhong Shen, Xingyi Yang, Xinchao Wang

3D reconstruction from a single-RGB image in unconstrained real-world scenarios presents numerous challenges due to the inherent diversity and complexity of objects and environments.

3D Reconstruction Semantic Segmentation

InceptionNeXt: When Inception Meets ConvNeXt

11 code implementations CVPR 2024 Weihao Yu, Pan Zhou, Shuicheng Yan, Xinchao Wang

Inspired by the long-range modeling ability of ViTs, large-kernel convolutions are widely studied and adopted recently to enlarge the receptive field and improve model performance, like the remarkable work ConvNeXt which employs 7x7 depthwise convolution.

Image Classification Semantic Segmentation

EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning

1 code implementation CVPR 2023 Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, Yanfeng Wang

In motion prediction tasks, maintaining motion equivariance under Euclidean geometric transformations and invariance of agent interaction is a critical and fundamental principle.

Human Pose Forecasting motion prediction +2

Partial Network Cloning

1 code implementation CVPR 2023 Jingwen Ye, Songhua Liu, Xinchao Wang

Unlike prior methods that update all or at least part of the parameters in the target network throughout the knowledge transfer process, PNC conducts partial parametric "cloning" from a source network and then injects the cloned module to the target, without modifying its parameters.

Transfer Learning

DepGraph: Towards Any Structural Pruning

1 code implementation CVPR 2023 Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang

Structural pruning enables model acceleration by removing structurally-grouped parameters from neural networks.

Network Pruning Neural Network Compression

Dataset Distillation: A Comprehensive Review

1 code implementation17 Jan 2023 Ruonan Yu, Songhua Liu, Xinchao Wang

Recent success of deep learning is largely attributed to the sheer amount of data used for training deep neural networks. Despite the unprecedented success, the massive data, unfortunately, significantly increases the burden on storage and transmission and further gives rise to a cumbersome model training process.

Dataset Condensation

Slimmable Dataset Condensation

no code implementations CVPR 2023 Songhua Liu, Jingwen Ye, Runpeng Yu, Xinchao Wang

In this paper, we explore the problem of slimmable dataset condensation, to extract a smaller synthetic dataset given only previous condensation results.

Dataset Condensation

Few-Shot Dataset Distillation via Translative Pre-Training

1 code implementation ICCV 2023 Songhua Liu, Xinchao Wang

We pre-train the translator on some large datasets like ImageNet so that it requires only a limited number of adaptation steps on the target dataset.

Diffusion Probabilistic Model Made Slim

no code implementations CVPR 2023 Xingyi Yang, Daquan Zhou, Jiashi Feng, Xinchao Wang

Despite the recent visually-pleasing results achieved, the massive computational cost has been a long-standing flaw for diffusion probabilistic models (DPMs), which, in turn, greatly limits their applications on resource-limited platforms.

Image Generation Unconditional Image Generation

AvatarGen: A 3D Generative Model for Animatable Human Avatars

1 code implementation26 Nov 2022 Jianfeng Zhang, Zihang Jiang, Dingdong Yang, Hongyi Xu, Yichun Shi, Guoxian Song, Zhongcong Xu, Xinchao Wang, Jiashi Feng

Specifically, we decompose the generative 3D human synthesis into pose-guided mapping and canonical representation with predefined human pose and shape, such that the canonical representation can be explicitly driven to different poses and shapes with the guidance of a 3D parametric human model SMPL.

Task Residual for Tuning Vision-Language Models

1 code implementation CVPR 2023 Tao Yu, Zhihe Lu, Xin Jin, Zhibo Chen, Xinchao Wang

Large-scale vision-language models (VLMs) pre-trained on billion-level data have learned general visual representations and broad visual concepts.

Transfer Learning

Dataset Factorization for Condensation

1 code implementation NIPS 2022 Songhua Liu, Kai Wang, Xingyi Yang, Jingwen Ye, Xinchao Wang

In this paper, we study dataset distillation (DD), from a novel perspective and introduce a \emph{dataset factorization} approach, termed \emph{HaBa}, which is a plug-and-play strategy portable to any existing DD baseline.

Hallucination Informativeness

Dataset Distillation via Factorization

3 code implementations30 Oct 2022 Songhua Liu, Kai Wang, Xingyi Yang, Jingwen Ye, Xinchao Wang

In this paper, we study \xw{dataset distillation (DD)}, from a novel perspective and introduce a \emph{dataset factorization} approach, termed \emph{HaBa}, which is a plug-and-play strategy portable to any existing DD baseline.

Hallucination Informativeness

MetaFormer Baselines for Vision

7 code implementations24 Oct 2022 Weihao Yu, Chenyang Si, Pan Zhou, Mi Luo, Yichen Zhou, Jiashi Feng, Shuicheng Yan, Xinchao Wang

By simply applying depthwise separable convolutions as token mixer in the bottom stages and vanilla self-attention in the top stages, the resulting model CAFormer sets a new record on ImageNet-1K: it achieves an accuracy of 85. 5% at 224x224 resolution, under normal supervised training without external data or distillation.

Ranked #2 on Domain Generalization on ImageNet-C (using extra training data)

Domain Generalization Image Classification

Deep Model Reassembly

1 code implementation24 Oct 2022 Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, Xinchao Wang

Given a collection of heterogeneous models pre-trained from distinct sources and with diverse architectures, the goal of DeRy, as its name implies, is to first dissect each model into distinctive building blocks, and then selectively reassemble the derived blocks to produce customized networks under both the hardware resource and performance constraints.

Transfer Learning

Reachability-Aware Laplacian Representation in Reinforcement Learning

no code implementations24 Oct 2022 Kaixin Wang, Kuangqi Zhou, Jiashi Feng, Bryan Hooi, Xinchao Wang

In Reinforcement Learning (RL), Laplacian Representation (LapRep) is a task-agnostic state representation that encodes the geometry of the environment.

reinforcement-learning Reinforcement Learning (RL)

Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning

1 code implementation17 Oct 2022 Dongze Lian, Daquan Zhou, Jiashi Feng, Xinchao Wang

With the proposed SSF, our model obtains 2. 46% (90. 72% vs. 88. 54%) and 11. 48% (73. 10% vs. 65. 57%) performance improvement on FGVC and VTAB-1k in terms of Top-1 accuracy compared to the full fine-tuning but only fine-tuning about 0. 3M parameters.

Image Classification

Training Spiking Neural Networks with Local Tandem Learning

1 code implementation10 Oct 2022 Qu Yang, Jibin Wu, Malu Zhang, Yansong Chua, Xinchao Wang, Haizhou Li

The LTL rule follows the teacher-student learning approach by mimicking the intermediate feature representations of a pre-trained ANN.

Attention Diversification for Domain Generalization

1 code implementation9 Oct 2022 Rang Meng, Xianfeng Li, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, ShiLiang Pu

Under this guidance, a novel Attention Diversification framework is proposed, in which Intra-Model and Inter-Model Attention Diversification Regularization are collaborated to reassign appropriate attention to diverse task-related features.

Domain Generalization

Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale Persons

no code implementations25 Aug 2022 Yu Cheng, Yihao Ai, Bo wang, Xinchao Wang, Robby T. Tan

In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all persons, and unlike the top-down methods, do not rely on human detection.

2D Pose Estimation Human Detection +1

AvatarGen: a 3D Generative Model for Animatable Human Avatars

1 code implementation1 Aug 2022 Jianfeng Zhang, Zihang Jiang, Dingdong Yang, Hongyi Xu, Yichun Shi, Guoxian Song, Zhongcong Xu, Xinchao Wang, Jiashi Feng

Unsupervised generation of clothed virtual humans with various appearance and animatable poses is important for creating 3D human avatars and other AR/VR applications.

3D Human Reconstruction

Federated Selective Aggregation for Knowledge Amalgamation

1 code implementation27 Jul 2022 Donglin Xie, Ruonan Yu, Gongfan Fang, Jie Song, Zunlei Feng, Xinchao Wang, Li Sun, Mingli Song

The goal of FedSA is to train a student model for a new task with the help of several decentralized teachers, whose pre-training tasks and data are different and agnostic.

Learning Graph Neural Networks for Image Style Transfer

no code implementations24 Jul 2022 Yongcheng Jing, Yining Mao, Yiding Yang, Yibing Zhan, Mingli Song, Xinchao Wang, DaCheng Tao

To this end, we develop an elaborated GNN model with content and style local patches as the graph vertices.

Image Stylization

Learning with Recoverable Forgetting

1 code implementation17 Jul 2022 Jingwen Ye, Yifang Fu, Jie Song, Xingyi Yang, Songhua Liu, Xin Jin, Mingli Song, Xinchao Wang

Life-long learning aims at learning a sequence of tasks without forgetting the previously acquired knowledge.

General Knowledge Transfer Learning

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation

1 code implementation13 Jul 2022 Songhua Liu, Jingwen Ye, Sucheng Ren, Xinchao Wang

Prior approaches, despite the promising results, have relied on either estimating dense attention to compute per-point matching, which is limited to only coarse scales due to the quadratic memory cost, or fixing the number of correspondences to achieve linear complexity, which lacks flexibility.

Face Generation Style Transfer

Factorizing Knowledge in Neural Networks

1 code implementation4 Jul 2022 Xingyi Yang, Jingwen Ye, Xinchao Wang

The core idea of KF lies in the modularization and assemblability of knowledge: given a pretrained network model as input, KF aims to decompose it into several factor networks, each of which handles only a dedicated task and maintains task-specific knowledge factorized from the source network.

Disentanglement Transfer Learning

Slimmable Domain Adaptation

1 code implementation CVPR 2022 Rang Meng, WeiJie Chen, Shicai Yang, Jie Song, Luojun Lin, Di Xie, ShiLiang Pu, Xinchao Wang, Mingli Song, Yueting Zhuang

In this paper, we introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank, from which models of different capacities can be sampled to accommodate different accuracy-efficiency trade-offs.

Domain Generalization Unsupervised Domain Adaptation

Learning Domain Adaptive Object Detection with Probabilistic Teacher

2 code implementations13 Jun 2022 Meilin Chen, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Yunfeng Yan, Donglian Qi, Yueting Zhuang, Di Xie, ShiLiang Pu

In addition, we conduct anchor adaptation in parallel with localization adaptation, since anchor can be regarded as a learnable parameter.

Object object-detection +1

Inception Transformer

3 code implementations25 May 2022 Chenyang Si, Weihao Yu, Pan Zhou, Yichen Zhou, Xinchao Wang, Shuicheng Yan

Recent studies show that Transformer has strong capability of building long-range dependencies, yet is incompetent in capturing high frequencies that predominantly convey local information.

Image Classification

Tyger: Task-Type-Generic Active Learning for Molecular Property Prediction

no code implementations23 May 2022 Kuangqi Zhou, Kaixin Wang, Jiashi Feng, Jian Tang, Tingyang Xu, Xinchao Wang

However, existing best deep AL methods are mostly developed for a single type of learning task (e. g., single-label classification), and hence may not perform well in molecular property prediction that involves various task types.

Active Learning Drug Discovery +3

Prompting to Distill: Boosting Data-Free Knowledge Distillation via Reinforced Prompt

no code implementations16 May 2022 Xinyin Ma, Xinchao Wang, Gongfan Fang, Yongliang Shen, Weiming Lu

Data-free knowledge distillation (DFKD) conducts knowledge distillation via eliminating the dependence of original training data, and has recently achieved impressive results in accelerating pre-trained language models.

Data-free Knowledge Distillation

M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database

1 code implementation ACL 2022 Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li

In this work, we propose a Multi-modal Multi-scene Multi-label Emotional Dialogue dataset, M3ED, which contains 990 dyadic emotional dialogues from 56 different TV series, a total of 9, 082 turns and 24, 449 utterances.

Cultural Vocal Bursts Intensity Prediction Emotion Recognition

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels

1 code implementation30 Apr 2022 Kai Wang, Xiangyu Peng, Shuo Yang, Jianfei Yang, Zheng Zhu, Xinchao Wang, Yang You

This paradigm, however, is prone to significant degeneration under heavy label noise, as the number of clean samples is too small for conventional methods to behave well.

Learning with noisy labels

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

1 code implementation CVPR 2022 Kehong Gong, Bingbing Li, Jianfeng Zhang, Tao Wang, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang

Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions like consistency loss to guide the learning, which, inevitably, leads to inferior results in real-world scenarios with unseen poses.

3D Human Pose Estimation Hallucination

Point2Seq: Detecting 3D Objects as Sequences

1 code implementation CVPR 2022 Yujing Xue, Jiageng Mao, Minzhe Niu, Hang Xu, Michael Bi Mi, Wei zhang, Xiaogang Wang, Xinchao Wang

We further propose a lightweight scene-to-sequence decoder that can auto-regressively generate words conditioned on features from a 3D scene as well as cues from the preceding words.

3D Object Detection Decoder +2

CAFE: Learning to Condense Dataset by Aligning Features

2 code implementations CVPR 2022 Kai Wang, Bo Zhao, Xiangyu Peng, Zheng Zhu, Shuo Yang, Shuo Wang, Guan Huang, Hakan Bilen, Xinchao Wang, Yang You

Dataset condensation aims at reducing the network training effort through condensing a cumbersome training set into a compact synthetic one.

Dataset Condensation

Geometric Structure Preserving Warp for Natural Image Stitching

1 code implementation CVPR 2022 Peng Du, Jifeng Ning, Jiguang Cui, Shaoli Huang, Xinchao Wang, Jiaxin Wang

Further, an optimized GES energy term is presented to reasonably determine the weights of the sampling points on the geometric structure, and the term is added into the Global Similarity Prior (GSP) stitching model called GES-GSP to achieve a smooth transition between local alignment and geometric structure preservation.

Edge Detection Image Stitching

PONet: Robust 3D Human Pose Estimation via Learning Orientations Only

no code implementations21 Dec 2021 Jue Wang, Shaoli Huang, Xinchao Wang, DaCheng Tao

Conventional 3D human pose estimation relies on first detecting 2D body keypoints and then solving the 2D to 3D correspondence problem. Despite the promising results, this learning paradigm is highly dependent on the quality of the 2D keypoint detector, which is inevitably fragile to occlusions and out-of-image absences. In this paper, we propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only, hence bypassing the error-prone keypoint detector in the absence of image evidence.

3D Human Pose Estimation

Up to 100$\times$ Faster Data-free Knowledge Distillation

2 code implementations12 Dec 2021 Gongfan Fang, Kanya Mo, Xinchao Wang, Jie Song, Shitao Bei, Haofei Zhang, Mingli Song

At the heart of our approach is a novel strategy to reuse the shared common features in training data so as to synthesize different data instances.

Data-free Knowledge Distillation

Safe Distillation Box

1 code implementation5 Dec 2021 Jingwen Ye, Yining Mao, Jie Song, Xinchao Wang, Cheng Jin, Mingli Song

In other words, all users may employ a model in SDB for inference, but only authorized users get access to KD from the model.

Knowledge Distillation

Shunted Self-Attention via Multi-Scale Token Aggregation

1 code implementation CVPR 2022 Sucheng Ren, Daquan Zhou, Shengfeng He, Jiashi Feng, Xinchao Wang

This novel merging scheme enables the self-attention to learn relationships between objects with different sizes and simultaneously reduces the token numbers and the computational cost.

MetaFormer Is Actually What You Need for Vision

14 code implementations CVPR 2022 Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao Wang, Jiashi Feng, Shuicheng Yan

Based on this observation, we hypothesize that the general architecture of the Transformers, instead of the specific token mixer module, is more essential to the model's performance.

Image Classification Object Detection +1

Meta Clustering Learning for Large-scale Unsupervised Person Re-identification

no code implementations19 Nov 2021 Xin Jin, Tianyu He, Xu Shen, Tongliang Liu, Xinchao Wang, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua

Unsupervised Person Re-identification (U-ReID) with pseudo labeling recently reaches a competitive performance compared to fully-supervised ReID methods based on modern clustering algorithms.

Clustering Unsupervised Person Re-Identification

MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition

no code implementations27 Oct 2021 Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li

Multimodal emotion recognition study is hindered by the lack of labelled corpora in terms of scale and diversity, due to the high annotation cost and label ambiguity.

Emotion Classification Multimodal Emotion Recognition +1

Unleash the Potential of Adaptation Models via Dynamic Domain Labels

no code implementations29 Sep 2021 Xin Jin, Tianyu He, Xu Shen, Songhua Wu, Tongliang Liu, Xinchao Wang, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua

In this paper, we propose an embarrassing simple yet highly effective adversarial domain adaptation (ADA) method for effectively training models for alignment.

Domain Adaptation Memorization

How Well Does Self-Supervised Pre-Training Perform with Streaming ImageNet?

no code implementations NeurIPS Workshop ImageNet_PPF 2021 Dapeng Hu, Shipeng Yan, Qizhengqiu Lu, Lanqing Hong, Hailin Hu, Yifan Zhang, Zhenguo Li, Xinchao Wang, Jiashi Feng

Prior works on self-supervised pre-training focus on the joint training scenario, where massive unlabeled data are assumed to be given as input all at once, and only then is a learner trained.

Self-Supervised Learning

Meta-Aggregator: Learning to Aggregate for 1-bit Graph Neural Networks

no code implementations ICCV 2021 Yongcheng Jing, Yiding Yang, Xinchao Wang, Mingli Song, DaCheng Tao

In this paper, we study a novel meta aggregation scheme towards binarizing graph neural networks (GNNs).

Structure-Aware Feature Generation for Zero-Shot Learning

no code implementations16 Aug 2021 Lianbo Zhang, Shaoli Huang, Xinchao Wang, Wei Liu, DaCheng Tao

In this paper, we introduce a novel structure-aware feature generation scheme, termed as SA-GAN, to explicitly account for the topological structure in learning both the latent space and the generative networks.

Attribute Generative Adversarial Network +1

Visual Boundary Knowledge Translation for Foreground Segmentation

1 code implementation1 Aug 2021 Zunlei Feng, Lechao Cheng, Xinchao Wang, Xiang Wang, Yajie Liu, Xiangtong Du, Mingli Song

To this end, we propose a Translation Segmentation Network (Trans-Net), which comprises a segmentation network and two boundary discriminators.

Foreground Segmentation Image Segmentation +3

Edge-competing Pathological Liver Vessel Segmentation with Limited Labels

1 code implementation1 Aug 2021 Zunlei Feng, Zhonghua Wang, Xinchao Wang, Xiuming Zhang, Lechao Cheng, Jie Lei, Yuexuan Wang, Mingli Song

The diagnosis of MVI needs discovering the vessels that contain hepatocellular carcinoma cells and counting their number in each vessel, which depends heavily on experiences of the doctor, is largely subjective and time-consuming.

Segmentation whole slide images

Boundary Knowledge Translation based Reference Semantic Segmentation

no code implementations1 Aug 2021 Lechao Cheng, Zunlei Feng, Xinchao Wang, Ya Jie Liu, Jie Lei, Mingli Song

In this paper, we introduce a novel Reference semantic segmentation Network (Ref-Net) to conduct visual boundary knowledge translation.

Segmentation Semantic Segmentation +1

Turning Frequency to Resolution: Video Super-Resolution via Event Cameras

no code implementations CVPR 2021 Yongcheng Jing, Yiding Yang, Xinchao Wang, Mingli Song, DaCheng Tao

To this end, we propose an Event-based VSR framework (E-VSR), of which the key component is an asynchronous interpolation (EAI) module that reconstructs a high-frequency (HF) video stream with uniform and tiny pixel displacements between neighboring frames from an event stream.

Video Super-Resolution

Tree-Like Decision Distillation

no code implementations CVPR 2021 Jie Song, Haofei Zhang, Xinchao Wang, Mengqi Xue, Ying Chen, Li Sun, DaCheng Tao, Mingli Song

Knowledge distillation pursues a diminutive yet well-behaved student network by harnessing the knowledge learned by a cumbersome teacher model.

Decision Making Knowledge Distillation

Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking

no code implementations CVPR 2021 Yiding Yang, Zhou Ren, Haoxiang Li, Chunluan Zhou, Xinchao Wang, Gang Hua

In this paper, we propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame, and hence may serve as a robust estimation even in challenging scenarios including occlusion.

Graph Neural Network Multi-Person Pose Estimation +2

Contrastive Model Inversion for Data-Free Knowledge Distillation

3 code implementations18 May 2021 Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, Mingli Song

In this paper, we propose Contrastive Model Inversion~(CMI), where the data diversity is explicitly modeled as an optimizable objective, to alleviate the mode collapse issue.

Contrastive Learning Data-free Knowledge Distillation

KDExplainer: A Task-oriented Attention Model for Explaining Knowledge Distillation

1 code implementation10 May 2021 Mengqi Xue, Jie Song, Xinchao Wang, Ying Chen, Xingen Wang, Mingli Song

Knowledge distillation (KD) has recently emerged as an efficacious scheme for learning compact deep neural networks (DNNs).

Knowledge Distillation Multi-class Classification

How Well Does Self-Supervised Pre-Training Perform with Streaming Data?

no code implementations ICLR 2022 Dapeng Hu, Shipeng Yan, Qizhengqiu Lu, Lanqing Hong, Hailin Hu, Yifan Zhang, Zhenguo Li, Xinchao Wang, Jiashi Feng

Prior works on self-supervised pre-training focus on the joint training scenario, where massive unlabeled data are assumed to be given as input all at once, and only then is a learner trained.

Representation Learning Self-Supervised Learning

Online Multiple Object Tracking with Cross-Task Synergy

1 code implementation CVPR 2021 Song Guo, Jingya Wang, Xinchao Wang, DaCheng Tao

On the other hand, such reliable embeddings can boost identity-awareness through memory aggregation, hence strengthen attention modules and suppress drifts.

Multiple Object Tracking Object +1

Training Generative Adversarial Networks in One Stage

1 code implementation CVPR 2021 Chengchao Shen, Youtan Yin, Xinchao Wang, Xubin Li, Jie Song, Mingli Song

Based on the adversarial losses of the generator and discriminator, we categorize GANs into two classes, Symmetric GANs and Asymmetric GANs, and introduce a novel gradient decomposition method to unify the two, allowing us to train both classes in one stage and hence alleviate the training effort.

Data-free Knowledge Distillation Image Generation

SPAGAN: Shortest Path Graph Attention Network

1 code implementation10 Jan 2021 Yiding Yang, Xinchao Wang, Mingli Song, Junsong Yuan, DaCheng Tao

SPAGAN therefore allows for a more informative and intact exploration of the graph structure and further {a} more effective aggregation of information from distant neighbors into the center node, as compared to node-based GCN methods.

Graph Attention

Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-Grained Recognition

1 code implementation ICCV 2021 Shaoli Huang, Xinchao Wang, DaCheng Tao

Learning mid-level representation for fine-grained recognition is easily dominated by a limited number of highly discriminative patterns, degrading its robustness and generalization capability.

Material Recognition Scene Recognition

Self-Born Wiring for Neural Trees

no code implementations ICCV 2021 Ying Chen, Feng Mao, Jie Song, Xinchao Wang, Huiqiong Wang, Mingli Song

Neural trees aim at integrating deep neural networks and decision trees so as to bring the best of the two worlds, including representation learning from the former and faster inference from the latter.

Representation Learning

Overcoming Catastrophic Forgetting in Graph Neural Networks

1 code implementation10 Dec 2020 Huihui Liu, Yiding Yang, Xinchao Wang

Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks.

Continual Learning

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data

2 code implementations9 Dec 2020 Shaoli Huang, Xinchao Wang, DaCheng Tao

As the main discriminative information of a fine-grained image usually resides in subtle regions, methods along this line are prone to heavy label noise in fine-grained recognition.

Fine-Grained Image Classification Semantic Composition +1

Progressive Network Grafting for Few-Shot Knowledge Distillation

2 code implementations9 Dec 2020 Chengchao Shen, Xinchao Wang, Youtan Yin, Jie Song, Sihui Luo, Mingli Song

In this paper, we investigate the practical few-shot knowledge distillation scenario, where we assume only a few samples without human annotations are available for each category.

Knowledge Distillation Model Compression +1

One-sample Guided Object Representation Disassembling

no code implementations NeurIPS 2020 Zunlei Feng, Yongming He, Xinchao Wang, Xin Gao, Jie Lei, Cheng Jin, Mingli Song

In this paper, we introduce the One-sample Guided Object Representation Disassembling (One-GORD) method, which only requires one annotated sample for each object category to learn disassembled object representation from unannotated images.

Data Augmentation Image Classification +1

Learning Propagation Rules for Attribution Map Generation

no code implementations ECCV 2020 Yiding Yang, Jiayan Qiu, Mingli Song, DaCheng Tao, Xinchao Wang

Prior gradient-based attribution-map methods rely on handcrafted propagation rules for the non-linear/activation layers during the backward pass, so as to produce gradients of the input and then the attribution map.

Factorizable Graph Convolutional Networks

1 code implementation NeurIPS 2020 Yiding Yang, Zunlei Feng, Mingli Song, Xinchao Wang

In this paper, we introduce a novel graph convolutional network (GCN), termed as factorizable graph convolutional network(FactorGCN), that explicitly disentangles such intertwined relations encoded in a graph.

Graph Classification Graph Regression +1

Tracking-by-Counting: Using Network Flows on Crowd Density Maps for Tracking Multiple Targets

no code implementations18 Jul 2020 Weihong Ren, Xinchao Wang, Jiandong Tian, Yandong Tang, Antoni B. Chan

State-of-the-art multi-object tracking~(MOT) methods follow the tracking-by-detection paradigm, where object trajectories are obtained by associating per-frame outputs of object detectors.

Cell Tracking Multi-Object Tracking +1

Impression Space from Deep Template Network

no code implementations10 Jul 2020 Gongfan Fang, Xinchao Wang, Haofei Zhang, Jie Song, Mingli Song

This network is referred to as the {\emph{Template Network}} because its filters will be used as templates to reconstruct images from the impression.

Image Generation Translation

Disassembling Object Representations without Labels

no code implementations3 Apr 2020 Zunlei Feng, Xinchao Wang, Yongming He, Yike Yuan, Xin Gao, Mingli Song

In this paper, we study a new representation-learning task, which we termed as disassembling object representations.

General Classification Generative Adversarial Network +3

Learning Oracle Attention for High-fidelity Face Completion

no code implementations CVPR 2020 Tong Zhou, Changxing Ding, Shaowen Lin, Xinchao Wang, DaCheng Tao

While recent works adopted the attention mechanism to learn the contextual relations among elements of the face, they have largely overlooked the disastrous impacts of inaccurate attention scores; in addition, they fail to pay sufficient attention to key facial components, the completion results of which largely determine the authenticity of a face image.

Facial Inpainting Vocal Bursts Intensity Prediction

Distilling Knowledge from Graph Convolutional Networks

1 code implementation CVPR 2020 Yiding Yang, Jiayan Qiu, Mingli Song, DaCheng Tao, Xinchao Wang

To enable the knowledge transfer from the teacher GCN to the student, we propose a local structure preserving module that explicitly accounts for the topological semantics of the teacher.

Knowledge Distillation Transfer Learning

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

1 code implementation CVPR 2020 Jie Song, Yixin Chen, Jingwen Ye, Xinchao Wang, Chengchao Shen, Feng Mao, Mingli Song

In this paper, we propose the DEeP Attribution gRAph (DEPARA) to investigate the transferability of knowledge learned from PR-DNNs.

Model Selection Transfer Learning

Data-Free Adversarial Distillation

3 code implementations23 Dec 2019 Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, Mingli Song

Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer.

Knowledge Distillation Model Compression +2

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

1 code implementation26 Nov 2019 Ya Zhao, Rui Xu, Xinchao Wang, Peng Hou, Haihong Tang, Mingli Song

In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers.

Knowledge Distillation Lipreading +3

Dynamic Instance Normalization for Arbitrary Style Transfer

no code implementations16 Nov 2019 Yongcheng Jing, Xiao Liu, Yukang Ding, Xinchao Wang, Errui Ding, Mingli Song, Shilei Wen

Prior normalization methods rely on affine transformations to produce arbitrary image style transfers, of which the parameters are computed in a pre-defined way.

Style Transfer

Deep Model Transferability from Attribution Maps

2 code implementations NeurIPS 2019 Jie Song, Yixin Chen, Xinchao Wang, Chengchao Shen, Mingli Song

Exploring the transferability between heterogeneous tasks sheds light on their intrinsic interconnections, and consequently enables knowledge transfer from one task to another so as to reduce the training effort of the latter.

Transfer Learning

Customizing Student Networks From Heterogeneous Teachers via Adaptive Knowledge Amalgamation

2 code implementations ICCV 2019 Chengchao Shen, Mengqi Xue, Xinchao Wang, Jie Song, Li Sun, Mingli Song

To this end, we introduce a dual-step strategy that first extracts the task-specific knowledge from the heterogeneous teachers sharing the same sub-task, and then amalgamates the extracted knowledge to build the student network.

Knowledge Amalgamation from Heterogeneous Networks by Common Feature Learning

2 code implementations24 Jun 2019 Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song

An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations.

One-pass Multi-task Networks with Cross-task Guided Attention for Brain Tumor Segmentation

1 code implementation5 Jun 2019 Chenhong Zhou, Changxing Ding, Xinchao Wang, Zhentai Lu, DaCheng Tao

The model cascade (MC) strategy significantly alleviates the class imbalance issue via running a set of individual deep models for coarse-to-fine segmentation.

Brain Tumor Segmentation Image Segmentation +2

Amalgamating Filtered Knowledge: Learning Task-customized Student from Multi-task Teachers

1 code implementation28 May 2019 Jingwen Ye, Xinchao Wang, Yixin Ji, Kairi Ou, Mingli Song

Many well-trained Convolutional Neural Network(CNN) models have now been released online by developers for the sake of effortless reproducing.

Not All Parts Are Created Equal: 3D Pose Estimation by Modelling Bi-directional Dependencies of Body Parts

no code implementations20 May 2019 Jue Wang, Shaoli Huang, Xinchao Wang, DaCheng Tao

We model parts with higher DOFs like the elbows, as dependent components of the corresponding parts with lower DOFs like the torso, of which the 3D locations can be more reliably estimated.

3D Pose Estimation Attribute

Amalgamating Knowledge towards Comprehensive Classification

1 code implementation7 Nov 2018 Chengchao Shen, Xinchao Wang, Jie Song, Li Sun, Mingli Song

We propose in this paper to study a new model-reusing task, which we term as \emph{knowledge amalgamation}.

Classification General Classification

Geometry-Aware Scene Text Detection With Instance Transformation Network

no code implementations CVPR 2018 Fangfang Wang, Liming Zhao, Xi Li, Xinchao Wang, DaCheng Tao

Localizing text in the wild is challenging in the situations of complicated geometric layout of the targets like random orientation and large aspect ratio.

General Classification Multi-Task Learning +5

Dual Swap Disentangling

1 code implementation NeurIPS 2018 Zunlei Feng, Xinchao Wang, Chenglong Ke, An-Xiang Zeng, DaCheng Tao, Mingli Song

To achieve disentangling using the labeled pairs, we follow a "encoding-swap-decoding" process, where we first swap the parts of their encodings corresponding to the shared attribute and then decode the obtained hybrid codes to reconstruct the original input pairs.


Anchor-based Nearest Class Mean Loss for Convolutional Neural Networks

no code implementations22 Apr 2018 Fusheng Hao, Jun Cheng, Lei Wang, Xinchao Wang, Jianzhong Cao, Xiping Hu, Dapeng Tao

Discriminative features are obtained by constraining the deep CNNs to map training samples to the corresponding anchors as close as possible.

Image Classification

Horizontal Pyramid Matching for Person Re-identification

1 code implementation14 Apr 2018 Yang Fu, Yunchao Wei, Yuqian Zhou, Honghui Shi, Gao Huang, Xinchao Wang, Zhiqiang Yao, Thomas Huang

Despite the remarkable recent progress, person re-identification (Re-ID) approaches are still suffering from the failure cases where the discriminative body parts are missing.

Person Re-Identification

Deep Motion Boundary Detection

no code implementations13 Apr 2018 Xiaoqing Yin, Xiyang Dai, Xinchao Wang, Maojun Zhang, DaCheng Tao, Larry Davis

In this paper, we propose the first dedicated end-to-end deep learning approach for motion boundary detection, which we term as MoBoNet.

Boundary Detection Optical Flow Estimation

Improving Object Detection from Scratch via Gated Feature Reuse

2 code implementations4 Dec 2017 Zhiqiang Shen, Honghui Shi, Jiahui Yu, Hai Phan, Rogerio Feris, Liangliang Cao, Ding Liu, Xinchao Wang, Thomas Huang, Marios Savvides

In this paper, we present a simple and parameter-efficient drop-in module for one-stage object detectors like SSD when learning from scratch (i. e., without pre-trained models).

Object object-detection +1

Non-Markovian Globally Consistent Multi-Object Tracking

no code implementations ICCV 2017 Andrii Maksai, Xinchao Wang, Francois Fleuret, Pascal Fua

Many state-of-the-art approaches to multi-object tracking rely on detecting them in each frame independently, grouping detections into short but reliable trajectory segments, and then further grouping them into full trajectories.

Multi-Object Tracking Object

On Compressing Deep Models by Low Rank and Sparse Decomposition

no code implementations CVPR 2017 Xiyu Yu, Tongliang Liu, Xinchao Wang, DaCheng Tao

Deep compression refers to removing the redundancy of parameters and feature maps for deep learning models.

Globally Consistent Multi-People Tracking using Motion Patterns

1 code implementation2 Dec 2016 Andrii Maksai, Xinchao Wang, Francois Fleuret, Pascal Fua

Many state-of-the-art approaches to people tracking rely on detecting them in each frame independently, grouping detections into short but reliable trajectory segments, and then further grouping them into full trajectories.

Do We Need Binary Features for 3D Reconstruction?

no code implementations14 Feb 2016 Bin Fan, Qingqun Kong, Wei Sui, Zhiheng Wang, Xinchao Wang, Shiming Xiang, Chunhong Pan, Pascal Fua

Binary features have been incrementally popular in the past few years due to their low memory footprints and the efficient computation of Hamming distance between binary descriptors.

3D Reconstruction

Predicting People's 3D Poses from Short Sequences

no code implementations30 Apr 2015 Bugra Tekin, Xiaolu Sun, Xinchao Wang, Vincent Lepetit, Pascal Fua

We propose an efficient approach to exploiting motion information from consecutive frames of a video sequence to recover the 3D pose of people.

Globally Optimal Cell Tracking using Integer Programming

no code implementations22 Jan 2015 Engin Türetken, Xinchao Wang, Carlos Becker, Carsten Haubold, Pascal Fua

We propose a novel approach to automatically tracking cell populations in time-lapse images.

Cell Tracking

Multiple human pose estimation with temporally consistent 3d pictorial structures

no code implementations6 Sep 2014 Vasileios Belagiannis, Xinchao Wang, Bernt Schiele, Pascal Fua, Slobodan Ilic, Nassir Navab

To address these challenges, we propose a temporally consistent 3D Pictorial Structures model (3DPS) for multiple human pose estimation from multiple camera views.

3D Multi-Person Pose Estimation 3D Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.