Search Results for author: Jiashi Feng

Found 308 papers, 126 papers with code

ConvBERT: Improving BERT with Span-based Dynamic Convolution

7 code implementations • NeurIPS 2020 • Zi-Hang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

The novel convolution heads, together with the rest self-attention heads, form a new mixed attention block that is more efficient at both global and local context learning.

Natural Language Understanding

124,527

Paper
Code

MetaFormer Is Actually What You Need for Vision

14 code implementations • CVPR 2022 • Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao Wang, Jiashi Feng, Shuicheng Yan

Based on this observation, we hypothesize that the general architecture of the Transformers, instead of the specific token mixer module, is more essential to the model's performance.

Ranked #9 on Semantic Segmentation on DensePASS

Image Classification Object Detection +1

124,527

Paper
Code

Dual Path Networks

19 code implementations • NeurIPS 2017 • Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng

In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally.

Image Classification

29,671

Paper
Code

VOLO: Vision Outlooker for Visual Recognition

7 code implementations • 24 Jun 2021 • Li Yuan, Qibin Hou, Zihang Jiang, Jiashi Feng, Shuicheng Yan

Though recently the prevailing vision transformers (ViTs) have shown great potential of self-attention based models in ImageNet classification, their performance is still inferior to that of the latest SOTA CNNs if no extra data are provided.

Ranked #1 on Image Classification on VizWiz-Classification

Domain Generalization Image Classification +1

29,671

Paper
Code

MetaFormer Baselines for Vision

7 code implementations • 24 Oct 2022 • Weihao Yu, Chenyang Si, Pan Zhou, Mi Luo, Yichen Zhou, Jiashi Feng, Shuicheng Yan, Xinchao Wang

By simply applying depthwise separable convolutions as token mixer in the bottom stages and vanilla self-attention in the top stages, the resulting model CAFormer sets a new record on ImageNet-1K: it achieves an accuracy of 85. 5% at 224x224 resolution, under normal supervised training without external data or distillation.

Ranked #2 on Domain Generalization on ImageNet-C (using extra training data)

Domain Generalization Image Classification

29,671

Paper
Code

MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

2 code implementations • 27 Nov 2023 • Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou

Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion.

Image Animation

9,778

Paper
Code

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

3 code implementations • 19 Jan 2024 • Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao

To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error.

Ranked #2 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Data Augmentation Monocular Depth Estimation +1

5,555

Paper
Code

Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

2 code implementations • 10 Apr 2018 • Jian Zhao, Jianshu Li, Yu Cheng, Li Zhou, Terence Sim, Shuicheng Yan, Jiashi Feng

Despite the noticeable progress in perceptual tasks like detection, instance segmentation and human parsing, computers still perform unsatisfactorily on visually understanding humans in crowded scenes, such as group behavior analysis, person re-identification and autonomous driving, etc.

Ranked #1 on Multi-Human Parsing on PASCAL-Part

Autonomous Driving Clustering +6

4,966

Paper
Code

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

13 code implementations • ICCV 2021 • Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan

To overcome such limitations, we propose a new Tokens-To-Token Vision Transformer (T2T-ViT), which incorporates 1) a layer-wise Tokens-to-Token (T2T) transformation to progressively structurize the image to tokens by recursively aggregating neighboring Tokens into one Token (Tokens-to-Token), such that local structure represented by surrounding tokens can be modeled and tokens length can be reduced; 2) an efficient backbone with a deep-narrow structure for vision transformer motivated by CNN architecture design after empirical study.

Ranked #400 on Image Classification on ImageNet

Image Classification Language Modelling

3,137

Paper
Code

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

28 code implementations • ICCV 2019 • Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, Jiashi Feng

Similarly, the output feature maps of a convolution layer can also be seen as a mixture of information at different frequencies.

Ranked #147 on Action Classification on Kinetics-400

Action Classification Image Classification +1

2,917

Paper
Code

Revisiting Knowledge Distillation via Label Smoothing Regularization

2 code implementations • CVPR 2020 • Li Yuan, Francis E. H. Tay, Guilin Li, Tao Wang, Jiashi Feng

Without any extra computation cost, Tf-KD achieves up to 0. 65\% improvement on ImageNet over well-established baseline models, which is superior to label smoothing regularization.

Self-Knowledge Distillation

1,260

Paper
Code

Rethinking Bottleneck Structure for Efficient Mobile Network Design

4 code implementations • ECCV 2020 • Zhou Daquan, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

In this paper, we rethink the necessity of such design changes and find it may bring risks of information loss and gradient confusion.

General Classification Neural Architecture Search +2

945

Paper
Code

Coordinate Attention for Efficient Mobile Network Design

2 code implementations • CVPR 2021 • Qibin Hou, Daquan Zhou, Jiashi Feng

Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e. g., the Squeeze-and-Excitation attention) for lifting model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps.

object-detection Object Detection +1

945

Paper
Code

Decoupling Representation and Classifier for Long-Tailed Recognition

4 code implementations • ICLR 2020 • Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

The long-tail distribution of the visual world poses great challenges for deep learning based classification models on how to handle the class imbalance problem.

Ranked #3 on Long-tail learning with class descriptors on CUB-LT

Classification General Classification +3

917

Paper
Code

PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer

1 code implementation • CVPR 2020 • Wentao Jiang, Si Liu, Chen Gao, Jie Cao, Ran He, Jiashi Feng, Shuicheng Yan

In this paper, we address the makeup transfer task, which aims to transfer the makeup from a reference image to a source image.

697

Paper
Code

PSGAN++: Robust Detail-Preserving Makeup Transfer and Removal

1 code implementation • 26 May 2021 • Si Liu, Wentao Jiang, Chen Gao, Ran He, Jiashi Feng, Bo Li, Shuicheng Yan

In this paper, we address the makeup transfer and removal tasks simultaneously, which aim to transfer the makeup from a reference image to a source image and remove the makeup from the with-makeup image respectively.

Style Transfer

697

Paper
Code

Multiple-Human Parsing in the Wild

2 code implementations • 19 May 2017 • Jianshu Li, Jian Zhao, Yunchao Wei, Congyan Lang, Yidong Li, Terence Sim, Shuicheng Yan, Jiashi Feng

To address the multi-human parsing problem, we introduce a new multi-human parsing (MHP) dataset and a novel multi-human parsing model named MH-Parser.

Ranked #3 on Multi-Human Parsing on MHP v1.0

Multi-Human Parsing

645

Paper
Code

A Simple Pooling-Based Design for Real-Time Salient Object Detection

5 code implementations • CVPR 2019 • Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Jiashi Feng, Jianmin Jiang

We further design a feature aggregation module (FAM) to make the coarse-level semantic information well fused with the fine-level features from the top-down pathway.

Ranked #1 on RGB Salient Object Detection on SOD

object-detection RGB Salient Object Detection +1

618

Paper
Code

Few-shot Object Detection via Feature Reweighting

4 code implementations • ICCV 2019 • Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, Trevor Darrell

The feature learner extracts meta features that are generalizable to detect novel object classes, using training data from base classes with sufficient samples.

Ranked #21 on Few-Shot Object Detection on MS-COCO (30-shot)

Few-Shot Learning Few-Shot Object Detection +3

519

Paper
Code

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

1 code implementation • 17 Jul 2023 • Yang Zhao, Zhijie Lin, Daquan Zhou, Zilong Huang, Jiashi Feng, Bingyi Kang

Our experiments show that BuboGPT achieves impressive multi-modality understanding and visual grounding abilities during the interaction with human.

Instruction Following Sentence +1

468

Paper
Code

3D Face Reconstruction from A Single Image Assisted by 2D Face Images in the Wild

2 code implementations • 22 Mar 2019 • Xiaoguang Tu, Jian Zhao, Zi-Hang Jiang, Yao Luo, Mei Xie, Yang Zhao, Linxiao He, Zheng Ma, Jiashi Feng

3D face reconstruction from a single 2D image is a challenging problem with broad applications.

Ranked #7 on Face Alignment on AFLW2000-3D

3D Face Reconstruction Face Alignment +2

463

Paper
Code

Understanding The Robustness in Vision Transformers

2 code implementations • 26 Apr 2022 • Daquan Zhou, Zhiding Yu, Enze Xie, Chaowei Xiao, Anima Anandkumar, Jiashi Feng, Jose M. Alvarez

Our study is motivated by the intriguing properties of the emerging visual grouping in Vision Transformers, which indicates that self-attention may promote robustness through improved mid-level representations.

Ranked #4 on Domain Generalization on ImageNet-R (using extra training data)

Domain Generalization Image Classification +3

458

Paper
Code

Deep Long-Tailed Learning: A Survey

1 code implementation • 9 Oct 2021 • Yifan Zhang, Bingyi Kang, Bryan Hooi, Shuicheng Yan, Jiashi Feng

Deep long-tailed learning, one of the most challenging problems in visual recognition, aims to train well-performing deep models from a large number of images that follow a long-tailed class distribution.

418

Paper
Code

All Tokens Matter: Token Labeling for Training Better Vision Transformers

6 code implementations • NeurIPS 2021 • Zihang Jiang, Qibin Hou, Li Yuan, Daquan Zhou, Yujun Shi, Xiaojie Jin, Anran Wang, Jiashi Feng

In this paper, we present token labeling -- a new training objective for training high-performance vision transformers (ViTs).

Ranked #3 on Efficient ViTs on ImageNet-1K (With LV-ViT-S)

Efficient ViTs General Classification +1

417

Paper
Code

Distilling Object Detectors with Fine-grained Feature Imitation

3 code implementations • CVPR 2019 • Tao Wang, Li Yuan, Xiaopeng Zhang, Jiashi Feng

To address the challenge of distilling knowledge in detection model, we propose a fine-grained feature imitation method exploiting the cross-location discrepancy of feature response.

Knowledge Distillation Object +2

412

Paper
Code

Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation

2 code implementations • ICML 2020 • Jian Liang, Dapeng Hu, Jiashi Feng

Unsupervised domain adaptation (UDA) aims to leverage the knowledge learned from a labeled source dataset to solve similar tasks in a new unlabeled domain.

Ranked #1 on Source-Free Domain Adaptation on VisDA-2017

Partial Domain Adaptation Representation Learning +3

409

Paper
Code

DINE: Domain Adaptation from Single and Multiple Black-box Predictors

3 code implementations • CVPR 2022 • Jian Liang, Dapeng Hu, Jiashi Feng, Ran He

To ease the burden of labeling, unsupervised domain adaptation (UDA) aims to transfer knowledge in previous and related labeled datasets (sources) to a new unlabeled dataset (target).

Transductive Learning Unsupervised Domain Adaptation

409

Paper
Code

Magic-Me: Identity-Specific Video Customized Diffusion

1 code implementation • 14 Feb 2024 • Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Feng

To achieve this, we propose three novel components that are essential for high-quality identity preservation and stable video generation: 1) a noise initialization method with 3D Gaussian Noise Prior for better inter-frame stability; 2) an ID module based on extended Textual Inversion trained with the cropped identity to disentangle the ID information from the background 3) Face VCD and Tiled VCD modules to reinforce faces and upscale the video to higher resolution while preserving the identity's features.

Text-to-Image Generation Video Generation

390

Paper
Code

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

2 code implementations • CVPR 2020 • Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng

Spatial pooling has been proven highly effective in capturing long-range contextual information for pixel-wise prediction tasks, such as scene parsing.

Ranked #32 on Semantic Segmentation on Cityscapes test

Scene Parsing Semantic Segmentation

381

Paper
Code

No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data

1 code implementation • NeurIPS 2021 • Mi Luo, Fei Chen, Dapeng Hu, Yifan Zhang, Jian Liang, Jiashi Feng

Motivated by the above findings, we propose a novel and simple algorithm called Classifier Calibration with Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated gaussian mixture model.

Classifier calibration Federated Learning

380

Paper
Code

Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition

1 code implementation • 2 Sep 2018 • Jian Zhao, Yu Cheng, Yi Cheng, Yang Yang, Haochong Lan, Fang Zhao, Lin Xiong, Yan Xu, Jianshu Li, Sugiri Pranata, ShengMei Shen, Junliang Xing, Hengzhu Liu, Shuicheng Yan, Jiashi Feng

Benchmarking our model on one of the most popular unconstrained face recognition datasets IJB-C additionally verifies the promising generalizability of AIM in recognizing faces in the wild.

Ranked #1 on Age-Invariant Face Recognition on MORPH Album2

Age-Invariant Face Recognition Benchmarking +4

361

Paper
Code

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation

1 code implementation • CVPR 2021 • Kehong Gong, Jianfeng Zhang, Jiashi Feng

To address this problem, we present PoseAug, a new auto-augmentation framework that learns to augment the available training poses towards a greater diversity and thus improve generalization of the trained 2D-to-3D pose estimator.

Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (Use Video Sequence metric)

Data Augmentation Monocular 3D Human Pose Estimation +1

357

Paper
Code

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax

2 code implementations • CVPR 2020 • Yu Li, Tao Wang, Bingyi Kang, Sheng Tang, Chunfeng Wang, Jintao Li, Jiashi Feng

Solving long-tail large vocabulary object detection with deep learning based models is a challenging and demanding task, which is however under-explored. In this work, we provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution.

Image Classification Instance Segmentation +5

350

Paper
Code

Graph-Based Global Reasoning Networks

9 code implementations • CVPR 2019 • Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed.

Action Classification Action Recognition +4

334

Paper
Code

Correlation Alignment for Unsupervised Domain Adaptation

4 code implementations • 6 Dec 2016 • Baochen Sun, Jiashi Feng, Kate Saenko

In contrast to subspace manifold methods, it aligns the original feature distributions of the source and target domains, rather than the bases of lower-dimensional subspaces.

Ranked #8 on Domain Adaptation on Office-Caltech

Unsupervised Domain Adaptation

330

Paper
Code

Domain Adaptation with Auxiliary Target Domain-Oriented Classifier

2 code implementations • CVPR 2021 • Jian Liang, Dapeng Hu, Jiashi Feng

ATDOC alleviates the classifier bias by introducing an auxiliary classifier for target data only, to improve the quality of pseudo labels.

Domain Adaptation Transfer Learning

319

Paper
Code

Direct Multi-view Multi-person 3D Pose Estimation

2 code implementations • NeurIPS 2021 • Tao Wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng

Instead of estimating 3D joint locations from costly volumetric representation or reconstructing the per-person 3D pose from multiple detected 2D poses as in previous methods, MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks.

Ranked #3 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation 3D Pose Estimation

310

Paper
Code

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

1 code implementation • CVPR 2022 • Kehong Gong, Bingbing Li, Jianfeng Zhang, Tao Wang, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang

Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions like consistency loss to guide the learning, which, inevitably, leads to inferior results in real-world scenarios with unseen poses.

Ranked #37 on 3D Human Pose Estimation on MPI-INF-3DHP

3D Human Pose Estimation Hallucination

304

Paper
Code

PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment

5 code implementations • ICCV 2019 • Kaixin Wang, Jun Hao Liew, Yingtian Zou, Daquan Zhou, Jiashi Feng

In this paper, we tackle the challenging few-shot segmentation problem from a metric learning perspective and present PANet, a novel prototype alignment network to better utilize the information of the support set.

Ranked #70 on Few-Shot Semantic Segmentation on COCO-20i (5-shot)

Few-Shot Semantic Segmentation Metric Learning +2

303

Paper
Code

AvatarGen: a 3D Generative Model for Animatable Human Avatars

1 code implementation • 1 Aug 2022 • Jianfeng Zhang, Zihang Jiang, Dingdong Yang, Hongyi Xu, Yichun Shi, Guoxian Song, Zhongcong Xu, Xinchao Wang, Jiashi Feng

Unsupervised generation of clothed virtual humans with various appearance and animatable poses is important for creating 3D human avatars and other AR/VR applications.

3D Human Reconstruction

241

Paper
Code

AvatarGen: A 3D Generative Model for Animatable Human Avatars

1 code implementation • 26 Nov 2022 • Jianfeng Zhang, Zihang Jiang, Dingdong Yang, Hongyi Xu, Yichun Shi, Guoxian Song, Zhongcong Xu, Xinchao Wang, Jiashi Feng

Specifically, we decompose the generative 3D human synthesis into pose-guided mapping and canonical representation with predefined human pose and shape, such that the canonical representation can be explicitly driven to different poses and shapes with the guidance of a 3D parametric human model SMPL.

241

Paper
Code

Voxel Transformer for 3D Object Detection

1 code implementation • ICCV 2021 • Jiageng Mao, Yujing Xue, Minzhe Niu, Haoyue Bai, Jiashi Feng, Xiaodan Liang, Hang Xu, Chunjing Xu

We present Voxel Transformer (VoTr), a novel and effective voxel-based Transformer backbone for 3D object detection from point clouds.

Ranked #3 on 3D Object Detection on waymo vehicle (L1 mAP metric)

3D Object Detection Computational Efficiency +3

234

Paper
Code

Dataset Quantization

1 code implementation • ICCV 2023 • Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, Jiashi Feng

Extensive experiments demonstrate that DQ is able to generate condensed small datasets for training unseen network architectures with state-of-the-art compression ratios for lossless model training.

object-detection Object Detection +2

233

Paper
Code

Central Similarity Quantization for Efficient Image and Video Retrieval

1 code implementation • CVPR 2020 • Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, Jiashi Feng

In this work, we propose a new \emph{global} similarity metric, termed as \emph{central similarity}, with which the hash codes of similar data pairs are encouraged to approach a common center and those for dissimilar pairs to converge to different centers, to improve hash learning efficiency and retrieval accuracy.

Quantization Retrieval +1

227

Paper
Code

PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection

1 code implementation • CVPR 2020 • Yue Liao, Si Liu, Fei Wang, Yanjie Chen, Chen Qian, Jiashi Feng

Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points.

Ranked #25 on Human-Object Interaction Detection on V-COCO

Human-Object Interaction Detection Object +2

213

Paper
Code

Dynamic Feature Fusion for Semantic Edge Detection

1 code implementation • 25 Feb 2019 • Yuan Hu, Yunpeng Chen, Xiang Li, Jiashi Feng

In this work, we propose a novel dynamic feature fusion strategy that assigns different fusion weights for different input images and locations adaptively.

Edge Detection

212

Paper
Code

Shunted Self-Attention via Multi-Scale Token Aggregation

1 code implementation • CVPR 2022 • Sucheng Ren, Daquan Zhou, Shengfeng He, Jiashi Feng, Xinchao Wang

This novel merging scheme enables the self-attention to learn relationships between objects with different sizes and simultaneously reduces the token numbers and the computational cost.

203

Paper
Code

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition

4 code implementations • 23 Jun 2021 • Qibin Hou, Zihang Jiang, Li Yuan, Ming-Ming Cheng, Shuicheng Yan, Jiashi Feng

By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections.

184

Paper
Code

Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning

1 code implementation • 17 Oct 2022 • Dongze Lian, Daquan Zhou, Jiashi Feng, Xinchao Wang

With the proposed SSF, our model obtains 2. 46% (90. 72% vs. 88. 54%) and 11. 48% (73. 10% vs. 65. 57%) performance improvement on FGVC and VTAB-1k in terms of Top-1 accuracy compared to the full fine-tuning but only fine-tuning about 0. 3M parameters.

Image Classification

151

Paper
Code

Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search

2 code implementations • CVPR 2019 • Xin Li, Yiming Zhou, Zheng Pan, Jiashi Feng

It prunes the architecture search space with a partial order assumption to automatically search for the architectures with the best speed and accuracy trade-off.

Neural Architecture Search

149

Paper
Code

Self-Supervised Aggregation of Diverse Experts for Test-Agnostic Long-Tailed Recognition

2 code implementations • 20 Jul 2021 • Yifan Zhang, Bryan Hooi, Lanqing Hong, Jiashi Feng

Existing long-tailed recognition methods, aiming to train class-balanced models from long-tailed data, generally assume the models would be evaluated on the uniform test class distribution.

Ranked #7 on Long-tail Learning on iNaturalist 2018

Image Classification Long-tail Learning

136

Paper
Code

DeepViT: Towards Deeper Vision Transformer

5 code implementations • 22 Mar 2021 • Daquan Zhou, Bingyi Kang, Xiaojie Jin, Linjie Yang, Xiaochen Lian, Zihang Jiang, Qibin Hou, Jiashi Feng

In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the performance of ViTs saturate fast when scaled to be deeper.

Ranked #423 on Image Classification on ImageNet

Image Classification Representation Learning

133

Paper
Code

PnP-DETR: Towards Efficient Visual Analysis with Transformers

1 code implementation • ICCV 2021 • Tao Wang, Li Yuan, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

Recently, DETR pioneered the solution of vision tasks with transformers, it directly translates the image feature map into the object detection result.

object-detection Object Detection +1

129

Paper
Code

Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition

1 code implementation • 22 Nov 2022 • Qibin Hou, Cheng-Ze Lu, Ming-Ming Cheng, Jiashi Feng

This paper does not attempt to design a state-of-the-art method for visual recognition but investigates a more efficient way to make use of convolutions to encode spatial features.

object-detection Object Detection +1

128

Paper
Code

Single-Stage Multi-Person Pose Machines

1 code implementation • ICCV 2019 • Xuecheng Nie, Jianfeng Zhang, Shuicheng Yan, Jiashi Feng

Based on SPR, we develop the SPM model that can directly predict structured poses for multiple persons in a single stage, and thus offer a more compact pipeline and attractive efficiency advantage over two-stage methods.

Ranked #3 on Keypoint Detection on MPII Multi-Person

3D Pose Estimation Keypoint Detection +1

127

Paper
Code

Natural Language Object Retrieval

1 code implementation • CVPR 2016 • Ronghang Hu, Huazhe Xu, Marcus Rohrbach, Jiashi Feng, Kate Saenko, Trevor Darrell

In this paper, we address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object.

Ranked #12 on Referring Expression Comprehension on Talk2Car

Image Captioning Image Retrieval +4

112

Paper
Code

Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm

1 code implementation • 10 Apr 2018 • Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Zhouchen Lin, Shuicheng Yan

Equipped with the new tensor nuclear norm, we then solve the TRPCA problem by solving a convex program and provide the theoretical guarantee for the exact recovery.

111

Paper
Code

DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks

1 code implementation • 5 Jan 2016 • Jie Fu, Hongyin Luo, Jiashi Feng, Kian Hsiang Low, Tat-Seng Chua

The performance of deep neural networks is well-known to be sensitive to the setting of their hyperparameters.

108

Paper
Code

Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer

2 code implementations • 14 Dec 2020 • Jian Liang, Dapeng Hu, Yunbo Wang, Ran He, Jiashi Feng

Furthermore, we propose a new labeling transfer strategy, which separates the target data into two splits based on the confidence of predictions (labeling information), and then employ semi-supervised learning to improve the accuracy of less-confident predictions in the target domain.

Classification General Classification +3

106

Paper
Code

Refiner: Refining Self-attention for Vision Transformers

1 code implementation • 7 Jun 2021 • Daquan Zhou, Yujun Shi, Bingyi Kang, Weihao Yu, Zihang Jiang, Yuan Li, Xiaojie Jin, Qibin Hou, Jiashi Feng

Vision Transformers (ViTs) have shown competitive accuracy in image classification tasks compared with CNNs.

Ranked #172 on Image Classification on ImageNet

Image Classification

106

Paper
Code

Classification Calibration for Long-tail Instance Segmentation

1 code implementation • 29 Oct 2019 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Jun Hao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

In this report, we investigate the performance drop phenomenon of state-of-the-art two-stage instance segmentation models when processing extreme long-tail training data based on the LVIS [5] dataset, and find a major cause is the inaccurate classification of object proposals.

Classification General Classification +3

100

Paper
Code

The Devil is in Classification: A Simple Framework for Long-tail Object Detection and Instance Segmentation

1 code implementation • ECCV 2020 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

Specifically, we systematically investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset, and unveil that a major cause is the inaccurate classification of object proposals.

General Classification Instance Segmentation +4

100

Paper
Code

Body Meshes as Points

1 code implementation • CVPR 2021 • Jianfeng Zhang, Dongdong Yu, Jun Hao Liew, Xuecheng Nie, Jiashi Feng

In this work, we present a single-stage model, Body Meshes as Points (BMP), to simplify the pipeline and lift both efficiency and performance.

Ranked #9 on 3D Multi-Person Pose Estimation on MuPoTS-3D

3D Human Shape Estimation 3D Multi-Person Pose Estimation +1

Paper
Code

Generative Partition Networks for Multi-Person Pose Estimation

1 code implementation • 21 May 2017 • Xuecheng Nie, Jiashi Feng, Junliang Xing, Shuicheng Yan

This paper proposes a new Generative Partition Network (GPN) to address the challenging multi-person pose estimation problem.

Ranked #1 on Multi-Person Pose Estimation on WAF (AP metric)

Human Detection Keypoint Detection +1

Paper
Code

Deep Joint Rain Detection and Removal from a Single Image

2 code implementations • CVPR 2017 • Wenhan Yang, Robby T. Tan, Jiashi Feng, Jiaying Liu, Zongming Guo, Shuicheng Yan

Based on the first model, we develop a multi-task deep learning architecture that learns the binary rain streak map, the appearance of rain streaks, and the clean background, which is our ultimate output.

Rain Removal

Paper
Code

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring

1 code implementation • CVPR 2023 • Ruyang Liu, Jingjia Huang, Ge Li, Jiashi Feng, Xinglong Wu, Thomas H. Li

In this paper, based on the CLIP model, we revisit temporal modeling in the context of image-to-video knowledge transferring, which is the key point for extending image-text pretrained models to the video domain.

Ranked #7 on Video Retrieval on MSR-VTT-1kA (using extra training data)

Representation Learning Retrieval +3

Paper
Code

Recovering the Unbiased Scene Graphs from the Biased Ones

1 code implementation • 5 Jul 2021 • Meng-Jiun Chiou, Henghui Ding, Hanshu Yan, Changhu Wang, Roger Zimmermann, Jiashi Feng

Given input images, scene graph generation (SGG) aims to produce comprehensive, graphical representations describing visual relationships among salient objects.

Ranked #2 on Unbiased Scene Graph Generation on Visual Genome

Missing Labels Scene Graph Classification +4

Paper
Code

ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning

1 code implementation • ICLR 2020 • Weihao Yu, Zi-Hang Jiang, Yanfei Dong, Jiashi Feng

Empirical results show that state-of-the-art models have an outstanding ability to capture biases contained in the dataset with high accuracy on EASY set.

Ranked #1 on Logical Reasoning Question Answering on ReClor

Logical Reasoning Logical Reasoning Question Answering +2

Paper
Code

Harnessing Diffusion Models for Visual Perception with Meta Prompts

1 code implementation • 22 Dec 2023 • Qiang Wan, Zilong Huang, Bingyi Kang, Jiashi Feng, Li Zhang

Our key insight is to introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.

Ranked #2 on Semantic Segmentation on Cityscapes test (using extra training data)

Monocular Depth Estimation Pose Estimation +1

Paper
Code

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

1 code implementation • ICLR 2022 • Jiawei Du, Hanshu Yan, Jiashi Feng, Joey Tianyi Zhou, Liangli Zhen, Rick Siow Mong Goh, Vincent Y. F. Tan

Recently, the relation between the sharpness of the loss landscape and the generalization error has been established by Foret et al. (2020), in which the Sharpness Aware Minimizer (SAM) was proposed to mitigate the degradation of the generalization.

Paper
Code

Expanding Small-Scale Datasets with Guided Imagination

1 code implementation • NeurIPS 2023 • Yifan Zhang, Daquan Zhou, Bryan Hooi, Kai Wang, Jiashi Feng

Specifically, GIF conducts data imagination by optimizing the latent features of the seed data in the semantically meaningful space of the prior model, resulting in the creation of photo-realistic images with new content.

Paper
Code

Contrastive Masked Autoencoders are Stronger Vision Learners

1 code implementation • 27 Jul 2022 • Zhicheng Huang, Xiaojie Jin, Chengze Lu, Qibin Hou, Ming-Ming Cheng, Dongmei Fu, Xiaohui Shen, Jiashi Feng

The momentum encoder, fed with the full images, enhances the feature discriminability via contrastive learning with its online counterpart.

Contrastive Learning Image Classification +3

Paper
Code

ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos

1 code implementation • 25 May 2021 • Meng-Jiun Chiou, Chun-Yu Liao, Li-Wei Wang, Roger Zimmermann, Jiashi Feng

Detecting human-object interactions (HOI) is an important step toward a comprehensive visual understanding of machines.

Ranked #3 on Human-Object Interaction Anticipation on VidHOI

Action Detection Human-Object Interaction Anticipation +2

Paper
Code

Query-efficient Meta Attack to Deep Neural Networks

1 code implementation • ICLR 2020 • Jiawei Du, Hu Zhang, Joey Tianyi Zhou, Yi Yang, Jiashi Feng

Black-box attack methods aim to infer suitable attack patterns to targeted DNN models by only using output feedback of the models and the corresponding input queries.

Adversarial Attack Meta-Learning

Paper
Code

The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up

4 code implementations • 9 Feb 2020 • Razvan V. Marinescu, Neil P. Oxtoby, Alexandra L. Young, Esther E. Bron, Arthur W. Toga, Michael W. Weiner, Frederik Barkhof, Nick C. Fox, Arman Eshaghi, Tina Toni, Marcin Salaterski, Veronika Lunina, Manon Ansart, Stanley Durrleman, Pascal Lu, Samuel Iddi, Dan Li, Wesley K. Thompson, Michael C. Donohue, Aviv Nahon, Yarden Levy, Dan Halbersberg, Mariya Cohen, Huiling Liao, Tengfei Li, Kaixian Yu, Hongtu Zhu, Jose G. Tamez-Pena, Aya Ismail, Timothy Wood, Hector Corrada Bravo, Minh Nguyen, Nanbo Sun, Jiashi Feng, B. T. Thomas Yeo, Gang Chen, Ke Qi, Shiyang Chen, Deqiang Qiu, Ionut Buciuman, Alex Kelner, Raluca Pop, Denisa Rimocea, Mostafa M. Ghazi, Mads Nielsen, Sebastien Ourselin, Lauge Sorensen, Vikram Venkatraghavan, Keli Liu, Christina Rabe, Paul Manser, Steven M. Hill, James Howlett, Zhiyue Huang, Steven Kiddle, Sach Mukherjee, Anais Rouanet, Bernd Taschler, Brian D. M. Tom, Simon R. White, Noel Faux, Suman Sedai, Javier de Velasco Oriol, Edgar E. V. Clemente, Karol Estrada, Leon Aksman, Andre Altmann, Cynthia M. Stonnington, Yalin Wang, Jianfeng Wu, Vivek Devadas, Clementine Fourrier, Lars Lau Raket, Aristeidis Sotiras, Guray Erus, Jimit Doshi, Christos Davatzikos, Jacob Vogel, Andrew Doyle, Angela Tam, Alex Diaz-Papkovich, Emmanuel Jammeh, Igor Koval, Paul Moore, Terry J. Lyons, John Gallacher, Jussi Tohka, Robert Ciszek, Bruno Jedynak, Kruti Pandya, Murat Bilgel, William Engels, Joseph Cole, Polina Golland, Stefan Klein, Daniel C. Alexander

TADPOLE's unique results suggest that current prediction algorithms provide sufficient accuracy to exploit biomarkers related to clinical diagnosis and ventricle volume, for cohort refinement in clinical trials for Alzheimer's disease.

Alzheimer's Disease Detection Disease Prediction

Paper
Code

A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation

1 code implementation • ECCV 2020 • Jian Liang, Yunbo Wang, Dapeng Hu, Ran He, Jiashi Feng

On one hand, negative transfer results in misclassification of target samples to the classes only present in the source domain.

Ranked #2 on Partial Domain Adaptation on ImageNet-Caltech

Partial Domain Adaptation Unsupervised Domain Adaptation

Paper
Code

Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors

1 code implementation • 28 May 2022 • Jianfei Yang, Xiangyu Peng, Kai Wang, Zheng Zhu, Jiashi Feng, Lihua Xie, Yang You

Domain Adaptation of Black-box Predictors (DABP) aims to learn a model on an unlabeled target domain supervised by a black-box predictor trained on a source domain.

Domain Adaptation Knowledge Distillation

Paper
Code

Clover: Towards A Unified Video-Language Alignment and Fusion Model

1 code implementation • CVPR 2023 • Jingjia Huang, Yinan Li, Jiashi Feng, Xinglong Wu, Xiaoshuai Sun, Rongrong Ji

We then introduce \textbf{Clover}\textemdash a Correlated Video-Language pre-training method\textemdash towards a universal Video-Language model for solving multiple video understanding tasks with neither performance nor efficiency compromise.

Ranked #1 on Video Question Answering on LSMDC-FiB

Language Modelling Question Answering +10

Paper
Code

Continual Learning via Bit-Level Information Preserving

1 code implementation • CVPR 2021 • Yujun Shi, Li Yuan, Yunpeng Chen, Jiashi Feng

Continual learning tackles the setting of learning different tasks sequentially.

Continual Learning Quantization

Paper
Code

COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

1 code implementation • 15 Jun 2023 • Sihan Chen, Xingjian He, Handong Li, Xiaojie Jin, Jiashi Feng, Jing Liu

Due to the limited scale and quality of video-text training corpus, most vision-language foundation models employ image-text datasets for pretraining and primarily focus on modeling visually semantic representations while disregarding temporal semantic representations and correlations.

Ranked #1 on TGIF-Frame on TGIF-QA (using extra training data)

Question Answering Retrieval +6

Paper
Code

Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning

1 code implementation • CVPR 2022 • Yujun Shi, Kuangqi Zhou, Jian Liang, Zihang Jiang, Jiashi Feng, Philip Torr, Song Bai, Vincent Y. F. Tan

Specifically, we experimentally show that directly encouraging CIL Learner at the initial phase to output similar representations as the model jointly trained on all classes can greatly boost the CIL performance.

Class Incremental Learning Incremental Learning

Paper
Code

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

1 code implementation • CVPR 2023 • Jiacheng Wei, Hao Wang, Jiashi Feng, Guosheng Lin, Kim-Hui Yap

We conduct extensive experiments to analyze each of our proposed components and show the efficacy of our framework in generating high-fidelity 3D textured and text-relevant shapes.

Paper
Code

Learning Detection with Diverse Proposals

1 code implementation • CVPR 2017 • Samaneh Azadi, Jiashi Feng, Trevor Darrell

To predict a set of diverse and informative proposals with enriched representations, this paper introduces a differentiable Determinantal Point Process (DPP) layer that is able to augment the object detection architectures.

Object object-detection +1

Paper
Code

A^2-Nets: Double Attention Networks

2 code implementations • NeurIPS 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

Learning to capture long-range relations is fundamental to image/video recognition.

Action Classification Action Recognition +2

Paper
Code

Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks

1 code implementation • CVPR 2017 • Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, Jiashi Feng

Through competition with discriminator, the generator progressively improves quality of the future frames and thus anticipates future gaze better.

Gaze Prediction

Paper
Code

Improving Generalization in Reinforcement Learning with Mixture Regularization

2 code implementations • NeurIPS 2020 • Kaixin Wang, Bingyi Kang, Jie Shao, Jiashi Feng

Deep reinforcement learning (RL) agents trained in a limited set of environments tend to suffer overfitting and fail to generalize to unseen testing environments.

Data Augmentation reinforcement-learning +1

Paper
Code

Few-shot Classification via Adaptive Attention

1 code implementation • 6 Aug 2020 • Zi-Hang Jiang, Bingyi Kang, Kuangqi Zhou, Jiashi Feng

To be specific, we devise a simple and efficient meta-reweighting strategy to adapt the sample representations and generate soft attention to refine the representation such that the relevant features from the query and support samples can be extracted for a better few-shot classification.

Classification Few-Shot Learning +1

Paper
Code

Understanding and Resolving Performance Degradation in Graph Convolutional Networks

2 code implementations • 12 Jun 2020 • Kuangqi Zhou, Yanfei Dong, Kaixin Wang, Wee Sun Lee, Bryan Hooi, Huan Xu, Jiashi Feng

In this work, we study performance degradation of GCNs by experimentally examining how stacking only TRANs or PROPs works.

Paper
Code

MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration

1 code implementation • 14 Nov 2023 • Lin Xu, Zhiyuan Hu, Daquan Zhou, Hongyu Ren, Zhen Dong, Kurt Keutzer, See Kiong Ng, Jiashi Feng

Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing, demonstrating exceptional capabilities in reasoning, tool usage, and memory.

Benchmarking Language Modelling +1

Paper
Code

STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation

1 code implementation • 10 Sep 2015 • Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, Shuicheng Yan

Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations.

object-detection RGB Salient Object Detection +4

Paper
Code

Sharpness-Aware Training for Free

1 code implementation • 27 May 2022 • Jiawei Du, Daquan Zhou, Jiashi Feng, Vincent Y. F. Tan, Joey Tianyi Zhou

Intuitively, SAF achieves this by avoiding sudden drops in the loss in the sharp local minima throughout the trajectory of the updates of the weights.

Paper
Code

MagicMix: Semantic Mixing with Diffusion Models

2 code implementations • 28 Oct 2022 • Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng

Unlike style transfer, where an image is stylized according to the reference style without changing the image content, semantic blending mixes two different concepts in a semantic manner to synthesize a novel concept while preserving the spatial layout and geometry.

Denoising Style Transfer

Paper
Code

Return of Frustratingly Easy Domain Adaptation

1 code implementation • 17 Nov 2015 • Baochen Sun, Jiashi Feng, Kate Saenko

Unlike human learning, machine learning often fails to handle changes between training (source) and test (target) input distributions.

Ranked #4 on Domain Adaptation on Synth Digits-to-SVHN

BIG-bench Machine Learning Unsupervised Domain Adaptation

Paper
Code

Exact Low Tubal Rank Tensor Recovery from Gaussian Measurements

1 code implementation • 7 Jun 2018 • Canyi Lu, Jiashi Feng, Zhouchen Lin, Shuicheng Yan

Specifically, we show that by solving a TNN minimization problem, the underlying tensor of size $n_1\times n_2\times n_3$ with tubal rank $r$ can be exactly recovered when the given number of Gaussian measurements is $O(r(n_1+n_2-r)n_3)$.

Paper
Code

Video Recognition in Portrait Mode

1 code implementation • 21 Dec 2023 • Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang, Heng Wang

While existing datasets mainly comprise landscape mode videos, our paper seeks to introduce portrait mode videos to the research community and highlight the unique challenges associated with this video format.

Data Augmentation Video Recognition

Paper
Code

Generalizing Few-Shot NAS with Gradient Matching

1 code implementation • ICLR 2022 • Shoukang Hu, Ruochen Wang, Lanqing Hong, Zhenguo Li, Cho-Jui Hsieh, Jiashi Feng

Efficient performance estimation of architectures drawn from large search spaces is essential to Neural Architecture Search.

Neural Architecture Search

Paper
Code

AggMask: Exploring locally aggregated learning of mask representations for instance segmentation

1 code implementation • 1 Jan 2021 • Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Recently proposed one-stage instance segmentation models (\emph{e. g.}, SOLO) learn to directly predict location-specific object mask with fully-convolutional networks.

Instance Segmentation Segmentation +1

Paper
Code

CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

2 code implementations • 10 Feb 2021 • Hanshu Yan, Jingfeng Zhang, Gang Niu, Jiashi Feng, Vincent Y. F. Tan, Masashi Sugiyama

By comparing \textit{non-robust} (normally trained) and \textit{robustified} (adversarially trained) models, we observe that adversarial training (AT) robustifies CNNs by aligning the channel-wise activations of adversarial data with those of their natural counterparts.

Adversarial Robustness feature selection

Paper
Code

Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning

1 code implementation • NeurIPS 2021 • Yifan Zhang, Bryan Hooi, Dapeng Hu, Jian Liang, Jiashi Feng

In this paper, we investigate whether applying contrastive learning to fine-tuning would bring further benefits, and analytically find that optimizing the contrastive loss benefits both discriminative representation learning and model optimization during fine-tuning.

Contrastive Learning Image Classification +4

Paper
Code

SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask Representations

1 code implementation • 15 Feb 2022 • Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Unlike the original per grid cell object masks, SODAR is implicitly supervised to learn mask representations that encode geometric structure of nearby objects and complement adjacent representations with context.

Instance Segmentation Object +1

Paper
Code

Hierarchical Neural Architecture Search via Operator Clustering

1 code implementation • 26 Sep 2019 • Guilin Li, Xing Zhang, Zitong Wang, Matthias Tan, Jiashi Feng, Zhenguo Li, Tong Zhang

Recently, the efficiency of automatic neural architecture design has been significantly improved by gradient-based search methods such as DARTS.

Clustering Neural Architecture Search

Paper
Code

LV-BERT: Exploiting Layer Variety for BERT

1 code implementation • Findings (ACL) 2021 • Weihao Yu, Zihang Jiang, Fei Chen, Qibin Hou, Jiashi Feng

In this paper, beyond this stereotyped layer pattern, we aim to improve pre-trained models by exploiting layer variety from two aspects: the layer type set and the layer order.

Paper
Code

Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations

1 code implementation • 10 Sep 2020 • Meng-Jiun Chiou, Roger Zimmermann, Jiashi Feng

Visual relationship detection aims to reason over relationships among salient objects in images, which has drawn increasing attention over the past few years.

Object object-detection +4

Paper
Code

Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing

1 code implementation • 12 Jul 2021 • Kaixin Wang, Kuangqi Zhou, Qixin Zhang, Jie Shao, Bryan Hooi, Jiashi Feng

It enables learning high-quality Laplacian representations that faithfully approximate the ground truth.

Continuous Control reinforcement-learning +1

Paper
Code

A Simple Loss Function for Improving the Convergence and Accuracy of Visual Question Answering Models

3 code implementations • 2 Aug 2017 • Ilija Ilievski, Jiashi Feng

On the other hand, very little focus has been put on the models' loss function, arguably one of the most important aspects of training deep learning models.

Question Answering Visual Question Answering

Paper
Code

ManiCLIP: Multi-Attribute Face Manipulation from Text

1 code implementation • 2 Oct 2022 • Hao Wang, Guosheng Lin, Ana García del Molino, Anran Wang, Jiashi Feng, Zhiqi Shen

In this paper we present a novel multi-attribute face manipulation method based on textual descriptions.

Attribute Text-based Image Editing

Paper
Code

PVRED: A Position-Velocity Recurrent Encoder-Decoder for Human Motion Prediction

1 code implementation • 15 Jun 2019 • Hongsong Wang, Jian Dong, Bin Cheng, Jiashi Feng

We therefore propose a novel Position-Velocity Recurrent Encoder-Decoder (PVRED) for human motion prediction, which makes full use of pose velocities and temporal positional information.

Human motion prediction motion prediction +1

Paper
Code

Adversarial Complementary Learning for Weakly Supervised Object Localization

2 code implementations • CVPR 2018 • Xiaolin Zhang, Yunchao Wei, Jiashi Feng, Yi Yang, Thomas Huang

With such an adversarial learning, the two parallel-classifiers are forced to leverage complementary object regions for classification and can finally generate integral object localization together.

Ranked #2 on Weakly-Supervised Object Localization on ILSVRC 2016

General Classification Object +1

Paper
Code

On Robustness of Neural Ordinary Differential Equations

2 code implementations • ICLR 2020 • Hanshu Yan, Jiawei Du, Vincent Y. F. Tan, Jiashi Feng

We then provide an insightful understanding of this phenomenon by exploiting a certain desirable property of the flow of a continuous-time ODE, namely that integral curves are non-intersecting.

Adversarial Attack

Paper
Code

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation

1 code implementation • ICCV 2023 • Kunyang Han, Yong liu, Jun Hao Liew, Henghui Ding, Yunchao Wei, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang, Jiashi Feng, Yao Zhao

Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).

Knowledge Distillation Open Vocabulary Semantic Segmentation +4

Paper
Code

A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion

1 code implementation • 3 Apr 2017 • Lin Xiong, Jayashree Karlekar, Jian Zhao, Yi Cheng, Yan Xu, Jiashi Feng, Sugiri Pranata, ShengMei Shen

In this paper, we propose a unified learning framework named Transferred Deep Feature Fusion (TDFF) targeting at the new IARPA Janus Benchmark A (IJB-A) face recognition dataset released by NIST face challenge.

Face Recognition Transfer Learning

Paper
Code

Egocentric Spatial Memory

1 code implementation • 31 Jul 2018 • Mengmi Zhang, Keng Teck Ma, Shih-Cheng Yen, Joo Hwee Lim, Qi Zhao, Jiashi Feng

Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective.

Feature Engineering

Paper
Code

Variational Prototype Replays for Continual Learning

1 code implementation • 23 May 2019 • Mengmi Zhang, Tao Wang, Joo Hwee Lim, Gabriel Kreiman, Jiashi Feng

In each classification task, our method learns a set of variational prototypes with their means and variances, where embedding of the samples from the same class can be represented in a prototypical distribution and class-representative prototypes are separated apart.

Continual Learning General Classification +2

Paper
Code

Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing

1 code implementation • 17 Jan 2019 • Xiaoguang Tu, Jian Zhao, Mei Xie, Guodong Du, Hengsheng Zhang, Jianshu Li, Zheng Ma, Jiashi Feng

Face anti-spoofing (a. k. a presentation attack detection) has drawn growing attention due to the high-security demand in face authentication systems.

Ranked #2 on Face Anti-Spoofing on MSU-MFSD

Domain Adaptation Face Anti-Spoofing +1

Paper
Code

Adaptive ROI Generation for Video Object Segmentation Using Reinforcement Learning

1 code implementation • 27 Sep 2019 • Mingjie Sun, Jimin Xiao, Eng Gee Lim, Yanchu Xie, Jiashi Feng

In this paper, we aim to tackle the task of semi-supervised video object segmentation across a sequence of frames where only the ground-truth segmentation of the first frame is provided.

reinforcement-learning Reinforcement Learning (RL) +4

Paper
Code

AutoSpace: Neural Architecture Search with Less Human Interference

1 code implementation • ICCV 2021 • Daquan Zhou, Xiaojie Jin, Xiaochen Lian, Linjie Yang, Yujing Xue, Qibin Hou, Jiashi Feng

Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction.

Neural Architecture Search

Paper
Code

Deep Learning with S-shaped Rectified Linear Activation Units

1 code implementation • 22 Dec 2015 • Xiaojie Jin, Chunyan Xu, Jiashi Feng, Yunchao Wei, Junjun Xiong, Shuicheng Yan

Rectified linear activation units are important components for state-of-the-art deep convolutional networks.

Paper
Code

Video-based Person Re-identiﬁcation with Accumulative Motion Context

1 code implementation • 13 Jun 2017 • Hao liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, Jiashi Feng

Video based person re-identification plays a central role in realistic security and video surveillance.

Video-Based Person Re-Identification

Paper
Code

Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts

1 code implementation • NeurIPS 2020 • Guilin Li, Junlei Zhang, Yunhe Wang, Chuanjian Liu, Matthias Tan, Yunfeng Lin, Wei zhang, Jiashi Feng, Tong Zhang

In particular, we propose a novel joint-training framework to train plain CNN by leveraging the gradients of the ResNet counterpart.

Paper
Code

Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond

1 code implementation • NeurIPS 2021 • Pan Zhou, Hanshu Yan, Xiaotong Yuan, Jiashi Feng, Shuicheng Yan

Specifically, we prove that lookahead using SGD as its inner-loop optimizer can better balance the optimization error and generalization error to achieve smaller excess risk error than vanilla SGD on (strongly) convex problems and nonconvex problems with Polyak-{\L}ojasiewicz condition which has been observed/proved in neural networks.

Paper
Code

RAIN: A Simple Approach for Robust and Accurate Image Classification Networks

1 code implementation • 24 Apr 2020 • Jiawei Du, Hanshu Yan, Vincent Y. F. Tan, Joey Tianyi Zhou, Rick Siow Mong Goh, Jiashi Feng

However, similar to existing preprocessing-based methods, the randomized process will degrade the prediction accuracy.

Adversarial Defense General Classification +2

Paper
Code

Class Prototype-based Cleaner for Label Noise Learning

1 code implementation • 21 Dec 2022 • Jingjia Huang, Yuanqi Chen, Jiashi Feng, Xinglong Wu

Semi-supervised learning based methods are current SOTA solutions to the noisy-label learning problem, which rely on learning an unsupervised label cleaner first to divide the training samples into a labeled set for clean data and an unlabeled set for noise data.

Ranked #3 on Image Classification on Clothing1M (using extra training data)

Image Classification

Paper
Code

Understanding Generalization and Optimization Performance of Deep CNNs

no code implementations • ICML 2018 • Pan Zhou, Jiashi Feng

Besides, we prove that for an arbitrary gradient descent algorithm, the computed approximate stationary point by minimizing empirical risk is also an approximate stationary point to the population risk.

Paper
Add Code

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

no code implementations • CVPR 2017 • Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan

We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems.

Classification General Classification +4

Paper
Add Code

Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation

no code implementations • CVPR 2018 • Yunchao Wei, Huaxin Xiao, Honghui Shi, Zequn Jie, Jiashi Feng, Thomas S. Huang

It can produce dense and reliable object localization maps and effectively benefit both weakly- and semi- supervised semantic segmentation.

Object Object Localization +3

Paper
Add Code

Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization

no code implementations • CVPR 2016 • Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Zhouchen Lin, Shuicheng Yan

In this work, we prove that under certain suitable assumptions, we can recover both the low-rank and the sparse components exactly by simply solving a convex program whose objective is a weighted combination of the tensor nuclear norm and the $\ell_1$-norm, i. e., $\min_{{\mathcal{L}},\ {\mathcal{E}}} \ \|{{\mathcal{L}}}\|_*+\lambda\|{{\mathcal{E}}}\|_1, \ \text{s. t.}

Image Denoising

Paper
Add Code

Subspace Clustering by Block Diagonal Representation

no code implementations • 23 May 2018 • Canyi Lu, Jiashi Feng, Zhouchen Lin, Tao Mei, Shuicheng Yan

Second, we observe that many existing methods approximate the block diagonal representation matrix by using different structure priors, e. g., sparsity and low-rankness, which are indirect.

Clustering

Paper
Add Code

WSNet: Compact and Efficient Networks Through Weight Sampling

no code implementations • ICML 2018 • Xiaojie Jin, Yingzhen Yang, Ning Xu, Jianchao Yang, Nebojsa Jojic, Jiashi Feng, Shuicheng Yan

We present a new approach and a novel architecture, termed WSNet, for learning compact and efficient deep neural networks.

Audio Classification General Classification +1

Paper
Add Code

Learning Markov Clustering Networks for Scene Text Detection

no code implementations • CVPR 2018 • Zichuan Liu, Guosheng Lin, Sheng Yang, Jiashi Feng, Weisi Lin, Wang Ling Goh

MCN predicts instance-level bounding boxes by firstly converting an image into a Stochastic Flow Graph (SFG) and then performing Markov Clustering on this graph.

Clustering Scene Text Detection +1

Paper
Add Code

Learning Pixel-wise Labeling from the Internet without Human Interaction

no code implementations • 19 May 2018 • Yun Liu, Yujun Shi, Jia-Wang Bian, Le Zhang, Ming-Ming Cheng, Jiashi Feng

Collecting sufficient annotated data is very expensive in many applications, especially for pixel-level prediction tasks such as semantic segmentation.

Segmentation Semantic Segmentation

Paper
Add Code

Transferable Semi-supervised Semantic Segmentation

no code implementations • 18 Nov 2017 • Huaxin Xiao, Yunchao Wei, Yu Liu, Maojun Zhang, Jiashi Feng

The performance of deep learning based semantic segmentation models heavily depends on sufficient data with careful annotations.

Segmentation Semi-Supervised Semantic Segmentation

Paper
Add Code

Zigzag Learning for Weakly Supervised Object Detection

no code implementations • CVPR 2018 • Xiaopeng Zhang, Jiashi Feng, Hongkai Xiong, Qi Tian

Unlike them, we propose a zigzag learning strategy to simultaneously discover reliable object instances and prevent the model from overfitting initial seeds.

Ranked #16 on Weakly Supervised Object Detection on PASCAL VOC 2012 test

Object object-detection +1

Paper
Add Code

Left-Right Comparative Recurrent Model for Stereo Matching

no code implementations • CVPR 2018 • Zequn Jie, Pengfei Wang, Yonggen Ling, Bo Zhao, Yunchao Wei, Jiashi Feng, Wei Liu

Left-right consistency check is an effective way to enhance the disparity estimation by referring to the information from the opposite view.

Disparity Estimation Stereo Disparity Estimation +2

Paper
Add Code

Multi-View Image Generation from a Single-View

no code implementations • 17 Apr 2017 • Bo Zhao, Xiao Wu, Zhi-Qi Cheng, Hao liu, Zequn Jie, Jiashi Feng

This paper addresses a challenging problem -- how to generate multi-view cloth images from only a single view input.

Image Generation Variational Inference

Paper
Add Code

Stochastic Primal-Dual Proximal ExtraGradient Descent for Compositely Regularized Optimization

no code implementations • 20 Aug 2017 • Tianyi Lin, Linbo Qiao, Teng Zhang, Jiashi Feng, Bofeng Zhang

This optimization model abstracts a number of important applications in artificial intelligence and machine learning, such as fused Lasso, fused logistic regression, and a class of graph-guided regularized minimization.

regression

Paper
Add Code

Cross-domain Human Parsing via Adversarial Feature and Label Adaptation

no code implementations • 4 Jan 2018 • Si Liu, Yao Sun, Defa Zhu, Guanghui Ren, Yu Chen, Jiashi Feng, Jizhong Han

Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences.

Human Parsing

Paper
Add Code

Weaving Multi-scale Context for Single Shot Detector

no code implementations • 8 Dec 2017 • Yunpeng Chen, Jianshu Li, Bin Zhou, Jiashi Feng, Shuicheng Yan

For 320x320 input of batch size = 8, WeaveNet reaches 79. 5% mAP on PASCAL VOC 2007 test in 101 fps with only 4 fps extra cost, and further improves to 79. 7% mAP with more iterations.

object-detection Object Detection

Paper
Add Code

Nonconvex Sparse Spectral Clustering by Alternating Direction Method of Multipliers and Its Convergence Analysis

no code implementations • 8 Dec 2017 • Canyi Lu, Jiashi Feng, Zhouchen Lin, Shuicheng Yan

Experimental analysis on several real data sets verifies the effectiveness of our method.

Clustering

Paper
Add Code

Personalized and Occupational-aware Age Progression by Generative Adversarial Networks

no code implementations • 26 Nov 2017 • Siyu Zhou, Weiqiang Zhao, Jiashi Feng, Hanjiang Lai, Yan Pan, Jian Yin, Shuicheng Yan

Second, we propose a new occupational-aware adversarial face aging network, which learns human aging process under different occupations.

Human Aging

Paper
Add Code

HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval

no code implementations • 26 Nov 2017 • Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, Shuicheng Yan

The proposed new adversarial network, HashGAN, consists of three building blocks: 1) the feature learning module to obtain feature representations, 2) the generative attention module to generate an attention mask, which is used to obtain the attended (foreground) and the unattended (background) feature representations, 3) the discriminative hash coding module to learn hash functions that preserve the similarities between different modalities.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Integrated Face Analytics Networks through Cross-Dataset Hybrid Training

no code implementations • 16 Nov 2017 • Jianshu Li, Shengtao Xiao, Fang Zhao, Jian Zhao, Jianan Li, Jiashi Feng, Shuicheng Yan, Terence Sim

Specifically, iFAN achieves an overall F-score of 91. 15% on the Helen dataset for face parsing, a normalized mean error of 5. 81% on the MTFL dataset for facial landmark localization and an accuracy of 45. 73% on the BNU dataset for emotion recognition with a single model.

Face Alignment Face Parsing +1

Paper
Add Code

Predicting Scene Parsing and Motion Dynamics in the Future

no code implementations • NeurIPS 2017 • Xiaojie Jin, Huaxin Xiao, Xiaohui Shen, Jimei Yang, Zhe Lin, Yunpeng Chen, Zequn Jie, Jiashi Feng, Shuicheng Yan

The ability of predicting the future is important for intelligent systems, e. g. autonomous vehicles and robots to plan early and make decisions accordingly.

Autonomous Vehicles motion prediction +2

Paper
Add Code

Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms

no code implementations • ICLR 2018 • Tom Zahavy, Bingyi Kang, Alex Sivak, Jiashi Feng, Huan Xu, Shie Mannor

As most deep learning algorithms are stochastic (e. g., Stochastic Gradient Descent, Dropout, and Bayes-by-backprop), we revisit the robustness arguments of Xu & Mannor, and introduce a new approach, ensemble robustness, that concerns the robustness of a population of hypotheses.

Paper
Add Code

Deep Sparse Subspace Clustering

no code implementations • 25 Sep 2017 • Xi Peng, Jiashi Feng, Shijie Xiao, Jiwen Lu, Zhang Yi, Shuicheng Yan

In this paper, we present a deep extension of Sparse Subspace Clustering, termed Deep Sparse Subspace Clustering (DSSC).

Clustering valid

Paper
Add Code

Discriminative Similarity for Clustering and Semi-Supervised Learning

no code implementations • 5 Sep 2017 • Yingzhen Yang, Feng Liang, Nebojsa Jojic, Shuicheng Yan, Jiashi Feng, Thomas S. Huang

By generalization analysis via Rademacher complexity, the generalization error bound for the kernel classifier learned from hypothetical labeling is expressed as the sum of pairwise similarity between the data from different classes, parameterized by the weights of the kernel classifier.

Clustering

Paper
Add Code

On the Suboptimality of Proximal Gradient Descent for $\ell^{0}$ Sparse Approximation

no code implementations • 5 Sep 2017 • Yingzhen Yang, Jiashi Feng, Nebojsa Jojic, Jianchao Yang, Thomas S. Huang

We study the proximal gradient descent (PGD) method for $\ell^{0}$ sparse approximation problem as well as its accelerated optimization with randomized algorithms in this paper.

Compressive Sensing Dimensionality Reduction

Paper
Add Code

Self-explanatory Deep Salient Object Detection

no code implementations • 18 Aug 2017 • Huaxin Xiao, Jiashi Feng, Yunchao Wei, Maojun Zhang

Through visualizing the differences, we can interpret the capability of different deep neural networks based saliency detection models and demonstrate that our proposed model indeed uses more reasonable structure for salient object detection.

Object object-detection +3

Paper
Add Code

Training Group Orthogonal Neural Networks with Privileged Information

no code implementations • 24 Jan 2017 • Yunpeng Chen, Xiaojie Jin, Jiashi Feng, Shuicheng Yan

Learning rich and diverse representations is critical for the performance of deep convolutional neural networks (CNNs).

Image Classification Image Segmentation +1

Paper
Add Code

Learning with Rethinking: Recurrently Improving Convolutional Neural Networks through Feedback

no code implementations • 15 Aug 2017 • Xin Li, Zequn Jie, Jiashi Feng, Changsong Liu, Shuicheng Yan

However, most of the existing CNN models only learn features through a feedforward structure and no feedback information from top to bottom layers is exploited to enable the networks to refine themselves.

Paper
Add Code

FoveaNet: Perspective-aware Urban Scene Parsing

no code implementations • ICCV 2017 • Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng

Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors.

Scene Parsing

Paper
Add Code

The Landscape of Deep Learning Algorithms

no code implementations • 19 May 2017 • Pan Zhou, Jiashi Feng

For an $l$-layer linear neural network, we prove its empirical risk uniformly converges to its population risk at the rate of $\mathcal{O}(r^{2l}\sqrt{d\log(l)}/\sqrt{n})$ with training sample size of $n$, the total weight dimension of $d$ and the magnitude bound $r$ of weight of each layer.

Generalization Bounds

Paper
Add Code

Accelerated Randomized Mirror Descent Algorithms For Composite Non-strongly Convex Optimization

no code implementations • 23 May 2016 • Le Thi Khanh Hien, Cuong V. Nguyen, Huan Xu, Can-Yi Lu, Jiashi Feng

Avoiding this devise, we propose an accelerated randomized mirror descent method for solving this problem without the strongly convex assumption.

Paper
Add Code

Neural Person Search Machines

no code implementations • ICCV 2017 • Hao Liu, Jiashi Feng, Zequn Jie, Karlekar Jayashree, Bo Zhao, Meibin Qi, Jianguo Jiang, Shuicheng Yan

We investigate the problem of person search in the wild in this work.

Ranked #4 on Person Re-Identification on CUHK-SYSU

Person Search

Paper
Add Code

Perceptual Generative Adversarial Networks for Small Object Detection

no code implementations • CVPR 2017 • Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, Jiashi Feng, Shuicheng Yan

In this work, we address the small object detection problem by developing a single architecture that internally lifts representations of small objects to "super-resolved" ones, achieving similar characteristics as large objects and thus more discriminative for detection.

Generative Adversarial Network Object +2

Paper
Add Code

Video-based Person Re-identification with Accumulative Motion Context

no code implementations • 1 Jan 2017 • Hao Liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, Jiashi Feng

Video based person re-identification plays a central role in realistic security and video surveillance.

Video-Based Person Re-Identification

Paper
Add Code

A Unified Framework for Stochastic Matrix Factorization via Variance Reduction

no code implementations • 19 May 2017 • Renbo Zhao, William B. Haskell, Jiashi Feng

We propose a unified framework to speed up the existing stochastic matrix factorization (SMF) algorithms via variance reduction.

Paper
Add Code

Diversified Visual Attention Networks for Fine-Grained Object Classification

no code implementations • 28 Jun 2016 • Bo Zhao, Xiao Wu, Jiashi Feng, Qiang Peng, Shuicheng Yan

Fine-grained object classification is a challenging task due to the subtle inter-class difference and large intra-class variation.

Classification General Classification +1

Paper
Add Code

IAN: The Individual Aggregation Network for Person Search

no code implementations • 16 May 2017 • Jimin Xiao, Yanchun Xie, Tammam Tillo, Kai-Zhu Huang, Yunchao Wei, Jiashi Feng

In addition, to relieve the negative effect caused by varying visual appearances of the same individual, IAN introduces a novel center loss that can increase the intra-class compactness of feature representations.

object-detection Object Detection +1

Paper
Add Code

Deep Self-Taught Learning for Weakly Supervised Object Localization

no code implementations • CVPR 2017 • Zequn Jie, Yunchao Wei, Xiaojie Jin, Jiashi Feng, Wei Liu

To overcome this issue, we propose a deep self-taught learning approach, which makes the detector learn the object-level features reliable for acquiring tight positive samples and afterwards re-train itself based on them.

Ranked #20 on Weakly Supervised Object Detection on PASCAL VOC 2012 test

Object Weakly Supervised Object Detection +1

Paper
Add Code

End-to-End Comparative Attention Networks for Person Re-identification

no code implementations • 14 Jun 2016 • Hao Liu, Jiashi Feng, Meibin Qi, Jianguo Jiang, Shuicheng Yan

The CAN model is able to learn which parts of images are relevant for discerning persons and automatically integrates information from different parts to determine whether a pair of images belongs to the same person.

Person Re-Identification

Paper
Add Code

On Fundamental Limits of Robust Learning

no code implementations • 30 Mar 2017 • Jiashi Feng

We consider the problems of robust PAC learning from distributed and streaming data, which may contain malicious errors and outliers, and analyze their fundamental complexity questions.

PAC learning

Paper
Add Code

Interpretable Structure-Evolving LSTM

no code implementations • CVPR 2017 • Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing

Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization.

Small Data Image Classification

Paper
Add Code

Tree-Structured Reinforcement Learning for Sequential Object Localization

no code implementations • NeurIPS 2016 • Zequn Jie, Xiaodan Liang, Jiashi Feng, Xiaojie Jin, Wen Feng Lu, Shuicheng Yan

Therefore, Tree-RL can better cover different objects with various scales which is quite appealing in the context of object proposal.

Object Object Localization +2

Paper
Add Code

Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates

1 code implementation • 28 Jul 2016 • Ilija Ilievski, Taimoor Akhtar, Jiashi Feng, Christine Annette Shoemaker

Those methods adopt probabilistic surrogate models like Gaussian processes to approximate and minimize the validation error function of hyperparameter values.

Bayesian Optimization Gaussian Processes +2

Paper
Code

Outlier Robust Online Learning

no code implementations • 1 Jan 2017 • Jiashi Feng, Huan Xu, Shie Mannor

We consider the problem of learning from noisy data in practical settings where the size of data is too large to store on a single machine.

Paper
Add Code

Robust LSTM-Autoencoders for Face De-Occlusion in the Wild

no code implementations • 27 Dec 2016 • Fang Zhao, Jiashi Feng, Jian Zhao, Wenhan Yang, Shuicheng Yan

The first one, named multi-scale spatial LSTM encoder, reads facial patches of various scales sequentially to output a latent representation, and occlusion-robustness is achieved owing to the fact that the influence of occlusion is only upon some of the patches.

Face Recognition

Paper
Add Code

Video Scene Parsing with Predictive Feature Learning

no code implementations • ICCV 2017 • Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan

In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.

Representation Learning Scene Parsing

Paper
Add Code

Multi-Path Feedback Recurrent Neural Network for Scene Parsing

no code implementations • 27 Aug 2016 • Xiaojie Jin, Yunpeng Chen, Jiashi Feng, Zequn Jie, Shuicheng Yan

In this paper, we consider the scene parsing problem and propose a novel Multi-Path Feedback recurrent neural network (MPF-RNN) for parsing scene images.

Scene Parsing

Paper
Add Code

Deep Recurrent Regression for Facial Landmark Detection

no code implementations • 30 Oct 2015 • Hanjiang Lai, Shengtao Xiao, Yan Pan, Zhen Cui, Jiashi Feng, Chunyan Xu, Jian Yin, Shuicheng Yan

We propose a novel end-to-end deep architecture for face landmark detection, based on a deep convolutional and deconvolutional network followed by carefully designed recurrent network structures.

Facial Landmark Detection regression

Paper
Add Code

Multi-stage Object Detection with Group Recursive Learning

no code implementations • 18 Aug 2016 • Jianan Li, Xiaodan Liang, Jianshu Li, Tingfa Xu, Jiashi Feng, Shuicheng Yan

Most of existing detection pipelines treat object proposals independently and predict bounding box locations and classification scores over them separately.

Object object-detection +4

Paper
Add Code

Hyperparameter Transfer Learning through Surrogate Alignment for Efficient Deep Neural Network Training

no code implementations • 31 Jul 2016 • Ilija Ilievski, Jiashi Feng

Recently, several optimization methods have been successfully applied to the hyperparameter optimization of deep neural networks (DNNs).

Hyperparameter Optimization Transfer Learning

Paper
Add Code

Scale-aware Pixel-wise Object Proposal Networks

no code implementations • 19 Jan 2016 • Zequn Jie, Xiaodan Liang, Jiashi Feng, Wen Feng Lu, Eng Hock Francis Tay, Shuicheng Yan

In particular, in order to improve the localization accuracy, a fully convolutional network is employed which predicts locations of object proposals for each pixel.

Object object-detection +2

Paper
Add Code

Collaborative Layer-wise Discriminative Learning in Deep Neural Networks

no code implementations • 19 Jul 2016 • Xiaojie Jin, Yunpeng Chen, Jian Dong, Jiashi Feng, Shuicheng Yan

In this paper, we propose a layer-wise discriminative learning method to enhance the discriminative capability of a deep network by allowing its layers to work collaboratively for classification.

Classification General Classification +1

Paper
Add Code

Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods

no code implementations • 19 Jul 2016 • Xiaojie Jin, Xiao-Tong Yuan, Jiashi Feng, Shuicheng Yan

In this paper, we propose an iterative hard thresholding (IHT) approach to train Skinny Deep Neural Networks (SDNNs).

Object Recognition

Paper
Add Code

Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution

no code implementations • 29 Apr 2016 • Wenhan Yang, Jiashi Feng, Jianchao Yang, Fang Zhao, Jiaying Liu, Zongming Guo, Shuicheng Yan

To address this essentially ill-posed problem, we introduce a Deep Edge Guided REcurrent rEsidual~(DEGREE) network to progressively recover the high-frequency details.

Image Super-Resolution

Paper
Add Code

Scale-aware Fast R-CNN for Pedestrian Detection

no code implementations • 28 Oct 2015 • Jianan Li, Xiaodan Liang, ShengMei Shen, Tingfa Xu, Jiashi Feng, Shuicheng Yan

Taking pedestrian detection as an example, we illustrate how we can leverage this philosophy to develop a Scale-Aware Fast R-CNN (SAF R-CNN) framework.

Ranked #23 on Pedestrian Detection on Caltech

Pedestrian Detection Philosophy

Paper
Add Code

A Focused Dynamic Attention Model for Visual Question Answering

no code implementations • 6 Apr 2016 • Ilija Ilievski, Shuicheng Yan, Jiashi Feng

Solving VQA problems requires techniques from both computer vision for understanding the visual contents of a presented image or video, as well as the ones from natural language processing for understanding semantics of the question and generating the answers.

Ranked #8 on Visual Question Answering (VQA) on COCO Visual Question Answering (VQA) real images 1.0 multiple choice

Question Answering Visual Question Answering

Paper
Add Code

Attentive Contexts for Object Detection

no code implementations • 24 Mar 2016 • Jianan Li, Yunchao Wei, Xiaodan Liang, Jian Dong, Tingfa Xu, Jiashi Feng, Shuicheng Yan

We provide preliminary answers to these questions through developing a novel Attention to Context Convolution Neural Network (AC-CNN) based object detection model.

Object object-detection +1

Paper
Add Code

Semantic Object Parsing with Graph LSTM

no code implementations • 23 Mar 2016 • Xiaodan Liang, Xiaohui Shen, Jiashi Feng, Liang Lin, Shuicheng Yan

By taking the semantic object parsing task as an exemplar application scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network, which is the generalization of LSTM from sequential data or multi-dimensional data to general graph-structured data.

Object Superpixels

Paper
Add Code

Auxiliary Image Regularization for Deep CNNs with Noisy Labels

no code implementations • 22 Nov 2015 • Samaneh Azadi, Jiashi Feng, Stefanie Jegelka, Trevor Darrell

Precisely-labeled data sets with sufficient amount of samples are very important for training deep convolutional neural networks (CNNs).

Image Classification

Paper
Add Code

Learning with $\ell^{0}$-Graph: $\ell^{0}$-Induced Sparse Subspace Clustering

no code implementations • 28 Oct 2015 • Yingzhen Yang, Jiashi Feng, Jianchao Yang, Thomas S. Huang

Sparse subspace clustering methods, such as Sparse Subspace Clustering (SSC) \cite{ElhamifarV13} and $\ell^{1}$-graph \cite{YanW09, ChengYYFH10}, are effective in partitioning the data that lie in a union of subspaces.

Clustering

Paper
Add Code

Reversible Recursive Instance-level Object Segmentation

no code implementations • CVPR 2016 • Xiaodan Liang, Yunchao Wei, Xiaohui Shen, Zequn Jie, Jiashi Feng, Liang Lin, Shuicheng Yan

By being reversible, the proposal refinement sub-network adaptively determines an optimal number of refinement iterations required for each proposal during both training and testing.

Denoising Object +2

Paper
Add Code

Semantic Object Parsing with Local-Global Long Short-Term Memory

no code implementations • CVPR 2016 • Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, Shuicheng Yan

The long chains of sequential computation by stacked LG-LSTM layers also enable each pixel to sense a much larger region for inference benefiting from the memorization of previous dependencies in all positions along all dimensions.

Memorization Position

Paper
Add Code

Sense Beyond Expressions: Cuteness

no code implementations • 17 Aug 2015 • Kang Wang, Tam V. Nguyen, Jiashi Feng, Jose Sepulveda

With the development of Internet culture, cuteness has become a popular concept.

Cultural Vocal Bursts Intensity Prediction

Paper
Add Code

Modality-dependent Cross-media Retrieval

no code implementations • 22 Jun 2015 • Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, Shuicheng Yan

Specifically, by jointly optimizing the correlation between images and text and the linear regression from one modal space (image or text) to the semantic space, two couples of mappings are learned to project images and text from their original feature spaces into two common latent subspaces (one for I2T and the other for T2I).

Retrieval

Paper
Add Code

Distributed Robust Learning

no code implementations • 21 Sep 2014 • Jiashi Feng, Huan Xu, Shie Mannor

We propose a framework for distributed robust statistical learning on {\em big contaminated data}.

Paper
Add Code

Correlation Adaptive Subspace Segmentation by Trace Lasso

no code implementations • 18 Jan 2015 • Canyi Lu, Jiashi Feng, Zhouchen Lin, Shuicheng Yan

In this work, we argue that both sparsity and the grouping effect are important for subspace segmentation.

Clustering Segmentation

Paper
Add Code

TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection

no code implementations • ECCV 2018 • Yunchao Wei, Zhiqiang Shen, Bowen Cheng, Honghui Shi, JinJun Xiong, Jiashi Feng, Thomas Huang

This work provides a simple approach to discover tight object bounding boxes with only image-level supervision, called Tight box mining with Surrounding Segmentation Context (TS2C).

Multiple Instance Learning Object +4

Paper
Add Code

Object Relation Detection Based on One-shot Learning

no code implementations • 16 Jul 2018 • Li Zhou, Jian Zhao, Jianshu Li, Li Yuan, Jiashi Feng

Detecting the relations among objects, such as "cat on sofa" and "person ride horse", is a crucial task in image understanding, and beneficial to bridging the semantic gap between images and natural language.

Object One-Shot Learning +1

Paper
Add Code

Multi-Fiber Networks for Video Recognition

no code implementations • ECCV 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

In this paper, we aim to reduce the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while preserving state-of-the-art accuracy on video recognition benchmarks.

Ranked #36 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Paper
Add Code

$A^2$-Nets: Double Attention Networks

no code implementations • 27 Oct 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

Learning to capture long-range relations is fundamental to image/video recognition.

Ranked #35 on Action Recognition on UCF101

3D Absolute Human Pose Estimation Action Classification +3

Paper
Add Code

New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity

no code implementations • NeurIPS 2018 • Pan Zhou, Xiao-Tong Yuan, Jiashi Feng

In this paper, we affirmatively answer this open question by showing that under WoRS and for both convex and non-convex problems, it is still possible for HSGD (with constant step-size) to match full gradient descent in rate of convergence, while maintaining comparable sample-size-independent incremental first-order oracle complexity to stochastic gradient descent.

Open-Ended Question Answering

Paper
Add Code

Efficient Stochastic Gradient Hard Thresholding

no code implementations • NeurIPS 2018 • Pan Zhou, Xiao-Tong Yuan, Jiashi Feng

To address these deficiencies, we propose an efficient hybrid stochastic gradient hard thresholding (HSG-HT) method that can be provably shown to have sample-size-independent gradient evaluation and hard thresholding complexity bounds.

Computational Efficiency

Paper
Add Code

Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis

no code implementations • NeurIPS 2017 • Jian Zhao, Lin Xiong, Panasonic Karlekar Jayashree, Jianshu Li, Fang Zhao, Zhecan Wang, Panasonic Sugiri Pranata, Panasonic Shengmei Shen, Shuicheng Yan, Jiashi Feng

In particular, we employ an off-the-shelf 3D face model as a simulator to generate profile face images with varying poses.

Ranked #1 on Face Verification on IJB-A

Face Generation Face Model +4

Paper
Add Code

Multimodal Learning and Reasoning for Visual Question Answering

no code implementations • NeurIPS 2017 • Ilija Ilievski, Jiashi Feng

In this work we introduce a modular neural network model that learns a multimodal and multifaceted representation of the image and the question.

Question Answering Representation Learning +1

Paper
Add Code

Robust Logistic Regression and Classification

no code implementations • NeurIPS 2014 • Jiashi Feng, Huan Xu, Shie Mannor, Shuicheng Yan

We consider logistic regression with arbitrary outliers in the covariate matrix.

Binary Classification Classification +2

Paper
Add Code

Online Robust PCA via Stochastic Optimization

no code implementations • NeurIPS 2013 • Jiashi Feng, Huan Xu, Shuicheng Yan

Robust PCA methods are typically based on batch optimization and have to load all the samples into memory.

Stochastic Optimization

Paper
Add Code

Online PCA for Contaminated Data

no code implementations • NeurIPS 2013 • Jiashi Feng, Huan Xu, Shie Mannor, Shuicheng Yan

We consider the online Principal Component Analysis (PCA) for contaminated samples (containing outliers) which are revealed sequentially to the Principal Components (PCs) estimator.

Paper
Add Code

MoNet: Deep Motion Exploitation for Video Object Segmentation

no code implementations • CVPR 2018 • Huaxin Xiao, Jiashi Feng, Guosheng Lin, Yu Liu, Maojun Zhang

In this paper, we propose a novel MoNet model to deeply exploit motion cues for boosting video object segmentation performance from two aspects, i. e., frame representation learning and segmentation refinement.

Object Optical Flow Estimation +5

Paper
Add Code

Deep Adversarial Subspace Clustering

no code implementations • CVPR 2018 • Pan Zhou, Yunqing Hou, Jiashi Feng

To solve this issue, we propose a novel deep adversarial subspace clustering (DASC) model, which learns more favorable sample representations by deep learning for subspace clustering, and more importantly introduces adversarial learning to supervise sample representation learning and subspace clustering.

Ranked #2 on Image Clustering on coil-40

Clustering Image Clustering +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.