Search Results for author: Alan Yuille

Found 197 papers, 101 papers with code

InstMove: Instance Motion for Object-centric Video Segmentation

1 code implementation14 Mar 2023 Qihao Liu, Junfeng Wu, Yi Jiang, Xiang Bai, Alan Yuille, Song Bai

A common solution is to use optical flow to provide motion information, but essentially it only considers pixel-level motion, which still relies on appearance similarity and hence is often inaccurate under occlusion and fast movement.

Optical Flow Estimation Video Segmentation +1

PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation

1 code implementation13 Mar 2023 Qihao Liu, Adam Kortylewski, Alan Yuille

We introduce a learning-based testing method, termed PoseExaminer, that automatically diagnoses HPS algorithms by searching over the parameter space of human pose images to find the failure modes.

Multi-agent Reinforcement Learning

Benchmarking Robustness in Neural Radiance Fields

no code implementations10 Jan 2023 Chen Wang, Angtian Wang, Junbo Li, Alan Yuille, Cihang Xie

We find that NeRF-based models are significantly degraded in the presence of corruption, and are more sensitive to a different set of corruptions than image recognition models.

Benchmarking Camera Calibration +2

Learning Road Scene-level Representations via Semantic Region Prediction

no code implementations2 Jan 2023 Zihao Xiao, Alan Yuille, Yi-Ting Chen

In this work, we tackle two vital tasks in automated driving systems, i. e., driver intent prediction and risk object identification from egocentric images.

Unleashing the Power of Visual Prompting At the Pixel Level

1 code implementation20 Dec 2022 Junyang Wu, Xianhang Li, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie

This paper presents a simple and effective visual prompting method for adapting pre-trained models to downstream recognition tasks.

AsyInst: Asymmetric Affinity with DepthGrad and Color for Box-Supervised Instance Segmentation

no code implementations7 Dec 2022 Siwei Yang, Longlong Jing, Junfei Xiao, Hang Zhao, Alan Yuille, Yingwei Li

Through systematic analysis, we found that the commonly used pairwise affinity loss has two limitations: (1) it works with color affinity but leads to inferior performance with other modalities such as depth gradient, (2)the original affinity loss does not prevent trivial predictions as intended but actually accelerates this process due to the affinity loss term being symmetric.

Box-supervised Instance Segmentation Semantic Segmentation +1

Localization vs. Semantics: How Can Language Benefit Visual Representation Learning?

no code implementations1 Dec 2022 Zhuowan Li, Cihang Xie, Benjamin Van Durme, Alan Yuille

In this work, we investigate how language can help with visual representation learning from a probing perspective.

Representation Learning

LUMix: Improving Mixup by Better Modelling Label Uncertainty

no code implementations29 Nov 2022 Shuyang Sun, Jie-Neng Chen, Ruifei He, Alan Yuille, Philip Torr, Song Bai

LUMix is simple as it can be implemented in just a few lines of code and can be universally applied to any deep networks \eg CNNs and Vision Transformers, with minimal computational cost.

Data Augmentation

SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training

no code implementations21 Nov 2022 Yuanze Lin, Chen Wei, Huiyu Wang, Alan Yuille, Cihang Xie

Coupling all these designs allows our method to enjoy both competitive performances on text-to-video retrieval and video question answering tasks, and much less pre-training costs by 1. 9X or more.

Question Answering Retrieval +3

Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification

1 code implementation23 Oct 2022 Junfei Xiao, Yutong Bai, Alan Yuille, Zongwei Zhou

We hope that this study can direct future research on the application of Transformers to a larger variety of medical imaging tasks.

Transfer Learning

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

1 code implementation23 Oct 2022 Junfei Xiao, Zhichao Xu, Shiyi Lan, Zhiding Yu, Alan Yuille, Anima Anandkumar

The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with a simple dataset balancing strategy.

Semantic Segmentation

Context-Enhanced Stereo Transformer

1 code implementation21 Oct 2022 Weiyu Guo, Zhaoshuo Li, Yongkui Yang, Zheng Wang, Russell H. Taylor, Mathias Unberath, Alan Yuille, Yingwei Li

We construct our stereo depth estimation model, Context Enhanced Stereo Transformer (CSTR), by plugging CEP into the state-of-the-art stereo depth estimation method Stereo Transformer.

Stereo Depth Estimation Stereo Matching

Masked Autoencoders Enable Efficient Knowledge Distillers

1 code implementation25 Aug 2022 Yutong Bai, Zeyu Wang, Junfei Xiao, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie

For example, by distilling the knowledge from an MAE pre-trained ViT-L into a ViT-B, our method achieves 84. 0% ImageNet top-1 accuracy, outperforming the baseline of directly distilling a fine-tuned ViT-L by 1. 2%.

Knowledge Distillation

Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation

no code implementations29 Jul 2022 Qihao Liu, Yi Zhang, Song Bai, Alan Yuille

Inspired by the remarkable ability of humans to infer occluded joints from visible cues, we develop a method to explicitly model this process that significantly improves bottom-up multi-person human pose estimation with or without occlusions.

3D Human Pose Estimation Data Augmentation

In Defense of Online Models for Video Instance Segmentation

1 code implementation21 Jul 2022 Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai

In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models gradually attracted less attention possibly due to their inferior performance.

Ranked #2 on Video Instance Segmentation on YouTube-VIS validation (using extra training data)

Association Contrastive Learning +5

k-means Mask Transformer

1 code implementation8 Jul 2022 Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

However, we observe that most existing transformer-based vision models simply borrow the idea from NLP, neglecting the crucial difference between languages and images, particularly the extremely large sequence length of spatially flattened pixel features.

Object Detection Panoptic Segmentation

Unsupervised Domain Adaptation through Shape Modeling for Medical Image Segmentation

1 code implementation6 Jul 2022 Yuan YAO, Fengze Liu, Zongwei Zhou, Yan Wang, Wei Shen, Alan Yuille, Yongyi Lu

Previous methods proposed Variational Autoencoder (VAE) based models to learn the distribution of shape for a particular organ and used it to automatically evaluate the quality of a segmentation prediction by fitting it into the learned shape distribution.

Image Segmentation Pancreas Segmentation +2

A Simple Data Mixing Prior for Improving Self-Supervised Learning

1 code implementation CVPR 2022 Sucheng Ren, Huiyu Wang, Zhengqi Gao, Shengfeng He, Alan Yuille, Yuyin Zhou, Cihang Xie

More notably, our SDMP is the first method that successfully leverages data mixing to improve (rather than hurt) the performance of Vision Transformers in the self-supervised setting.

Representation Learning Self-Supervised Learning

VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis

1 code implementation30 May 2022 Angtian Wang, Peng Wang, Jian Sun, Adam Kortylewski, Alan Yuille

The Gaussian reconstruction kernels have been proposed by Westover (1990) and studied by the computer graphics community back in the 90s, which gives an alternative representation of object 3D geometry from meshes and point clouds.

Pose Estimation

In Defense of Image Pre-Training for Spatiotemporal Recognition

1 code implementation3 May 2022 Xianhang Li, Huiyu Wang, Chen Wei, Jieru Mei, Alan Yuille, Yuyin Zhou, Cihang Xie

Inspired by this observation, we hypothesize that the key to effectively leveraging image pre-training lies in the decomposition of learning spatial and temporal features, and revisiting image pre-training as the appearance prior to initializing 3D kernels.

STS Video Recognition

Fast AdvProp

1 code implementation ICLR 2022 Jieru Mei, Yucheng Han, Yutong Bai, Yixiao Zhang, Yingwei Li, Xianhang Li, Alan Yuille, Cihang Xie

Specifically, our modifications in Fast AdvProp are guided by the hypothesis that disentangled learning with adversarial examples is the key for performance improvements, while other training recipes (e. g., paired clean and adversarial training samples, multi-step adversarial attackers) could be largely simplified.

Data Augmentation object-detection +1

CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation

1 code implementation22 Mar 2022 Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen

Recent advances in self-supervised contrastive learning yield good image-level representation, which favors classification tasks but usually neglects pixel-level detailed information, leading to unsatisfactory transfer performance to dense prediction tasks such as semantic segmentation.

Contrastive Learning Representation Learning +1

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

1 code implementation CVPR 2022 Yingwei Li, Adams Wei Yu, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Bo Wu, Yifeng Lu, Denny Zhou, Quoc V. Le, Alan Yuille, Mingxing Tan

In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e. g., rotation, to enable accurate geometric alignment between lidar points and image pixels, and LearnableAlign that leverages cross-attention to dynamically capture the correlations between image and lidar features during fusion.

3D Object Detection Autonomous Driving +2

Lite Vision Transformer with Enhanced Self-Attention

1 code implementation CVPR 2022 Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zijun Wei, Zhe Lin, Alan Yuille

We propose Lite Vision Transformer (LVT), a novel light-weight transformer network with two enhanced self-attention mechanisms to improve the model performances for mobile deployment.

Panoptic Segmentation

MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification

1 code implementation3 Dec 2021 Jingye Chen, Jieneng Chen, Zongwei Zhou, Bin Li, Alan Yuille, Yongyi Lu

However, these approaches formulated skin cancer diagnosis as a simple classification task, dismissing the potential benefit from lesion segmentation.

Classification Lesion Classification +2

OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

no code implementations29 Nov 2021 Bingchen Zhao, Shaozuo Yu, Wufei Ma, Mingxin Yu, Shenxiao Mei, Angtian Wang, Ju He, Alan Yuille, Adam Kortylewski

One reason is that existing robustness benchmarks are limited, as they either rely on synthetic data or ignore the effects of individual nuisance factors.

3D Pose Estimation Benchmarking +4

Learning from Temporal Gradient for Semi-supervised Action Recognition

1 code implementation CVPR 2022 Junfei Xiao, Longlong Jing, Lin Zhang, Ju He, Qi She, Zongwei Zhou, Alan Yuille, Yingwei Li

Our method achieves the state-of-the-art performance on three video action recognition benchmarks (i. e., Kinetics-400, UCF-101, and HMDB-51) under several typical semi-supervised settings (i. e., different ratios of labeled data).

Action Recognition Temporal Action Localization

TransMix: Attend to Mix for Vision Transformers

2 code implementations CVPR 2022 Jie-Neng Chen, Shuyang Sun, Ju He, Philip Torr, Alan Yuille, Song Bai

The confidence of the label will be larger if the corresponding input image is weighted higher by the attention map.

Instance Segmentation object-detection +2

Searching for TrioNet: Combining Convolution with Local and Global Self-Attention

no code implementations15 Nov 2021 Huaijin Pi, Huiyu Wang, Yingwei Li, Zizhang Li, Alan Yuille

In order to effectively search in this huge architecture space, we propose Hierarchical Sampling for better training of the supernet.

Neural Architecture Search

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge

no code implementations15 Nov 2021 Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario.

Association Instance Segmentation +4

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

1 code implementation NeurIPS 2021 Angtian Wang, Shenxiao Mei, Alan Yuille, Adam Kortylewski

The model is initialized from a few labelled images and is subsequently used to synthesize feature representations of unseen 3D views.

3D Pose Estimation Few-Shot Learning

Image BERT Pre-training with Online Tokenizer

no code implementations ICLR 2022 Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong

The success of language Transformers is primarily attributed to the pretext task of masked language modeling (MLM), where texts are first tokenized into semantically meaningful pieces.

Image Classification Instance Segmentation +5

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

1 code implementation11 Sep 2021 Shiyu Tang, Ruihao Gong, Yan Wang, Aishan Liu, Jiakai Wang, Xinyun Chen, Fengwei Yu, Xianglong Liu, Dawn Song, Alan Yuille, Philip H. S. Torr, DaCheng Tao

Thus, we propose RobustART, the first comprehensive Robustness investigation benchmark on ImageNet regarding ARchitecture design (49 human-designed off-the-shelf architectures and 1200+ networks from neural architecture search) and Training techniques (10+ techniques, e. g., data augmentation) towards diverse noises (adversarial, natural, and system noises).

Adversarial Robustness Benchmarking +2

Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement

no code implementations CVPR 2021 Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao

For a given unsupervised task, we design multilevel tasks and define different learning stages for the deep network.

Simulated Adversarial Testing of Face Recognition Models

no code implementations CVPR 2022 Nataniel Ruiz, Adam Kortylewski, Weichao Qiu, Cihang Xie, Sarah Adel Bargal, Alan Yuille, Stan Sclaroff

In this work, we propose a framework for learning how to test machine learning algorithms using simulators in an adversarial manner in order to find weaknesses in the model before deploying it in critical scenarios.

BIG-bench Machine Learning Face Recognition

Glance-and-Gaze Vision Transformer

1 code implementation NeurIPS 2021 Qihang Yu, Yingda Xia, Yutong Bai, Yongyi Lu, Alan Yuille, Wei Shen

It is motivated by the Glance and Gaze behavior of human beings when recognizing objects in natural scenes, with the ability to efficiently model both long-range dependencies and local context.

Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning

1 code implementation1 Jun 2021 Ju He, Adam Kortylewski, Shaokang Yang, Shuai Liu, Cheng Yang, Changhu Wang, Alan Yuille

In particular, we decouple the training of the representation and the classifier, and systematically investigate the effects of different data re-sampling techniques when training the whole network including a classifier as well as fine-tuning the feature extractor only.

Visual analogy: Deep learning versus compositional models

no code implementations14 May 2021 Nicholas Ichien, Qing Liu, Shuhao Fu, Keith J. Holyoak, Alan Yuille, Hongjing Lu

We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) directly trained to solve these analogy problems, as well as to that of a compositional model that assesses relational similarity between part-based representations.

Visual Analogies

Auto-FedAvg: Learnable Federated Averaging for Multi-Institutional Medical Image Segmentation

no code implementations20 Apr 2021 Yingda Xia, Dong Yang, Wenqi Li, Andriy Myronenko, Daguang Xu, Hirofumi Obinata, Hitoshi Mori, Peng An, Stephanie Harmon, Evrim Turkbey, Baris Turkbey, Bradford Wood, Francesca Patella, Elvira Stellato, Gianpaolo Carrafiello, Anna Ierardi, Alan Yuille, Holger Roth

In this work, we design a new data-driven approach, namely Auto-FedAvg, where aggregation weights are dynamically adjusted, depending on data distributions across data silos and the current training progress of the models.

Federated Learning Image Segmentation +3

Self-Supervised Pillar Motion Learning for Autonomous Driving

1 code implementation CVPR 2021 Chenxu Luo, Xiaodong Yang, Alan Yuille

Autonomous driving can benefit from motion behavior comprehension when interacting with diverse traffic participants in highly dynamic environments.

Autonomous Driving Motion Estimation

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation

1 code implementation ICCV 2021 Jiteng Mu, Weichao Qiu, Adam Kortylewski, Alan Yuille, Nuno Vasconcelos, Xiaolong Wang

To deal with the large shape variance, we introduce Articulated Signed Distance Functions (A-SDF) to represent articulated shapes with a disentangled latent space, where we have separate codes for encoding shape and articulation.

CateNorm: Categorical Normalization for Robust Medical Image Segmentation

1 code implementation29 Mar 2021 Junfei Xiao, Lequan Yu, Zongwei Zhou, Yutong Bai, Lei Xing, Alan Yuille, Yuyin Zhou

We propose a new normalization strategy, named categorical normalization (CateNorm), to normalize the activations according to categorical statistics.

Image Segmentation Medical Image Segmentation +1

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

no code implementations CVPR 2021 Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang

However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of objects and (b) missing object predictions.

Instance Segmentation Semantic Segmentation +1

Understanding Catastrophic Forgetting and Remembering in Continual Learning with Optimal Relevance Mapping

1 code implementation22 Feb 2021 Prakhar Kaushik, Alex Gain, Adam Kortylewski, Alan Yuille

Additionally, current approaches that deal with forgetting ignore the problem of catastrophic remembering, i. e. the worsening ability to discriminate between data from different tasks.

Continual Learning

Occluded Video Instance Segmentation: A Benchmark

1 code implementation2 Feb 2021 Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16. 3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario.

Association Instance Segmentation +3

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation

1 code implementation ICLR 2021 Angtian Wang, Adam Kortylewski, Alan Yuille

Using differentiable rendering we estimate the 3D object pose by minimizing the reconstruction error between NeMo and the feature representation of the target image.

3D Pose Estimation Contrastive Learning

CORL: Compositional Representation Learning for Few-Shot Classification

no code implementations28 Jan 2021 Ju He, Adam Kortylewski, Alan Yuille

In particular, during meta-learning, we train a knowledge base that consists of a dictionary of component representations and a dictionary of component activation maps that encode common spatial activation patterns of components.

Classification Few-Shot Image Classification +3

Meticulous Object Segmentation

1 code implementation13 Dec 2020 Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zhe Lin, Alan Yuille

To evaluate segmentation quality near object boundaries, we propose the Meticulosity Quality (MQ) score considering both the mask coverage and boundary precision.

Image Segmentation Semantic Segmentation

Mask Guided Matting via Progressive Refinement Network

1 code implementation CVPR 2021 Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.

Image Matting

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

1 code implementation CVPR 2021 Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public.

Depth-aware Video Panoptic Segmentation Monocular Depth Estimation +2

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

2 code implementations CVPR 2021 Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

As a result, MaX-DeepLab shows a significant 7. 1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time.

Panoptic Segmentation

Unsupervised Part Discovery via Feature Alignment

no code implementations1 Dec 2020 Mengqi Guo, Yutong Bai, Zhishuai Zhang, Adam Kortylewski, Alan Yuille

Specifically, given a training image, we find a set of similar images that show instances of the same object category in the same pose, through an affine alignment of their corresponding feature maps.

Object Recognition

Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

no code implementations1 Dec 2020 Christian Cosgrove, Adam Kortylewski, Chenglin Yang, Alan Yuille

Second, we find that compositional deep networks, which have part-based representations that lead to innate robustness to natural occlusion, are robust to patch attacks on PASCAL3D+ and the German Traffic Sign Recognition Benchmark, without adversarial training.

Traffic Sign Recognition

Batch Normalization with Enhanced Linear Transformation

1 code implementation28 Nov 2020 Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille

Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.

Amodal Segmentation through Out-of-Task and Out-of-Distribution Generalization with a Bayesian Model

1 code implementation CVPR 2022 Yihong Sun, Adam Kortylewski, Alan Yuille

Moreover, by leveraging an outlier process, Bayesian models can further generalize out-of-distribution to segment partially occluded objects and to predict their amodal object boundaries.

Amodal Instance Segmentation Out-of-Distribution Generalization +1

Shape-Texture Debiased Neural Network Training

1 code implementation ICLR 2021 Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, Cihang Xie

To prevent models from exclusively attending on a single cue in representation learning, we augment training data with images with conflicting shape and texture information (eg, an image of chimpanzee shape but with lemon texture) and, most importantly, provide the corresponding supervisions from shape and texture simultaneously.

Adversarial Robustness Data Augmentation +2

CO2: Consistent Contrast for Unsupervised Visual Representation Learning

no code implementations ICLR 2021 Chen Wei, Huiyu Wang, Wei Shen, Alan Yuille

Regarding the similarity of the query crop to each crop from other images as "unlabeled", the consistency term takes the corresponding similarity of a positive crop as a pseudo label, and encourages consistency between these two similarities.

Contrastive Learning Image Classification +5

Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

no code implementations27 Aug 2020 Zhuotun Zhu, Dakai Jin, Ke Yan, Tsung-Ying Ho, Xianghua Ye, Dazhou Guo, Chun-Hung Chao, Jing Xiao, Alan Yuille, Le Lu

Finding, identifying and segmenting suspicious cancer metastasized lymph nodes from 3D multi-modality imaging is a clinical task of paramount importance.

ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation

1 code implementation12 Aug 2020 Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille

In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.

Probabilistic Multi-modal Trajectory Prediction with Lane Attention for Autonomous Vehicles

no code implementations6 Jul 2020 Chenxu Luo, Lin Sun, Dariush Dabiri, Alan Yuille

As for vehicles, their trajectories are significantly influenced by the lane geometry and how to effectively use the lane information is of active interest.

Autonomous Vehicles Motion Forecasting +1

Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation

no code implementations28 Jun 2020 Yingda Xia, Dong Yang, Zhiding Yu, Fengze Liu, Jinzheng Cai, Lequan Yu, Zhuotun Zhu, Daguang Xu, Alan Yuille, Holger Roth

Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation.

Image Segmentation Organ Segmentation +5

Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition under Occlusion

no code implementations28 Jun 2020 Adam Kortylewski, Qing Liu, Angtian Wang, Yihong Sun, Alan Yuille

The structure of the compositional model enables CompositionalNets to decompose images into objects and context, as well as to further decompose object representations in terms of individual parts and the objects' pose.

Image Classification object-detection +2

Smooth Adversarial Training

1 code implementation25 Jun 2020 Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille, Quoc V. Le

SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82. 2% accuracy and 58. 6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9. 5% for accuracy and 11. 6% for robustness.

Adversarial Defense Adversarial Robustness

Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

no code implementations27 May 2020 Zhuotun Zhu, Ke Yan, Dakai Jin, Jinzheng Cai, Tsung-Ying Ho, Adam P. Harrison, Dazhou Guo, Chun-Hung Chao, Xianghua Ye, Jing Xiao, Alan Yuille, Le Lu

We focus on the detection and segmentation of oncology-significant (or suspicious cancer metastasized) lymph nodes (OSLNs), which has not been studied before as a computational task.

JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D Multi-Modal Image Alignment of Large-scale Pathological CT Scans

no code implementations ECCV 2020 Fengze Liu, Jingzheng Cai, Yuankai Huo, Chi-Tung Cheng, Ashwin Raju, Dakai Jin, Jing Xiao, Alan Yuille, Le Lu, Chien-Hung Liao, Adam P. Harrison

We extensively evaluate our JSSR system on a large-scale medical image dataset containing 1, 485 patient CT imaging studies of four different phases (i. e., 5, 940 3D CT scans with pathological livers) on the registration, segmentation and synthesis tasks.

Image Registration Multi-Task Learning +1

Robust Object Detection under Occlusion with Context-Aware CompositionalNets

no code implementations CVPR 2020 Angtian Wang, Yihong Sun, Adam Kortylewski, Alan Yuille

In this work, we propose to overcome two limitations of CompositionalNets which will enable them to detect partially occluded objects: 1) CompositionalNets, as well as other DCNN architectures, do not explicitly separate the representation of the context from the object itself.

object-detection Robust Object Detection

Domain Adaptive Relational Reasoning for 3D Multi-Organ Segmentation

no code implementations18 May 2020 Shuhao Fu, Yongyi Lu, Yan Wang, Yuyin Zhou, Wei Shen, Elliot Fishman, Alan Yuille

In this paper, we present a novel unsupervised domain adaptation (UDA) method, named Domain Adaptive Relational Reasoning (DARR), to generalize 3D multi-organ segmentation models to medical data collected from different scanners and/or protocols (domains).

Organ Segmentation Relational Reasoning +2

Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

no code implementations CVPR 2020 Dazhou Guo, Dakai Jin, Zhuotun Zhu, Tsung-Ying Ho, Adam P. Harrison, Chun-Hung Chao, Jing Xiao, Alan Yuille, Chien-Yu Lin, Le Lu

This is the goal of our work, where we introduce stratified organ at risk segmentation (SOARS), an approach that stratifies OARs into anchor, mid-level, and small & hard (S&H) categories.

Anatomy Neural Architecture Search

Context-Aware Group Captioning via Self-Attention and Contrastive Features

no code implementations CVPR 2020 Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille

In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.

Image Captioning

Neural Architecture Search for Lightweight Non-Local Networks

2 code implementations CVPR 2020 Yingwei Li, Xiaojie Jin, Jieru Mei, Xiaochen Lian, Linjie Yang, Cihang Xie, Qihang Yu, Yuyin Zhou, Song Bai, Alan Yuille

However, it has been rarely explored to embed the NL blocks in mobile neural networks, mainly due to the following challenges: 1) NL blocks generally have heavy computation cost which makes it difficult to be applied in applications where computational resources are limited, and 2) it is an open problem to discover an optimal configuration to embed NL blocks into mobile neural networks.

Image Classification Neural Architecture Search

Are Labels Necessary for Neural Architecture Search?

2 code implementations ECCV 2020 Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie

Existing neural network architectures in computer vision -- whether designed by humans or by machines -- were typically found using both images and their associated labels.

Neural Architecture Search

Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation

1 code implementation ECCV 2020 Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, Alan Yuille

The ability to detect failures and anomalies are fundamental requirements for building reliable systems for computer vision applications, especially safety-critical applications of semantic segmentation, such as autonomous driving and medical image analysis.

Ranked #5 on Anomaly Detection on Road Anomaly (using extra training data)

Anomaly Detection Autonomous Driving +2

Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion

1 code implementation CVPR 2020 Adam Kortylewski, Ju He, Qing Liu, Alan Yuille

Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model with innate robustness to partial occlusion.

General Classification

When Radiology Report Generation Meets Knowledge Graph

no code implementations19 Feb 2020 Yixiao Zhang, Xiaosong Wang, Ziyue Xu, Qihang Yu, Alan Yuille, Daguang Xu

In addition, we proposed a new evaluation metric for radiology image reporting with the assistance of the same composed graph.

Graph Embedding Image Captioning

AtomNAS: Fine-Grained End-to-End Neural Architecture Search

1 code implementation ICLR 2020 Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, Alan Yuille, Jianchao Yang

We propose a fine-grained search space comprised of atomic blocks, a minimal search unit that is much smaller than the ones used in recent NAS algorithms.

Neural Architecture Search

Learning from Synthetic Animals

2 code implementations CVPR 2020 Jiteng Mu, Weichao Qiu, Gregory Hager, Alan Yuille

Despite great success in human parsing, progress for parsing other deformable articulated objects, like animals, is still limited by the lack of labeled data.

Domain Adaptation Human Parsing +1

DASZL: Dynamic Action Signatures for Zero-shot Learning

no code implementations8 Dec 2019 Tae Soo Kim, Jonathan D. Jones, Michael Peven, Zihao Xiao, Jin Bai, Yi Zhang, Weichao Qiu, Alan Yuille, Gregory D. Hager

There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large.

Action Detection Activity Detection +3

RSA: Randomized Simulation as Augmentation for Robust Human Action Recognition

no code implementations3 Dec 2019 Yi Zhang, Xinyue Wei, Weichao Qiu, Zihao Xiao, Gregory D. Hager, Alan Yuille

In this paper, we propose the Randomized Simulation as Augmentation (RSA) framework which augments real-world training data with synthetic data to improve the robustness of action recognition networks.

Action Recognition Temporal Action Localization

Identifying Model Weakness with Adversarial Examiner

no code implementations25 Nov 2019 Michelle Shu, Chenxi Liu, Weichao Qiu, Alan Yuille

Different from the existing strategy to always give the same (distribution of) test data, the adversarial examiner will dynamically select the next test data to hand out based on the testing history so far, with the goal being to undermine the model's performance.

Autonomous Driving

Deeply Shape-guided Cascade for Instance Segmentation

1 code implementation CVPR 2021 Hao Ding, Siyuan Qiao, Alan Yuille, Wei Shen

The key to a successful cascade architecture for precise instance segmentation is to fully leverage the relationship between bounding box detection and mask segmentation across multiple stages.

Instance Segmentation Region Proposal +1

Adversarial Examples Improve Image Recognition

6 code implementations CVPR 2020 Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang, Alan Yuille, Quoc V. Le

We show that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger.

Image Classification

Rethinking Normalization and Elimination Singularity in Neural Networks

1 code implementation21 Nov 2019 Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille

To address this issue, we propose BatchChannel Normalization (BCN), which uses batch knowledge to avoid the elimination singularities in the training of channel-normalized models.

Image Classification Instance Segmentation +3

Localizing Occluders with Compositional Convolutional Networks

no code implementations18 Nov 2019 Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille

Our experimental results demonstrate that the proposed extensions increase the model's performance at localizing occluders as well as at classifying partially occluded objects.

Grouped Spatial-Temporal Aggregation for Efficient Action Recognition

1 code implementation ICCV 2019 Chenxu Luo, Alan Yuille

This decomposition is more parameter-efficient and enables us to quantitatively analyze the contributions of spatial and temporal features in different layers.

Action Recognition

TDAPNet: Prototype Network with Recurrent Top-Down Attention for Robust Object Classification under Partial Occlusion

no code implementations9 Sep 2019 Mingqing Xiao, Adam Kortylewski, Ruihai Wu, Siyuan Qiao, Wei Shen, Alan Yuille

Despite deep convolutional neural networks' great success in object classification, it suffers from severe generalization performance drop under occlusion due to the inconsistency between training and testing data.

General Classification Object Recognition

Hyper-Pairing Network for Multi-Phase Pancreatic Ductal Adenocarcinoma Segmentation

no code implementations3 Sep 2019 Yuyin Zhou, Yingwei Li, Zhishuai Zhang, Yan Wang, Angtian Wang, Elliot Fishman, Alan Yuille, Seyoun Park

Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers with an overall five-year survival rate of 8%.

Deep Differentiable Random Forests for Age Estimation

no code implementations23 Jul 2019 Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo wang, Alan Yuille

Both of them connect split nodes to the top layer of convolutional neural networks (CNNs) and deal with inhomogeneous data by jointly learning input-dependent data partitions at the split nodes and age distributions at the leaf nodes.

Age Estimation regression

Multi-Scale Attentional Network for Multi-Focal Segmentation of Active Bleed after Pelvic Fractures

no code implementations23 Jun 2019 Yuyin Zhou, David Dreizin, Yingwei Li, Zhishuai Zhang, Yan Wang, Alan Yuille

Trauma is the worldwide leading cause of death and disability in those younger than 45 years, and pelvic fractures are a major source of morbidity and mortality.

Intriguing properties of adversarial training at scale

no code implementations ICLR 2020 Cihang Xie, Alan Yuille

This two-domain hypothesis may explain the issue of BN when training with a mixture of clean and adversarial images, as estimating normalization statistics of this mixture distribution is challenging.

Adversarial Robustness

V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation

no code implementations6 Jun 2019 Zhuotun Zhu, Chenxi Liu, Dong Yang, Alan Yuille, Daguang Xu

Deep learning algorithms, in particular 2D and 3D fully convolutional neural networks (FCNs), have rapidly become the mainstream methodology for volumetric medical image segmentation.

Image Segmentation Neural Architecture Search +2

Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion

no code implementations28 May 2019 Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille

In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks.

General Classification Image Classification

Robustness of Object Recognition under Extreme Occlusion in Humans and Computational Models

1 code implementation11 May 2019 Hongru Zhu, Peng Tang, Jeongho Park, Soojin Park, Alan Yuille

We test both humans and the above-mentioned computational models in a challenging task of object recognition under extreme occlusion, where target objects are heavily occluded by irrelevant real objects in real backgrounds.

Object Recognition

Structured Prediction using cGANs with Fusion Discriminator

no code implementations ICLR 2019 Faisal Mahmood, Wenhao Xu, Nicholas J. Durr, Jeremiah W. Johnson, Alan Yuille

We propose the fusion discriminator, a single unified framework for incorporating conditional information into a generative adversarial network (GAN) for a variety of distinct structured prediction tasks, including image synthesis, semantic segmentation, and depth estimation.

Depth Estimation Image Generation +2

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval

1 code implementation ICCV 2019 Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille

Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications.

Domain Adaptation Retrieval +2

An Alarm System For Segmentation Algorithm Based On Shape Model

no code implementations ICLR 2019 Fengze Liu, Yingda Xia, Dong Yang, Alan Yuille, Daguang Xu

Motivated by this, in this paper, we learn a feature space using the shape information which is a strong prior shared among different datasets and robust to the appearance variation of input data. The shape feature is captured using a Variational Auto-Encoder (VAE) network that trained with only the ground truth masks.

CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions

3 code implementations CVPR 2019 Runtao Liu, Chenxi Liu, Yutong Bai, Alan Yuille

Yet there has been evidence that current benchmark datasets suffer from bias, and current state-of-the-art models cannot be easily evaluated on their intermediate reasoning process.

Image Segmentation object-detection +9

ELASTIC: Improving CNNs with Dynamic Scaling Policies

1 code implementation CVPR 2019 Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari

We formulate the scaling policy as a non-linear function inside the network's structure that (a) is learned from data, (b) is instance specific, (c) does not add extra computation, and (d) can be applied on any network architecture.

General Classification Multi-Label Classification +1

Learning Transferable Adversarial Examples via Ghost Networks

1 code implementation9 Dec 2018 Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, Alan Yuille

The critical principle of ghost networks is to apply feature-level perturbations to an existing model to potentially create a huge set of diverse models.

Adversarial Attack

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization

1 code implementation CVPR 2019 Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille

By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.

Network Pruning Neural Architecture Search

3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training

no code implementations29 Nov 2018 Yingda Xia, Fengze Liu, Dong Yang, Jinzheng Cai, Lequan Yu, Zhuotun Zhu, Daguang Xu, Alan Yuille, Holger Roth

Meanwhile, a fully-supervised method based on our approach achieved state-of-the-art performances on both the LiTS liver tumor segmentation and the Medical Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value of our framework, even when fully supervised training is feasible.

Image Segmentation Medical Image Segmentation +2

Robust Face Detection via Learning Small Faces on Hard Images

1 code implementation28 Nov 2018 Zhishuai Zhang, Wei Shen, Siyuan Qiao, Yan Wang, Bo wang, Alan Yuille

In this paper, we propose that the robustness of a face detector against hard faces can be improved by learning small faces on hard images.

Face Detection

Semantic Part Detection via Matching: Learning to Generalize to Novel Viewpoints from Limited Training Data

1 code implementation ICCV 2019 Yutong Bai, Qing Liu, Lingxi Xie, Weichao Qiu, Yan Zheng, Alan Yuille

In particular, this enables images in the training dataset to be matched to a virtual 3D model of the object (for simplicity, we assume that the object viewpoint can be estimated by standard techniques).

Semantic Part Detection

OriNet: A Fully Convolutional Network for 3D Human Pose Estimation

1 code implementation12 Nov 2018 Chenxu Luo, Xiao Chu, Alan Yuille

We use limb orientations as a new way to represent 3D poses and bind the orientation together with the bounding box of each limb region to better associate images and predictions.

3D Human Pose Estimation

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding

1 code implementation14 Oct 2018 Chenxu Luo, Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia, Alan Yuille

Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.

Depth Estimation Optical Flow Estimation +2

Weakly Supervised Region Proposal Network and Object Detection

no code implementations ECCV 2018 Peng Tang, Xinggang Wang, Angtian Wang, Yongluan Yan, Wenyu Liu, Junzhou Huang, Alan Yuille

The Convolutional Neural Network (CNN) based region proposal generation method (i. e. region proposal network), trained using bounding box annotations, is an essential component in modern fully supervised object detectors.

object-detection Region Proposal +1

Rethinking Monocular Depth Estimation with Adversarial Training

no code implementations22 Aug 2018 Richard Chen, Faisal Mahmood, Alan Yuille, Nicholas J. Durr

Most existing approaches treat depth estimation as a regression problem with a local pixel-wise loss function.

Monocular Depth Estimation

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

4 code implementations9 Jul 2018 Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille

The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one.

Multiple Instance Learning object-detection +2

Resisting Large Data Variations via Introspective Transformation Network

no code implementations16 May 2018 Yunhan Zhao, Ye Tian, Charless Fowlkes, Wei Shen, Alan Yuille

Experimental results verify that our approach significantly improves the ability of deep networks to resist large variations between training and testing data and achieves classification accuracy improvements on several benchmark datasets, including MNIST, affNIST, SVHN, CIFAR-10 and miniImageNet.

Data Augmentation Few-Shot Learning

SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data

no code implementations1 Apr 2018 Qi Chen, Weichao Qiu, Yi Zhang, Lingxi Xie, Alan Yuille

But, this raises an important problem in active vision: given an {\bf infinite} data space, how to effectively sample a {\bf finite} subset to train a visual classifier?

Classification General Classification

Adversarial Attacks and Defences Competition

1 code implementation31 Mar 2018 Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.

BIG-bench Machine Learning

Scene Graph Parsing as Dependency Parsing

1 code implementation NAACL 2018 Yu-Siang Wang, Chenxi Liu, Xiaohui Zeng, Alan Yuille

The scene graphs generated by our learned neural dependency parser achieve an F-score similarity of 49. 67% to ground truth graphs on our evaluation set, surpassing best previous approaches by 5%.

Dependency Parsing Image Retrieval +2

Improving Transferability of Adversarial Examples with Input Diversity

1 code implementation CVPR 2019 Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jian-Yu Wang, Zhou Ren, Alan Yuille

We hope that our proposed attack strategy can serve as a strong benchmark baseline for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods in the future.

Adversarial Attack Image Classification

Deep Co-Training for Semi-Supervised Image Recognition

1 code implementation ECCV 2018 Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo wang, Alan Yuille

We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework.

Unleashing the Potential of CNNs for Interpretable Few-Shot Learning

no code implementations ICLR 2018 Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille

Our models are based on the idea of encoding objects in terms of visual concepts, which are interpretable visual cues represented by the feature vectors within CNNs.

Few-Shot Learning

Progressive Neural Architecture Search

13 code implementations ECCV 2018 Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms.

 Ranked #1 on Neural Architecture Search on ImageNet (Top-1 metric)

General Classification Image Classification +2