Search Results for author: Alan Yuille

Found 159 papers, 73 papers with code

TransMix: Attend to Mix for Vision Transformers

1 code implementation18 Nov 2021 Jie-Neng Chen, Shuyang Sun, Ju He, Philip Torr, Alan Yuille, Song Bai

The confidence of the label will be larger if the corresponding input image is weighted higher by the attention map.

Instance Segmentation Object Detection +1

Searching for TrioNet: Combining Convolution with Local and Global Self-Attention

1 code implementation15 Nov 2021 Huaijin Pi, Huiyu Wang, Yingwei Li, Zizhang Li, Alan Yuille

In order to effectively search in this huge architecture space, we propose Hierarchical Sampling for better training of the supernet.

Neural Architecture Search

iBOT: Image BERT Pre-Training with Online Tokenizer

no code implementations15 Nov 2021 Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong

We present a self-supervised framework iBOT that can perform masked prediction with an online tokenizer.

 Ranked #1 on Self-Supervised Image Classification on ImageNet (using extra training data)

Fine-tuning Instance Segmentation +4

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge

no code implementations15 Nov 2021 Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario.

Instance Segmentation Object Recognition +3

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

1 code implementation NeurIPS 2021 Angtian Wang, Shenxiao Mei, Alan Yuille, Adam Kortylewski

The model is initialized from a few labelled images and is subsequently used to synthesize feature representations of unseen 3D views.

3D Pose Estimation Few-Shot Learning

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

1 code implementation11 Sep 2021 Shiyu Tang, Ruihao Gong, Yan Wang, Aishan Liu, Jiakai Wang, Xinyun Chen, Fengwei Yu, Xianglong Liu, Dawn Song, Alan Yuille, Philip H. S. Torr, DaCheng Tao

Thus, we propose RobustART, the first comprehensive Robustness investigation benchmark on ImageNet (including open-source toolkit, pre-trained model zoo, datasets, and analyses) regarding ARchitecture design (44 human-designed off-the-shelf architectures and 1200+ networks from neural architecture search) and Training techniques (10+ general techniques, e. g., data augmentation) towards diverse noises (adversarial, natural, and system noises).

Adversarial Robustness Data Augmentation +1

Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement

no code implementations CVPR 2021 Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao

For a given unsupervised task, we design multilevel tasks and define different learning stages for the deep network.

Simulated Adversarial Testing of Face Recognition Models

no code implementations8 Jun 2021 Nataniel Ruiz, Adam Kortylewski, Weichao Qiu, Cihang Xie, Sarah Adel Bargal, Alan Yuille, Stan Sclaroff

In this work, we propose a framework for learning how to test machine learning algorithms using simulators in an adversarial manner in order to find weaknesses in the model before deploying it in critical scenarios.

Face Recognition

Glance-and-Gaze Vision Transformer

1 code implementation NeurIPS 2021 Qihang Yu, Yingda Xia, Yutong Bai, Yongyi Lu, Alan Yuille, Wei Shen

It is motivated by the Glance and Gaze behavior of human beings when recognizing objects in natural scenes, with the ability to efficiently model both long-range dependencies and local context.

Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning

1 code implementation1 Jun 2021 Ju He, Adam Kortylewski, Shaokang Yang, Shuai Liu, Cheng Yang, Changhu Wang, Alan Yuille

In particular, we decouple the training of the representation and the classifier, and systematically investigate the effects of different data re-sampling techniques when training the whole network including a classifier as well as fine-tuning the feature extractor only.


Visual analogy: Deep learning versus compositional models

no code implementations14 May 2021 Nicholas Ichien, Qing Liu, Shuhao Fu, Keith J. Holyoak, Alan Yuille, Hongjing Lu

We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) directly trained to solve these analogy problems, as well as to that of a compositional model that assesses relational similarity between part-based representations.

Auto-FedAvg: Learnable Federated Averaging for Multi-Institutional Medical Image Segmentation

no code implementations20 Apr 2021 Yingda Xia, Dong Yang, Wenqi Li, Andriy Myronenko, Daguang Xu, Hirofumi Obinata, Hitoshi Mori, Peng An, Stephanie Harmon, Evrim Turkbey, Baris Turkbey, Bradford Wood, Francesca Patella, Elvira Stellato, Gianpaolo Carrafiello, Anna Ierardi, Alan Yuille, Holger Roth

In this work, we design a new data-driven approach, namely Auto-FedAvg, where aggregation weights are dynamically adjusted, depending on data distributions across data silos and the current training progress of the models.

Federated Learning Lesion Segmentation +1

Self-Supervised Pillar Motion Learning for Autonomous Driving

1 code implementation CVPR 2021 Chenxu Luo, Xiaodong Yang, Alan Yuille

Autonomous driving can benefit from motion behavior comprehension when interacting with diverse traffic participants in highly dynamic environments.

Autonomous Driving Fine-tuning +1

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation

1 code implementation ICCV 2021 Jiteng Mu, Weichao Qiu, Adam Kortylewski, Alan Yuille, Nuno Vasconcelos, Xiaolong Wang

To deal with the large shape variance, we introduce Articulated Signed Distance Functions (A-SDF) to represent articulated shapes with a disentangled latent space, where we have separate codes for encoding shape and articulation.

DualNorm-UNet: Incorporating Global and Local Statistics for Robust Medical Image Segmentation

1 code implementation29 Mar 2021 Junfei Xiao, Lequan Yu, Lei Xing, Alan Yuille, Yuyin Zhou

However, BN only calculates the global statistics at the batch level, and applies the same affine transformation uniformly across all spatial coordinates, which would suppress the image contrast of different semantic structures.

Affine Transformation Medical Image Segmentation

CGPart: A Part Segmentation Dataset Based on 3D Computer Graphics Models

1 code implementation25 Mar 2021 Qing Liu, Adam Kortylewski, Zhishuai Zhang, Zizhang Li, Mengqi Guo, Qihao Liu, Xiaoding Yuan, Jiteng Mu, Weichao Qiu, Alan Yuille

In this paper, we introduce CGPart, a comprehensive part segmentation dataset that provides detailed annotations on 3D CAD models, synthetic images, and real test images.

Geometric Matching Transfer Learning +1

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

no code implementations CVPR 2021 Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang

However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of objects and (b) missing object predictions.

Instance Segmentation Semantic Segmentation +1

Understanding Catastrophic Forgetting and Remembering in Continual Learning with Optimal Relevance Mapping

1 code implementation22 Feb 2021 Prakhar Kaushik, Alex Gain, Adam Kortylewski, Alan Yuille

Additionally, current approaches that deal with forgetting ignore the problem of catastrophic remembering, i. e. the worsening ability to discriminate between data from different tasks.

Continual Learning

Occluded Video Instance Segmentation: A Benchmark

1 code implementation2 Feb 2021 Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai

On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16. 3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario.

Instance Segmentation Semantic Segmentation +2

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation

1 code implementation ICLR 2021 Angtian Wang, Adam Kortylewski, Alan Yuille

Using differentiable rendering we estimate the 3D object pose by minimizing the reconstruction error between NeMo and the feature representation of the target image.

3D Pose Estimation Contrastive Learning

COMPAS: Representation Learning with Compositional Part Sharing for Few-Shot Classification

no code implementations28 Jan 2021 Ju He, Adam Kortylewski, Alan Yuille

In particular, during meta-learning, we train a knowledge base that consists of a dictionary of part representations and a dictionary of part activation maps that encode common spatial activation patterns of parts.

Few-Shot Image Classification General Classification +2

Meticulous Object Segmentation

1 code implementation13 Dec 2020 Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zhe Lin, Alan Yuille

To evaluate segmentation quality near object boundaries, we propose the Meticulosity Quality (MQ) score considering both the mask coverage and boundary precision.

Semantic Segmentation

Mask Guided Matting via Progressive Refinement Network

1 code implementation CVPR 2021 Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

1 code implementation CVPR 2021 Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public.

Monocular Depth Estimation Panoptic Segmentation

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

2 code implementations CVPR 2021 Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

As a result, MaX-DeepLab shows a significant 7. 1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time.

Panoptic Segmentation

Unsupervised Part Discovery via Feature Alignment

no code implementations1 Dec 2020 Mengqi Guo, Yutong Bai, Zhishuai Zhang, Adam Kortylewski, Alan Yuille

Specifically, given a training image, we find a set of similar images that show instances of the same object category in the same pose, through an affine alignment of their corresponding feature maps.

Object Recognition

Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

no code implementations1 Dec 2020 Christian Cosgrove, Adam Kortylewski, Chenglin Yang, Alan Yuille

Second, we find that compositional deep networks, which have part-based representations that lead to innate robustness to natural occlusion, are robust to patch attacks on PASCAL3D+ and the German Traffic Sign Recognition Benchmark, without adversarial training.

Traffic Sign Recognition

Batch Normalization with Enhanced Linear Transformation

1 code implementation28 Nov 2020 Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille

Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.

Weakly-Supervised Amodal Instance Segmentation with Compositional Priors

no code implementations25 Oct 2020 Yihong Sun, Adam Kortylewski, Alan Yuille

In particular, we extend CompositionalNets to perform three new vision tasks from bounding box supervision only: 1) Learning compositional shape priors of objects in varying 3D poses from modal bounding box supervision; 2) Predicting instance segmentation by integrating the compositional shape priors into the part-voting mechanism in the CompositionalNets; 3) Predicting amodal completion for both the bounding box and the instance segmentation mask by implementing compositional feature alignment in CompositionalNets.

Amodal Instance Segmentation Semantic Segmentation

Shape-Texture Debiased Neural Network Training

1 code implementation ICLR 2021 Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, Cihang Xie

To prevent models from exclusively attending on a single cue in representation learning, we augment training data with images with conflicting shape and texture information (eg, an image of chimpanzee shape but with lemon texture) and, most importantly, provide the corresponding supervisions from shape and texture simultaneously.

Adversarial Robustness Data Augmentation +2

CO2: Consistent Contrast for Unsupervised Visual Representation Learning

no code implementations ICLR 2021 Chen Wei, Huiyu Wang, Wei Shen, Alan Yuille

Regarding the similarity of the query crop to each crop from other images as "unlabeled", the consistency term takes the corresponding similarity of a positive crop as a pseudo label, and encourages consistency between these two similarities.

Contrastive Learning Image Classification +3

CoKe: Localized Contrastive Learning for Robust Keypoint Detection

no code implementations29 Sep 2020 Yutong Bai, Angtian Wang, Adam Kortylewski, Alan Yuille

In this work, we take a step back and ask: Can we simply learn a local keypoint representation from the output of a standard backbone architecture?

Contrastive Learning Keypoint Detection +1

Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

no code implementations29 Aug 2020 Chun-Hung Chao, Zhuotun Zhu, Dazhou Guo, Ke Yan, Tsung-Ying Ho, Jinzheng Cai, Adam P. Harrison, Xianghua Ye, Jing Xiao, Alan Yuille, Min Sun, Le Lu, Dakai Jin

Specifically, we first utilize a 3D convolutional neural network with ROI-pooling to extract the GTV$_{LN}$'s instance-wise appearance features.

Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

no code implementations27 Aug 2020 Zhuotun Zhu, Dakai Jin, Ke Yan, Tsung-Ying Ho, Xianghua Ye, Dazhou Guo, Chun-Hung Chao, Jing Xiao, Alan Yuille, Le Lu

Finding, identifying and segmenting suspicious cancer metastasized lymph nodes from 3D multi-modality imaging is a clinical task of paramount importance.

ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation

1 code implementation12 Aug 2020 Hanwen Cao, Yongyi Lu, Cewu Lu, Bo Pang, Gongshen Liu, Alan Yuille

In this paper, we further improve spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which we find as two important factors for successful segmentation in dynamic point clouds.

Probabilistic Multi-modal Trajectory Prediction with Lane Attention for Autonomous Vehicles

no code implementations6 Jul 2020 Chenxu Luo, Lin Sun, Dariush Dabiri, Alan Yuille

As for vehicles, their trajectories are significantly influenced by the lane geometry and how to effectively use the lane information is of active interest.

Autonomous Vehicles Motion Forecasting +1

Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation

no code implementations28 Jun 2020 Yingda Xia, Dong Yang, Zhiding Yu, Fengze Liu, Jinzheng Cai, Lequan Yu, Zhuotun Zhu, Daguang Xu, Alan Yuille, Holger Roth

Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation.

Pancreas Segmentation Unsupervised Domain Adaptation +1

Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition under Occlusion

no code implementations28 Jun 2020 Adam Kortylewski, Qing Liu, Angtian Wang, Yihong Sun, Alan Yuille

The structure of the compositional model enables CompositionalNets to decompose images into objects and context, as well as to further decompose object representations in terms of individual parts and the objects' pose.

Image Classification Object Detection +1

Smooth Adversarial Training

1 code implementation25 Jun 2020 Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille, Quoc V. Le

SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82. 2% accuracy and 58. 6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9. 5% for accuracy and 11. 6% for robustness.

Adversarial Defense Adversarial Robustness

Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

no code implementations27 May 2020 Zhuotun Zhu, Ke Yan, Dakai Jin, Jinzheng Cai, Tsung-Ying Ho, Adam P. Harrison, Dazhou Guo, Chun-Hung Chao, Xianghua Ye, Jing Xiao, Alan Yuille, Le Lu

We focus on the detection and segmentation of oncology-significant (or suspicious cancer metastasized) lymph nodes (OSLNs), which has not been studied before as a computational task.

JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D Multi-Modal Image Alignment of Large-scale Pathological CT Scans

no code implementations ECCV 2020 Fengze Liu, Jingzheng Cai, Yuankai Huo, Chi-Tung Cheng, Ashwin Raju, Dakai Jin, Jing Xiao, Alan Yuille, Le Lu, Chien-Hung Liao, Adam P. Harrison

We extensively evaluate our JSSR system on a large-scale medical image dataset containing 1, 485 patient CT imaging studies of four different phases (i. e., 5, 940 3D CT scans with pathological livers) on the registration, segmentation and synthesis tasks.

Image Registration Multi-Task Learning +1

Robust Object Detection under Occlusion with Context-Aware CompositionalNets

no code implementations CVPR 2020 Angtian Wang, Yihong Sun, Adam Kortylewski, Alan Yuille

In this work, we propose to overcome two limitations of CompositionalNets which will enable them to detect partially occluded objects: 1) CompositionalNets, as well as other DCNN architectures, do not explicitly separate the representation of the context from the object itself.

Robust Object Detection

Domain Adaptive Relational Reasoning for 3D Multi-Organ Segmentation

no code implementations18 May 2020 Shuhao Fu, Yongyi Lu, Yan Wang, Yuyin Zhou, Wei Shen, Elliot Fishman, Alan Yuille

In this paper, we present a novel unsupervised domain adaptation (UDA) method, named Domain Adaptive Relational Reasoning (DARR), to generalize 3D multi-organ segmentation models to medical data collected from different scanners and/or protocols (domains).

Relational Reasoning Super-Resolution +1

Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

no code implementations CVPR 2020 Dazhou Guo, Dakai Jin, Zhuotun Zhu, Tsung-Ying Ho, Adam P. Harrison, Chun-Hung Chao, Jing Xiao, Alan Yuille, Chien-Yu Lin, Le Lu

This is the goal of our work, where we introduce stratified organ at risk segmentation (SOARS), an approach that stratifies OARs into anchor, mid-level, and small & hard (S&H) categories.

Neural Architecture Search

Context-Aware Group Captioning via Self-Attention and Contrastive Features

no code implementations CVPR 2020 Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille

In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.

Image Captioning

Neural Architecture Search for Lightweight Non-Local Networks

2 code implementations CVPR 2020 Yingwei Li, Xiaojie Jin, Jieru Mei, Xiaochen Lian, Linjie Yang, Cihang Xie, Qihang Yu, Yuyin Zhou, Song Bai, Alan Yuille

However, it has been rarely explored to embed the NL blocks in mobile neural networks, mainly due to the following challenges: 1) NL blocks generally have heavy computation cost which makes it difficult to be applied in applications where computational resources are limited, and 2) it is an open problem to discover an optimal configuration to embed NL blocks into mobile neural networks.

Image Classification Neural Architecture Search

Are Labels Necessary for Neural Architecture Search?

2 code implementations ECCV 2020 Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie

Existing neural network architectures in computer vision -- whether designed by humans or by machines -- were typically found using both images and their associated labels.

Neural Architecture Search

Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation

1 code implementation ECCV 2020 Yingda Xia, Yi Zhang, Fengze Liu, Wei Shen, Alan Yuille

The ability to detect failures and anomalies are fundamental requirements for building reliable systems for computer vision applications, especially safety-critical applications of semantic segmentation, such as autonomous driving and medical image analysis.

Anomaly Detection Autonomous Driving +2

Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion

1 code implementation CVPR 2020 Adam Kortylewski, Ju He, Qing Liu, Alan Yuille

Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model with innate robustness to partial occlusion.

General Classification

When Radiology Report Generation Meets Knowledge Graph

no code implementations19 Feb 2020 Yixiao Zhang, Xiaosong Wang, Ziyue Xu, Qihang Yu, Alan Yuille, Daguang Xu

In addition, we proposed a new evaluation metric for radiology image reporting with the assistance of the same composed graph.

Graph Embedding Image Captioning

AtomNAS: Fine-Grained End-to-End Neural Architecture Search

1 code implementation ICLR 2020 Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, Alan Yuille, Jianchao Yang

We propose a fine-grained search space comprised of atomic blocks, a minimal search unit that is much smaller than the ones used in recent NAS algorithms.

Neural Architecture Search

Learning from Synthetic Animals

2 code implementations CVPR 2020 Jiteng Mu, Weichao Qiu, Gregory Hager, Alan Yuille

Despite great success in human parsing, progress for parsing other deformable articulated objects, like animals, is still limited by the lack of labeled data.

Domain Adaptation Human Parsing +1

DASZL: Dynamic Action Signatures for Zero-shot Learning

no code implementations8 Dec 2019 Tae Soo Kim, Jonathan D. Jones, Michael Peven, Zihao Xiao, Jin Bai, Yi Zhang, Weichao Qiu, Alan Yuille, Gregory D. Hager

There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large.

Action Detection Activity Detection +3

RSA: Randomized Simulation as Augmentation for Robust Human Action Recognition

no code implementations3 Dec 2019 Yi Zhang, Xinyue Wei, Weichao Qiu, Zihao Xiao, Gregory D. Hager, Alan Yuille

In this paper, we propose the Randomized Simulation as Augmentation (RSA) framework which augments real-world training data with synthetic data to improve the robustness of action recognition networks.

Action Recognition

Deeply Shape-guided Cascade for Instance Segmentation

1 code implementation CVPR 2021 Hao Ding, Siyuan Qiao, Alan Yuille, Wei Shen

The key to a successful cascade architecture for precise instance segmentation is to fully leverage the relationship between bounding box detection and mask segmentation across multiple stages.

Instance Segmentation Region Proposal +1

Identifying Model Weakness with Adversarial Examiner

no code implementations25 Nov 2019 Michelle Shu, Chenxi Liu, Weichao Qiu, Alan Yuille

Different from the existing strategy to always give the same (distribution of) test data, the adversarial examiner will dynamically select the next test data to hand out based on the testing history so far, with the goal being to undermine the model's performance.

Autonomous Driving Object Classification

Adversarial Examples Improve Image Recognition

6 code implementations CVPR 2020 Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang, Alan Yuille, Quoc V. Le

We show that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger.

Image Classification

Rethinking Normalization and Elimination Singularity in Neural Networks

1 code implementation21 Nov 2019 Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille

To address this issue, we propose BatchChannel Normalization (BCN), which uses batch knowledge to avoid the elimination singularities in the training of channel-normalized models.

Image Classification Instance Segmentation +2

Localizing Occluders with Compositional Convolutional Networks

no code implementations18 Nov 2019 Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille

Our experimental results demonstrate that the proposed extensions increase the model's performance at localizing occluders as well as at classifying partially occluded objects.

Grouped Spatial-Temporal Aggregation for Efficient Action Recognition

1 code implementation ICCV 2019 Chenxu Luo, Alan Yuille

This decomposition is more parameter-efficient and enables us to quantitatively analyze the contributions of spatial and temporal features in different layers.

Action Recognition

TDAPNet: Prototype Network with Recurrent Top-Down Attention for Robust Object Classification under Partial Occlusion

no code implementations9 Sep 2019 Mingqing Xiao, Adam Kortylewski, Ruihai Wu, Siyuan Qiao, Wei Shen, Alan Yuille

Despite deep convolutional neural networks' great success in object classification, it suffers from severe generalization performance drop under occlusion due to the inconsistency between training and testing data.

General Classification Object Classification +1

Hyper-Pairing Network for Multi-Phase Pancreatic Ductal Adenocarcinoma Segmentation

no code implementations3 Sep 2019 Yuyin Zhou, Yingwei Li, Zhishuai Zhang, Yan Wang, Angtian Wang, Elliot Fishman, Alan Yuille, Seyoun Park

Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers with an overall five-year survival rate of 8%.

Deep Differentiable Random Forests for Age Estimation

no code implementations23 Jul 2019 Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo wang, Alan Yuille

Both of them connect split nodes to the top layer of convolutional neural networks (CNNs) and deal with inhomogeneous data by jointly learning input-dependent data partitions at the split nodes and age distributions at the leaf nodes.

Age Estimation

Multi-Scale Attentional Network for Multi-Focal Segmentation of Active Bleed after Pelvic Fractures

no code implementations23 Jun 2019 Yuyin Zhou, David Dreizin, Yingwei Li, Zhishuai Zhang, Yan Wang, Alan Yuille

Trauma is the worldwide leading cause of death and disability in those younger than 45 years, and pelvic fractures are a major source of morbidity and mortality.

Intriguing properties of adversarial training at scale

no code implementations ICLR 2020 Cihang Xie, Alan Yuille

This two-domain hypothesis may explain the issue of BN when training with a mixture of clean and adversarial images, as estimating normalization statistics of this mixture distribution is challenging.

Adversarial Robustness

V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation

no code implementations6 Jun 2019 Zhuotun Zhu, Chenxi Liu, Dong Yang, Alan Yuille, Daguang Xu

Deep learning algorithms, in particular 2D and 3D fully convolutional neural networks (FCNs), have rapidly become the mainstream methodology for volumetric medical image segmentation.

Neural Architecture Search Volumetric Medical Image Segmentation

Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion

no code implementations28 May 2019 Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, Alan Yuille

In this work, we combine DCNNs and compositional object models to retain the best of both approaches: a discriminative model that is robust to partial occlusion and mask attacks.

General Classification Image Classification +1

Robustness of Object Recognition under Extreme Occlusion in Humans and Computational Models

1 code implementation11 May 2019 Hongru Zhu, Peng Tang, Jeongho Park, Soojin Park, Alan Yuille

We test both humans and the above-mentioned computational models in a challenging task of object recognition under extreme occlusion, where target objects are heavily occluded by irrelevant real objects in real backgrounds.

Object Recognition

Structured Prediction using cGANs with Fusion Discriminator

no code implementations ICLR 2019 Faisal Mahmood, Wenhao Xu, Nicholas J. Durr, Jeremiah W. Johnson, Alan Yuille

We propose the fusion discriminator, a single unified framework for incorporating conditional information into a generative adversarial network (GAN) for a variety of distinct structured prediction tasks, including image synthesis, semantic segmentation, and depth estimation.

Depth Estimation Image Generation +2

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval

1 code implementation ICCV 2019 Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille

Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications.

Domain Adaptation Sketch-Based Image Retrieval +1

An Alarm System For Segmentation Algorithm Based On Shape Model

no code implementations ICLR 2019 Fengze Liu, Yingda Xia, Dong Yang, Alan Yuille, Daguang Xu

Motivated by this, in this paper, we learn a feature space using the shape information which is a strong prior shared among different datasets and robust to the appearance variation of input data. The shape feature is captured using a Variational Auto-Encoder (VAE) network that trained with only the ground truth masks.

CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions

3 code implementations CVPR 2019 Runtao Liu, Chenxi Liu, Yutong Bai, Alan Yuille

Yet there has been evidence that current benchmark datasets suffer from bias, and current state-of-the-art models cannot be easily evaluated on their intermediate reasoning process.

Object Detection Question Answering +5

ELASTIC: Improving CNNs with Dynamic Scaling Policies

1 code implementation CVPR 2019 Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari

We formulate the scaling policy as a non-linear function inside the network's structure that (a) is learned from data, (b) is instance specific, (c) does not add extra computation, and (d) can be applied on any network architecture.

General Classification Multi-Label Classification +1

Learning Transferable Adversarial Examples via Ghost Networks

1 code implementation9 Dec 2018 Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, Alan Yuille

The critical principle of ghost networks is to apply feature-level perturbations to an existing model to potentially create a huge set of diverse models.

Adversarial Attack

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization

1 code implementation CVPR 2019 Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille

By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.

Network Pruning Neural Architecture Search

3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training

no code implementations29 Nov 2018 Yingda Xia, Fengze Liu, Dong Yang, Jinzheng Cai, Lequan Yu, Zhuotun Zhu, Daguang Xu, Alan Yuille, Holger Roth

Meanwhile, a fully-supervised method based on our approach achieved state-of-the-art performances on both the LiTS liver tumor segmentation and the Medical Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value of our framework, even when fully supervised training is feasible.

Medical Image Segmentation Tumor Segmentation

Robust Face Detection via Learning Small Faces on Hard Images

1 code implementation28 Nov 2018 Zhishuai Zhang, Wei Shen, Siyuan Qiao, Yan Wang, Bo wang, Alan Yuille

In this paper, we propose that the robustness of a face detector against hard faces can be improved by learning small faces on hard images.

Face Detection

Semantic Part Detection via Matching: Learning to Generalize to Novel Viewpoints from Limited Training Data

1 code implementation ICCV 2019 Yutong Bai, Qing Liu, Lingxi Xie, Weichao Qiu, Yan Zheng, Alan Yuille

In particular, this enables images in the training dataset to be matched to a virtual 3D model of the object (for simplicity, we assume that the object viewpoint can be estimated by standard techniques).

Semantic Part Detection

OriNet: A Fully Convolutional Network for 3D Human Pose Estimation

1 code implementation12 Nov 2018 Chenxu Luo, Xiao Chu, Alan Yuille

We use limb orientations as a new way to represent 3D poses and bind the orientation together with the bounding box of each limb region to better associate images and predictions.

3D Human Pose Estimation

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding

1 code implementation14 Oct 2018 Chenxu Luo, Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia, Alan Yuille

Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.

Depth Estimation Optical Flow Estimation +2

Weakly Supervised Region Proposal Network and Object Detection

no code implementations ECCV 2018 Peng Tang, Xinggang Wang, Angtian Wang, Yongluan Yan, Wenyu Liu, Junzhou Huang, Alan Yuille

The Convolutional Neural Network (CNN) based region proposal generation method (i. e. region proposal network), trained using bounding box annotations, is an essential component in modern fully supervised object detectors.

Region Proposal Weakly Supervised Object Detection

Rethinking Monocular Depth Estimation with Adversarial Training

no code implementations22 Aug 2018 Richard Chen, Faisal Mahmood, Alan Yuille, Nicholas J. Durr

Most existing approaches treat depth estimation as a regression problem with a local pixel-wise loss function.

Monocular Depth Estimation

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

3 code implementations9 Jul 2018 Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille

The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one.

Multiple Instance Learning Object Recognition +1

Resisting Large Data Variations via Introspective Transformation Network

no code implementations16 May 2018 Yunhan Zhao, Ye Tian, Charless Fowlkes, Wei Shen, Alan Yuille

Experimental results verify that our approach significantly improves the ability of deep networks to resist large variations between training and testing data and achieves classification accuracy improvements on several benchmark datasets, including MNIST, affNIST, SVHN, CIFAR-10 and miniImageNet.

Data Augmentation Few-Shot Learning

SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data

no code implementations1 Apr 2018 Qi Chen, Weichao Qiu, Yi Zhang, Lingxi Xie, Alan Yuille

But, this raises an important problem in active vision: given an {\bf infinite} data space, how to effectively sample a {\bf finite} subset to train a visual classifier?

Classification General Classification

Adversarial Attacks and Defences Competition

1 code implementation31 Mar 2018 Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, Motoki Abe

To accelerate research on adversarial examples and robustness of machine learning classifiers, Google Brain organized a NIPS 2017 competition that encouraged researchers to develop new methods to generate adversarial examples as well as to develop new ways to defend against them.

Scene Graph Parsing as Dependency Parsing

1 code implementation NAACL 2018 Yu-Siang Wang, Chenxi Liu, Xiaohui Zeng, Alan Yuille

The scene graphs generated by our learned neural dependency parser achieve an F-score similarity of 49. 67% to ground truth graphs on our evaluation set, surpassing best previous approaches by 5%.

Dependency Parsing Image Retrieval +1

Improving Transferability of Adversarial Examples with Input Diversity

1 code implementation CVPR 2019 Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jian-Yu Wang, Zhou Ren, Alan Yuille

We hope that our proposed attack strategy can serve as a strong benchmark baseline for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods in the future.

Adversarial Attack Image Classification

Deep Co-Training for Semi-Supervised Image Recognition

1 code implementation ECCV 2018 Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo wang, Alan Yuille

We present Deep Co-Training, a deep learning based method inspired by the Co-Training framework.

Unleashing the Potential of CNNs for Interpretable Few-Shot Learning

no code implementations ICLR 2018 Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille

Our models are based on the idea of encoding objects in terms of visual concepts, which are interpretable visual cues represented by the feature vectors within CNNs.

Few-Shot Learning

Deep Regression Forests for Age Estimation

2 code implementations CVPR 2018 Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo wang, Alan Yuille

Age estimation from facial images is typically cast as a nonlinear regression problem.

Age Estimation

Progressive Neural Architecture Search

10 code implementations ECCV 2018 Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms.

General Classification Image Classification +1

Gradually Updated Neural Networks for Large-Scale Image Recognition

no code implementations ICML 2018 Siyuan Qiao, Zhishuai Zhang, Wei Shen, Bo wang, Alan Yuille

Our method is by introducing computation orderings to the channels within convolutional layers or blocks, based on which we gradually compute the outputs in a channel-wise manner.

Few-shot Learning by Exploiting Visual Concepts within CNNs

no code implementations22 Nov 2017 Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille

In this work, we address these limitations of CNNs by developing novel, flexible, and interpretable models for few-shot learning.

Few-Shot Learning

Visual Concepts and Compositional Voting

no code implementations13 Nov 2017 Jianyu Wang, Zhishuai Zhang, Cihang Xie, Yuyin Zhou, Vittal Premachandran, Jun Zhu, Lingxi Xie, Alan Yuille

We use clustering algorithms to study the population activities of the features and extract a set of visual concepts which we show are visually tight and correspond to semantic parts of vehicles.

Semantic Part Detection

Joint Multi-Person Pose Estimation and Semantic Part Segmentation

no code implementations CVPR 2017 Fangting Xia, Peng Wang, Xianjie Chen, Alan Yuille

To refine part segments, the refined pose and the original part potential are integrated through a Part FCN, where the skeleton feature from pose serves as additional regularization cues for part segments.

Human Detection Multi-Person Pose Estimation

Detecting Semantic Parts on Partially Occluded Objects

no code implementations25 Jul 2017 Jianyu Wang, Cihang Xie, Zhishuai Zhang, Jun Zhu, Lingxi Xie, Alan Yuille

Our approach detects semantic parts by accumulating the confidence of local visual cues.

Semantic Part Detection

ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond

no code implementations ICCV 2017 Siyuan Qiao, Wei Shen, Weichao Qiu, Chenxi Liu, Alan Yuille

We argue that estimation of object scales in images is helpful for generating object proposals, especially for supermarket images where object scales are usually within a small range.

Object Proposal Generation

Transfer of View-manifold Learning to Similarity Perception of Novel Objects

no code implementations31 Mar 2017 Xingyu Lin, Hao Wang, Zhihao LI, Yimeng Zhang, Alan Yuille, Tai Sing Lee

We develop a model of perceptual similarity judgment based on re-training a deep convolution neural network (DCNN) that learns to associate different views of each 3D object to capture the notion of object persistence and continuity in our visual experience.

Metric Learning

Multi-stage Multi-recursive-input Fully Convolutional Networks for Neuronal Boundary Detection

no code implementations ICCV 2017 Wei Shen, Bin Wang, Yuan Jiang, Yan Wang, Alan Yuille

This design is biologically-plausible, as it likes a human visual system to compare different possible segmentation solutions to address the ambiguous boundary issue.

Boundary Detection Electron Microscopy

Adversarial Examples for Semantic Segmentation and Object Detection

2 code implementations ICCV 2017 Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, Alan Yuille

Our observation is that both segmentation and detection are based on classifying multiple targets on an image (e. g., the basic target is a pixel or a receptive field in segmentation, and an object proposal in detection), which inspires us to optimize a loss function over a set of pixels/proposals for generating adversarial perturbations.

Adversarial Attack Object Detection +1

Recurrent Multimodal Interaction for Referring Image Segmentation

1 code implementation ICCV 2017 Chenxi Liu, Zhe Lin, Xiaohui Shen, Jimei Yang, Xin Lu, Alan Yuille

In this paper we are interested in the problem of image segmentation given natural language descriptions, i. e. referring expressions.

Semantic Segmentation

SORT: Second-Order Response Transform for Visual Recognition

no code implementations ICCV 2017 Yan Wang, Lingxi Xie, Chenxi Liu, Ya zhang, Wenjun Zhang, Alan Yuille

In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks.

Genetic CNN

1 code implementation ICCV 2017 Lingxi Xie, Alan Yuille

The deep Convolutional Neural Network (CNN) is the state-of-the-art solution for large-scale visual recognition.

Object Recognition

Deep Collaborative Learning for Visual Recognition

no code implementations3 Mar 2017 Yan Wang, Lingxi Xie, Ya zhang, Wenjun Zhang, Alan Yuille

We formulate the function of a convolutional layer as learning a large visual vocabulary, and propose an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity.

General Classification Image Classification

Label Distribution Learning Forests

no code implementations NeurIPS 2017 Wei Shen, Kai Zhao, Yilu Guo, Alan Yuille

This paper presents label distribution learning forests (LDLFs) - a novel label distribution learning algorithm based on differentiable decision trees, which have several advantages: 1) Decision trees have the potential to model any general form of label distributions by a mixture of leaf node predictions.

Representation Learning

MAT: A Multimodal Attentive Translator for Image Captioning

no code implementations18 Feb 2017 Chang Liu, Fuchun Sun, Changhu Wang, Feng Wang, Alan Yuille

In this way, the sequential representation of an image can be naturally translated to a sequence of words, as the target sequence of the RNN model.

Image Captioning Machine Translation +1

UnrealStereo: Controlling Hazardous Factors to Analyze Stereo Vision

no code implementations14 Dec 2016 Yi Zhang, Weichao Qiu, Qi Chen, Xiaolin Hu, Alan Yuille

We generate a large synthetic image dataset with automatically computed hazardous regions and analyze algorithms on these regions.

Image Generation

Symmetric Non-Rigid Structure from Motion for Category-Specific Object Structure Estimation

no code implementations22 Sep 2016 Yuan Gao, Alan Yuille

This paper addresses the estimation of 3D structures of symmetric objects from multiple images of the same object category, e. g. different cars, seen from various viewpoints.

Structure from Motion

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

1 code implementation13 Sep 2016 Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Xiang Bai, Alan Yuille

By observing the relationship between the receptive field sizes of the different layers in the network and the skeleton scales they can capture, we introduce two scale-associated side outputs to each stage of the network.

Multi-Task Learning Object Detection +1

UnrealCV: Connecting Computer Vision to Unreal Engine

1 code implementation5 Sep 2016 Weichao Qiu, Alan Yuille

Computer graphics can not only generate synthetic images and ground truth but it also offers the possibility of constructing virtual worlds in which: (i) an agent can perceive, navigate, and take actions guided by AI algorithms, (ii) properties of the worlds can be modified (e. g., material and reflectance), (iii) physical simulations can be performed, and (iv) algorithms can be learnt and evaluated.

Physical Simulations

Geometric Neural Phrase Pooling: Modeling the Spatial Co-occurrence of Neurons

no code implementations21 Jul 2016 Lingxi Xie, Qi Tian, John Flynn, Jingdong Wang, Alan Yuille

For this, we consider the neurons in the hidden layer as neural words, and construct a set of geometric neural phrases on top of them.

Image Classification

Attention Correctness in Neural Image Captioning

no code implementations31 May 2016 Chenxi Liu, Junhua Mao, Fei Sha, Alan Yuille

Attention mechanisms have recently been introduced in deep learning for various tasks in natural language processing and computer vision.

Image Captioning

InterActive: Inter-Layer Activeness Propagation

no code implementations CVPR 2016 Lingxi Xie, Liang Zheng, Jingdong Wang, Alan Yuille, Qi Tian

An increasing number of computer vision tasks can be tackled with deep features, which are the intermediate outputs of a pre-trained Convolutional Neural Network.

General Classification

Multi-Instance Visual-Semantic Embedding

no code implementations22 Dec 2015 Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang, Alan Yuille

Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space.

General Classification Image Classification +1

Ground-truth dataset and baseline evaluations for image base-detail separation algorithms

no code implementations21 Nov 2015 Xuan Dong, Boyan Bonev, Weixin Li, Weichao Qiu, Xianjie Chen, Alan Yuille

Base-detail separation is a fundamental computer vision problem consisting of modeling a smooth base layer with the coarse structures, and a detail layer containing the texture-like structures.

Unsupervised learning of object semantic parts from internal states of CNNs by population encoding

1 code implementation21 Nov 2015 Jianyu Wang, Zhishuai Zhang, Cihang Xie, Vittal Premachandran, Alan Yuille

We address the key question of how object part representations can be found from the internal states of CNNs that are trained for high-level tasks, such as object classification.

Keypoint Detection Object Classification

Fidelity-Naturalness Evaluation of Single Image Super Resolution

no code implementations21 Nov 2015 Xuan Dong, Yu Zhu, Weixin Li, Lingxi Xie, Alex Wong, Alan Yuille

In this paper, we proposed to use both fidelity (the difference with original images) and naturalness (human visual perception of super resolved images) for evaluation.

Image Quality Assessment Image Super-Resolution

DOC: Deep OCclusion Estimation From a Single Image

no code implementations20 Nov 2015 Peng Wang, Alan Yuille

In this paper we propose a deep network architecture, called DOC, which acts on a single image, detects object boundaries and estimates the border ownership (i. e. which side of the boundary is foreground and which is background).

Occlusion Estimation

Generation and Comprehension of Unambiguous Object Descriptions

1 code implementation CVPR 2016 Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan Yuille, Kevin Murphy

We propose a method that can generate an unambiguous description (known as a referring expression) of a specific object or region in an image, and which can also comprehend or interpret such an expression to infer which object is being described.

Image Captioning

Pose-Guided Human Parsing with Deep Learned Features

no code implementations17 Aug 2015 Fangting Xia, Jun Zhu, Peng Wang, Alan Yuille

Parsing human body into semantic regions is crucial to human-centric analysis.

Human Parsing

Joint Object and Part Segmentation using Deep Learned Potentials

no code implementations ICCV 2015 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille

Segmenting semantic objects from images and parsing them into their respective semantic parts are fundamental steps towards detailed object understanding in computer vision.

Semantic Segmentation

Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images

1 code implementation ICCV 2015 Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan Yuille

In particular, we propose a transposed weight sharing scheme, which not only improves performance on image captioning, but also makes the model more suitable for the novel concept learning task.

Image Captioning

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

2 code implementations20 Dec 2014 Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan Yuille

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions.

Image Captioning

Semantic Part Segmentation using Compositional Model combining Shape and Appearance

no code implementations CVPR 2015 Jianyu Wang, Alan Yuille

This is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of animals often have similar appearance and highly varying shapes.

Object Detection Pose Estimation +1

Parsing Occluded People by Flexible Compositions

no code implementations CVPR 2015 Xianjie Chen, Alan Yuille

We model humans using a graphical model which has a tree structure building on recent work [32, 6] and exploit the connectivity prior that, even in presence of occlusion, the visible nodes form a connected subtree of the graphical model.

Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations

no code implementations NeurIPS 2014 Xianjie Chen, Alan Yuille

More precisely, we specify a graphical model for human pose which exploits the fact the local image measurements can be used both to detect parts (or joints) and also to predict the spatial relationships between them (Image Dependent Pairwise Relations).

Pose Estimation

Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding

no code implementations16 Jun 2014 Roozbeh Mottaghi, Sanja Fidler, Alan Yuille, Raquel Urtasun, Devi Parikh

Recent trends in image understanding have pushed for holistic scene understanding models that jointly reason about various tasks such as object detection, scene recognition, shape analysis, contextual reasoning, and local appearance based classifiers.

Object Detection Scene Recognition +2

Parsing Semantic Parts of Cars Using Graphical Models and Segment Appearance Consistency

no code implementations9 Jun 2014 Wenhao Lu, Xiaochen Lian, Alan Yuille

A novel mixture of graphical models is proposed, which dynamically couples the landmarks to a hierarchy of segments.

Bottom-Up Segmentation for Top-Down Detection

no code implementations CVPR 2013 Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun

When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP.

Object Detection Semantic Segmentation

Boundary Detection Benchmarking: Beyond F-Measures

no code implementations CVPR 2013 Xiaodi Hou, Alan Yuille, Christof Koch

For an ill-posed problem like boundary detection, human labeled datasets play a critical role.

Boundary Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.