Search Results for author: Hartwig Adam

Found 48 papers, 37 papers with code

VideoPrism: A Foundational Visual Encoder for Video Understanding

no code implementations • 20 Feb 2024 • Long Zhao, Nitesh B. Gundavarapu, Liangzhe Yuan, Hao Zhou, Shen Yan, Jennifer J. Sun, Luke Friedman, Rui Qian, Tobias Weyand, Yue Zhao, Rachel Hornung, Florian Schroff, Ming-Hsuan Yang, David A. Ross, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Ting Liu, Boqing Gong

We introduce VideoPrism, a general-purpose video encoder that tackles diverse video understanding tasks with a single frozen model.

Question Answering Video Question Answering +1

Paper
Add Code

Distilling Vision-Language Models on Millions of Videos

no code implementations • 11 Jan 2024 • Yue Zhao, Long Zhao, Xingyi Zhou, Jialin Wu, Chun-Te Chu, Hui Miao, Florian Schroff, Hartwig Adam, Ting Liu, Boqing Gong, Philipp Krähenbühl, Liangzhe Yuan

Our best model outperforms state-of-the-art methods on MSR-VTT zero-shot text-to-video retrieval by 6%.

Language Modelling Retrieval +2

Paper
Add Code

VideoPoet: A Large Language Model for Zero-Shot Video Generation

no code implementations • 21 Dec 2023 • Dan Kondratyuk, Lijun Yu, Xiuye Gu, José Lezama, Jonathan Huang, Grant Schindler, Rachel Hornung, Vighnesh Birodkar, Jimmy Yan, Ming-Chang Chiu, Krishna Somandepalli, Hassan Akbari, Yair Alon, Yong Cheng, Josh Dillon, Agrim Gupta, Meera Hahn, Anja Hauth, David Hendon, Alonso Martinez, David Minnen, Mikhail Sirotenko, Kihyuk Sohn, Xuan Yang, Hartwig Adam, Ming-Hsuan Yang, Irfan Essa, Huisheng Wang, David A. Ross, Bryan Seybold, Lu Jiang

We present VideoPoet, a language model capable of synthesizing high-quality video, with matching audio, from a large variety of conditioning signals.

Ranked #3 on Text-to-Video Generation on MSR-VTT

Language Modelling Large Language Model +2

Paper
Add Code

PolyMaX: General Dense Prediction with Mask Transformer

1 code implementation • 9 Nov 2023 • Xuan Yang, Liangzhe Yuan, Kimberly Wilber, Astuti Sharma, Xiuye Gu, Siyuan Qiao, Stephanie Debats, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Liang-Chieh Chen

Despite this shift, methods based on the per-pixel prediction paradigm still dominate the benchmarks on the other dense prediction tasks that require continuous outputs, such as depth estimation and surface normal prediction.

Ranked #1 on Surface Normals Estimation on NYU Depth v2

Monocular Depth Estimation Semantic Segmentation +2

985

Paper
Code

SANPO: A Scene Understanding, Accessibility, Navigation, Pathfinding, Obstacle Avoidance Dataset

no code implementations • 21 Sep 2023 • Sagar M. Waghmare, Kimberly Wilber, Dave Hawkey, Xuan Yang, Matthew Wilson, Stephanie Debats, Cattalyya Nuengsigkapian, Astuti Sharma, Lars Pandikow, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko

All synthetic sessions and a subset of real sessions have temporally consistent dense panoptic segmentation labels.

Depth Estimation Domain Adaptation +5

Paper
Add Code

VideoGLUE: Video General Understanding Evaluation of Foundation Models

1 code implementation • 6 Jul 2023 • Liangzhe Yuan, Nitesh Bharadwaj Gundavarapu, Long Zhao, Hao Zhou, Yin Cui, Lu Jiang, Xuan Yang, Menglin Jia, Tobias Weyand, Luke Friedman, Mikhail Sirotenko, Huisheng Wang, Florian Schroff, Hartwig Adam, Ming-Hsuan Yang, Ting Liu, Boqing Gong

We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action recognition, temporal localization, and spatiotemporal localization), eight datasets well received by the community, and four adaptation methods tailoring a foundation model (FM) for a downstream task.

Action Recognition Temporal Localization +1

76,579

Paper
Code

Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception

no code implementations • NeurIPS 2023 • Hassan Akbari, Dan Kondratyuk, Yin Cui, Rachel Hornung, Huisheng Wang, Hartwig Adam

We conduct extensive empirical studies and reveal the following key insights: 1) Performing gradient descent updates by alternating on diverse modalities, loss functions, and tasks, with varying input resolutions, efficiently improves the model.

Ranked #1 on Zero-Shot Action Recognition on Kinetics (using extra training data)

Classification Image Classification +7

Paper
Add Code

Unified Visual Relationship Detection with Vision and Language Models

1 code implementation • ICCV 2023 • Long Zhao, Liangzhe Yuan, Boqing Gong, Yin Cui, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu

To address this challenge, we propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models (VLMs).

Human-Object Interaction Detection Relationship Detection +2

2,988

Paper
Code

Improving Zero-shot Generalization and Robustness of Multi-modal Models

1 code implementation • CVPR 2023 • Yunhao Ge, Jie Ren, Andrew Gallagher, Yuxiao Wang, Ming-Hsuan Yang, Hartwig Adam, Laurent Itti, Balaji Lakshminarayanan, Jiaping Zhao

We also show that our method improves across ImageNet shifted datasets, four other datasets, and other model architectures such as LiT.

Image Classification Zero-shot Generalization

Paper
Code

MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models

2 code implementations • 4 Oct 2022 • Chenglin Yang, Siyuan Qiao, Qihang Yu, Xiaoding Yuan, Yukun Zhu, Alan Yuille, Hartwig Adam, Liang-Chieh Chen

The tiny-MOAT family is also benchmarked on downstream tasks, serving as a baseline for the community.

Ranked #1 on Object Detection on MS COCO

Image Classification Instance Segmentation +2

985

Paper
Code

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

1 code implementation • 21 Jul 2022 • Grant van Horn, Rui Qian, Kimberly Wilber, Hartwig Adam, Oisin Mac Aodha, Serge Belongie

We thoroughly benchmark audiovisual classification performance and modality fusion experiments through the use of state-of-the-art transformer methods.

Fine-Grained Visual Categorization Video Classification

Paper
Code

kMaX-DeepLab: k-means Mask Transformer

2 code implementations • 8 Jul 2022 • Qihang Yu, Huiyu Wang, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

However, we observe that most existing transformer-based vision models simply borrow the idea from NLP, neglecting the crucial difference between languages and images, particularly the extremely large sequence length of spatially flattened pixel features.

Ranked #2 on Panoptic Segmentation on COCO test-dev

Clustering Object Detection +1

985

Paper
Code

CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation

2 code implementations • CVPR 2022 • Qihang Yu, Huiyu Wang, Dahun Kim, Siyuan Qiao, Maxwell Collins, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We propose Clustering Mask Transformer (CMT-DeepLab), a transformer-based framework for panoptic segmentation designed around clustering.

Ranked #6 on Panoptic Segmentation on COCO test-dev

Clustering Panoptic Segmentation +1

Paper
Code

TubeFormer-DeepLab: Video Mask Transformer

no code implementations • CVPR 2022 • Dahun Kim, Jun Xie, Huiyu Wang, Siyuan Qiao, Qihang Yu, Hong-Seok Kim, Hartwig Adam, In So Kweon, Liang-Chieh Chen

We present TubeFormer-DeepLab, the first attempt to tackle multiple core video segmentation tasks in a unified manner.

Panoptic Segmentation Segmentation +2

Paper
Add Code

Adaptive Transformers for Robust Few-shot Cross-domain Face Anti-spoofing

no code implementations • 23 Mar 2022 • Hsin-Ping Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao, Jinwei Yuan, Hartwig Adam, Ming-Hsuan Yang

While recent face anti-spoofing methods perform well under the intra-domain setups, an effective approach needs to account for much larger appearance variations of images acquired in complex scenes with different sensors for robust performance.

Face Anti-Spoofing

Paper
Add Code

Surrogate Gap Minimization Improves Sharpness-Aware Training

1 code implementation • ICLR 2022 • Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha Dvornek, Sekhar Tatikonda, James Duncan, Ting Liu

Instead, we define a \textit{surrogate gap}, a measure equivalent to the dominant eigenvalue of Hessian at a local minimum when the radius of the neighborhood (to derive the perturbed loss) is small.

9,224

Paper
Code

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision

1 code implementation • CVPR 2022 • Liangzhe Yuan, Rui Qian, Yin Cui, Boqing Gong, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu

Modern self-supervised learning algorithms typically enforce persistency of instance representations across views.

Action Recognition Contrastive Learning +4

76,579

Paper
Code

Exploring Temporal Granularity in Self-Supervised Video Representation Learning

no code implementations • 8 Dec 2021 • Rui Qian, Yeqing Li, Liangzhe Yuan, Boqing Gong, Ting Liu, Matthew Brown, Serge Belongie, Ming-Hsuan Yang, Hartwig Adam, Yin Cui

The training objective consists of two parts: a fine-grained temporal learning objective to maximize the similarity between corresponding temporal embeddings in the short clip and the long clip, and a persistent temporal learning objective to pull together global embeddings of the two clips.

Representation Learning Self-Supervised Learning

Paper
Add Code

DeepLab2: A TensorFlow Library for Deep Labeling

4 code implementations • 17 Jun 2021 • Mark Weber, Huiyu Wang, Siyuan Qiao, Jun Xie, Maxwell D. Collins, Yukun Zhu, Liangzhe Yuan, Dahun Kim, Qihang Yu, Daniel Cremers, Laura Leal-Taixe, Alan L. Yuille, Florian Schroff, Hartwig Adam, Liang-Chieh Chen

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a state-of-the-art and easy-to-use TensorFlow codebase for general dense pixel prediction problems in computer vision.

985

Paper
Code

2.5D Visual Relationship Detection

1 code implementation • 26 Apr 2021 • Yu-Chuan Su, Soravit Changpinyo, Xiangning Chen, Sathish Thoppay, Cho-Jui Hsieh, Lior Shapira, Radu Soricut, Hartwig Adam, Matthew Brown, Ming-Hsuan Yang, Boqing Gong

To enable progress on this task, we create a new dataset consisting of 220k human-annotated 2. 5D relationships among 512K objects from 11K images.

Benchmarking Depth Estimation +2

Paper
Code

STEP: Segmenting and Tracking Every Pixel

1 code implementation • 23 Feb 2021 • Mark Weber, Jun Xie, Maxwell Collins, Yukun Zhu, Paul Voigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljoša Ošep, Laura Leal-Taixé, Liang-Chieh Chen

The task of assigning semantic classes and track identities to every pixel in a video is called video panoptic segmentation.

Segmentation Video Panoptic Segmentation

985

Paper
Code

ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation

1 code implementation • CVPR 2021 • Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We name this joint task as Depth-aware Video Panoptic Segmentation, and propose a new evaluation metric along with two derived datasets for it, which will be made available to the public.

Ranked #1 on Video Panoptic Segmentation on Cityscapes-VPS (using extra training data)

Depth-aware Video Panoptic Segmentation Monocular Depth Estimation +2

212

Paper
Code

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

1 code implementation • CVPR 2021 • Long Zhao, Yuxiao Wang, Jiaping Zhao, Liangzhe Yuan, Jennifer J. Sun, Florian Schroff, Hartwig Adam, Xi Peng, Dimitris Metaxas, Ting Liu

To evaluate the power of the learned representations, in addition to the conventional fully-supervised action recognition settings, we introduce a novel task called single-shot cross-view action recognition.

Action Recognition Contrastive Learning +1

32,745

Paper
Code

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

3 code implementations • CVPR 2021 • Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

As a result, MaX-DeepLab shows a significant 7. 1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time.

Ranked #12 on Panoptic Segmentation on COCO test-dev

Panoptic Segmentation

985

Paper
Code

View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose

2 code implementations • 23 Oct 2020 • Ting Liu, Jennifer J. Sun, Long Zhao, Jiaping Zhao, Liangzhe Yuan, Yuxiao Wang, Liang-Chieh Chen, Florian Schroff, Hartwig Adam

Recognition of human poses and actions is crucial for autonomous systems to interact smoothly with people.

3D Pose Estimation Action Recognition +2

32,745

Paper
Code

Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation

1 code implementation • ECCV 2020 • Liang-Chieh Chen, Raphael Gontijo Lopes, Bowen Cheng, Maxwell D. Collins, Ekin D. Cubuk, Barret Zoph, Hartwig Adam, Jonathon Shlens

We view this work as a notable step towards building a simple procedure to harness unlabeled video sequences and extra images to surpass state-of-the-art performance on core computer vision tasks.

Image Segmentation Optical Flow Estimation +4

76,579

Paper
Code

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

5 code implementations • ECCV 2020 • Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie, Bharath Hariharan, Hartwig Adam, Serge Belongie

In this work we explore the task of instance segmentation with attribute localization, which unifies instance segmentation (detect and segment each object instance) and fine-grained visual attribute categorization (recognize one or multiple attributes).

Attribute Fine-Grained Visual Categorization +5

5,176

Paper
Code

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

5 code implementations • ECCV 2020 • Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions.

Ranked #4 on Panoptic Segmentation on Cityscapes val (using extra training data)

Image Classification Panoptic Segmentation +1

1,140

Paper
Code

EEV: A Large-Scale Dataset for Studying Evoked Expressions from Video

1 code implementation • 15 Jan 2020 • Jennifer J. Sun, Ting Liu, Alan S. Cowen, Florian Schroff, Hartwig Adam, Gautam Prasad

The ability to predict evoked affect from a video, before viewers watch the video, can help in content creation and video recommendation.

Recommendation Systems Transfer Learning +1

Paper
Code

View-Invariant Probabilistic Embedding for Human Pose

2 code implementations • ECCV 2020 • Jennifer J. Sun, Jiaping Zhao, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Ting Liu

Depictions of similar human body configurations can vary with changing viewpoints.

Ranked #1 on Pose Retrieval on MPI-INF-3DHP

Action Recognition Pose Retrieval +2

32,745

Paper
Code

Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation

9 code implementations • CVPR 2020 • Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast system for panoptic segmentation, aiming to establish a solid baseline for bottom-up methods that can achieve comparable performance of two-stage methods while yielding fast inference speed.

Ranked #6 on Panoptic Segmentation on Cityscapes test (using extra training data)

Instance Segmentation Panoptic Segmentation +1

76,582

Paper
Code

Panoptic-DeepLab

2 code implementations • 10 Oct 2019 • Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen

The semantic segmentation branch is the same as the typical design of any semantic segmentation model (e. g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression.

Instance Segmentation Panoptic Segmentation +2

28,640

Paper
Code

The iMaterialist Fashion Attribute Dataset

1 code implementation • 13 Jun 2019 • Sheng Guo, Weilin Huang, Xiao Zhang, Prasanna Srikhanta, Yin Cui, Yuan Li, Matthew R. Scott, Hartwig Adam, Serge Belongie

The dataset is constructed from over one million fashion images with a label space that includes 8 groups of 228 fine-grained attributes in total.

Attribute General Classification +2

Paper
Code

Geo-Aware Networks for Fine-Grained Recognition

1 code implementation • 4 Jun 2019 • Grace Chu, Brian Potetz, Weijun Wang, Andrew Howard, Yang song, Fernando Brucher, Thomas Leung, Hartwig Adam

By leveraging geolocation information we improve top-1 accuracy in iNaturalist from 70. 1% to 79. 0% for a strong baseline image-only model.

Fine-Grained Image Classification General Classification

Paper
Code

Searching for MobileNetV3

61 code implementations • ICCV 2019 • Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam

We achieve new state of the art results for mobile classification, detection and segmentation.

Ranked #8 on Dichotomous Image Segmentation on DIS-TE1

Classification Dichotomous Image Segmentation +5

76,579

Paper
Code

FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation

3 code implementations • CVPR 2019 • Paul Voigtlaender, Yuning Chai, Florian Schroff, Hartwig Adam, Bastian Leibe, Liang-Chieh Chen

Many of the recent successful methods for video object segmentation (VOS) are overly complicated, heavily rely on fine-tuning on the first frame, and/or are slow, and are hence of limited practical use.

Ranked #1 on Semi-Supervised Video Object Segmentation on YouTube

Object Segmentation +3

76,579

Paper
Code

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

12 code implementations • CVPR 2019 • Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei

Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space.

Ranked #7 on Semantic Segmentation on PASCAL VOC 2012 val

Image Classification Image Segmentation +3

76,579

Paper
Code

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

1 code implementation • NeurIPS 2018 • Liang-Chieh Chen, Maxwell D. Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, Jonathon Shlens

Recent progress has demonstrated that such meta-learning methods may exceed scalable human-invented architectures on image classification tasks.

Ranked #1 on Human Part Segmentation on PASCAL-Person-Part

Image Classification Image Segmentation +5

76,579

Paper
Code

NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications

4 code implementations • ECCV 2018 • Tien-Ju Yang, Andrew Howard, Bo Chen, Xiao Zhang, Alec Go, Mark Sandler, Vivienne Sze, Hartwig Adam

This work proposes an algorithm, called NetAdapt, that automatically adapts a pre-trained deep neural network to a mobile platform given a resource budget.

Image Classification

900

Paper
Code

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

76 code implementations • ECCV 2018 • Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam

The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information.

Ranked #1 on Semantic Segmentation on PASCAL VOC 2012 test (using extra training data)

Image Classification Image Segmentation +2

76,579

Paper
Code

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

22 code implementations • CVPR 2018 • Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes.

General Classification Quantization

76,583

Paper
Code

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

no code implementations • CVPR 2018 • Liang-Chieh Chen, Alexander Hermans, George Papandreou, Florian Schroff, Peng Wang, Hartwig Adam

Within each region of interest, MaskLab performs foreground/background segmentation by combining semantic and direction prediction.

Ranked #85 on Instance Segmentation on COCO test-dev (using extra training data)

Instance Segmentation Object +4

Paper
Add Code

InclusiveFaceNet: Improving Face Attribute Detection with Race and Gender Diversity

1 code implementation • 1 Dec 2017 • Hee Jung Ryu, Hartwig Adam, Margaret Mitchell

We demonstrate an approach to face attribute detection that retains or improves attribute detection accuracy across gender and race subgroups by learning demographic information prior to learning the attribute detection task.

Attribute

Paper
Code

The iNaturalist Species Classification and Detection Dataset

19 code implementations • CVPR 2018 • Grant Van Horn, Oisin Mac Aodha, Yang song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, Serge Belongie

Existing image classification datasets used in computer vision tend to have a uniform distribution of images across object categories.

Ranked #8 on Image Classification on iNaturalist

General Classification Image Classification

76,579

Paper
Code

Learning Unified Embedding for Apparel Recognition

no code implementations • 19 Jul 2017 • Yang Song, Yuan Li, Bo Wu, Chao-Yeh Chen, Xiao Zhang, Hartwig Adam

To ease the training difficulty, a novel learning scheme is proposed by using the output from specialized models as learning targets so that L2 loss can be used instead of triplet loss.

Retrieval

Paper
Add Code

BranchOut: Regularization for Online Ensemble Tracking With Convolutional Neural Networks

no code implementations • CVPR 2017 • Bohyung Han, Jack Sim, Hartwig Adam

We propose an extremely simple but effective regularization technique of convolutional neural networks (CNNs), referred to as BranchOut, for online ensemble tracking.

Visual Tracking

Paper
Add Code

Rethinking Atrous Convolution for Semantic Image Segmentation

75 code implementations • 17 Jun 2017 • Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam

To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates.

Ranked #3 on Semantic Segmentation on PASCAL VOC 2012 test (using extra training data)

Dichotomous Image Segmentation Image Segmentation +3

76,579

Paper
Code

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

153 code implementations • 17 Apr 2017 • Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam

We present a class of efficient models called MobileNets for mobile and embedded vision applications.

Ranked #227 on Object Detection on COCO test-dev

General Classification Image Classification +1

182,328

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.