Search Results for author: Pavlo Molchanov

Found 46 papers, 20 papers with code

Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network

no code implementations • CVPR 2016 • Pavlo Molchanov, Xiaodong Yang, Shalini Gupta, Kihwan Kim, Stephen Tyree, Jan Kautz

Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult; 2) the system must work online in order to avoid noticeable lag between performing a gesture and its classification; in fact, a negative lag (classification before the gesture is finished) is desirable, as feedback to the user can then be truly instantaneous.

Classification General Classification +1

Paper
Add Code

Pruning Convolutional Neural Networks for Resource Efficient Inference

9 code implementations • 19 Nov 2016 • Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz

We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters.

Transfer Learning

140

Paper
Code

A Lightweight Approach for On-the-Fly Reflectance Estimation

no code implementations • ICCV 2017 • Kihwan Kim, Jinwei Gu, Stephen Tyree, Pavlo Molchanov, Matthias Nießner, Jan Kautz

In addition, we have created a large synthetic dataset, SynBRDF, which comprises a total of $500$K RGBD images rendered with a physically-based ray tracer under a variety of natural illumination, covering $5000$ materials and $5000$ shapes.

Color Constancy

Paper
Add Code

Improving Landmark Localization with Semi-Supervised Learning

no code implementations • CVPR 2018 • Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

First, we propose the framework of sequential multitasking and explore it here through an architecture for landmark localization where training with class labels acts as an auxiliary signal to guide the landmark localization on unlabeled data.

Ranked #41 on Face Alignment on 300W

Face Alignment Small Data Image Classification

Paper
Add Code

Budget-Aware Activity Detection with A Recurrent Policy Network

no code implementations • 30 Nov 2017 • Behrooz Mahasseni, Xiaodong Yang, Pavlo Molchanov, Jan Kautz

In this paper, we address the challenging problem of efficient temporal activity detection in untrimmed long videos.

Action Detection Activity Detection

Paper
Add Code

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals

1 code implementation • CVPR 2018 • Shanxin Yuan, Guillermo Garcia-Hernando, Bjorn Stenger, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee, Pavlo Molchanov, Jan Kautz, Sina Honari, Liuhao Ge, Junsong Yuan, Xinghao Chen, Guijin Wang, Fan Yang, Kai Akiyama, Yang Wu, Qingfu Wan, Meysam Madadi, Sergio Escalera, Shile Li, Dongheui Lee, Iason Oikonomidis, Antonis Argyros, Tae-Kyun Kim

Official Torch7 implementation of "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map", CVPR 2018

Ranked #5 on Hand Pose Estimation on HANDS 2017

3D Hand Pose Estimation 3D Pose Estimation

373

Paper
Code

Hand Pose Estimation via Latent 2.5D Heatmap Regression

no code implementations • ECCV 2018 • Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, Jan Kautz

Estimating the 3D pose of a hand is an essential part of human-computer interaction.

3D Hand Pose Estimation regression

Paper
Add Code

IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification

no code implementations • 26 Apr 2018 • Sam Leroux, Pavlo Molchanov, Pieter Simoens, Bart Dhoedt, Thomas Breuel, Jan Kautz

Deep residual networks (ResNets) made a recent breakthrough in deep learning.

General Classification Image Classification

Paper
Add Code

Making Convolutional Networks Recurrent for Visual Sequence Learning

no code implementations • CVPR 2018 • Xiaodong Yang, Pavlo Molchanov, Jan Kautz

Recurrent neural networks (RNNs) have emerged as a powerful model for a broad range of machine learning problems that involve sequential data.

Action Recognition Face Alignment +6

Paper
Add Code

Towards annotation-efficient segmentation via image-to-image translation

no code implementations • 2 Apr 2019 • Eugene Vorontsov, Pavlo Molchanov, Christopher Beckham, Jan Kautz, Samuel Kadoury

Specifically, we propose a semi-supervised framework that employs unpaired image-to-image translation between two domains, presence vs. absence of cancer, as the unsupervised objective.

Brain Tumor Segmentation Image-to-Image Translation +3

Paper
Add Code

SCOPS: Self-Supervised Co-Part Segmentation

1 code implementation • CVPR 2019 • Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, Jan Kautz

Parts provide a good intermediate representation of objects that is robust with respect to the camera, pose and appearance variations.

Ranked #4 on Unsupervised Keypoint Estimation on CUB

Object Segmentation +4

218

Paper
Code

Few-Shot Adaptive Gaze Estimation

1 code implementation • ICCV 2019 • Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Otmar Hilliges, Jan Kautz

Inter-personal anatomical differences limit the accuracy of person-independent gaze estimation networks.

Ranked #1 on Gaze Estimation on MPII Gaze (using extra training data)

Gaze Estimation Meta-Learning

305

Paper
Code

Importance Estimation for Neural Network Pruning

3 code implementations • CVPR 2019 • Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, Jan Kautz

On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0. 02% in the top-1 accuracy on ImageNet.

Network Pruning

303

Paper
Code

Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion

2 code implementations • CVPR 2020 • Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose M. Alvarez, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz

We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network.

Continual Learning Network Pruning +1

474

Paper
Code

HarDNN: Feature Map Vulnerability Evaluation in CNNs

no code implementations • 22 Feb 2020 • Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors.

Decision Making

Paper
Add Code

Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild

no code implementations • CVPR 2020 • Umar Iqbal, Pavlo Molchanov, Jan Kautz

One major challenge for monocular 3D human pose estimation in-the-wild is the acquisition of training data that contains unconstrained images annotated with accurate 3D poses.

Ranked #1 on Weakly-supervised 3D Human Pose Estimation on MPI-INF-3DHP

Monocular 3D Human Pose Estimation Weakly-superavised 3D Human Pose Estimation +1

Paper
Add Code

Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints

no code implementations • ECCV 2020 • Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Otmar Hilliges, Jan Kautz

Estimating 3D hand pose from 2D images is a difficult, inverse problem due to the inherent scale and depth ambiguities.

Ranked #10 on 3D Hand Pose Estimation on DexYCB

3D Hand Pose Estimation Open-Ended Question Answering +1

Paper
Add Code

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction

no code implementations • ECCV 2020 • Anil Armagan, Guillermo Garcia-Hernando, Seungryul Baek, Shreyas Hampali, Mahdi Rad, Zhaohui Zhang, Shipeng Xie, Mingxiu Chen, Boshen Zhang, Fu Xiong, Yang Xiao, Zhiguo Cao, Junsong Yuan, Pengfei Ren, Weiting Huang, Haifeng Sun, Marek Hrúz, Jakub Kanis, Zdeněk Krňoul, Qingfu Wan, Shile Li, Linlin Yang, Dongheui Lee, Angela Yao, Weiguo Zhou, Sijia Mei, Yun-hui Liu, Adrian Spurr, Umar Iqbal, Pavlo Molchanov, Philippe Weinzaepfel, Romain Brégier, Grégory Rogez, Vincent Lepetit, Tae-Kyun Kim

To address these issues, we designed a public challenge (HANDS'19) to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set.

3D Hand Pose Estimation

Paper
Add Code

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

2 code implementations • CVPR 2021 • Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, Dieter Fox

We introduce DexYCB, a new dataset for capturing hand grasping of objects.

3D Hand Pose Estimation 6D Pose Estimation using RGB +2

139

Paper
Code

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations • CVPR 2021 • Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

311

Paper
Code

KAMA: 3D Keypoint Aware Body Mesh Articulation

no code implementations • 27 Apr 2021 • Umar Iqbal, Kevin Xie, Yunrong Guo, Jan Kautz, Pavlo Molchanov

We present KAMA, a 3D Keypoint Aware Mesh Articulation approach that allows us to estimate a human body mesh from the positions of 3D body keypoints.

Ranked #48 on 3D Human Pose Estimation on 3DPW

3D Human Pose Estimation 3D Human Shape Estimation +1

Paper
Add Code

Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation

no code implementations • 10 Jun 2021 • Adrian Spurr, Pavlo Molchanov, Umar Iqbal, Jan Kautz, Otmar Hilliges

Hand pose estimation is difficult due to different environmental conditions, object- and self-occlusion as well as diversity in hand shape and appearance.

Hand Pose Estimation valid

Paper
Add Code

Optimal Quantization Using Scaled Codebook

no code implementations • CVPR 2021 • Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu Yin, Miguel A. Carreira-Perpinan, Jose M. Alvarez

We study the problem of quantizing N sorted, scalar datapoints with a fixed codebook containing K entries that are allowed to be rescaled.

Quantization

Paper
Add Code

LANA: Latency Aware Network Acceleration

no code implementations • 12 Jul 2021 • Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3. 0\%$) when compressing larger models to the latency level of smaller models.

Neural Architecture Search Quantization

Paper
Add Code

Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks

no code implementations • 13 Jul 2021 • Xin Dong, Hongxu Yin, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov, H. T. Kung

Prior works usually assume that SC offers privacy benefits as only intermediate features, instead of private data, are shared from devices to the cloud.

Paper
Add Code

Hardware-Aware Network Transformation

no code implementations • 29 Sep 2021 • Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach.

Neural Architecture Search

Paper
Add Code

Global Vision Transformer Pruning with Hessian-Aware Saliency

1 code implementation • CVPR 2023 • Huanrui Yang, Hongxu Yin, Maying Shen, Pavlo Molchanov, Hai Li, Jan Kautz

This work aims on challenging the common design philosophy of the Vision Transformer (ViT) model with uniform dimension across all the stacked blocks in a model stage, where we redistribute the parameters both across transformer blocks and between different structures within the block via the first systematic attempt on global structural pruning.

Efficient ViTs Philosophy

Paper
Code

HALP: Hardware-Aware Latency Pruning

1 code implementation • 20 Oct 2021 • Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget.

Paper
Code

When to Prune? A Policy towards Early Structural Pruning

no code implementations • CVPR 2022 • Maying Shen, Pavlo Molchanov, Hongxu Yin, Jose M. Alvarez

Through extensive experiments on ImageNet, we show that EPI empowers a quick tracking of early training epochs suitable for pruning, offering same efficacy as an otherwise ``oracle'' grid-search that scans through epochs and requires orders of magnitude more compute.

Network Pruning

Paper
Add Code

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras

1 code implementation • CVPR 2022 • Ye Yuan, Umar Iqbal, Pavlo Molchanov, Kris Kitani, Jan Kautz

Since the joint reconstruction of human motions and camera poses is underconstrained, we propose a global trajectory predictor that generates global human trajectories based on local body movements.

Ranked #1 on Global 3D Human Pose Estimation on EMDB

Global 3D Human Pose Estimation Human Mesh Recovery

335

Paper
Code

AdaViT: Adaptive Tokens for Efficient Vision Transformer

1 code implementation • CVPR 2022 • Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Ranked #34 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs Token Reduction

131

Paper
Code

Do Gradient Inversion Attacks Make Federated Learning Unsafe?

no code implementations • 14 Feb 2022 • Ali Hatamizadeh, Hongxu Yin, Pavlo Molchanov, Andriy Myronenko, Wenqi Li, Prerna Dogra, Andrew Feng, Mona G. Flores, Jan Kautz, Daguang Xu, Holger R. Roth

Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.

Federated Learning Privacy Preserving

Paper
Add Code

GradViT: Gradient Inversion of Vision Transformers

no code implementations • CVPR 2022 • Ali Hatamizadeh, Hongxu Yin, Holger Roth, Wenqi Li, Jan Kautz, Daguang Xu, Pavlo Molchanov

In this work we demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks.

Scheduling

Paper
Add Code

DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars

no code implementations • 29 Mar 2022 • Amit Raj, Umar Iqbal, Koki Nagano, Sameh Khamis, Pavlo Molchanov, James Hays, Jan Kautz

In this work, we present, DRaCoN, a framework for learning full-body volumetric avatars which exploits the advantages of both the 2D and 3D neural rendering techniques.

Neural Rendering

Paper
Add Code

Global Context Vision Transformers

8 code implementations • 20 Jun 2022 • Ali Hatamizadeh, Hongxu Yin, Greg Heinrich, Jan Kautz, Pavlo Molchanov

Pre-trained GC ViT backbones in downstream tasks of object detection, instance segmentation, and semantic segmentation using MS COCO and ADE20K datasets outperform prior work consistently.

Ranked #132 on Semantic Segmentation on ADE20K

Image Classification Inductive Bias +4

29,671

Paper
Code

Structural Pruning via Latency-Saliency Knapsack

1 code implementation • 13 Oct 2022 • Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

Paper
Code

RANA: Relightable Articulated Neural Avatars

no code implementations • ICCV 2023 • Umar Iqbal, Akin Caliskan, Koki Nagano, Sameh Khamis, Pavlo Molchanov, Jan Kautz

We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting.

Disentanglement Image Generation

Paper
Add Code

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

1 code implementation • CVPR 2023 • Paul Micaelli, Arash Vahdat, Hongxu Yin, Jan Kautz, Pavlo Molchanov

Our Landmark DEQ (LDEQ) achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching $3. 92$ NME with fewer parameters and a training memory cost of $\mathcal{O}(1)$ in the number of recurrent modules.

Ranked #2 on Face Alignment on WFLW

Face Alignment

Paper
Code

FasterViT: Fast Vision Transformers with Hierarchical Attention

2 code implementations • 9 Jun 2023 • Ali Hatamizadeh, Greg Heinrich, Hongxu Yin, Andrew Tao, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

At a high level, global self-attentions enable the efficient cross-window communication at lower costs.

object-detection Object Detection +1

662

Paper
Code

Heterogeneous Continual Learning

no code implementations • CVPR 2023 • Divyam Madaan, Hongxu Yin, Wonmin Byeon, Jan Kautz, Pavlo Molchanov

We propose a novel framework and a solution to tackle the continual learning (CL) problem with changing network architectures.

Continual Learning Knowledge Distillation +1

Paper
Add Code

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

no code implementations • 25 Jun 2023 • Anna Bair, Hongxu Yin, Maying Shen, Pavlo Molchanov, Jose Alvarez

Robustness and compactness are two essential attributes of deep learning models that are deployed in the real world.

Image Classification object-detection +2

Paper
Add Code

PACE: Human and Camera Motion Estimation from in-the-wild Videos

no code implementations • 20 Oct 2023 • Muhammed Kocabas, Ye Yuan, Pavlo Molchanov, Yunrong Guo, Michael J. Black, Otmar Hilliges, Jan Kautz, Umar Iqbal

This design combines the strengths of SLAM and motion priors, which leads to significant improvements in human and camera motion estimation.

Motion Estimation

Paper
Add Code

AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One

1 code implementation • 10 Dec 2023 • Mike Ranzinger, Greg Heinrich, Jan Kautz, Pavlo Molchanov

A handful of visual foundation models (VFMs) have recently emerged as the backbones for numerous downstream tasks.

Benchmarking object-detection +2

Paper
Code

VILA: On Pre-training for Visual Language Models

2 code implementations • 12 Dec 2023 • Ji Lin, Hongxu Yin, Wei Ping, Yao Lu, Pavlo Molchanov, Andrew Tao, Huizi Mao, Jan Kautz, Mohammad Shoeybi, Song Han

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

Ranked #21 on Visual Question Answering on MM-Vet

In-Context Learning Language Modelling +2

1,766

Paper
Code

DoRA: Weight-Decomposed Low-Rank Adaptation

4 code implementations • 14 Feb 2024 • Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin, Pavlo Molchanov, Yu-Chiang Frank Wang, Kwang-Ting Cheng, Min-Hung Chen

By employing DoRA, we enhance both the learning capacity and training stability of LoRA while avoiding any additional inference overhead.

259

Paper
Code

LITA: Language Instructed Temporal-Localization Assistant

1 code implementation • 27 Mar 2024 • De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz

In addition to leveraging existing video datasets with timestamps, we propose a new task, Reasoning Temporal Localization (RTL), along with the dataset, ActivityNet-RTL, for learning and evaluating this task.

Ranked #4 on Video-based Generative Performance Benchmarking on VideoInstruct

Instruction Following Temporal Localization +2

102

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.