Search Results for author: Zhenheng Yang

Found 16 papers, 7 papers with code

TALL: Temporal Activity Localization via Language Query

12 code implementations • ICCV 2017 • Jiyang Gao, Chen Sun, Zhenheng Yang, Ram Nevatia

For evaluation, we adopt TaCoS dataset, and build a new dataset for this task on top of Charades by adding sentence temporal annotations, called Charades-STA.

Natural Language Queries regression +2

334

Paper
Code

Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos

1 code implementation • 8 Oct 2018 • Yang Wang, Zhenheng Yang, Peng Wang, Yi Yang, Chenxu Luo, Wei Xu

Then the whole scene is decomposed into moving foreground and static background by compar- ing the estimated optical flow and rigid flow derived from the depth and ego-motion.

Motion Estimation Optical Flow Estimation

128

Paper
Code

TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals

1 code implementation • ICCV 2017 • Jiyang Gao, Zhenheng Yang, Chen Sun, Kan Chen, Ram Nevatia

Temporal Action Proposal (TAP) generation is an important problem, as fast and accurate extraction of semantically important (e. g. human actions) segments from untrimmed videos is an important step for large-scale video analysis.

Ranked #8 on Action Recognition on THUMOS’14

regression Temporal Action Localization

Paper
Code

LEGO: Learning Edge with Geometry all at Once by Watching Videos

1 code implementation • CVPR 2018 • Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia

In our framework, the predicted depths, normals and edges are forced to be consistent all the time.

Paper
Code

SPAN: Spatial Pyramid Attention Network for Image Manipulation Localization

1 code implementation • ECCV 2020 • Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, Syomantak Chaudhuri, Zhenheng Yang, Ram Nevatia

Tehchniques for manipulating images are advancing rapidly; while these are helpful for many useful tasks, they also pose a threat to society with their ability to create believable misinformation.

Ranked #5 on Image Manipulation Localization on Columbia

Image Manipulation Image Manipulation Detection +3

Paper
Code

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding

1 code implementation • 14 Oct 2018 • Chenxu Luo, Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia, Alan Yuille

Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.

Ranked #1 on Scene Flow Estimation on KITTI 2015 Scene Flow Training

Depth Estimation Optical Flow Estimation +2

Paper
Code

RED: Reinforced Encoder-Decoder Networks for Action Anticipation

1 code implementation • 16 Jul 2017 • Jiyang Gao, Zhenheng Yang, Ram Nevatia

RED takes multiple history representations as input and learns to anticipate a sequence of future representations.

Ranked #5 on Action Anticipation on EPIC-KITCHENS-55 (Unseen test set (S2)

Action Anticipation

Paper
Code

Occlusion Aware Unsupervised Learning of Optical Flow

no code implementations • CVPR 2018 • Yang Wang, Yi Yang, Zhenheng Yang, Liang Zhao, Peng Wang, Wei Xu

Especially on KITTI dataset where abundant unlabeled samples exist, our unsupervised method outperforms its counterpart trained with supervised learning.

Optical Flow Estimation

Paper
Add Code

Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency

no code implementations • 10 Nov 2017 • Zhenheng Yang, Peng Wang, Wei Xu, Liang Zhao, Ramakant Nevatia

Learning to reconstruct depths in a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years.

Depth Estimation

Paper
Add Code

Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation

no code implementations • 31 Jul 2017 • Zhenheng Yang, Jiyang Gao, Ram Nevatia

In this work, we address the problem of spatio-temporal action detection in temporally untrimmed videos.

Action Detection Region Proposal

Paper
Add Code

Cascaded Boundary Regression for Temporal Action Detection

no code implementations • 2 May 2017 • Jiyang Gao, Zhenheng Yang, Ram Nevatia

CBR uses temporal coordinate regression to refine the temporal boundaries of the sliding windows.

Ranked #6 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.1 metric)

Action Detection regression

Paper
Add Code

A Multi-Scale Cascade Fully Convolutional Network Face Detector

no code implementations • 12 Sep 2016 • Zhenheng Yang, Ram Nevatia

The number of proposals is decreased after each level, and the areas of regions are decreased to more precisely fit the face.

Face Detection

Paper
Add Code

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding

no code implementations • 27 Jun 2018 • Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia

The four types of information, i. e. 2D flow, camera pose, segment mask and depth maps, are integrated into a differentiable holistic 3D motion parser (HMP), where per-pixel 3D motion for rigid background and moving objects are recovered.

Ranked #2 on Scene Flow Estimation on KITTI 2015 Scene Flow Training

Depth And Camera Motion Optical Flow Estimation +1

Paper
Add Code

Activity Driven Weakly Supervised Object Detection

no code implementations • CVPR 2019 • Zhenheng Yang, Dhruv Mahajan, Deepti Ghadiyaram, Ram Nevatia, Vignesh Ramanathan

Weakly supervised object detection aims at reducing the amount of supervision required to train detection models.

Ranked #1 on Weakly Supervised Object Detection on Charades

Action Classification Object +2

Paper
Add Code

SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

no code implementations • 1 Sep 2020 • Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, Syomantak Chaudhuri, Zhenheng Yang, Ram Nevatia

We present a novel framework, Spatial Pyramid Attention Network (SPAN) for detection and localization of multiple types of image manipulations.

Position

Paper
Add Code

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

no code implementations • CVPR 2021 • Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang

However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of objects and (b) missing object predictions.

Instance Segmentation Relation Network +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.