Search Results for author: Zhenheng Yang

Found 16 papers, 7 papers with code

TALL: Temporal Activity Localization via Language Query

12 code implementations ICCV 2017 Jiyang Gao, Chen Sun, Zhenheng Yang, Ram Nevatia

For evaluation, we adopt TaCoS dataset, and build a new dataset for this task on top of Charades by adding sentence temporal annotations, called Charades-STA.

Natural Language Queries regression +2

Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos

1 code implementation8 Oct 2018 Yang Wang, Zhenheng Yang, Peng Wang, Yi Yang, Chenxu Luo, Wei Xu

Then the whole scene is decomposed into moving foreground and static background by compar- ing the estimated optical flow and rigid flow derived from the depth and ego-motion.

Motion Estimation Optical Flow Estimation

TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals

1 code implementation ICCV 2017 Jiyang Gao, Zhenheng Yang, Chen Sun, Kan Chen, Ram Nevatia

Temporal Action Proposal (TAP) generation is an important problem, as fast and accurate extraction of semantically important (e. g. human actions) segments from untrimmed videos is an important step for large-scale video analysis.

regression Temporal Action Localization

LEGO: Learning Edge with Geometry all at Once by Watching Videos

1 code implementation CVPR 2018 Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia

In our framework, the predicted depths, normals and edges are forced to be consistent all the time.

SPAN: Spatial Pyramid Attention Network for Image Manipulation Localization

1 code implementation ECCV 2020 Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, Syomantak Chaudhuri, Zhenheng Yang, Ram Nevatia

Tehchniques for manipulating images are advancing rapidly; while these are helpful for many useful tasks, they also pose a threat to society with their ability to create believable misinformation.

Image Manipulation Image Manipulation Detection +3

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding

1 code implementation14 Oct 2018 Chenxu Luo, Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia, Alan Yuille

Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.

Depth Estimation Optical Flow Estimation +2

RED: Reinforced Encoder-Decoder Networks for Action Anticipation

1 code implementation16 Jul 2017 Jiyang Gao, Zhenheng Yang, Ram Nevatia

RED takes multiple history representations as input and learns to anticipate a sequence of future representations.

Action Anticipation

Occlusion Aware Unsupervised Learning of Optical Flow

no code implementations CVPR 2018 Yang Wang, Yi Yang, Zhenheng Yang, Liang Zhao, Peng Wang, Wei Xu

Especially on KITTI dataset where abundant unlabeled samples exist, our unsupervised method outperforms its counterpart trained with supervised learning.

Optical Flow Estimation

Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency

no code implementations10 Nov 2017 Zhenheng Yang, Peng Wang, Wei Xu, Liang Zhao, Ramakant Nevatia

Learning to reconstruct depths in a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years.

Depth Estimation

Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation

no code implementations31 Jul 2017 Zhenheng Yang, Jiyang Gao, Ram Nevatia

In this work, we address the problem of spatio-temporal action detection in temporally untrimmed videos.

Action Detection Region Proposal

Cascaded Boundary Regression for Temporal Action Detection

no code implementations2 May 2017 Jiyang Gao, Zhenheng Yang, Ram Nevatia

CBR uses temporal coordinate regression to refine the temporal boundaries of the sliding windows.

Ranked #6 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.1 metric)

Action Detection regression

A Multi-Scale Cascade Fully Convolutional Network Face Detector

no code implementations12 Sep 2016 Zhenheng Yang, Ram Nevatia

The number of proposals is decreased after each level, and the areas of regions are decreased to more precisely fit the face.

Face Detection

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding

no code implementations27 Jun 2018 Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia

The four types of information, i. e. 2D flow, camera pose, segment mask and depth maps, are integrated into a differentiable holistic 3D motion parser (HMP), where per-pixel 3D motion for rigid background and moving objects are recovered.

Depth And Camera Motion Optical Flow Estimation +1

SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

no code implementations1 Sep 2020 Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, Syomantak Chaudhuri, Zhenheng Yang, Ram Nevatia

We present a novel framework, Spatial Pyramid Attention Network (SPAN) for detection and localization of multiple types of image manipulations.

Position

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

no code implementations CVPR 2021 Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang

However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of objects and (b) missing object predictions.

Instance Segmentation Relation Network +3

Cannot find the paper you are looking for? You can Submit a new open access paper.