Search Results for author: Jiyang Gao

Found 20 papers, 8 papers with code

HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps

no code implementations • CVPR 2021 • Lu Mi, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao, Chen Sun, Cordelia Schmid, Nir Shavit, Yuning Chai, Dragomir Anguelov

To address this issue, we introduce a new challenging task to generate HD maps.

Graph Generation Motion Forecasting +1

Paper
Add Code

TNT: Target-driveN Trajectory Prediction

4 code implementations • 19 Aug 2020 • Hang Zhao, Jiyang Gao, Tian Lan, Chen Sun, Benjamin Sapp, Balakrishnan Varadarajan, Yue Shen, Yi Shen, Yuning Chai, Cordelia Schmid, Cong-Cong Li, Dragomir Anguelov

Our key insight is that for prediction within a moderate time horizon, the future modes can be effectively captured by a set of target states.

Ranked #2 on Trajectory Prediction on INTERACTION Dataset - Validation

Motion Forecasting Trajectory Prediction

462

Paper
Code

STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction

no code implementations • CVPR 2020 • Zhishuai Zhang, Jiyang Gao, Junhua Mao, Yukai Liu, Dragomir Anguelov, Cong-Cong Li

For the Waymo Open Dataset, we achieve a bird-eyes-view (BEV) detection AP of 80. 73 and trajectory prediction average displacement error (ADE) of 33. 67cm for pedestrians, which establish the state-of-the-art for both tasks.

Autonomous Driving object-detection +3

Paper
Add Code

VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

3 code implementations • CVPR 2020 • Jiyang Gao, Chen Sun, Hang Zhao, Yi Shen, Dragomir Anguelov, Cong-Cong Li, Cordelia Schmid

Behavior prediction in dynamic, multi-agent systems is an important problem in the context of self-driving cars, due to the complex representations and interactions of road components, including moving agents (e. g. pedestrians and vehicles) and road context information (e. g. lanes, traffic lights).

Self-Driving Cars

461

Paper
Code

CPARR: Category-based Proposal Analysis for Referring Relationships

no code implementations • 17 Apr 2020 • Chuanzi He, Haidong Zhu, Jiyang Gao, Kan Chen, Ram Nevatia

The task of referring relationships is to localize subject and object entities in an image satisfying a relationship query, which is given in the form of \texttt{<subject, predicate, object>}.

Object Relationship Detection +1

Paper
Add Code

End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds

no code implementations • 15 Oct 2019 • Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan

In this paper, we aim to synergize the birds-eye view and the perspective view and propose a novel end-to-end multi-view fusion (MVF) algorithm, which can effectively learn to utilize the complementary information from both.

3D Object Detection object-detection

Paper
Add Code

NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection

no code implementations • ICCV 2019 • JIyang Gao, Jiang Wang, Shengyang Dai, Li-Jia Li, Ram Nevatia

Comparing to standard Faster RCNN, it contains three highlights: an ensemble of two classification heads and a distillation head to avoid overfitting on noisy labels and improve the mining precision, masking the negative sample loss in box predictor to avoid the harm of false negative labels, and training box regression head only on seed annotations to eliminate the harm from inaccurate boundaries of mined bounding boxes.

Object object-detection +2

Paper
Add Code

MAC: Mining Activity Concepts for Language-based Temporal Localization

3 code implementations • 21 Nov 2018 • Runzhou Ge, Jiyang Gao, Kan Chen, Ram Nevatia

Previous methods address the problem by considering features from video sliding windows and language queries and learning a subspace to encode their correlation, which ignore rich semantic cues about activities in videos and queries.

Language-Based Temporal Localization

Paper
Code

CTAP: Complementary Temporal Action Proposal Generation

1 code implementation • ECCV 2018 • Jiyang Gao, Kan Chen, Ram Nevatia

Temporal action proposal generation is an important task, akin to object proposals, temporal action proposals are intended to capture "clips" or temporal intervals in videos that are likely to contain an action.

Ranked #10 on Temporal Action Proposal Generation on ActivityNet-1.3

Temporal Action Proposal Generation

Paper
Code

Revisiting Temporal Modeling for Video-based Person ReID

8 code implementations • 5 May 2018 • Jiyang Gao, Ram Nevatia

Although many methods on temporal modeling have been proposed, it is hard to directly compare these methods, because the choice of feature extractor and loss function also have a large impact on the final performance.

374

Paper
Code

Motion-Appearance Co-Memory Networks for Video Question Answering

no code implementations • CVPR 2018 • Jiyang Gao, Runzhou Ge, Kan Chen, Ram Nevatia

Specifically, there are three salient aspects: (1) a co-memory attention mechanism that utilizes cues from both motion and appearance to generate attention; (2) a temporal conv-deconv network to generate multi-level contextual facts; (3) a dynamic fact ensemble method to construct temporal representation dynamically for different questions.

Ranked #28 on Visual Question Answering (VQA) on MSRVTT-QA

Question Answering Video Question Answering +1

Paper
Add Code

Knowledge Aided Consistency for Weakly Supervised Phrase Grounding

no code implementations • CVPR 2018 • Kan Chen, Jiyang Gao, Ram Nevatia

In this paper, we explore the consistency contained in both visual and language modalities, and leverage complementary external knowledge to facilitate weakly supervised grounding.

Phrase Grounding

Paper
Add Code

Knowledge Concentration: Learning 100K Object Classifiers in a Single CNN

no code implementations • 21 Nov 2017 • Jiyang Gao, Zijian, Guo, Zhen Li, Ram Nevatia

To address these challenges, we propose a Knowledge Concentration method, which effectively transfers the knowledge from dozens of specialists (multiple teacher networks) into one single model (one student network) to classify 100K object categories.

General Classification Image Classification +1

Paper
Add Code

Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation

no code implementations • 31 Jul 2017 • Zhenheng Yang, Jiyang Gao, Ram Nevatia

In this work, we address the problem of spatio-temporal action detection in temporally untrimmed videos.

Action Detection Region Proposal

Paper
Add Code

RED: Reinforced Encoder-Decoder Networks for Action Anticipation

1 code implementation • 16 Jul 2017 • Jiyang Gao, Zhenheng Yang, Ram Nevatia

RED takes multiple history representations as input and learns to anticipate a sequence of future representations.

Ranked #5 on Action Anticipation on EPIC-KITCHENS-55 (Unseen test set (S2)

Action Anticipation

Paper
Code

TALL: Temporal Activity Localization via Language Query

12 code implementations • ICCV 2017 • Jiyang Gao, Chen Sun, Zhenheng Yang, Ram Nevatia

For evaluation, we adopt TaCoS dataset, and build a new dataset for this task on top of Charades by adding sentence temporal annotations, called Charades-STA.

Natural Language Queries regression +2

334

Paper
Code

Cascaded Boundary Regression for Temporal Action Detection

no code implementations • 2 May 2017 • Jiyang Gao, Zhenheng Yang, Ram Nevatia

CBR uses temporal coordinate regression to refine the temporal boundaries of the sliding windows.

Ranked #6 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.1 metric)

Action Detection regression

Paper
Add Code

TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals

1 code implementation • ICCV 2017 • Jiyang Gao, Zhenheng Yang, Chen Sun, Kan Chen, Ram Nevatia

Temporal Action Proposal (TAP) generation is an important problem, as fast and accurate extraction of semantically important (e. g. human actions) segments from untrimmed videos is an important step for large-scale video analysis.

Ranked #8 on Action Recognition on THUMOS’14