INT: Towards Infinite-frames 3D Detection with An Efficient Framework

1 code implementation30 Sep 2022 Jianyun Xu, Zhenwei Miao, Da Zhang, Hongyu Pan, Kaixuan Liu, Peihan Hao, Jun Zhu, Zhengyang Sun, Hongmin Li, Xin Zhan

By employing INT on CenterPoint, we can get around 7% (Waymo) and 15% (nuScenes) performance boost with only 2~4ms latency overhead, and currently SOTA on the Waymo 3D Detection leaderboard.

Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes

no code implementations18 Aug 2022 Yu-Huan Wu, Da Zhang, Le Zhang, Xin Zhan, Dengxin Dai, Yun Liu, Ming-Ming Cheng

Current efficient LiDAR-based detection frameworks are lacking in exploiting object relations, which naturally present in both spatial and temporal manners.

The Gene of Scientific Success

no code implementations17 Feb 2022 Xiangjie Kong, Jun Zhang, Da Zhang, Yi Bu, Ying Ding, Feng Xia

Under this consideration, our paper presents and analyzes the causal factors that are crucial for scholars' academic success.

Estimating air quality co-benefits of energy transition using machine learning

no code implementations29 May 2021 Da Zhang, Qingyi Wang, Shaojie Song, Simiao Chen, MingWei Li, Lu Shen, Siqi Zheng, Bofeng Cai, Shenhao Wang

Applications of the framework with Chinese data reveal highly heterogeneous health benefits of reducing fossil fuel use in different sectors and regions in China with a mean of \$34/tCO2 and a standard deviation of \$84/tCO2.

Big Networks: A Survey

no code implementations9 Aug 2020 Hayat Dino Bedru, Shuo Yu, Xinru Xiao, Da Zhang, Liangtian Wan, He guo, Feng Xia

This paper proposes a guideline framework that gives an insight into the major topics in the area of network science from the viewpoint of a big network.

METAL: Minimum Effort Temporal Activity Localization in Untrimmed Videos

no code implementations CVPR 2020 Da Zhang, Xiyang Dai, Yuan-Fang Wang

Existing Temporal Activity Localization (TAL) methods largely adopt strong supervision for model training, which requires (1) vast amounts of untrimmed videos per each activity category and (2) accurate segment-level boundary annotations (start time and end time) for every instance.

Reference Product Search

no code implementations11 Apr 2019 Chu Wang, Lei Tang, Shujun Bian, Da Zhang, Zuohua Zhang, Yongning Wu

For a product of interest, we propose a search method to surface a set of reference products.

MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment

no code implementations CVPR 2019 Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis

In this paper, we present Moment Alignment Network (MAN), a novel framework that unifies the candidate moment encoding and temporal structural reasoning in a single-shot feed-forward network.

Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning

no code implementations7 Nov 2018 Xin Wang, Jiawei Wu, Da Zhang, Yu Su, William Yang Wang

Although promising results have been achieved in video captioning, existing models are limited to the fixed inventory of activities in the training corpus, and do not generalize to open vocabulary scenarios.

Dynamic Temporal Pyramid Network: A Closer Look at Multi-Scale Modeling for Activity Detection

no code implementations7 Aug 2018 Da Zhang, Xiyang Dai, Yuan-Fang Wang

(3) We further exploit the temporal context of activities by appropriately fusing multi-scale feature maps, and demonstrate that both local and global temporal contexts are important.

S3D: Single Shot multi-Span Detector via Fully 3D Convolutional Networks

1 code implementation21 Jul 2018 Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang

In this paper, we present a novel Single Shot multi-Span Detector for temporal activity detection in long, untrimmed videos using a simple end-to-end fully three-dimensional convolutional (Conv3D) network.

Deep Reinforcement Learning for Visual Object Tracking in Videos

no code implementations31 Jan 2017 Da Zhang, Hamid Maei, Xin Wang, Yuan-Fang Wang

In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame.

Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer

2 code implementations CVPR 2017 Xin Wang, Geoffrey Oxholm, Da Zhang, Yuan-Fang Wang

That is, our scheme can generate results that are visually pleasing and more similar to multiple desired artistic styles with color and texture cues at multiple scales.

