To improve the precision and stability of predictions, we propose several techniques, including parallel and cascade model-ensemble mechanisms and a sliding training method.
3D multi-object tracking (MOT) has witnessed numerous novel benchmarks and approaches in recent years, especially those under the "tracking-by-detection" paradigm.
Ranked #1 on 3D Multi-Object Tracking on Waymo Open Dataset
With MIG, A100 can be the most cost-efficient GPU ever for serving Deep Neural Networks (DNNs).
This report aims to compare two safety methods: control barrier function and Hamilton-Jacobi reachability analysis.
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time.
Ranked #27 on Monocular Depth Estimation on KITTI Eigen split
The code and protocols for our benchmark and algorithm are available at https://github. com/TuSimple/LiDAR_SOT/.
This paper considers the problem of safe autonomous navigation in unknown environments, relying on local obstacle sensing.
Systems and Control Robotics Systems and Control
To the best of our knowledge, this is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale-consistent camera trajectories over a long video sequence.
Ranked #32 on Monocular Depth Estimation on KITTI Eigen split (using extra training data)
In this paper, we propose a novel network design mechanism for efficient embedded computing.
Ranked #4 on Face Verification on CFP-FP
2 code implementations • 16 Apr 2018 • Jason Dai, Yiheng Wang, Xin Qiu, Ding Ding, Yao Zhang, Yanzhang Wang, Xianyan Jia, Cherry Zhang, Yan Wan, Zhichao Li, Jiao Wang, Shengsheng Huang, Zhongyuan Wu, Yang Wang, Yuhao Yang, Bowen She, Dongjie Shi, Qi Lu, Kai Huang, Guoqiong Song
This paper presents BigDL (a distributed deep learning framework for Apache Spark), which has been used by a variety of users in the industry for building deep learning applications on production big data platforms.
This paper describes our solution for the video recognition task of the Google Cloud and YouTube-8M Video Understanding Challenge that ranked the 3rd place.
We propose a dynamic computational time model to accelerate the average processing time for recurrent visual attention (RAM).