1 code implementation • 7 Jun 2025 • Zhenxin Li, Wenhao Yao, Zi Wang, Xinglong Sun, Joshua Chen, Nadine Chang, Maying Shen, Zuxuan Wu, Shiyi Lan, Jose M. Alvarez
GTRS consists of three complementary innovations: (1) a diffusion-based trajectory generator that produces diverse fine-grained proposals; (2) a vocabulary generalization technique that trains a scorer on super-dense trajectory sets with dropout regularization, enabling its robust inference on smaller subsets; and (3) a sensor augmentation strategy that enhances out-of-domain generalization while incorporating refinement training for critical trajectory discrimination.
no code implementations • 7 Jun 2025 • Wenhao Yao, Zhenxin Li, Shiyi Lan, Zi Wang, Xinglong Sun, Jose M. Alvarez, Zuxuan Wu
In complex driving environments, autonomous vehicles must navigate safely.
no code implementations • CVPR 2025 • Xinglong Sun, Barath Lakshmanan, Maying Shen, Shiyi Lan, Jingde Chen, Jose M. Alvarez
Current structural pruning methods face two significant limitations: (i) they often limit pruning to finer-grained levels like channels, making aggressive parameter reduction challenging, and (ii) they focus heavily on parameter and FLOP reduction, with existing latency-aware methods frequently relying on simplistic, suboptimal linear models that fail to generalize well to transformers, where multiple interacting dimensions impact latency.
2 code implementations • 18 Mar 2025 • Nvidia, :, Hassan Abu Alhaija, Jose Alvarez, Maciej Bala, Tiffany Cai, Tianshi Cao, Liz Cha, Joshua Chen, Mike Chen, Francesco Ferroni, Sanja Fidler, Dieter Fox, Yunhao Ge, Jinwei Gu, Ali Hassani, Michael Isaev, Pooya Jannaty, Shiyi Lan, Tobias Lasser, Huan Ling, Ming-Yu Liu, Xian Liu, Yifan Lu, Alice Luo, Qianli Ma, Hanzi Mao, Fabio Ramos, Xuanchi Ren, Tianchang Shen, Xinglong Sun, Shitao Tang, Ting-Chun Wang, Jay Wu, Jiashu Xu, Stella Xu, Kevin Xie, Yuchong Ye, Xiaodong Yang, Xiaohui Zeng, Yu Zeng
We introduce Cosmos-Transfer, a conditional world generation model that can generate world simulations based on multiple spatial control inputs of various modalities such as segmentation, depth, and edge.
no code implementations • 13 Mar 2025 • Xinglong Sun, Haijiang Sun, Shan Jiang, Jiacheng Wang, Jiasong Wang
The trackers based on lightweight neural networks have achieved great success in the field of aerial remote sensing, most of which aggregate multi-stage deep features to lift the tracking quality.
no code implementations • 5 Mar 2025 • Zi Wang, Shiyi Lan, Xinglong Sun, Nadine Chang, Zhenxin Li, Zhiding Yu, Jose M. Alvarez
In this paper, we propose SafeFusion, a training framework to learn from collision data.
no code implementations • 17 Jun 2024 • Xinglong Sun, Barath Lakshmanan, Maying Shen, Shiyi Lan, Jingde Chen, Jose Alvarez
We develop a latency modeling technique that accurately captures model-wide latency variations during pruning, which is crucial for achieving an optimal latency-accuracy trade-offs at high pruning ratio.
no code implementations • 25 Mar 2024 • Xinglong Sun, Haijiang Sun, Shan Jiang, Jiacheng Wang, Xilai Wei, Zhonghe Hu
They are capable of fully capturing the category-related semantics for classification and the local spatial contexts for regression, respectively.
1 code implementation • 1 Jan 2024 • Xinglong Sun, Adam W. Harley, Leonidas J. Guibas
In the first stage, we use the pre-trained model to estimate motion in a video, and then select the subset of motion estimates which we can verify with cycle-consistency.
2 code implementations • 3 Aug 2023 • Xinglong Sun, Jean Ponce, Yu-Xiong Wang
Our study reveals that, different from prior work, deformable convolution needs to be applied on an estimated depth map with a relatively high density for better performance.
1 code implementation • 22 Jun 2023 • Xinglong Sun
On DomainBed benchmark and state-of-the-art MIRO, we can further boost its performance by 1 point only by introducing 10% sparsity into the model.
1 code implementation • CVPR 2022 • Xinglong Sun, Ali Hassani, Zhangyang Wang, Gao Huang, Humphrey Shi
We analyzed the pruning masks generated with DiSparse and observed strikingly similar sparse network architecture identified by each task even before the training starts.
no code implementations • 30 Apr 2021 • Xinglong Sun, Guangliang Han, Lihong Guo, Tingfa Xu, Jianan Li, Peixun Liu
Offline Siamese networks have achieved very promising tracking performance, especially in accuracy and efficiency.
no code implementations • 12 Aug 2020 • Hang Yang, Xiaotian Wu, Xinglong Sun
The goal of blind image deblurring is to recover sharp image from one input blurred image with an unknown blur kernel.