TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Autonomous Driving	CARLA Leaderboard	Transfuser	Driving Score	16.93	# 16
Autonomous Driving	CARLA Leaderboard	Transfuser	Route Completion	51.82	# 14
Autonomous Driving	CARLA Leaderboard	Transfuser	Infraction penalty	0.42	# 18
Semantic Segmentation	KITTI-360	TransFuser (RGB-LiDAR)	mIoU	56.57	# 7
Autonomous Driving	Town05 Long	Geometric Fusion	RC	69.17	# 1
Autonomous Driving	Town05 Long	TransFuser	RC	56.36	# 2
Autonomous Driving	Town05 Long	TransFuser	DS	33.15	# 1
Autonomous Driving	Town05 Short	Geometric Fusion	RC	86.91	# 1
Autonomous Driving	Town05 Short	TransFuser	RC	78.41	# 2
Autonomous Driving	Town05 Short	TransFuser	DS	54.52	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-modal-fusion-transformer-for-end-to-end/autonomous-driving-on-town05-long)](https://paperswithcode.com/sota/autonomous-driving-on-town05-long?p=multi-modal-fusion-transformer-for-end-to-end)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-modal-fusion-transformer-for-end-to-end/autonomous-driving-on-town05-short)](https://paperswithcode.com/sota/autonomous-driving-on-town05-short?p=multi-modal-fusion-transformer-for-end-to-end)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-modal-fusion-transformer-for-end-to-end/semantic-segmentation-on-kitti-360)](https://paperswithcode.com/sota/semantic-segmentation-on-kitti-360?p=multi-modal-fusion-transformer-for-end-to-end)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-modal-fusion-transformer-for-end-to-end/autonomous-driving-on-carla-leaderboard)](https://paperswithcode.com/sota/autonomous-driving-on-carla-leaderboard?p=multi-modal-fusion-transformer-for-end-to-end)`

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

CVPR 2021 · Aditya Prakash, Kashyap Chitta, Andreas Geiger ·

How should representations from complementary sensors be integrated for autonomous driving? Geometry-based sensor fusion has shown great promise for perception tasks such as object detection and motion forecasting. However, for the actual driving task, the global context of the 3D scene is key, e.g. a change in traffic light state can affect the behavior of a vehicle geometrically distant from that traffic light. Geometry alone may therefore be insufficient for effectively fusing representations in end-to-end driving models. In this work, we demonstrate that imitation learning policies based on existing sensor fusion methods under-perform in the presence of a high density of dynamic agents and complex scenarios, which require global contextual reasoning, such as handling traffic oncoming from multiple directions at uncontrolled intersections. Therefore, we propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention. We experimentally validate the efficacy of our approach in urban settings involving complex scenarios using the CARLA urban driving simulator. Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Code

Add Remove Mark official

autonomousvision/transfuser official

976

Kin-Zhang/mmfn

Tasks

Add Remove

Autonomous Driving

Imitation Learning

Motion Forecasting

object-detection

Object Detection

Semantic Segmentation

Sensor Fusion

Datasets

CARLA

KITTI-360

Results from the Paper

Edit

Ranked #1 on Autonomous Driving on Town05 Short

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Autonomous Driving	CARLA Leaderboard	Transfuser	Driving Score	16.93	# 16	Compare
			Route Completion	51.82	# 14	Compare
			Infraction penalty	0.42	# 18	Compare
Semantic Segmentation	KITTI-360	TransFuser (RGB-LiDAR)	mIoU	56.57	# 7	Compare
Autonomous Driving	Town05 Long	Geometric Fusion	RC	69.17	# 1	Compare
Autonomous Driving	Town05 Long	TransFuser	RC	56.36	# 2	Compare
Autonomous Driving	Town05 Long	TransFuser	DS	33.15	# 1	Compare
Autonomous Driving	Town05 Short	Geometric Fusion	RC	86.91	# 1	Compare
Autonomous Driving	Town05 Short	TransFuser	RC	78.41	# 2	Compare
Autonomous Driving	Town05 Short	TransFuser	DS	54.52	# 1	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • CARLA • Dense Connections • Dropout • Entropy Regularization • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • PPO • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove