Search Results for author: Mingxing Tan

Found 42 papers, 19 papers with code

LEF: Late-to-Early Temporal Fusion for LiDAR 3D Object Detection

no code implementations • 28 Sep 2023 • Tong He, Pei Sun, Zhaoqi Leng, Chenxi Liu, Dragomir Anguelov, Mingxing Tan

We propose a late-to-early recurrent feature fusion scheme for 3D object detection using temporal LiDAR point clouds.

Paper
Add Code

WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting

no code implementations • 7 Apr 2023 • Kan Chen, Runzhou Ge, Hang Qiu, Rami Ai-Rfou, Charles R. Qi, Xuanyu Zhou, Zoey Yang, Scott Ettinger, Pei Sun, Zhaoqi Leng, Mustafa Baniodeh, Ivan Bogun, Weiyue Wang, Mingxing Tan, Dragomir Anguelov

To study the effect of these modular approaches, design new paradigms that mitigate these limitations, and accelerate the development of end-to-end motion forecasting models, we augment the Waymo Open Motion Dataset (WOMD) with large-scale, high-quality, diverse LiDAR data for the motion forecasting task.

Motion Forecasting

Paper
Add Code

PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds

no code implementations • 24 Oct 2022 • Zhaoqi Leng, Shuyang Cheng, Benjamin Caine, Weiyue Wang, Xiao Zhang, Jonathon Shlens, Mingxing Tan, Dragomir Anguelov

To alleviate the cost of hyperparameter tuning and iterative pseudo labeling, we develop a population-based data augmentation framework for 3D detection, named AutoPseudoAugment.

Data Augmentation Pseudo Label

Paper
Add Code

LidarAugment: Searching for Scalable 3D LiDAR Data Augmentations

no code implementations • 24 Oct 2022 • Zhaoqi Leng, Guowang Li, Chenxi Liu, Ekin Dogus Cubuk, Pei Sun, Tong He, Dragomir Anguelov, Mingxing Tan

Data augmentations are important in training high-performance 3D object detectors for point clouds.

3D Object Detection Data Augmentation +1

Paper
Add Code

SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds

no code implementations • 13 Oct 2022 • Pei Sun, Mingxing Tan, Weiyue Wang, Chenxi Liu, Fei Xia, Zhaoqi Leng, Dragomir Anguelov

3D object detection in point clouds is a core component for modern robotics and autonomous driving systems.

3D Object Detection Autonomous Driving +2

Paper
Add Code

LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds

no code implementations • 10 Oct 2022 • Chenxi Liu, Zhaoqi Leng, Pei Sun, Shuyang Cheng, Charles R. Qi, Yin Zhou, Mingxing Tan, Dragomir Anguelov

Developing neural models that accurately understand objects in 3D point clouds is essential for the success of robotics and autonomous driving.

3D Object Detection Autonomous Driving +2

Paper
Add Code

PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions

8 code implementations • ICLR 2022 • Zhaoqi Leng, Mingxing Tan, Chenxi Liu, Ekin Dogus Cubuk, Xiaojie Shi, Shuyang Cheng, Dragomir Anguelov

Cross-entropy loss and focal loss are the most common choices when training deep neural networks for classification problems.

Ranked #99 on Image Classification on ImageNet (using extra training data)

3D Object Detection Image Classification +2

301

Paper
Code

Revisiting Multi-Scale Feature Fusion for Semantic Segmentation

no code implementations • 23 Mar 2022 • Tianjian Meng, Golnaz Ghiasi, Reza Mahjourian, Quoc V. Le, Mingxing Tan

It is commonly believed that high internal resolution combined with expensive operations (e. g. atrous convolutions) are necessary for accurate semantic segmentation, resulting in slow speed and large memory usage.

Segmentation Semantic Segmentation

Paper
Add Code

DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

1 code implementation • CVPR 2022 • Yingwei Li, Adams Wei Yu, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Bo Wu, Yifeng Lu, Denny Zhou, Quoc V. Le, Alan Yuille, Mingxing Tan

In this paper, we propose two novel techniques: InverseAug that inverses geometric-related augmentations, e. g., rotation, to enable accurate geometric alignment between lidar points and image pixels, and LearnableAlign that leverages cross-attention to dynamically capture the correlations between image and lidar features during fusion.

3D Object Detection Autonomous Driving +2

2,781

Paper
Code

Occupancy Flow Fields for Motion Forecasting in Autonomous Driving

no code implementations • 8 Mar 2022 • Reza Mahjourian, Jinkyu Kim, Yuning Chai, Mingxing Tan, Ben Sapp, Dragomir Anguelov

We propose Occupancy Flow Fields, a new representation for motion forecasting of multiple agents, an important task in autonomous driving.

Motion Estimation Motion Forecasting

Paper
Add Code

Combined Scaling for Zero-shot Transfer Learning

no code implementations • 19 Nov 2021 • Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Ranked #3 on Zero-Shot Transfer Image Classification on ImageNet-Sketch

Classification Contrastive Learning +3

Paper
Add Code

CoAtNet: Marrying Convolution and Attention for All Data Sizes

14 code implementations • NeurIPS 2021 • Zihang Dai, Hanxiao Liu, Quoc V. Le, Mingxing Tan

Transformers have attracted increasing interests in computer vision, but they still fall behind state-of-the-art convolutional networks.

Ranked #1 on Image Classification on GasHisSDB

Image Classification Inductive Bias

29,680

Paper
Code

EfficientNetV2: Smaller Models and Faster Training

20 code implementations • 1 Apr 2021 • Mingxing Tan, Quoc V. Le

By pretraining on the same ImageNet21k, our EfficientNetV2 achieves 87. 3% top-1 accuracy on ImageNet ILSVRC2012, outperforming the recent ViT by 2. 0% accuracy while training 5x-11x faster using the same computing resources.

Ranked #3 on Image Classification on Stanford Cars

Classification Data Augmentation +2

29,680

Paper
Code

Robust and Accurate Object Detection via Adversarial Learning

1 code implementation • CVPR 2021 • Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong

Data augmentation has become a de facto component for training high-performance deep image classifiers, but its potential is under-explored for object detection.

Ranked #17 on Object Detection on COCO-O

AutoML Data Augmentation +3

6,153

Paper
Code

MoViNets: Mobile Video Networks for Efficient Video Recognition

3 code implementations • CVPR 2021 • Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

Ranked #3 on Action Classification on Charades

Action Classification Action Recognition +4

76,579

Paper
Code

Rethinking Co-design of Neural Architectures and Hardware Accelerators

no code implementations • 17 Feb 2021 • Yanqi Zhou, Xuanyi Dong, Berkin Akin, Mingxing Tan, Daiyi Peng, Tianjian Meng, Amir Yazdanbakhsh, Da Huang, Ravi Narayanaswami, James Laudon

In our work, we target the optimization of hardware and software configurations on an industry-standard edge accelerator.

Neural Architecture Search

Paper
Add Code

Searching for Fast Model Families on Datacenter Accelerators

no code implementations • CVPR 2021 • Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc Le, Norman P. Jouppi

On top of our DC accelerator optimized neural architecture search space, we further propose a latency-aware compound scaling (LACS), the first multi-objective compound scaling method optimizing both accuracy and latency.

Neural Architecture Search

Paper
Add Code

Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention

6 code implementations • 7 Feb 2021 • Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh

The scalability of Nystr\"{o}mformer enables application to longer sequences with thousands of tokens.

Ranked #13 on Semantic Textual Similarity on MRPC (F1 metric)

Natural Language Inference Question Answering +2

7,522

Paper
Code

PyGlove: Symbolic Programming for Automated Machine Learning

no code implementations • NeurIPS 2020 • Daiyi Peng, Xuanyi Dong, Esteban Real, Mingxing Tan, Yifeng Lu, Hanxiao Liu, Gabriel Bender, Adam Kraft, Chen Liang, Quoc V. Le

As a result, AutoML can be reformulated as an automated process of symbolic manipulation.

AutoML BIG-bench Machine Learning

Paper
Add Code

NAHAS: Neural Architecture and Hardware Accelerator Search

no code implementations • 1 Jan 2021 • Yanqi Zhou, Xuanyi Dong, Daiyi Peng, Ethan Zhu, Amir Yazdanbakhsh, Berkin Akin, Mingxing Tan, James Laudon

In this paper, we study the importance of co-designing neural architectures and hardware accelerators.

Neural Architecture Search

Paper
Add Code

Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour

no code implementations • 30 Oct 2020 • Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc Le, Yang You, Sameer Kumar

EfficientNets are a family of state-of-the-art image classification models based on efficiently scaled convolutional neural networks.

Image Classification Playing the Game of 2048

Paper
Add Code

Efficient Scale-Permuted Backbone with Learned Resource Distribution

no code implementations • ECCV 2020 • Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Yin Cui, Mingxing Tan, Quoc Le, Xiaodan Song

Furthermore, SpineNet is built with a uniform resource distribution over operations.

General Classification Image Classification +3

Paper
Add Code

Shape-Texture Debiased Neural Network Training

1 code implementation • ICLR 2021 • Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, Cihang Xie

To prevent models from exclusively attending on a single cue in representation learning, we augment training data with images with conflicting shape and texture information (eg, an image of chimpanzee shape but with lemon texture) and, most importantly, provide the corresponding supervisions from shape and texture simultaneously.

Ranked #598 on Image Classification on ImageNet

Adversarial Robustness Data Augmentation +2

108

Paper
Code

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

no code implementations • ICML 2020 • Denny Zhou, Mao Ye, Chen Chen, Tianjian Meng, Mingxing Tan, Xiaodan Song, Quoc Le, Qiang Liu, Dale Schuurmans

This is achieved by layerwise imitation, that is, forcing the thin network to mimic the intermediate outputs of the wide network from layer to layer.

Computational Efficiency Model Compression

Paper
Add Code

Smooth Adversarial Training

1 code implementation • 25 Jun 2020 • Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille, Quoc V. Le

SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82. 2% accuracy and 58. 6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9. 5% for accuracy and 11. 6% for robustness.

Ranked #1 on Adversarial Defense on ImageNet (non-targeted PGD, max perturbation=4)

Adversarial Defense Adversarial Robustness

Paper
Code

AutoHAS: Efficient Hyperparameter and Architecture Search

no code implementations • 5 Jun 2020 • Xuanyi Dong, Mingxing Tan, Adams Wei Yu, Daiyi Peng, Bogdan Gabrys, Quoc V. Le

Efficient hyperparameter or architecture search methods have shown remarkable results, but each of them is only applicable to searching for either hyperparameters (HPs) or architectures.

Hyperparameter Optimization Neural Architecture Search +1

Paper
Add Code

When Ensembling Smaller Models is More Efficient than Single Large Models

no code implementations • 1 May 2020 • Dan Kondratyuk, Mingxing Tan, Matthew Brown, Boqing Gong

Ensembling is a simple and popular technique for boosting evaluation performance by training multiple models (e. g., with different initializations) and aggregating their predictions.

Paper
Add Code

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators

4 code implementations • CVPR 2021 • Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen

By incorporating regular convolutions in the search space and directly optimizing the network architectures for object detection, we obtain a family of object detection models, MobileDets, that achieve state-of-the-art results across mobile accelerators.

Neural Architecture Search Object +2

76,579

Paper
Code

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models

1 code implementation • ECCV 2020 • Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, Quoc Le

Without extra retraining or post-processing steps, we are able to train a single set of shared weights on ImageNet and use these weights to obtain child models whose sizes range from 200 to 1000 MFLOPs.

Ranked #30 on Neural Architecture Search on ImageNet

Neural Architecture Search

Paper
Code

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

13 code implementations • CVPR 2020 • Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Golnaz Ghiasi, Mingxing Tan, Yin Cui, Quoc V. Le, Xiaodan Song

We propose SpineNet, a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by Neural Architecture Search.

Ranked #9 on Image Classification on iNaturalist

General Classification Image Classification +5

73,120

Paper
Code

Adversarial Examples Improve Image Recognition

6 code implementations • CVPR 2020 • Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang, Alan Yuille, Quoc V. Le

We show that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger.

Ranked #4 on Domain Generalization on VizWiz-Classification

Domain Generalization Image Classification

29,680

Paper
Code

EfficientDet: Scalable and Efficient Object Detection

62 code implementations • CVPR 2020 • Mingxing Tan, Ruoming Pang, Quoc V. Le

Model efficiency has become increasingly important in computer vision.

Ranked #6 on Object Detection on COCO minival (APS metric)

AutoML Object +1

76,582

Paper
Code

Search to Distill: Pearls are Everywhere but not the Eyes

no code implementations • CVPR 2020 • Yu Liu, Xuhui Jia, Mingxing Tan, Raviteja Vemulapalli, Yukun Zhu, Bradley Green, Xiaogang Wang

Standard Knowledge Distillation (KD) approaches distill the knowledge of a cumbersome teacher model into the parameters of a student model with a pre-defined architecture.

Ensemble Learning Face Recognition +3

Paper
Add Code

Scaling Up Neural Architecture Search with Big Single-Stage Models

no code implementations • 25 Sep 2019 • Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Quoc Le

In this work, we propose BigNAS, an approach that simplifies this workflow and scales up neural architecture search to target a wide range of model sizes simultaneously.

Neural Architecture Search

Paper
Add Code

Evo-NAS: Evolutionary-Neural Hybrid Agent for Architecture Search

no code implementations • 25 Sep 2019 • Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Kuang-Yu Samuel Chang, Andrea Gesmundo

We show that the Evo-NAS agent outperforms both neural and evolutionary agents when applied to architecture search for a suite of text and image classification benchmarks.

Evolutionary Algorithms Image Classification +2

Paper
Add Code

MixConv: Mixed Depthwise Convolutional Kernels

13 code implementations • 22 Jul 2019 • Mingxing Tan, Quoc V. Le

In this paper, we systematically study the impact of different kernel sizes, and observe that combining the benefits of multiple kernel sizes can lead to better accuracy and efficiency.

Ranked #733 on Image Classification on ImageNet

AutoML Image Classification +2

29,680

Paper
Code

AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures

2 code implementations • ICLR 2020 • Michael S. Ryoo, AJ Piergiovanni, Mingxing Tan, Anelia Angelova

Learning to represent videos is a very challenging task both algorithmically and computationally.

Ranked #1 on Multimodal Activity Recognition on Moments in Time Dataset

Action Classification Action Recognition +4

72,249

Paper
Code

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

133 code implementations • ICML 2019 • Mingxing Tan, Quoc V. Le

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available.

Ranked #1 on Medical Image Classification on NCT-CRC-HE-100K

Action Recognition Domain Generalization +4

29,680

Paper
Code

Searching for MobileNetV3

61 code implementations • ICCV 2019 • Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam

We achieve new state of the art results for mobile classification, detection and segmentation.

Ranked #8 on Dichotomous Image Segmentation on DIS-TE1

Classification Dichotomous Image Segmentation +5

76,582

Paper
Code

Neural Architecture Search Over a Graph Search Space

no code implementations • 27 Dec 2018 • Stanisław Jastrzębski, Quentin de Laroussilhe, Mingxing Tan, Xiao Ma, Neil Houlsby, Andrea Gesmundo

However, the success of NAS depends on the definition of the search space.

Image Classification Neural Architecture Search

Paper
Add Code

Evolutionary-Neural Hybrid Agents for Architecture Search

no code implementations • 24 Nov 2018 • Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Marin Georgiev, Andrea Gesmundo

We show that the Evo-NAS agent outperforms both neural and evolutionary agents when applied to architecture search for a suite of text and image classification benchmarks.

Evolutionary Algorithms General Classification +3

Paper
Add Code

MnasNet: Platform-Aware Neural Architecture Search for Mobile

28 code implementations • CVPR 2019 • Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, Quoc V. Le

In this paper, we propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency.

Ranked #829 on Image Classification on ImageNet (using extra training data)

Image Classification Neural Architecture Search +2

29,680

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.