Search Results for author: Yangguang Li

Found 19 papers, 11 papers with code

Mask Hierarchical Features For Self-Supervised Learning

no code implementations1 Apr 2023 Fenggang Liu, Yangguang Li, Feng Liang, Jilan Xu, Bin Huang, Jing Shao

We mask part of patches in the representation space and then utilize sparse visible patches to reconstruct high semantic image representation.

object-detection Object Detection +1

Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline

1 code implementation29 Jan 2023 Yangguang Li, Bin Huang, Zeren Chen, Yufeng Cui, Feng Liang, Mingzhu Shen, Fenggang Liu, Enze Xie, Lu Sheng, Wanli Ouyang, Jing Shao

Our Fast-BEV consists of five parts, We novelly propose (1) a lightweight deployment-friendly view transformation which fast transfers 2D image feature to 3D voxel space, (2) an multi-scale image encoder which leverages multi-scale information for better performance, (3) an efficient BEV encoder which is particularly designed to speed up on-vehicle inference.

Data Augmentation

Fast-BEV: Towards Real-time On-vehicle Bird's-Eye View Perception

1 code implementation19 Jan 2023 Bin Huang, Yangguang Li, Enze Xie, Feng Liang, Luya Wang, Mingzhu Shen, Fenggang Liu, Tianqi Wang, Ping Luo, Jing Shao

Recently, the pure camera-based Bird's-Eye-View (BEV) perception removes expensive Lidar sensors, making it a feasible solution for economical autonomous driving.

Autonomous Driving Data Augmentation

BEVBert: Topo-Metric Map Pre-training for Language-guided Navigation

no code implementations8 Dec 2022 Dong An, Yuankai Qi, Yangguang Li, Yan Huang, Liang Wang, Tieniu Tan, Jing Shao

Inspired by the robotics community, we introduce hybrid topo-metric maps into VLN, where a topological map is used for long-term planning and a metric map for short-term reasoning.

Vision and Language Navigation

R$^2$F: A General Retrieval, Reading and Fusion Framework for Document-level Natural Language Inference

1 code implementation22 Oct 2022 Hao Wang, Yixin Cao, Yangguang Li, Zhen Huang, Kun Wang, Jing Shao

Document-level natural language inference (DOCNLI) is a new challenging task in natural language processing, aiming at judging the entailment relationship between a pair of hypothesis and premise documents.

Natural Language Inference Retrieval

A Mixture of Surprises for Unsupervised Reinforcement Learning

1 code implementation13 Oct 2022 Andrew Zhao, Matthieu Gaetan Lin, Yangguang Li, Yong-Jin Liu, Gao Huang

However, both strategies rely on a strong assumption: the entropy of the environment's dynamics is either high or low.

reinforcement-learning Reinforcement Learning (RL) +1

Neighbor Regularized Bayesian Optimization for Hyperparameter Optimization

no code implementations7 Oct 2022 Lei Cui, Yangguang Li, Xin Lu, Dong An, Fenggang Liu

Bayesian Optimization (BO) is a common solution to search optimal hyperparameters based on sample observations of a machine learning model.

Bayesian Optimization Hyperparameter Optimization

Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies

1 code implementation3 Sep 2022 Xingrun Xing, Yangguang Li, Wei Li, Wenrui Ding, Yalong Jiang, Yufeng Wang, Jing Shao, Chunlei Liu, Xianglong Liu

Second, to improve the robustness of binary models with contextual dependencies, we compute the contextual dynamic embeddings to determine the binarization thresholds in general binary convolutional blocks.

Binarization Inductive Bias

Task-Balanced Distillation for Object Detection

no code implementations5 Aug 2022 Ruining Tang, Zhenyu Liu, Yangguang Li, Yiguo Song, Hui Liu, Qide Wang, Jing Shao, Guifang Duan, Jianrong Tan

To alleviate this problem, a novel Task-decoupled Feature Distillation (TFD) is proposed by flexibly balancing the contributions of classification and regression tasks.

Classification Knowledge Distillation +3

MVP: Robust Multi-View Practice for Driving Action Localization

no code implementations5 Jul 2022 Jingjie Shang, Kunchang Li, Kaibin Tian, Haisheng Su, Yangguang Li

Due to the small data scale and unclear action boundary, the dataset presents a unique challenge to precisely localize all the different actions and classify their categories.

Action Localization

1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022)

1 code implementation23 Jun 2022 Dong An, Zun Wang, Yangguang Li, Yi Wang, Yicong Hong, Yan Huang, Liang Wang, Jing Shao

Our model consists of three modules: the candidate waypoints predictor (CWP), the history enhanced planner and the tryout controller.

Data Augmentation Vision and Language Navigation

SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners

2 code implementations28 May 2022 Feng Liang, Yangguang Li, Diana Marculescu

The proposed Supervised MAE (SupMAE) only exploits a visible subset of image patches for classification, unlike the standard supervised pre-training where all image patches are used.

Representation Learning Transfer Learning

Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision

1 code implementation11 Mar 2022 Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao

This is because researchers do not choose consistent training recipes and even use different data, hampering the fair comparison between different methods.

RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training

no code implementations18 Jan 2022 Luya Wang, Feng Liang, Yangguang Li, Honggang Zhang, Wanli Ouyang, Jing Shao

Recently, self-supervised vision transformers have attracted unprecedented attention for their impressive representation learning ability.

Contrastive Learning Representation Learning

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

1 code implementation16 Jan 2022 Hao Wang, Yangguang Li, Zhen Huang, Yong Dou, Lingpeng Kong, Jing Shao

To alleviate feature suppression, we propose contrastive learning for unsupervised sentence embedding with soft negative samples (SNCSE).

Contrastive Learning Data Augmentation +5

INTERN: A New Learning Paradigm Towards General Vision

no code implementations16 Nov 2021 Jing Shao, Siyu Chen, Yangguang Li, Kun Wang, Zhenfei Yin, Yinan He, Jianing Teng, Qinghong Sun, Mengya Gao, Jihao Liu, Gengshi Huang, Guanglu Song, Yichao Wu, Yuming Huang, Fenggang Liu, Huan Peng, Shuo Qin, Chengyu Wang, Yujie Wang, Conghui He, Ding Liang, Yu Liu, Fengwei Yu, Junjie Yan, Dahua Lin, Xiaogang Wang, Yu Qiao

Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society.

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

2 code implementations ICLR 2022 Yangguang Li, Feng Liang, Lichen Zhao, Yufeng Cui, Wanli Ouyang, Jing Shao, Fengwei Yu, Junjie Yan

Recently, large-scale Contrastive Language-Image Pre-training (CLIP) has attracted unprecedented attention for its impressive zero-shot recognition ability and excellent transferability to downstream tasks.

Zero-Shot Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.