Search Results for author: Yongtao Wang

Found 33 papers, 23 papers with code

HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras

1 code implementation3 Apr 2024 Zhongyu Xia, Zhiwei Lin, Xinhao Wang, Yongtao Wang, Yun Xing, Shengxiang Qi, Nan Dong, Ming-Hsuan Yang

Three-dimensional perception from multi-view cameras is a crucial component in autonomous driving systems, which involves multiple tasks like 3D object detection and bird's-eye-view (BEV) semantic segmentation.

3D Object Detection Autonomous Driving +2

RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection

1 code implementation CVPR 2024 Zhiwei Lin, Zhe Liu, Zhongyu Xia, Xinhao Wang, Yongtao Wang, Shengxiang Qi, Yang Dong, Nan Dong, Le Zhang, Ce Zhu

In the dual-stream radar backbone, a point-based encoder and a transformer-based encoder are proposed to extract radar features, with an injection and extraction module to facilitate communication between the two encoders.

3D Object Detection (RoI) Autonomous Driving +3

Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection

no code implementations ICCV 2023 Xinzhu Ma, Yongtao Wang, Yinmin Zhang, Zhiyi Xia, Yuan Meng, Zhihui Wang, Haojie Li, Wanli Ouyang

In this work, we build a modular-designed codebase, formulate strong training recipes, design an error diagnosis toolbox, and discuss current methods for image-based 3D object detection.

3D Object Detection Object +1

SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image

no code implementations ICCV 2023 Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, Ming-Hsuan Yang

Recent novel view synthesis methods obtain promising results for relatively small scenes, e. g., indoor environments and scenes with a few objects, but tend to fail for unbounded outdoor scenes with a single image as input.

Novel View Synthesis

DUAW: Data-free Universal Adversarial Watermark against Stable Diffusion Customization

no code implementations19 Aug 2023 Xiaoyu Ye, Hao Huang, Jiaqi An, Yongtao Wang

Stable Diffusion (SD) customization approaches enable users to personalize SD model outputs, greatly enhancing the flexibility and diversity of AI art.

Diversity Language Modelling +1

DynamicDet: A Unified Dynamic Architecture for Object Detection

1 code implementation CVPR 2023 ZhiHao Lin, Yongtao Wang, Jinhe Zhang, Xiaojie Chu

We also present a novel optimization strategy with an exiting criterion based on the detection losses for our dynamic detectors.

Computational Efficiency Object +1

A Simple and Generic Framework for Feature Distillation via Channel-wise Transformation

no code implementations23 Mar 2023 Ziwei Liu, Yongtao Wang, Xiaojie Chu

Specifically, we propose a learnable nonlinear channel-wise transformation to align the features of the student and the teacher model.

Image Classification Instance Segmentation +5

BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios

1 code implementation12 Dec 2022 Zhiwei Lin, Yongtao Wang, Shengxiang Qi, Nan Dong, Ming-Hsuan Yang

Based on the property of outdoor point clouds in autonomous driving scenarios, i. e., the point clouds of distant objects are more sparse, we propose point density prediction to enable the 3D encoder to learn location information, which is essential for object detection.

3D Object Detection Autonomous Driving +3

T-SEA: Transfer-based Self-Ensemble Attack on Object Detection

1 code implementation CVPR 2023 Hao Huang, Ziyan Chen, Huanran Chen, Yongtao Wang, Kevin Zhang

Then, we analogize patch optimization with regular model optimization, proposing a series of self-ensemble approaches on the input data, the attacked model, and the adversarial patch to efficiently make use of the limited information and prevent the patch from overfitting.

Adversarial Attack Model Optimization +2

Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers

1 code implementation24 Oct 2022 Zhiwei Lin, Zengyu Yang, Yongtao Wang

Firstly, we present a foreground guidance strategy with an off-the-shelf UOD detector to highlight the foreground regions on the feature maps and then refine object locations in an iterative fashion.

Object object-detection +2

Differentiable Architecture Search with Random Features

no code implementations CVPR 2023 Xuanyang Zhang, Yonggang Li, Xiangyu Zhang, Yongtao Wang, Jian Sun

Differentiable architecture search (DARTS) has significantly promoted the development of NAS techniques because of its high search efficiency and effectiveness but suffers from performance collapse.

Neural Architecture Search

FlowNAS: Neural Architecture Search for Optical Flow Estimation

1 code implementation4 Jul 2022 Zhiwei Lin, TingTing Liang, Taihong Xiao, Yongtao Wang, Zhi Tang, Ming-Hsuan Yang

To address this issue, we propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task.

Image Classification Neural Architecture Search +1

IterVM: Iterative Vision Modeling Module for Scene Text Recognition

1 code implementation6 Apr 2022 Xiaojie Chu, Yongtao Wang

By combining the proposed IterVM with iterative language modeling module, we further propose a powerful scene text recognizer called IterNet.

Language Modelling Scene Text Recognition

Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching

2 code implementations13 Mar 2022 Xiaojie Chu, Yongtao Wang, Chunhua Shen, Jingdong Chen, Wei Chu

The development of scene text recognition (STR) in the era of deep learning has been mainly focused on novel architectures of STR models.

Scene Text Recognition

Continual Contrastive Learning for Image Classification

1 code implementation5 Jul 2021 Zhiwei Lin, Yongtao Wang, Hongxiang Lin

In this paper, we make the first attempt to tackle the catastrophic forgetting problem in the mainstream self-supervised methods, i. e., contrastive learning methods.

Classification Continual Learning +5

CBNet: A Composite Backbone Network Architecture for Object Detection

4 code implementations1 Jul 2021 TingTing Liang, Xiaojie Chu, Yudong Liu, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling

With multi-scale testing, we push the current best single model result to a new record of 60. 1% box AP and 52. 3% mask AP without using extra training data.

Instance Segmentation Object +2

CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes

1 code implementation23 May 2021 Hao Huang, Yongtao Wang, Zhaoyu Chen, Yuze Zhang, Yuheng Li, Zhi Tang, Wei Chu, Jingdong Chen, Weisi Lin, Kai-Kuang Ma

Then, we design a two-level perturbation fusion strategy to alleviate the conflict between the adversarial watermarks generated by different facial images and models.

Adversarial Attack Face Swapping +1

RPATTACK: Refined Patch Attack on General Object Detectors

1 code implementation23 Mar 2021 Hao Huang, Yongtao Wang, Zhaoyu Chen, Zhi Tang, Wenqiang Zhang, Kai-Kuang Ma

Firstly, we propose a patch selection and refining scheme to find the pixels which have the greatest importance for attack and remove the inconsequential perturbations gradually.


OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

1 code implementation CVPR 2021 TingTing Liang, Yongtao Wang, Zhi Tang, Guosheng Hu, Haibin Ling

Encouraged by the success, we propose a novel One-Shot Path Aggregation Network Architecture Search (OPANAS) algorithm, which significantly improves both searching efficiency and detection accuracy.

Neural Architecture Search object-detection +1

Learning a Single Model with a Wide Range of Quality Factors for JPEG Image Artifacts Removal

1 code implementation15 Sep 2020 Jianwei Li, Yongtao Wang, Haihua Xie, Kai-Kuang Ma

Our proposed network is a single model approach that can be trained for handling a wide range of quality factors while consistently delivering superior or comparable image artifacts removal performance.

Blocking JPEG Artifact Correction +1

GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Pixel Labeling

1 code implementation27 May 2020 Zhuoying Wang, Yongtao Wang, Zhi Tang, Yangyan Li, Ying Chen, Haibin Ling, Weisi Lin

Existing CNN-based methods for pixel labeling heavily depend on multi-scale features to meet the requirements of both semantic comprehension and detail preservation.

Pose Estimation Semantic Segmentation

MixTConv: Mixed Temporal Convolutional Kernels for Efficient Action Recogntion

no code implementations19 Jan 2020 Kaiyu Shan, Yongtao Wang, Zhuoying Wang, TingTing Liang, Zhi Tang, Ying Chen, Yangyan Li

To efficiently extract spatiotemporal features of video for action recognition, most state-of-the-art methods integrate 1D temporal convolution into a conventional 2D CNN backbone.

Action Recognition

CBNet: A Novel Composite Backbone Network Architecture for Object Detection

6 code implementations9 Sep 2019 Yudong Liu, Yongtao Wang, Siwei Wang, Ting-Ting Liang, Qijie Zhao, Zhi Tang, Haibin Ling

In existing CNN based detectors, the backbone network is a very important component for basic feature extraction, and the performance of the detectors highly depends on it.

Instance Segmentation object-detection +2

M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network

12 code implementations12 Nov 2018 Qijie Zhao, Tao Sheng, Yongtao Wang, Zhi Tang, Ying Chen, Ling Cai, Haibin Ling

Finally, we gather up the decoder layers with equivalent scales (sizes) to develop a feature pyramid for object detection, in which every feature map consists of the layers (features) from multiple levels.

Decoder Object +2

Deep Dual Pyramid Network for Barcode Segmentation using Barcode-30k Database

no code implementations31 Jul 2018 Qijie Zhao, Feng Ni, Yang song, Yongtao Wang, Zhi Tang

Specifically, a synthesizing method was proposed to generate well-annotated images containing barcode and QR code labels, which contributes to largely decrease the annotation time.

Segmentation Semantic Segmentation

CFENet: An Accurate and Efficient Single-Shot Object Detector for Autonomous Driving

1 code implementation26 Jun 2018 Qijie Zhao, Tao Sheng, Yongtao Wang, Feng Ni, Ling Cai

The ability to detect small objects and the speed of the object detector are very important for the application of autonomous driving, and in this paper, we propose an effective yet efficient one-stage detector, which gained the second place in the Road Object Detection competition of CVPR2018 workshop - Workshop of Autonomous Driving(WAD).

Autonomous Driving object-detection +1

Mutual Enhancement for Detection of Multiple Logos in Sports Videos

no code implementations ICCV 2017 Yuan Liao, Xiaoqing Lu, Chengcui Zhang, Yongtao Wang, Zhi Tang

Mutual enhancement is also included in our frame propagation mechanism that improves logo detection by utilizing the continuity of logos across frames.

object-detection Object Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.