Search Results for author: Wenyu Liu

Found 116 papers, 77 papers with code

You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection

2 code implementations • NeurIPS 2021 • Yuxin Fang, Bencheng Liao, Xinggang Wang, Jiemin Fang, Jiyang Qi, Rui Wu, Jianwei Niu, Wenyu Liu

Can Transformer perform 2D object- and region-level recognition from a pure sequence-to-sequence perspective with minimal knowledge about the 2D spatial structure?

Ranked #30 on Object Detection on COCO-O

Object object-detection +1

124,527

Paper
Code

Deep High-Resolution Representation Learning for Visual Recognition

42 code implementations • 20 Aug 2019 • Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, Wenyu Liu, Bin Xiao

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection.

Ranked #1 on Object Detection on COCO test-dev (Hardware Burden metric)

Dichotomous Image Segmentation Face Alignment +7

27,708

Paper
Code

Instances as Queries

5 code implementations • ICCV 2021 • Yuxin Fang, Shusheng Yang, Xinggang Wang, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu

The key insight of QueryInst is to leverage the intrinsic one-to-one correspondence in object queries across different stages, as well as one-to-one correspondence between mask RoI features and object queries in the same stage.

Ranked #13 on Object Detection on COCO-O (using extra training data)

Instance Segmentation Object +4

27,708

Paper
Code

ByteTrack: Multi-Object Tracking by Associating Every Detection Box

10 code implementations • arXiv 2021 • Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Fucheng Weng, Zehuan Yuan, Ping Luo, Wenyu Liu, Xinggang Wang

ByteTrack also achieves state-of-the-art performance on MOT20, HiEve and BDD100K tracking benchmarks.

Ranked #1 on Multiple Object Tracking on BDD100K val

Multi-Object Tracking Multiple Object Tracking +1

12,032

Paper
Code

High-Resolution Representations for Labeling Pixels and Regions

39 code implementations • 9 Apr 2019 • Ke Sun, Yang Zhao, Borui Jiang, Tianheng Cheng, Bin Xiao, Dong Liu, Yadong Mu, Xinggang Wang, Wenyu Liu, Jingdong Wang

The proposed approach achieves superior results to existing single-model networks on COCO object detection.

Ranked #7 on Semantic Segmentation on LIP val

Face Alignment Facial Landmark Detection +5

12,029

Paper
Code

FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking

32 code implementations • 4 Apr 2020 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wen-Jun Zeng, Wenyu Liu

Formulating MOT as multi-task learning of object detection and re-ID in a single network is appealing since it allows joint optimization of the two tasks and enjoys high computation efficiency.

Ranked #1 on Multi-Object Tracking on 2DMOT15 (using extra training data)

Fairness Multi-Object Tracking +4

12,029

Paper
Code

CCNet: Criss-Cross Attention for Semantic Segmentation

4 code implementations • ICCV 2019 • Zilong Huang, Xinggang Wang, Yunchao Wei, Lichao Huang, Humphrey Shi, Wenyu Liu, Thomas S. Huang

Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.

Ranked #7 on Semantic Segmentation on FoodSeg103 (using extra training data)

Computational Efficiency Human Parsing +8

7,370

Paper
Code

YOLO-World: Real-Time Open-Vocabulary Object Detection

1 code implementation • 30 Jan 2024 • Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan

The You Only Look Once (YOLO) series of detectors have established themselves as efficient and practical tools.

Instance Segmentation Language Modelling +4

3,296

Paper
Code

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

5 code implementations • 17 Jan 2024 • Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, Xinggang Wang

The results demonstrate that Vim is capable of overcoming the computation & memory constraints on performing Transformer-style understanding for high-resolution images and it has great potential to be the next-generation backbone for vision foundation models.

object-detection Object Detection +3

2,007

Paper
Code

YOLOP: You Only Look Once for Panoptic Driving Perception

5 code implementations • 25 Aug 2021 • Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wang, Xiang Bai, Wenqing Cheng, Wenyu Liu

A panoptic driving perception system is an essential part of autonomous driving.

Ranked #3 on Drivable Area Detection on BDD100K val

Autonomous Driving Drivable Area Detection +5

1,812

Paper
Code

4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

1 code implementation • 12 Oct 2023 • Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang

Representing and rendering dynamic scenes has been an important but challenging task.

1,644

Paper
Code

MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction

1 code implementation • 30 Aug 2022 • Bencheng Liao, Shaoyu Chen, Xinggang Wang, Tianheng Cheng, Qian Zhang, Wenyu Liu, Chang Huang

High-definition (HD) map provides abundant and precise environmental information of the driving scene, serving as a fundamental and indispensable component for planning in autonomous driving system.

Ranked #7 on 3D Lane Detection on OpenLane-V2 val

3D Lane Detection Autonomous Driving

903

Paper
Code

VAD: Vectorized Scene Representation for Efficient Autonomous Driving

2 code implementations • ICCV 2023 • Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang

In this paper, we propose VAD, an end-to-end vectorized paradigm for autonomous driving, which models the driving scene as a fully vectorized representation.

Autonomous Driving Trajectory Planning

903

Paper
Code

VMA: Divide-and-Conquer Vectorized Map Annotation System for Large-Scale Driving Scene

2 code implementations • 19 Apr 2023 • Shaoyu Chen, Yunchi Zhang, Bencheng Liao, Jiafeng Xie, Tianheng Cheng, Wei Sui, Qian Zhang, Chang Huang, Wenyu Liu, Xinggang Wang

We design a divide-and-conquer annotation scheme to solve the spatial extensibility problem of HD map generation, and abstract map elements with a variety of geometric patterns as unified point sequence representation, which can be extended to most map elements in the driving scene.

Autonomous Driving

903

Paper
Code

MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction

1 code implementation • 10 Aug 2023 • Bencheng Liao, Shaoyu Chen, Yunchi Zhang, Bo Jiang, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang

We propose a unified permutation-equivalent modeling approach, \ie, modeling map element as a point set with a group of equivalent permutations, which accurately describes the shape of map element and stabilizes the learning process.

Autonomous Driving

903

Paper
Code

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

3 code implementations • 21 Nov 2016 • Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu

This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression.

629

Paper
Code

ResizeMix: Mixing Data with Preserved Object Information and True Labels

1 code implementation • 21 Dec 2020 • Jie Qin, Jiemin Fang, Qian Zhang, Wenyu Liu, Xingang Wang, Xinggang Wang

Especially, CutMix uses a simple but effective method to improve the classifiers by randomly cropping a patch from one image and pasting it on another image.

Data Augmentation Image Classification +3

567

Paper
Code

Sparse Instance Activation for Real-Time Instance Segmentation

2 code implementations • CVPR 2022 • Tianheng Cheng, Xinggang Wang, Shaoyu Chen, Wenqiang Zhang, Qian Zhang, Chang Huang, Zhaoxiang Zhang, Wenyu Liu

In this paper, we propose a conceptually novel, efficient, and fully convolutional framework for real-time instance segmentation.

Ranked #8 on Real-time Instance Segmentation on MSCOCO

Object object-detection +4

561

Paper
Code

GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models

1 code implementation • 12 Oct 2023 • Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, Xinggang Wang

In recent times, the generation of 3D assets from text prompts has shown impressive results.

Text to 3D

515

Paper
Code

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

1 code implementation • 15 Dec 2021 • Wenyu Liu, Gaofeng Ren, Runsheng Yu, Shi Guo, Jianke Zhu, Lei Zhang

Though deep learning-based object detection methods have achieved promising results on the conventional datasets, it is still challenging to locate objects from the low-quality images captured in adverse weather conditions.

Image Enhancement object-detection +1

463

Paper
Code

Improving Nighttime Driving-Scene Segmentation via Dual Image-adaptive Learnable Filters

2 code implementations • 4 Jul 2022 • Wenyu Liu, Wentong Li, Jianke Zhu, Miaomiao Cui, Xuansong Xie, Lei Zhang

With DIAL-Filters, we design both unsupervised and supervised frameworks for nighttime driving-scene segmentation, which can be trained in an end-to-end manner.

Autonomous Driving Scene Segmentation +1

463

Paper
Code

Matte Anything: Interactive Natural Image Matting with Segment Anything Models

1 code implementation • 7 Jun 2023 • Jingfeng Yao, Xinggang Wang, Lang Ye, Wenyu Liu

In our work, we leverage vision foundation models to enhance the performance of natural image matting.

Image Matting

402

Paper
Code

Box2Mask: Box-supervised Instance Segmentation via Level-set Evolution

2 code implementations • 3 Dec 2022 • Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Risheng Yu, Xiansheng Hua, Lei Zhang

In contrast to fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of simple box annotations, which has recently attracted increasing research attention.

Ranked #1 on Box-supervised Instance Segmentation on PASCAL VOC 2012 val

Box-supervised Instance Segmentation Segmentation

401

Paper
Code

Tracking Instances as Queries

1 code implementation • 22 Jun 2021 • Shusheng Yang, Yuxin Fang, Xinggang Wang, Yu Li, Ying Shan, Bin Feng, Wenyu Liu

Recently, query based deep networks catch lots of attention owing to their end-to-end pipeline and competitive results on several fundamental computer vision tasks, such as object detection, semantic segmentation, and instance segmentation.

Instance Segmentation object-detection +4

400

Paper
Code

Temporally Efficient Vision Transformer for Video Instance Segmentation

3 code implementations • CVPR 2022 • Shusheng Yang, Xinggang Wang, Yu Li, Yuxin Fang, Jiemin Fang, Wenyu Liu, Xun Zhao, Ying Shan

To effectively and efficiently model the crucial temporal information within a video clip, we propose a Temporally Efficient Vision Transformer (TeViT) for video instance segmentation (VIS).

Ranked #35 on Video Instance Segmentation on OVIS validation

Instance Segmentation Semantic Segmentation +1

400

Paper
Code

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation

3 code implementations • CVPR 2022 • Wenqiang Zhang, Zilong Huang, Guozhong Luo, Tao Chen, Xinggang Wang, Wenyu Liu, Gang Yu, Chunhua Shen

Although vision transformers (ViTs) have achieved great success in computer vision, the heavy computational cost hampers their applications to dense prediction tasks such as semantic segmentation on mobile devices.

Segmentation Semantic Segmentation

372

Paper
Code

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

2 code implementations • 23 Jul 2022 • Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai

Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.

Optical Character Recognition (OCR)

342

Paper
Code

Fast Dynamic Radiance Fields with Time-Aware Neural Voxels

1 code implementation • 30 May 2022 • Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian

A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.

307

Paper
Code

Densely Connected Search Space for More Flexible Neural Architecture Search

1 code implementation • CVPR 2020 • Jiemin Fang, Yuzhu Sun, Qian Zhang, Yuan Li, Wenyu Liu, Xinggang Wang

We revisit the search space design in most previous NAS methods and find the number and widths of blocks are set manually.

Ranked #91 on Neural Architecture Search on ImageNet

Image Classification Neural Architecture Search

295

Paper
Code

Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing

1 code implementation • CVPR 2018 • Zilong Huang, Xinggang Wang, Jiasi Wang, Wenyu Liu, Jingdong Wang

Inspired by the traditional image segmentation methods of seeded region growing, we propose to train a semantic segmentation network starting from the discriminative regions and progressively increase the pixel-level supervision using by seeded region growing.

Ranked #38 on Weakly-Supervised Semantic Segmentation on COCO 2014 val (using extra training data)

Image Segmentation Segmentation +2

248

Paper
Code

Multiple Instance Detection Network with Online Instance Classifier Refinement

4 code implementations • CVPR 2017 • Peng Tang, Xinggang Wang, Xiang Bai, Wenyu Liu

We propose a novel online instance classifier refinement algorithm to integrate MIL and the instance classifier refinement procedure into a single deep network, and train the network end-to-end with only image-level supervision, i. e., without object location information.

Ranked #4 on Weakly Supervised Object Detection on ImageNet

Multiple Instance Learning Object +3

245

Paper
Code

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

4 code implementations • 9 Jul 2018 • Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille

The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one.

Ranked #1 on Weakly Supervised Object Detection on ImageNet

Multiple Instance Learning Object +3

245

Paper
Code

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer

1 code implementation • 9 Jun 2022 • Shaoyu Chen, Tianheng Cheng, Xinggang Wang, Wenming Meng, Qian Zhang, Wenyu Liu

GKT leverages the geometric priors to guide the transformer to focus on discriminative regions and unfolds kernel features to generate BEV representation.

Autonomous Driving Representation Learning

197

Paper
Code

Boundary-preserving Mask R-CNN

1 code implementation • ECCV 2020 • Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu

Besides, it is not surprising to observe that BMask R-CNN obtains more obvious improvement when the evaluation criterion requires better localization (e. g., AP$_{75}$) as shown in Fig. 1.

Instance Segmentation Object +1

184

Paper
Code

Box-supervised Instance Segmentation with Level Set Evolution

1 code implementation • 19 Jul 2022 • Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xiansheng Hua, Lei Zhang

A simple mask supervised SOLOv2 model is adapted to predict the instance-aware mask map as the level set for each instance.

Box-supervised Instance Segmentation Segmentation

181

Paper
Code

RPTQ: Reorder-based Post-training Quantization for Large Language Models

1 code implementation • 3 Apr 2023 • Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, Bingzhe Wu

In this paper, we identify that the challenge in quantizing activations in LLMs arises from varying ranges across channels, rather than solely the presence of outliers.

Quantization

172

Paper
Code

FNA++: Fast Network Adaptation via Parameter Remapping and Architecture Search

2 code implementations • 21 Jun 2020 • Jiemin Fang, Yuzhu Sun, Qian Zhang, Kangjian Peng, Yuan Li, Wenyu Liu, Xinggang Wang

In this paper, we propose a Fast Network Adaptation (FNA++) method, which can adapt both the architecture and parameters of a seed network (e. g. an ImageNet pre-trained network) to become a network with different depths, widths, or kernel sizes via a parameter remapping technique, making it possible to use NAS for segmentation and detection tasks a lot more efficiently.

Image Classification Neural Architecture Search +5

164

Paper
Code

Hierarchical Aggregation for 3D Instance Segmentation

1 code implementation • ICCV 2021 • Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang

Instance segmentation on point clouds is a fundamental task in 3D scene perception.

Ranked #4 on 3D Instance Segmentation on S3DIS (mCov metric, using extra training data)

3D Instance Segmentation Clustering +2

163

Paper
Code

AlignSeg: Feature-Aligned Segmentation Networks

1 code implementation • 24 Feb 2020 • Zilong Huang, Yunchao Wei, Xinggang Wang, Wenyu Liu, Thomas S. Huang, Humphrey Shi

Aggregating features in terms of different convolutional blocks or contextual embeddings has been proven to be an effective way to strengthen feature representations for semantic segmentation.

Segmentation Semantic Segmentation

124

Paper
Code

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

1 code implementation • 27 Jun 2023 • Haoyi Jiang, Tianheng Cheng, Naiyu Gao, Haoyang Zhang, Tianwei Lin, Wenyu Liu, Xinggang Wang

`3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal undertaking in autonomous driving, aiming to predict voxel occupancy within volumetric scenes.

Ranked #1 on 3D Semantic Scene Completion on KITTI-360

3D Semantic Scene Completion from a single RGB image Autonomous Driving

123

Paper
Code

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

1 code implementation • 2 Jul 2019 • Qiang Zhou, Zilong Huang, Lichao Huang, Yongchao Gong, Han Shen, Chang Huang, Wenyu Liu, Xinggang Wang

Video object segmentation (VOS) aims at pixel-level object tracking given only the annotations in the first frame.

Ranked #1 on Visual Object Tracking on YouTube-VOS 2018 (Jaccard (Seen) metric)

Object Object Tracking +4

117

Paper
Code

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation

1 code implementation • 3 Apr 2023 • Lianghui Zhu, Yingyue Li, Jiemin Fang, Yan Liu, Hao Xin, Wenyu Liu, Xinggang Wang

Thus a novel weight-based method is proposed to end-to-end estimate the importance of attention heads, while the self-attention maps are adaptively fused for high-quality CAM results that tend to have more complete objects.

Ranked #2 on Weakly-Supervised Semantic Segmentation on PASCAL VOC 2012 train

Weakly-supervised Learning Weakly supervised Semantic Segmentation +1

115

Paper
Code

Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction

1 code implementation • 15 Mar 2023 • Bencheng Liao, Shaoyu Chen, Bo Jiang, Tianheng Cheng, Qian Zhang, Wenyu Liu, Chang Huang, Xinggang Wang

We present a path-based online lane graph construction method, termed LaneGAP, which end-to-end learns the path and recovers the lane graph via a Path2Graph algorithm.

Autonomous Driving graph construction +1

110

Paper
Code

SparseTrack: Multi-Object Tracking by Performing Scene Decomposition based on Pseudo-Depth

2 code implementations • 8 Jun 2023 • Zelin Liu, Xinggang Wang, Cheng Wang, Wenyu Liu, Xiang Bai

By integrating the pseudo-depth method and the DCM strategy into the data association process, we propose a new tracker, called SparseTrack.

Ranked #1 on Multi-Object Tracking on MOT20 (using extra training data)

Depth Estimation Multi-Object Tracking +1

107

Paper
Code

Deep Learning-based Detection for COVID-19 from Chest CT using Weak Label

1 code implementation • medRxiv 2020 • Chuansheng Zheng, Xianbo Deng, Qing Fu, Qiang Zhou, Jiapei Feng, Hui Ma, Wenyu Liu, Xinggang Wang

Our weakly-supervised deep learning model can accurately predict the COVID-19 infectious probability in chest CT volumes without the need for annotating the lesions for training.

COVID-19 Diagnosis Specificity

106

Paper
Code

Crossover Learning for Fast Online Video Instance Segmentation

1 code implementation • ICCV 2021 • Shusheng Yang, Yuxin Fang, Xinggang Wang, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu

For temporal information modeling in VIS, we present a novel crossover learning scheme that uses the instance feature in the current frame to pixel-wisely localize the same instance in other frames.

Ranked #34 on Video Instance Segmentation on OVIS validation

Instance Segmentation Semantic Segmentation +2

Paper
Code

MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

3 code implementations • CVPR 2022 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Transformers have offered a new methodology of designing neural networks for visual recognition.

Image Classification object-detection +1

Paper
Code

Scene Text Retrieval via Joint Text Detection and Similarity Learning

1 code implementation • CVPR 2021 • Hao Wang, Xiang Bai, Mingkun Yang, Shenggao Zhu, Jing Wang, Wenyu Liu

Such a task is usually realized by matching a query text to the recognized words, outputted by an end-to-end scene text spotter.

Retrieval Scene Text Detection +3

Paper
Code

LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation

1 code implementation • CVPR 2023 • Song Wang, Wentong Li, Wenyu Liu, Xiaolu Liu, Jianke Zhu

To mitigate the defects caused by lacking semantic cues in LiDAR data, we present an online Camera-to-LiDAR distillation scheme to facilitate the semantic learning from image to point cloud.

Autonomous Driving

Paper
Code

Polar Parametrization for Vision-based Surround-View 3D Detection

1 code implementation • 22 Jun 2022 • Shaoyu Chen, Xinggang Wang, Tianheng Cheng, Qian Zhang, Chang Huang, Wenyu Liu

Based on Polar Parametrization, we propose a surround-view 3D DEtection TRansformer, named PolarDETR.

Inductive Bias Position

Paper
Code

BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised Instance Segmentation

1 code implementation • CVPR 2023 • Tianheng Cheng, Xinggang Wang, Shaoyu Chen, Qian Zhang, Wenyu Liu

Most existing methods for weakly supervised instance segmentation focus on designing heuristic losses with priors from bounding boxes.

Ranked #2 on Box-supervised Instance Segmentation on COCO test-dev

Box-supervised Instance Segmentation Segmentation +3

Paper
Code

Multi-Oriented Text Detection with Fully Convolutional Networks

1 code implementation • CVPR 2016 • Zheng Zhang, Chengquan Zhang, Wei Shen, Cong Yao, Wenyu Liu, Xiang Bai

In this paper, we propose a novel approach for text detec- tion in natural images.

Ranked #40 on Scene Text Detection on ICDAR 2015

Scene Text Detection Text Detection

Paper
Code

Diversity Transfer Network for Few-Shot Learning

1 code implementation • 31 Dec 2019 • Mengting Chen, Yuxin Fang, Xinggang Wang, Heng Luo, Yifeng Geng, Xin-Yu Zhang, Chang Huang, Wenyu Liu, Bo wang

The learning problem of the sample generation (i. e., diversity transfer) is solved via minimizing an effective meta-classification loss in a single-stage network, instead of the generative loss in previous works.

Few-Shot Learning

Paper
Code

Dynamic Class Queue for Large Scale Face Recognition In the Wild

1 code implementation • CVPR 2021 • Bi Li, Teng Xi, Gang Zhang, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Wenyu Liu

Since only a subset of classes is selected for each iteration, the computing requirement is reduced.

Ranked #3 on Face Recognition on AgeDB-30

Face Recognition Representation Learning

Paper
Code

AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception

1 code implementation • CVPR 2022 • Shaoyu Chen, Xinggang Wang, Tianheng Cheng, Wenqiang Zhang, Qian Zhang, Chang Huang, Wenyu Liu

For segmentation, we integrate AziNorm into KPConv.

object-detection Object Detection +1

Paper
Code

Label-efficient Segmentation via Affinity Propagation

1 code implementation • NeurIPS 2023 • Wentong Li, Yuqian Yuan, Song Wang, Wenyu Liu, Dongqi Tang, Jian Liu, Jianke Zhu, Lei Zhang

In this work, we formulate the affinity modeling as an affinity propagation process, and propose a local and a global pairwise affinity terms to generate accurate soft pseudo labels.

Box-supervised Instance Segmentation Segmentation +2

Paper
Code

Vision-based Uneven BEV Representation Learning with Polar Rasterization and Surface Estimation

1 code implementation • 5 Jul 2022 • Zhi Liu, Shaoyu Chen, Xiaojie Guo, Xinggang Wang, Tianheng Cheng, Hongmei Zhu, Qian Zhang, Wenyu Liu, Yi Zhang

In this work, we propose PolarBEV for vision-based uneven BEV representation learning.

Instance Segmentation Representation Learning +2

Paper
Code

Featurized Query R-CNN

1 code implementation • 13 Jun 2022 • Wenqiang Zhang, Tianheng Cheng, Xinggang Wang, Shaoyu Chen, Qian Zhang, Wenyu Liu

The query mechanism introduced in the DETR method is changing the paradigm of object detection and recently there are many query-based methods have obtained strong object detection performance.

Object object-detection +1

Paper
Code

PD-Quant: Post-Training Quantization based on Prediction Difference Metric

1 code implementation • CVPR 2023 • Jiawei Liu, Lin Niu, Zhihang Yuan, Dawei Yang, Xinggang Wang, Wenyu Liu

It determines the quantization parameters by using the information of differences between network prediction before and after quantization.

Neural Network Compression Quantization

Paper
Code

Graph Contrastive Learning for Skeleton-based Action Recognition

1 code implementation • 26 Jan 2023 • Xiaohu Huang, Hao Zhou, Jian Wang, Haocheng Feng, Junyu Han, Errui Ding, Jingdong Wang, Xinggang Wang, Wenyu Liu, Bin Feng

In this paper, we propose a graph contrastive learning framework for skeleton-based action recognition (\textit{SkeletonGCL}) to explore the \textit{global} context across all sequences.

Ranked #9 on Skeleton Based Action Recognition on NTU RGB+D

Action Recognition Contrastive Learning +2

Paper
Code

NeuSample: Neural Sample Field for Efficient View Synthesis

1 code implementation • 30 Nov 2021 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.

Paper
Code

Knowledge Mining with Scene Text for Fine-Grained Recognition

1 code implementation • CVPR 2022 • Hao Wang, Junchao Liao, Tianheng Cheng, Zewen Gao, Hao liu, Bo Ren, Xiang Bai, Wenyu Liu

Recently, the semantics of scene text has been proven to be essential in fine-grained image classification.

Activity Recognition Classification +1

Paper
Code

MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning

1 code implementation • 13 Mar 2024 • Jialv Zou, Bencheng Liao, Qian Zhang, Wenyu Liu, Xinggang Wang

Learning robust and scalable visual representations from massive multi-view video data remains a challenge in computer vision and autonomous driving.

3D Object Detection Autonomous Driving +2

Paper
Code

Deep Patch Learning for Weakly Supervised Object Classification and Discovery

1 code implementation • 6 May 2017 • Peng Tang, Xinggang Wang, Zilong Huang, Xiang Bai, Wenyu Liu

Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background.

Classification General Classification +3

Paper
Code

A Simple Adaptive Unfolding Network for Hyperspectral Image Reconstruction

1 code implementation • 24 Jan 2023 • Junyu Wang, Shijie Wang, Wenyu Liu, Zengqiang Zheng, Xinggang Wang

We present a simple, efficient, and scalable unfolding network, SAUNet, to simplify the network design with an adaptive alternate optimization framework for hyperspectral image (HSI) reconstruction.

Image Reconstruction

Paper
Code

Query6DoF: Learning Sparse Queries as Implicit Shape Prior for Category-Level 6DoF Pose Estimation

1 code implementation • ICCV 2023 • Ruiqi Wang, Xinggang Wang, Te Li, Rong Yang, Minhong Wan, Wenyu Liu

Category-level 6DoF object pose estimation intends to estimate the rotation, translation, and size of unseen objects.

6D Pose Estimation Point cloud reconstruction

Paper
Code

EAT-NAS: Elastic Architecture Transfer for Accelerating Large-scale Neural Architecture Search

1 code implementation • 17 Jan 2019 • Jiemin Fang, Yukang Chen, Xinbang Zhang, Qian Zhang, Chang Huang, Gaofeng Meng, Wenyu Liu, Xinggang Wang

In our implementations, architectures are first searched on a small dataset, e. g., CIFAR-10.

Neural Architecture Search

Paper
Code

Deep multi-metric learning for text-independent speaker verification

1 code implementation • 17 Jul 2020 • Jiwei Xu, Xinggang Wang, Bin Feng, Wenyu Liu

Text-independent speaker verification is an important artificial intelligence problem that has a wide spectrum of applications, such as criminal investigation, payment certification, and interest-based customer services.

Metric Learning Text-Independent Speaker Verification

Paper
Code

WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition

1 code implementation • 22 Feb 2024 • Lianghui Zhu, Junwei Zhou, Yan Liu, Xin Hao, Wenyu Liu, Xinggang Wang

Weakly supervised visual recognition using inexact supervision is a critical yet challenging learning problem.

object-detection Segmentation +2

Paper
Code

ViTGaze: Gaze Following with Interaction Features in Vision Transformers

1 code implementation • 19 Mar 2024 • Yuehao Song, Xinggang Wang, Jingfeng Yao, Wenyu Liu, Jinglin Zhang, Xiangmin Xu

Our method achieves state-of-the-art (SOTA) performance among all single-modality methods (3. 4% improvement on AUC, 5. 1% improvement on AP) and very comparable performance against multi-modality methods with 59% number of parameters less.

Paper
Code

Circuit as Set of Points

1 code implementation • NeurIPS 2023 • Jialv Zou, Xinggang Wang, Jiahao Guo, Wenyu Liu, Qian Zhang, Chang Huang

In our work, we propose a novel perspective for circuit design by treating circuit components as point clouds and using Transformer-based point cloud perception methods to extract features from the circuit.

Paper
Code

EfficientPose: Efficient Human Pose Estimation with Neural Architecture Search

1 code implementation • 13 Dec 2020 • Wenqiang Zhang, Jiemin Fang, Xinggang Wang, Wenyu Liu

Human pose estimation from image and video is a vital task in many multimedia applications.

Image Classification Neural Architecture Search +1

Paper
Code

A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging

1 code implementation • 16 May 2023 • Junyu Wang, Shijie Wang, Ruijie Zhang, Zengqiang Zheng, Wenyu Liu, Xinggang Wang

We present RND-SCI, a novel framework for compressive hyperspectral image (HSI) reconstruction.

Paper
Code

Understanding Self-Supervised Pretraining with Part-Aware Representation Learning

1 code implementation • 27 Jan 2023 • Jie Zhu, Jiyang Qi, Mingyu Ding, Xiaokang Chen, Ping Luo, Xinggang Wang, Wenyu Liu, Leye Wang, Jingdong Wang

The study is mainly motivated by that random views, used in contrastive learning, and random masked (visible) patches, used in masked image modeling, are often about object parts.

Contrastive Learning Object +1

Paper
Code

Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition

1 code implementation • 13 Aug 2023 • Xiaohu Huang, Xinggang Wang, Zhidianqiu Jin, Bo Yang, Botao He, Bin Feng, Wenyu Liu

Graph convolutional networks have been widely applied in skeleton-based gait recognition.

Gait Recognition

Paper
Code

Multi-scale Context-aware Network with Transformer for Gait Recognition

1 code implementation • ICCV 2021 • Duowang Zhu, Xiaohu Huang, Xinggang Wang, Bo Yang, Botao He, Wenyu Liu, Bin Feng

Although gait recognition has drawn increasing research attention recently, since the silhouette differences are quite subtle in spatial domain, temporal feature representation is crucial for gait recognition.

Ranked #1 on Gait Recognition on OUMVLP

Multiview Gait Recognition Relation

Paper
Code

Object Detection in Videos by High Quality Object Linking

no code implementations • 30 Jan 2018 • Peng Tang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wen-Jun Zeng, Jingdong Wang

In particular, our method improves results by 8. 8% over the static image detector for fast moving objects.

General Classification Object +3

Paper
Add Code

Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis

no code implementations • 27 Jun 2017 • Pengyuan Lyu, Xiang Bai, Cong Yao, Zhen Zhu, Tengteng Huang, Wenyu Liu

In this paper, we investigate the Chinese calligraphy synthesis problem: synthesizing Chinese calligraphy images with specified style from standard font(eg.

Image-to-Image Translation Translation

Paper
Add Code

Point Linking Network for Object Detection

no code implementations • 12 Jun 2017 • Xinggang Wang, Kaibing Chen, Zilong Huang, Cong Yao, Wenyu Liu

The deep ConvNets based object detectors mainly focus on regressing the coordinates of bounding box, e. g., Faster-R-CNN, YOLO and SSD.

Object object-detection +1

Paper
Add Code

Revisiting Multiple Instance Neural Networks

no code implementations • 8 Oct 2016 • Xinggang Wang, Yongluan Yan, Peng Tang, Xiang Bai, Wenyu Liu

We propose a new multiple instance neural network to learn bag representations, which is different from the existing multiple instance neural networks that focus on estimating instance label.

Multiple Instance Learning Weakly-supervised Learning

Paper
Add Code

Deep FisherNet for Object Classification

no code implementations • 31 Jul 2016 • Peng Tang, Xinggang Wang, Baoguang Shi, Xiang Bai, Wenyu Liu, Zhuowen Tu

Our proposed FisherNet combines convolutional neural network training and Fisher Vector encoding in a single end-to-end structure.

Classification Computational Efficiency +3

Paper
Add Code

Deep Regression for Face Alignment

no code implementations • 18 Sep 2014 • Baoguang Shi, Xiang Bai, Wenyu Liu, Jingdong Wang

In this paper, we present a deep regression approach for face alignment.

Face Alignment regression

Paper
Add Code

Learning to Update for Object Tracking with Recurrent Meta-learner

no code implementations • 19 Jun 2018 • Bi Li, Wenxuan Xie, Wen-Jun Zeng, Wenyu Liu

Generally, model update is formulated as an online learning problem where a target model is learned over the online training set.

Ranked #1 on Visual Tracking on OTB-100

Meta-Learning Visual Object Tracking +1

Paper
Add Code

DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning

no code implementations • NeurIPS 2018 • Runsheng Yu, Wenyu Liu, Yasen Zhang, Zhi Qu, Deli Zhao, Bo Zhang

Based on these sub-images, a local exposure for each sub-image is automatically learned by virtue of policy network sequentially while the reward of learning is globally designed for striking a balance of overall exposures.

Paper
Add Code

Fusion with Diffusion for Robust Visual Tracking

no code implementations • NeurIPS 2012 • Yu Zhou, Xiang Bai, Wenyu Liu, Longin J. Latecki

A key feature of our approach is that the time complexity of the dif-fusion on the TPG is the same as the diffusion process on each of the original graphs, Moreover, it is not necessary to explicitly construct the TPG in our frame-work.

Clustering Visual Tracking

Paper
Add Code

Maximal Cliques that Satisfy Hard Constraints with Application to Deformable Object Model Learning

no code implementations • NeurIPS 2011 • Xinggang Wang, Xiang Bai, Xingwei Yang, Wenyu Liu, Longin J. Latecki

We propose a novel inference framework for finding maximal cliques in a weighted graph that satisfy hard constraints.

Object

Paper
Add Code

Weakly Supervised Region Proposal Network and Object Detection

no code implementations • ECCV 2018 • Peng Tang, Xinggang Wang, Angtian Wang, Yongluan Yan, Wenyu Liu, Junzhou Huang, Alan Yuille

The Convolutional Neural Network (CNN) based region proposal generation method (i. e. region proposal network), trained using bounding box annotations, is an essential component in modern fully supervised object detectors.

Object object-detection +2

Paper
Add Code

Mancs: A Multi-task Attentional Network with Curriculum Sampling for Person Re-identification

no code implementations • ECCV 2018 • Cheng Wang, Qian Zhang, Chang Huang, Wenyu Liu, Xinggang Wang

We propose a novel deep network called Mancs that solves the person re-identification problem from the following aspects: fully utilizing the attention mechanism for the person misalignment problem and properly sampling for the ranking loss to obtain more stable person representation.

Person Re-Identification

Paper
Add Code

Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition

no code implementations • CVPR 2014 • Cong Yao, Xiang Bai, Baoguang Shi, Wenyu Liu

Driven by the wide range of applications, scene text detection and recognition have become active research topics in computer vision.

Scene Text Detection Scene Text Recognition +1

Paper
Add Code

All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting

no code implementations • 21 Nov 2019 • Hao Wang, Pu Lu, HUI ZHANG, Mingkun Yang, Xiang Bai, Yongchao Xu, Mengchao He, Yongpan Wang, Wenyu Liu

Recently, end-to-end text spotting that aims to detect and recognize text from cluttered images simultaneously has received particularly growing interest in computer vision.

Instance Segmentation Scene Text Detection +3

Paper
Add Code

Patch Aggregator for Scene Text Script Identification

no code implementations • 9 Dec 2019 • Changxu Cheng, Qiuhui Huang, Xiang Bai, Bin Feng, Wenyu Liu

Script identification in the wild is of great importance in a multi-lingual robust-reading system.

Clustering

Paper
Add Code

Fast Neural Network Adaptation via Parameter Remapping and Architecture Search

no code implementations • ICLR 2020 • Jiemin Fang, Yuzhu Sun, Kangjian Peng, Qian Zhang, Yuan Li, Wenyu Liu, Xinggang Wang

In our experiments, we conduct FNA on MobileNetV2 to obtain new networks for both segmentation and detection that clearly out-perform existing networks designed both manually and by NAS.

Image Classification Neural Architecture Search +4

Paper
Add Code

Maximum Entropy Regularization and Chinese Text Recognition

no code implementations • 9 Jul 2020 • Changxu Cheng, Wuheng Xu, Xiang Bai, Bin Feng, Wenyu Liu

Chinese text recognition is more challenging than Latin text due to the large amount of fine-grained Chinese characters and the great imbalance over classes, which causes a serious overfitting problem.

Fine-Grained Image Classification

Paper
Add Code

Learning Global Structure Consistency for Robust Object Tracking

no code implementations • 26 Aug 2020 • Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu

Unlike many existing trackers that focus on modeling only the target, in this work, we consider the \emph{transient variations of the whole scene}.

Object Visual Object Tracking

Paper
Add Code

Learning to Focus: Cascaded Feature Matching Network for Few-shot Image Recognition

no code implementations • 13 Jan 2021 • Mengting Chen, Xinggang Wang, Heng Luo, Yifeng Geng, Wenyu Liu

By applying the proposed feature matching block in different layers of the few-shot recognition network, multi-scale information among the compared images can be incorporated into the final cascaded matching feature, which boosts the recognition performance further and generalizes better by learning on relationships.

Few-Shot Learning

Paper
Add Code

Half-Real Half-Fake Distillation for Class-Incremental Semantic Segmentation

no code implementations • 2 Apr 2021 • Zilong Huang, Wentian Hao, Xinggang Wang, Mingyuan Tao, Jianqiang Huang, Wenyu Liu, Xian-Sheng Hua

Despite their success for semantic segmentation, convolutional neural networks are ill-equipped for incremental learning, \ie, adapting the original segmentation model as new classes are available but the initial training data is not retained.

Class-Incremental Semantic Segmentation Incremental Learning +1

Paper
Add Code

Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images

no code implementations • CVPR 2021 • Xinggang Wang, Jiapei Feng, Bin Hu, Qi Ding, Longjin Ran, Xiaoxin Chen, Wenyu Liu

Humans have a strong class-agnostic object segmentation ability and can outline boundaries of unknown objects precisely, which motivates us to propose a box-supervised class-agnostic object segmentation (BoxCaseg) based solution for weakly-supervised instance segmentation.

Ranked #5 on Box-supervised Instance Segmentation on COCO test-dev (using extra training data)

Box-supervised Instance Segmentation Multi-Task Learning +5

Paper
Add Code

What Makes for Hierarchical Vision Transformer?

no code implementations • 5 Jul 2021 • Yuxin Fang, Xinggang Wang, Rui Wu, Wenyu Liu

Recent studies indicate that hierarchical Vision Transformer with a macro architecture of interleaved non-overlapped window-based self-attention \& shifted-window operation is able to achieve state-of-the-art performance in various visual recognition tasks, and challenges the ubiquitous convolutional neural networks (CNNs) using densely slid kernels.

Instance Segmentation object-detection +3

Paper
Add Code

VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the Wild

no code implementations • 5 Aug 2021 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wenjun Zeng

We estimate 3D poses from the voxel representation by predicting whether each voxel contains a particular body joint.

Ranked #7 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation 3D Pose Estimation

Paper
Add Code

Decoupling Visual-Semantic Feature Learning for Robust Scene Text Recognition

no code implementations • 24 Nov 2021 • Changxu Cheng, Bohan Li, Qi Zheng, Yongpan Wang, Wenyu Liu

As a result, the learning of semantic features is prone to have a bias on the limited vocabulary of the training set, which is called vocabulary reliance.

Scene Text Recognition

Paper
Add Code

Deep Level Set for Box-supervised Instance Segmentation in Aerial Images

no code implementations • 7 Dec 2021 • Wentong Li, Yijie Chen, Wenyu Liu, Jianke Zhu

Instead of learning the pairwise affinity, the level set method with the carefully designed energy functions treats the object segmentation as curve evolution, which is able to accurately recover the object's boundaries and prevent the interference from the indistinguishable background and similar objects.

Box-supervised Instance Segmentation Segmentation +1

Paper
Add Code

Warped Convolutional Networks: Bridge Homography to sl(3) algebra by Group Convolution

no code implementations • 23 Jun 2022 • Xinrui Zhan, Yang Li, Wenyu Liu, Jianke Zhu

In this paper, we propose Warped Convolution Networks (WCN) to effectively learn and represent the homography by SL(3) group and sl(3) algebra with group convolution.

Homography Estimation Object Tracking

Paper
Add Code

Robust Multi-Object Tracking by Marginal Inference

no code implementations • 7 Aug 2022 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu

To address the problem, we present an efficient approach to compute a marginal probability for each pair of objects in real time.

Multi-Object Tracking Object

Paper
Add Code

Perceive, Interact, Predict: Learning Dynamic and Static Clues for End-to-End Motion Prediction

no code implementations • 5 Dec 2022 • Bo Jiang, Shaoyu Chen, Xinggang Wang, Bencheng Liao, Tianheng Cheng, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang

Motion prediction is highly relevant to the perception of dynamic objects and static map elements in the scenarios of autonomous driving.

Autonomous Driving motion prediction +2

Paper
Add Code

Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance

no code implementations • 23 Mar 2023 • Zhihang Yuan, Jiawei Liu, Jiaxiang Wu, Dawei Yang, Qiang Wu, Guangyu Sun, Wenyu Liu, Xinggang Wang, Bingzhe Wu

Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures.

Benchmarking Data Augmentation +1

Paper
Add Code

Generalizable Neural Voxels for Fast Human Radiance Fields

no code implementations • 27 Mar 2023 • Taoran Yi, Jiemin Fang, Xinggang Wang, Wenyu Liu

Rendering moving human bodies at free viewpoints only from a monocular video is quite a challenging problem.

Novel View Synthesis

Paper
Add Code

OpenInst: A Simple Query-Based Method for Open-World Instance Segmentation

no code implementations • 28 Mar 2023 • Cheng Wang, Guoli Wang, Qian Zhang, Peng Guo, Wenyu Liu, Xinggang Wang

Fortunately, we have identified two observations that help us achieve the best of both worlds: 1) query-based methods demonstrate superiority over dense proposal-based methods in open-world instance segmentation, and 2) learning localization cues is sufficient for open world instance segmentation.

Autonomous Driving Open-World Instance Segmentation +2

Paper
Add Code

MobileInst: Video Instance Segmentation on the Mobile

no code implementations • 30 Mar 2023 • Renhong Zhang, Tianheng Cheng, Shusheng Yang, Haoyi Jiang, Shuai Zhang, Jiancheng Lyu, Xin Li, Xiaowen Ying, Dashan Gao, Wenyu Liu, Xinggang Wang

To address those issues, we present MobileInst, a lightweight and mobile-friendly framework for video instance segmentation on mobile devices.

Instance Segmentation Segmentation +2

Paper
Add Code

TinyDet: Accurate Small Object Detection in Lightweight Generic Detectors

no code implementations • 7 Apr 2023 • Shaoyu Chen, Tianheng Cheng, Jiemin Fang, Qian Zhang, Yuan Li, Wenyu Liu, Xinggang Wang

Small object detection requires the detection head to scan a large number of positions on image feature maps, which is extremely hard for computation- and energy-efficient lightweight generic detectors.

object-detection Small Object Detection

Paper
Add Code

Improving Post-Training Quantization on Object Detection with Task Loss-Guided Lp Metric

no code implementations • 19 Apr 2023 • Lin Niu, Jiawei Liu, Zhihang Yuan, Dawei Yang, Xinggang Wang, Wenyu Liu

PTQ optimizes the quantization parameters by different metrics to minimize the perturbation of quantization.

Object object-detection +2

Paper
Add Code

GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition

no code implementations • 31 May 2023 • Haijun Xiong, Yunze Deng, Xiaohu Huang, Xinggang Wang, Wenyu Liu, Bin Feng

In order to fully harness the potential of gait recognition, it is crucial to consider temporal features at various granularities and spans.

Gait Recognition

Paper
Add Code

TiAVox: Time-aware Attenuation Voxels for Sparse-view 4D DSA Reconstruction

no code implementations • 5 Sep 2023 • Zhenghong Zhou, Huangxuan Zhao, Jiemin Fang, Dongqiao Xiang, Lei Chen, Lingxia Wu, Feihong Wu, Wenyu Liu, Chuansheng Zheng, Xinggang Wang

Additionally, 2D and 3D DSA imaging results can be generated from the reconstructed 4D DSA images.

3D Reconstruction Novel View Synthesis

Paper
Add Code

Fast High Dynamic Range Radiance Fields for Dynamic Scenes

no code implementations • 11 Jan 2024 • Guanjun Wu, Taoran Yi, Jiemin Fang, Wenyu Liu, Xinggang Wang

To extend HDR NeRF methods to wider applications, we propose a dynamic HDR NeRF framework, named HDR-HexPlane, which can learn 3D scenes from dynamic 2D images captured with various exposures.

Paper
Add Code

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

no code implementations • 20 Feb 2024 • Shaoyu Chen, Bo Jiang, Hao Gao, Bencheng Liao, Qing Xu, Qian Zhang, Chang Huang, Wenyu Liu, Xinggang Wang

Learning a human-like driving policy from large-scale driving demonstrations is promising, but the uncertainty and non-deterministic nature of planning make it challenging.

Autonomous Driving

Paper
Add Code

TOGS: Gaussian Splatting with Temporal Opacity Offset for Real-Time 4D DSA Rendering

no code implementations • 28 Mar 2024 • Shuai Zhang, Huangxuan Zhao, Zhenghong Zhou, Guanjun Wu, Chuansheng Zheng, Xinggang Wang, Wenyu Liu

To overcome these limitations, we propose TOGS, a Gaussian splatting method with opacity offset over time, which can effectively improve the rendering quality and speed of 4D DSA.

Paper
Add Code

Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation

1 code implementation • 18 Apr 2024 • Song Wang, Jiawei Yu, Wentong Li, Wenyu Liu, Xiaolu Liu, Junbo Chen, Jianke Zhu

Furthermore, the voxels in the boundary region are more challenging to differentiate than those in the interior.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.