WidthFormer: Toward Efficient Transformer-based BEV View Transformation

1 code implementation8 Jan 2024 Chenhongyi Yang, Tianwei Lin, Lichao Huang, Elliot J. Crowley

In this work, we present WidthFormer, a novel transformer-based Bird's-Eye-View (BEV) 3D detection method tailored for real-time autonomous-driving applications.

3D Object Detection Autonomous Driving +4

EDA: Evolving and Distinct Anchors for Multimodal Motion Prediction

1 code implementation15 Dec 2023 Longzhong Lin, Xuewu Lin, Tianwei Lin, Lichao Huang, Rong Xiong, Yue Wang

Motion prediction is a crucial task in autonomous driving, and one of its major challenges lands in the multimodality of future behaviors.

Autonomous Driving motion prediction +1

Focus on Your Instruction: Fine-grained and Multi-instruction Image Editing by Attention Modulation

1 code implementation15 Dec 2023 Qin Guo, Tianwei Lin

For the first objective, we identify the implicit grounding capability of IP2P from the cross-attention between instruction and image, then develop an effective mask extraction method.


Sparse4D v3: Advancing End-to-End 3D Detection and Tracking

1 code implementation20 Nov 2023 Xuewu Lin, Zixiang Pei, Tianwei Lin, Lichao Huang, Zhizhong Su

We introduce two auxiliary training tasks (Temporal Instance Denoising and Quality Estimation) and propose decoupled attention to make structural improvements, leading to significant enhancements in detection performance.

Autonomous Driving Denoising

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

1 code implementation27 Jun 2023 Haoyi Jiang, Tianheng Cheng, Naiyu Gao, Haoyang Zhang, Tianwei Lin, Wenyu Liu, Xinggang Wang

`3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal undertaking in autonomous driving, aiming to predict voxel occupancy within volumetric scenes.

3D Semantic Scene Completion from a single RGB image Autonomous Driving

DynStatF: An Efficient Feature Fusion Strategy for LiDAR 3D Object Detection

no code implementations24 May 2023 Yao Rong, Xiangyu Wei, Tianwei Lin, Yueyu Wang, Enkelejda Kasneci

In this work, we propose a novel feature fusion strategy, DynStaF (Dynamic-Static Fusion), which enhances the rich semantic information provided by the multi-frame (dynamic branch) with the accurate location information from the current single-frame (static branch).

3D Object Detection object-detection

Sparse4D v2: Recurrent Temporal Fusion with Sparse Model

1 code implementation23 May 2023 Xuewu Lin, Tianwei Lin, Zixiang Pei, Lichao Huang, Zhizhong Su

Firstly, it reduces the computational complexity of temporal fusion from $O(T)$ to $O(1)$, resulting in significant improvements in inference speed and memory usage.

Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer

no code implementations CVPR 2023 Hao Tang, Songhua Liu, Tianwei Lin, Shaoli Huang, Fu Li, Dongliang He, Xinchao Wang

On the other hand, different from the vanilla version, we adopt a learnable scaling operation on content features before content-style feature interaction, which better preserves the original similarity between a pair of content features while ensuring the stylization quality.

Meta-Learning Style Transfer

DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation

1 code implementation CVPR 2023 Yueming Lyu, Tianwei Lin, Fu Li, Dongliang He, Jing Dong, Tieniu Tan

Our key idea is to investigate and identify a space, namely delta image and text space that has well-aligned distribution between CLIP visual feature differences of two images and CLIP textual embedding differences of source and target texts.

Image Manipulation

Planning-oriented Autonomous Driving

1 code implementation CVPR 2023 Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, Lewei Lu, Xiaosong Jia, Qiang Liu, Jifeng Dai, Yu Qiao, Hongyang Li

Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning.

Autonomous Driving Philosophy

AdaCM: Adaptive ColorMLP for Real-Time Universal Photo-realistic Style Transfer

no code implementations3 Dec 2022 Tianwei Lin, Honglin Lin, Fu Li, Dongliang He, Wenhao Wu, Meiling Wang, Xin Li, Yong liu

Then, in \textbf{AdaCM}, we adopt a CNN encoder to adaptively predict all parameters for the ColorMLP conditioned on each input content and style image pair.

4k Style Transfer

Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence

no code implementations CVPR 2022 Zhihong Pan, Baopu Li, Dongliang He, Mingde Yao, Wenhao Wu, Tianwei Lin, Xin Li, Errui Ding

Deep learning based single image super-resolution models have been widely studied and superb results are achieved in upscaling low-resolution images with fixed scale factor and downscaling degradation kernel.

Image Super-Resolution

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

1 code implementation CVPR 2022 Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc van Gool, Errui Ding

We propose a novel framework, i. e., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation that requires little manual annotation while being applicable to a wide variety of manipulations.

Image Manipulation Language Modelling

Background-Click Supervision for Temporal Action Localization

1 code implementation24 Nov 2021 Le Yang, Junwei Han, Tao Zhao, Tianwei Lin, Dingwen Zhang, Jianxin Chen

Weakly supervised temporal action localization aims at learning the instance-level action pattern from the video-level labels, where a significant challenge is action-context confusion.

Position Weakly-supervised Temporal Action Localization +1

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction

2 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, Hao Wang

Neural painting refers to the procedure of producing a series of strokes for a given image and non-photo-realistically recreating it using neural networks.

Object Detection Reinforcement Learning (RL) +1

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer

3 code implementations ICCV 2021 Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Meiling Wang, Xin Li, Zhengxing Sun, Qian Li, Errui Ding

Finally, the content feature is normalized so that they demonstrate the same local feature statistics as the calculated per-point weighted style feature statistics.

Style Transfer Video Style Transfer

Learning Semantic Person Image Generation by Region-Adaptive Normalization

1 code implementation CVPR 2021 Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He, WangMeng Zuo

In the first stage, we predict the target semantic parsing maps to eliminate the difficulties of pose transfer and further benefit the latter translation of per-region appearance style.

Pose Transfer Semantic Parsing +1

Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

2 code implementations CVPR 2021 Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

Inspired by the common painting process of drawing a draft and revising the details, we introduce a novel feed-forward method named Laplacian Pyramid Network (LapStyle).

Style Transfer

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

3 code implementations13 Dec 2020 Wenhao Wu, Dongliang He, Tianwei Lin, Fu Li, Chuang Gan, Errui Ding

Existing state-of-the-art methods have achieved excellent accuracy regardless of the complexity meanwhile efficient spatiotemporal modeling solutions are slightly inferior in performance.

Action Classification Action Recognition +2

Deep Concept-wise Temporal Convolutional Networks for Action Localization

2 code implementations26 Aug 2019 Xin Li, Tianwei Lin, Xiao Liu, Chuang Gan, WangMeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen

In this paper, we empirically find that stacking more conventional temporal convolution layers actually deteriorates action classification performance, possibly ascribing to that all channels of 1D feature map, which generally are highly abstract and can be regarded as latent concepts, are excessively recombined in temporal convolution.

Action Classification Action Localization

BMN: Boundary-Matching Network for Temporal Action Proposal Generation

15 code implementations ICCV 2019 Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, Shilei Wen

To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map.

Action Detection Action Recognition +1

Cascaded Pyramid Mining Network for Weakly Supervised Temporal Action Localization

no code implementations28 Oct 2018 Haisheng Su, Xu Zhao, Tianwei Lin

Weakly supervised temporal action localization, which aims at temporally locating action instances in untrimmed videos using only video-level class labels during training, is an important yet challenging problem in video analysis.

General Classification Video Classification +2

Discriminative Representation Combinations for Accurate Face Spoofing Detection

no code implementations27 Aug 2018 Xiao Song, Xu Zhao, Liangji Fang, Tianwei Lin

Secondly we utilize the SSD, which is a deep learning framework for detection, to excavate context cues and conduct end-to-end face presentation attack detection.

Face Presentation Attack Detection

BSN: Boundary Sensitive Network for Temporal Action Proposal Generation

17 code implementations ECCV 2018 Tianwei Lin, Xu Zhao, Haisheng Su, Chongjing Wang, Ming Yang

Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content.

Action Detection Temporal Action Proposal Generation

Face Spoofing Detection by Fusing Binocular Depth and Spatial Pyramid Coding Micro-Texture Features

no code implementations13 Mar 2018 Xiao Song, Xu Zhao, Tianwei Lin

The second one is a high-level micro-texture based feature called Spatial Pyramid Coding Micro-Texture (SPMT) feature.

Single Shot Temporal Action Detection

2 code implementations17 Oct 2017 Tianwei Lin, Xu Zhao, Zheng Shou

The main drawback of this framework is that the boundaries of action instance proposals have been fixed during the classification step.

Action Detection General Classification

Temporal Convolution Based Action Proposal: Submission to ActivityNet 2017

no code implementations21 Jul 2017 Tianwei Lin, Xu Zhao, Zheng Shou

Our approach achieves the state-of-the-art performances on both temporal action proposal task and temporal action localization task.

Action Classification General Classification +1

