Search Results for author: Dongfang Liu

Found 29 papers, 14 papers with code

BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks

no code implementations • 1 Apr 2024 • Zhiyuan Cheng, Zhaoyi Liu, Tengda Guo, Shiwei Feng, Dongfang Liu, Mingjie Tang, Xiangyu Zhang

Our attack prototype, named BadPart, is evaluated on both MDE and OFE tasks, utilizing a total of 7 models.

Adversarial Robustness Autonomous Driving +3

Paper
Add Code

Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval

1 code implementation • 26 Mar 2024 • Jiamian Wang, Guohao Sun, Pichao Wang, Dongfang Liu, Sohail Dianat, Majid Rabbani, Raghuveer Rao, Zhiqiang Tao

Correspondingly, a single text embedding may be less expressive to capture the video embedding and empower the retrieval.

Multimodal Reasoning Retrieval +1

Paper
Code

Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?

no code implementations • 23 Jan 2024 • Cheng Han, Qifan Wang, Yiming Cui, Wenguan Wang, Lifu Huang, Siyuan Qi, Dongfang Liu

As the scale of vision models continues to grow, the emergence of Visual Prompt Tuning (VPT) as a parameter-efficient transfer learning technique has gained attention due to its superior performance compared to traditional full-finetuning.

Transfer Learning Visual Prompt Tuning

Paper
Add Code

Image Translation as Diffusion Visual Programmers

no code implementations • 18 Jan 2024 • Cheng Han, James C. Liang, Qifan Wang, Majid Rabbani, Sohail Dianat, Raghuveer Rao, Ying Nian Wu, Dongfang Liu

We introduce the novel Diffusion Visual Programmer (DVP), a neuro-symbolic image translation framework.

Style Transfer Translation

Paper
Add Code

Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning

1 code implementation • 1 Dec 2023 • Shaohua Dong, Yunhe Feng, Qing Yang, Yan Huang, Dongfang Liu, Heng Fan

Existing approaches often fully fine-tune a dual-branch encoder-decoder framework with a complicated feature fusion strategy for achieving multimodal semantic segmentation, which is training-costly due to the massive parameter updates in feature extraction and fusion.

Ranked #2 on Semantic Segmentation on SUN-RGBD (using extra training data)

object-detection Object Detection +6

Paper
Code

CML-MOTS: Collaborative Multi-task Learning for Multi-Object Tracking and Segmentation

no code implementations • 2 Nov 2023 • Yiming Cui, Cheng Han, Dongfang Liu

The advancement of computer vision has pushed visual analysis tasks from still images to the video domain.

Autonomous Driving Instance Segmentation +8

Paper
Add Code

ClusterFormer: Clustering As A Universal Visual Learner

1 code implementation • 22 Sep 2023 • James C. Liang, Yiming Cui, Qifan Wang, Tong Geng, Wenguan Wang, Dongfang Liu

This paper presents CLUSTERFORMER, a universal vision model that is based on the CLUSTERing paradigm with TransFORMER.

Clustering Image Classification +7

Paper
Code

E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

1 code implementation • ICCV 2023 • Cheng Han, Qifan Wang, Yiming Cui, Zhiwen Cao, Wenguan Wang, Siyuan Qi, Dongfang Liu

Specifically, we introduce a set of learnable key-value prompts and visual prompts into self-attention and input layers, respectively, to improve the effectiveness of model fine-tuning.

Visual Prompt Tuning

Paper
Code

CLUSTSEG: Clustering for Universal Segmentation

1 code implementation • 3 May 2023 • James Liang, Tianfei Zhou, Dongfang Liu, Wenguan Wang

We present CLUSTSEG, a general, transformer-based framework that tackles different image segmentation tasks (i. e., superpixel, semantic, instance, and panoptic) through a unified neural clustering scheme.

Instance Segmentation Panoptic Segmentation +3

Paper
Code

Fusion is Not Enough: Single Modal Attacks on Fusion Models for 3D Object Detection

no code implementations • 28 Apr 2023 • Zhiyuan Cheng, Hongjun Choi, James Liang, Shiwei Feng, Guanhong Tao, Dongfang Liu, Michael Zuzak, Xiangyu Zhang

We argue that the weakest link of fusion models depends on their most vulnerable modality, and propose an attack framework that targets advanced camera-LiDAR fusion-based 3D object detection models through camera-only adversarial attacks.

3D Object Detection Autonomous Driving +2

Paper
Add Code

TransFlow: Transformer as Flow Learner

no code implementations • CVPR 2023 • Yawen Lu, Qifan Wang, Siqi Ma, Tong Geng, Yingjie Victor Chen, Huaijin Chen, Dongfang Liu

Optical flow is an indispensable building block for various important computer vision tasks, including motion estimation, object tracking, and disparity measurement.

Motion Estimation object-detection +4

Paper
Add Code

Exploiting Logic Locking for a Neural Trojan Attack on Machine Learning Accelerators

no code implementations • 12 Apr 2023 • Hongye Xu, Dongfang Liu, Cory Merkel, Michael Zuzak

If an incorrect secret key is used, a set of deterministic errors is produced in locked modules, restricting unauthorized use.

Paper
Add Code

Adversarial Training of Self-supervised Monocular Depth Estimation against Physical-World Attacks

1 code implementation • 31 Jan 2023 • Zhiyuan Cheng, James Liang, Guanhong Tao, Dongfang Liu, Xiangyu Zhang

We improve adversarial robustness against physical-world attacks using L0-norm-bounded perturbation in training.

Adversarial Robustness Autonomous Driving +2

Paper
Code

Learning Equivariant Segmentation with Instance-Unique Querying

1 code implementation • 3 Oct 2022 • Wenguan Wang, James Liang, Dongfang Liu

Prevalent state-of-the-art instance segmentation methods fall into a query-based scheme, in which instance masks are derived by querying the image feature using a set of instance-aware embeddings.

Instance Segmentation Semantic Segmentation

Paper
Code

Visual Recognition with Deep Nearest Centroids

1 code implementation • 15 Sep 2022 • Wenguan Wang, Cheng Han, Tianfei Zhou, Dongfang Liu

We devise deep nearest centroids (DNC), a conceptually elegant yet surprisingly effective network for large-scale visual recognition, by revisiting Nearest Centroids, one of the most classic and simple classifiers.

Decision Making Image Classification +1

Paper
Code

Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian

no code implementations • 19 Aug 2022 • Zhiwen Cao, Dongfang Liu, Qifan Wang, Yingjie Chen

In this paper, we propose an Anisotropic Spherical Gaussian (ASG)-based LDL approach for facial pose estimation.

Pose Estimation

Paper
Add Code

Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches

1 code implementation • 11 Jul 2022 • Zhiyuan Cheng, James Liang, Hongjun Choi, Guanhong Tao, Zhiwen Cao, Dongfang Liu, Xiangyu Zhang

Experimental results show that our method can generate stealthy, effective, and robust adversarial patches for different target objects and models and achieves more than 6 meters mean depth estimation error and 93% attack success rate (ASR) in object detection with a patch of 1/9 of the vehicle's rear area.

3D Object Detection Autonomous Driving +3

Paper
Code

GL-RG: Global-Local Representation Granularity for Video Captioning

1 code implementation • 22 May 2022 • Liqi Yan, Qifan Wang, Yiming Cui, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu

Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description.

Caption Generation Descriptive +1

Paper
Code

Deep Partial Multiplex Network Embedding

no code implementations • 5 Mar 2022 • Qifan Wang, Yi Fang, Anirudh Ravula, Ruining He, Bin Shen, Jingang Wang, Xiaojun Quan, Dongfang Liu

Network embedding is an effective technique to learn the low-dimensional representations of nodes in networks.

Link Prediction Network Embedding +1

Paper
Add Code

WebFormer: The Web-page Transformer for Structure Information Extraction

no code implementations • 1 Feb 2022 • Qifan Wang, Yi Fang, Anirudh Ravula, Fuli Feng, Xiaojun Quan, Dongfang Liu

Structure information extraction refers to the task of extracting structured text fields from web pages, such as extracting a product offer from a shopping page including product title, description, brand and price.

Deep Attention document understanding +1

Paper
Add Code

DG-Labeler and DGL-MOTS Dataset: Boost the Autonomous Driving Perception

no code implementations • 15 Oct 2021 • Yiming Cui, Zhiwen Cao, Yixin Xie, Xingyu Jiang, Feng Tao, Yingjie Chen, Lin Li, Dongfang Liu

The existing MOTS studies face two critical challenges: 1) the published datasets inadequately capture the real-world complexity for network training to address various driving settings; 2) the working pipeline annotation tool is under-studied in the literature to improve the quality of MOTS learning examples.

Autonomous Driving Multi-Object Tracking +1

Paper
Add Code

TF-Blender: Temporal Feature Blender for Video Object Detection

1 code implementation • ICCV 2021 • Yiming Cui, Liqi Yan, Zhiwen Cao, Dongfang Liu

One of the popular solutions is to exploit the temporal information and enhance per-frame representation through aggregating features from neighboring frames.

Object object-detection +1

Paper
Code

SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation

1 code implementation • CVPR 2021 • Dongfang Liu, Yiming Cui, Wenbo Tan, Yingjie Chen

Video instance segmentation (VIS) is a new and critical task in computer vision.

Head Detection Instance Segmentation +2

Paper
Code

Hierarchical Attention Fusion for Geo-Localization

1 code implementation • 18 Feb 2021 • Liqi Yan, Yiming Cui, Yingjie Chen, Dongfang Liu

We extract the hierarchical feature maps from a convolutional neural network (CNN) and organically fuse the extracted features for image representations.

Image Retrieval Retrieval

Paper
Code

Semantic Aware Data Augmentation for Cell Nuclei Microscopical Images With Artificial Neural Networks

no code implementations • ICCV 2021 • Alireza Naghizadeh, Hongye Xu, Mohab Mohamed, Dimitris N. Metaxas, Dongfang Liu

The importance of this subject is nested in the amount of training data that artificial neural networks need to accurately identify and segment objects in images and the infeasibility of acquiring a sufficient dataset within the biomedical field.

Data Augmentation object-detection +3

Paper
Add Code

DenserNet: Weakly Supervised Visual Localization Using Multi-scale Feature Aggregation

1 code implementation • 4 Dec 2020 • Dongfang Liu, Yiming Cui, Liqi Yan, Christos Mousas, Baijian Yang, Yingjie Chen

In this work, we introduce a Denser Feature Network (DenserNet) for visual localization.

Image Retrieval Retrieval +1

Paper
Code

A Vector-based Representation to Enhance Head Pose Estimation

no code implementations • 14 Oct 2020 • Zhiwen Cao, Zongcheng Chu, Dongfang Liu, Yingjie Chen

This paper proposes to use the three vectors in a rotation matrix as the representation in head pose estimation and develops a new neural network based on the characteristic of such representation.

Head Pose Estimation

Paper
Add Code

Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning

no code implementations • 1 Sep 2020 • Liqi Yan, Dongfang Liu, Yaoxian Song, Changbin Yu

Memory is important for the agent to avoid repeating certain tasks unnecessarily and in order for it to adapt adequately to new scenes, therefore, we make use of meta-learning.

Ranked #1 on Visual Navigation on AI2-THOR

Meta-Learning Visual Navigation

Paper
Add Code

Visual Localization for Autonomous Driving: Mapping the Accurate Location in the City Maze

no code implementations • 13 Aug 2020 • Dongfang Liu, Yiming Cui, Xiaolei Guo, Wei Ding, Baijian Yang, Yingjie Chen

It is a common practice for vehicles to use GPS to acquire location information.

Autonomous Driving Visual Localization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.