no code implementations • 1 Apr 2024 • Zhiyuan Cheng, Zhaoyi Liu, Tengda Guo, Shiwei Feng, Dongfang Liu, Mingjie Tang, Xiangyu Zhang
Our attack prototype, named BadPart, is evaluated on both MDE and OFE tasks, utilizing a total of 7 models.
1 code implementation • 26 Mar 2024 • Jiamian Wang, Guohao Sun, Pichao Wang, Dongfang Liu, Sohail Dianat, Majid Rabbani, Raghuveer Rao, Zhiqiang Tao
Correspondingly, a single text embedding may be less expressive to capture the video embedding and empower the retrieval.
no code implementations • 23 Jan 2024 • Cheng Han, Qifan Wang, Yiming Cui, Wenguan Wang, Lifu Huang, Siyuan Qi, Dongfang Liu
As the scale of vision models continues to grow, the emergence of Visual Prompt Tuning (VPT) as a parameter-efficient transfer learning technique has gained attention due to its superior performance compared to traditional full-finetuning.
no code implementations • 18 Jan 2024 • Cheng Han, James C. Liang, Qifan Wang, Majid Rabbani, Sohail Dianat, Raghuveer Rao, Ying Nian Wu, Dongfang Liu
We introduce the novel Diffusion Visual Programmer (DVP), a neuro-symbolic image translation framework.
1 code implementation • 1 Dec 2023 • Shaohua Dong, Yunhe Feng, Qing Yang, Yan Huang, Dongfang Liu, Heng Fan
Existing approaches often fully fine-tune a dual-branch encoder-decoder framework with a complicated feature fusion strategy for achieving multimodal semantic segmentation, which is training-costly due to the massive parameter updates in feature extraction and fusion.
Ranked #2 on Semantic Segmentation on SUN-RGBD (using extra training data)
no code implementations • 2 Nov 2023 • Yiming Cui, Cheng Han, Dongfang Liu
The advancement of computer vision has pushed visual analysis tasks from still images to the video domain.
1 code implementation • 22 Sep 2023 • James C. Liang, Yiming Cui, Qifan Wang, Tong Geng, Wenguan Wang, Dongfang Liu
This paper presents CLUSTERFORMER, a universal vision model that is based on the CLUSTERing paradigm with TransFORMER.
1 code implementation • ICCV 2023 • Cheng Han, Qifan Wang, Yiming Cui, Zhiwen Cao, Wenguan Wang, Siyuan Qi, Dongfang Liu
Specifically, we introduce a set of learnable key-value prompts and visual prompts into self-attention and input layers, respectively, to improve the effectiveness of model fine-tuning.
1 code implementation • 3 May 2023 • James Liang, Tianfei Zhou, Dongfang Liu, Wenguan Wang
We present CLUSTSEG, a general, transformer-based framework that tackles different image segmentation tasks (i. e., superpixel, semantic, instance, and panoptic) through a unified neural clustering scheme.
no code implementations • 28 Apr 2023 • Zhiyuan Cheng, Hongjun Choi, James Liang, Shiwei Feng, Guanhong Tao, Dongfang Liu, Michael Zuzak, Xiangyu Zhang
We argue that the weakest link of fusion models depends on their most vulnerable modality, and propose an attack framework that targets advanced camera-LiDAR fusion-based 3D object detection models through camera-only adversarial attacks.
no code implementations • CVPR 2023 • Yawen Lu, Qifan Wang, Siqi Ma, Tong Geng, Yingjie Victor Chen, Huaijin Chen, Dongfang Liu
Optical flow is an indispensable building block for various important computer vision tasks, including motion estimation, object tracking, and disparity measurement.
no code implementations • 12 Apr 2023 • Hongye Xu, Dongfang Liu, Cory Merkel, Michael Zuzak
If an incorrect secret key is used, a set of deterministic errors is produced in locked modules, restricting unauthorized use.
1 code implementation • 31 Jan 2023 • Zhiyuan Cheng, James Liang, Guanhong Tao, Dongfang Liu, Xiangyu Zhang
We improve adversarial robustness against physical-world attacks using L0-norm-bounded perturbation in training.
1 code implementation • 3 Oct 2022 • Wenguan Wang, James Liang, Dongfang Liu
Prevalent state-of-the-art instance segmentation methods fall into a query-based scheme, in which instance masks are derived by querying the image feature using a set of instance-aware embeddings.
1 code implementation • 15 Sep 2022 • Wenguan Wang, Cheng Han, Tianfei Zhou, Dongfang Liu
We devise deep nearest centroids (DNC), a conceptually elegant yet surprisingly effective network for large-scale visual recognition, by revisiting Nearest Centroids, one of the most classic and simple classifiers.
no code implementations • 19 Aug 2022 • Zhiwen Cao, Dongfang Liu, Qifan Wang, Yingjie Chen
In this paper, we propose an Anisotropic Spherical Gaussian (ASG)-based LDL approach for facial pose estimation.
1 code implementation • 11 Jul 2022 • Zhiyuan Cheng, James Liang, Hongjun Choi, Guanhong Tao, Zhiwen Cao, Dongfang Liu, Xiangyu Zhang
Experimental results show that our method can generate stealthy, effective, and robust adversarial patches for different target objects and models and achieves more than 6 meters mean depth estimation error and 93% attack success rate (ASR) in object detection with a patch of 1/9 of the vehicle's rear area.
1 code implementation • 22 May 2022 • Liqi Yan, Qifan Wang, Yiming Cui, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu
Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description.
no code implementations • 5 Mar 2022 • Qifan Wang, Yi Fang, Anirudh Ravula, Ruining He, Bin Shen, Jingang Wang, Xiaojun Quan, Dongfang Liu
Network embedding is an effective technique to learn the low-dimensional representations of nodes in networks.
no code implementations • 1 Feb 2022 • Qifan Wang, Yi Fang, Anirudh Ravula, Fuli Feng, Xiaojun Quan, Dongfang Liu
Structure information extraction refers to the task of extracting structured text fields from web pages, such as extracting a product offer from a shopping page including product title, description, brand and price.
no code implementations • 15 Oct 2021 • Yiming Cui, Zhiwen Cao, Yixin Xie, Xingyu Jiang, Feng Tao, Yingjie Chen, Lin Li, Dongfang Liu
The existing MOTS studies face two critical challenges: 1) the published datasets inadequately capture the real-world complexity for network training to address various driving settings; 2) the working pipeline annotation tool is under-studied in the literature to improve the quality of MOTS learning examples.
1 code implementation • ICCV 2021 • Yiming Cui, Liqi Yan, Zhiwen Cao, Dongfang Liu
One of the popular solutions is to exploit the temporal information and enhance per-frame representation through aggregating features from neighboring frames.
1 code implementation • CVPR 2021 • Dongfang Liu, Yiming Cui, Wenbo Tan, Yingjie Chen
Video instance segmentation (VIS) is a new and critical task in computer vision.
1 code implementation • 18 Feb 2021 • Liqi Yan, Yiming Cui, Yingjie Chen, Dongfang Liu
We extract the hierarchical feature maps from a convolutional neural network (CNN) and organically fuse the extracted features for image representations.
no code implementations • ICCV 2021 • Alireza Naghizadeh, Hongye Xu, Mohab Mohamed, Dimitris N. Metaxas, Dongfang Liu
The importance of this subject is nested in the amount of training data that artificial neural networks need to accurately identify and segment objects in images and the infeasibility of acquiring a sufficient dataset within the biomedical field.
1 code implementation • 4 Dec 2020 • Dongfang Liu, Yiming Cui, Liqi Yan, Christos Mousas, Baijian Yang, Yingjie Chen
In this work, we introduce a Denser Feature Network (DenserNet) for visual localization.
no code implementations • 14 Oct 2020 • Zhiwen Cao, Zongcheng Chu, Dongfang Liu, Yingjie Chen
This paper proposes to use the three vectors in a rotation matrix as the representation in head pose estimation and develops a new neural network based on the characteristic of such representation.
no code implementations • 1 Sep 2020 • Liqi Yan, Dongfang Liu, Yaoxian Song, Changbin Yu
Memory is important for the agent to avoid repeating certain tasks unnecessarily and in order for it to adapt adequately to new scenes, therefore, we make use of meta-learning.
Ranked #1 on Visual Navigation on AI2-THOR
no code implementations • 13 Aug 2020 • Dongfang Liu, Yiming Cui, Xiaolei Guo, Wei Ding, Baijian Yang, Yingjie Chen
It is a common practice for vehicles to use GPS to acquire location information.