no code implementations • 2 Sep 2024 • Yang Zhang, Rui Zhang, Xuecheng Nie, Haochen Li, Jikun Chen, Yifan Hao, Xin Zhang, Luoqi Liu, Ling Li
We found that attribute confusion occurs when a certain region of the latent features attend to multiple or incorrect prompt tokens.
no code implementations • 25 Aug 2024 • Minghao Liu, Le Zhang, Yingjie Tian, Xiaochao Qu, Luoqi Liu, Ting Liu
Recent advances in text-to-image diffusion models have demonstrated impressive capabilities in image quality.
no code implementations • 21 Aug 2024 • Chongkai Yu, Anqi Li, Xiaochao Qu, Luoqi Liu, Ting Liu
Experimentally, we demonstrated the high effectiveness and efficiency of our method in tackling complex cases with multiple interactions.
no code implementations • 20 Aug 2024 • Chen Liang, Qiang Guo, Xiaochao Qu, Luoqi Liu, Ting Liu
Video segmentation aims at partitioning video sequences into meaningful segments based on objects or regions of interest within frames.
2 code implementations • 24 Jun 2024 • Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, YaoWei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, Jingnan Luo, Jinyu Yang, Jungong Han, Feng Zheng, Bin Cao, Yisi Zhang, Xuanxu Lin, Xingjian He, Bo Zhao, Jing Liu, Feiyu Pan, Hao Fang, Xiankai Lu
Moreover, we provide a new motion expression guided video segmentation dataset MeViS to study the natural language-guided video understanding in complex environments.
no code implementations • 12 Jun 2024 • Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu
Our method ranked 2nd in the MOSE track of PVUW 2024, with a $\mathcal{J}$ of 0. 8007, a $\mathcal{F}$ of 0. 8683 and a $\mathcal{J}$\&$\mathcal{F}$ of 0. 8345.
no code implementations • 7 Jun 2024 • Chen Liang, Qiang Guo, Chongkai Yu, Chengjing Wu, Ting Liu, Luoqi Liu
MVC enforces the consistency between predictions of masked frames where random patches are withheld.
no code implementations • 6 Jun 2024 • Ruipu Wu, Jifei Che, Han Li, Chengjing Wu, Ting Liu, Luoqi Liu
Video panoptic segmentation is an advanced task that extends panoptic segmentation by applying its concept to video sequences.
no code implementations • CVPR 2024 • Runze He, Shaofei Huang, Xuecheng Nie, Tianrui Hui, Luoqi Liu, Jiao Dai, Jizhong Han, Guanbin Li, Si Liu
In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt.
no code implementations • ICCV 2023 • Yan Fang, Feng Zhu, Bowen Cheng, Luoqi Liu, Yao Zhao, Yunchao Wei
This work shows that locating the patch-wise noisy region is a better way to deal with noise.
no code implementations • CVPR 2023 • Bonan Li, Yinhan Hu, Xuecheng Nie, Congying Han, Xiangjian Jiang, Tiande Guo, Luoqi Liu
Given exploration on the above three questions, we present the novel DropKey method that regards Key as the drop unit and exploits decreasing schedule for drop ratio, improving ViTs in a general way.
no code implementations • 4 Oct 2022 • Xiangjian Jiang, Xuecheng Nie, Zitian Wang, Luoqi Liu, Si Liu
Existing methods for human mesh recovery mainly focus on single-view frameworks, but they often fail to produce accurate results due to the ill-posed setup.
no code implementations • 4 Aug 2022 • Bonan Li, Yinhan Hu, Xuecheng Nie, Congying Han, Xiangjian Jiang, Tiande Guo, Luoqi Liu
Given exploration on the above three questions, we present the novel DropKey method that regards Key as the drop unit and exploits decreasing schedule for drop ratio, improving ViTs in a general way.
2 code implementations • 24 Nov 2021 • David Junhao Zhang, Kunchang Li, Yali Wang, Yunpeng Chen, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou
With such multi-dimension and multi-scale factorization, our MorphMLP block can achieve a great accuracy-computation balance.
Ranked #39 on Action Recognition on Something-Something V2 (using extra training data)
1 code implementation • CVPR 2020 • Shaofei Huang, Tianrui Hui, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li
In addition to the CMPC module, we further leverage a simple yet effective TGFE module to integrate the reasoned multimodal features from different levels with the guidance of textual information.
no code implementations • ICCV 2017 • Shengtao Xiao, Jiashi Feng, Luoqi Liu, Xuecheng Nie, Wei Wang, Shuicheng Yan, Ashraf Kassim
To address these challenging issues, we introduce a novel recurrent 3D-2D dual learning model that alternatively performs 2D-based 3D face model refinement and 3D-to-2D projection based 2D landmark refinement to reliably reason about self-occluded landmarks, precisely capture the subtle landmark displacement and accurately detect landmarks even in presence of extremely large poses.
no code implementations • 22 Sep 2017 • Tam V. Nguyen, Luoqi Liu
The female facial image beautification usually requires professional editing softwares, which are relatively difficult for common users.
no code implementations • 23 May 2017 • Tam V. Nguyen, Luoqi Liu
Salient object detection has increasingly become a popular topic in cognitive and computational sciences, including computer vision and artificial intelligence research.
no code implementations • ICCV 2017 • Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan
In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.
no code implementations • 21 Aug 2016 • Jing Wang, Meng Wang, Pei-Pei Li, Luoqi Liu, Zhong-Qiu Zhao, Xuegang Hu, Xindong Wu
The problem assumes that features are generated individually but there are group structure in the feature stream.
no code implementations • 24 Jul 2016 • Xiangyun Zhao, Xiaodan Liang, Luoqi Liu, Teng Li, Yugang Han, Nuno Vasconcelos, Shuicheng Yan
Objective functions for training of deep networks for face-related recognition tasks, such as facial expression recognition (FER), usually consider each sample independently.
Ranked #2 on Facial Expression Recognition (FER) on Oulu-CASIA
no code implementations • ICCV 2015 • Xiaodan Liang, Si Liu, Yunchao Wei, Luoqi Liu, Liang Lin, Shuicheng Yan
Then the concept detector can be fine-tuned based on these new instances.
no code implementations • ICCV 2015 • Xiangbo Shu, Jinhui Tang, Hanjiang Lai, Luoqi Liu, Shuicheng Yan
Second, it is challenging or even impossible to collect faces of all age groups for a particular subject, yet much easier and more practical to get face pairs from neighboring age groups.
no code implementations • CVPR 2015 • Si Liu, Xiaodan Liang, Luoqi Liu, Xiaohui Shen, Jianchao Yang, Changsheng Xu, Liang Lin, Xiaochun Cao, Shuicheng Yan
Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image.
1 code implementation • 9 Mar 2015 • Xiaodan Liang, Si Liu, Xiaohui Shen, Jianchao Yang, Luoqi Liu, Jian Dong, Liang Lin, Shuicheng Yan
The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters.
no code implementations • 11 Nov 2014 • Xiaodan Liang, Si Liu, Yunchao Wei, Luoqi Liu, Liang Lin, Shuicheng Yan
Then the concept detector can be fine-tuned based on these new instances.