no code implementations • 18 Sep 2023 • Shaofei Huang, Han Li, Yuqing Wang, Hongji Zhu, Jiao Dai, Jizhong Han, Wenge Rong, Si Liu
Explicit object-level semantic correspondence between audio and visual modalities is established by gathering object information from visual features with predefined audio queries.
1 code implementation • 11 Sep 2023 • Bo Zhang, Xinyu Cai, Jiakang Yuan, Donglin Yang, Jianfei Guo, Xiangchao Yan, Renqiu Xia, Botian Shi, Min Dou, Tao Chen, Si Liu, Junchi Yan, Yu Qiao
Domain shifts such as sensor type changes and geographical situation variations are prevalent in Autonomous Driving (AD), which poses a challenge since AD model relying on the previous-domain knowledge can be hardly directly deployed to a new domain without additional costs.
no code implementations • 31 Aug 2023 • Si Liu, Chen Gao, Yuan Chen, Xingyu Peng, Xianghao Kong, Kun Wang, Runsheng Xu, Wentao Jiang, Hao Xiang, Jiaqi Ma, Miao Wang
Specifically, we analyze the performance changes of different methods under different bandwidths, providing a deep insight into the performance-bandwidth trade-off issue.
no code implementations • 20 Aug 2023 • Jinyu Chen, Wenguan Wang, Si Liu, Hongsheng Li, Yi Yang
CCPD transfers the fundamental, point-to-point wayfinding skill that is well trained on the large-scale PointGoal task to ORAN, so as to help ORAN to better master audio-visual navigation with far fewer training samples.
no code implementations • 5 Aug 2023 • Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu, Shuicheng Yan
To address these limitations, we present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation.
no code implementations • 29 Jun 2023 • Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo
We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.
no code implementations • 18 Jun 2023 • Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu
This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark.
no code implementations • 26 May 2023 • Zhiyi Xue, Si Liu, Zhaodi Zhang, Yiting Wu, Min Zhang
In this paper, we study existing approaches and identify a dominant factor in defining tight approximation, namely the approximation domain of the activation function.
1 code implementation • CVPR 2023 • Jingqiu Zhou, Linjiang Huang, Liang Wang, Si Liu, Hongsheng Li
Besides, the generated pseudo-labels can be fluctuating and inaccurate at the early stage of training.
Pseudo Label
Weakly-supervised Temporal Action Localization
+1
no code implementations • CVPR 2023 • Zongheng Tang, Yifan Sun, Si Liu, Yi Yang
Second, through our design, the object queries and the foreground query in the decoder share consensus on the class semantics, therefore making the strong and weak supervision mutually benefit each other for domain alignment.
no code implementations • 9 Apr 2023 • Yulu Gao, Chonghao Sima, Shaoshuai Shi, Shangzhe Di, Si Liu, Hongyang Li
With the prevalence of multimodal learning, camera-LiDAR fusion has gained popularity in 3D object detection.
1 code implementation • CVPR 2023 • Zhaodi Zhang, Zhiyi Xue, Yang Chen, Si Liu, Yueling Zhang, Jing Liu, Min Zhang
Via abstraction, all perturbed images are mapped into intervals before feeding into neural networks for training.
1 code implementation • CVPR 2023 • Luting Wang, Yi Liu, Penghui Du, Zihan Ding, Yue Liao, Qiaosong Qi, Biaolong Chen, Si Liu
When extracting object knowledge from PVLMs, the former adaptively transforms object proposals and adopts object-aware mask attention to obtain precise and complete knowledge of objects.
Ranked #7 on
Open Vocabulary Object Detection
on MSCOCO
1 code implementation • 2 Mar 2023 • Rongyao Fang, Peng Gao, Aojun Zhou, Yingjie Cai, Si Liu, Jifeng Dai, Hongsheng Li
The first method is One-to-many Matching via Data Augmentation (denoted as DataAug-DETR).
1 code implementation • CVPR 2023 • Shaofei Huang, Zhenwei Shen, Zehao Huang, Zi-han Ding, Jiao Dai, Jizhong Han, Naiyan Wang, Si Liu
An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes.
Ranked #3 on
3D Lane Detection
on Apollo Synthetic 3D Lane
no code implementations • 6 Jan 2023 • Zitian Wang, Zehao Huang, Jiahui Fu, Naiyan Wang, Si Liu
For the generated queries, we design a sparse cross attention module to force them to focus on the features of specific objects, which suppresses interference from noises.
1 code implementation • CVPR 2023 • Chen Gao, Xingyu Peng, Mi Yan, He Wang, Lirong Yang, Haibing Ren, Hongsheng Li, Si Liu
In this paper, we propose an Adaptive Zone-aware Hierarchical Planner (AZHP) to explicitly divides the navigation process into two heterogeneous phases, i. e., sub-goal setting via zone partition/selection (high-level action) and sub-goal executing (low-level action), for hierarchical planning.
1 code implementation • CVPR 2023 • Tianrui Hui, Zizheng Xun, Fengguang Peng, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Jiao Dai, Jizhong Han, Si Liu
To alleviate these limitations, we propose a novel Template-Bridged Search region Interaction (TBSI) module which exploits templates as the medium to bridge the cross-modal interaction between RGB and TIR search regions by gathering and distributing target-relevant object and environment contexts.
1 code implementation • 2 Dec 2022 • Fangxun Shu, Biaolong Chen, Yue Liao, Shuwen Xiao, Wenyu Sun, Xiaobo Li, Yousong Zhu, Jinqiao Wang, Si Liu
Our MAC aims to reduce video representation's spatial and temporal redundancy in the VidLP model by a mask sampling mechanism to improve pre-training efficiency.
Ranked #31 on
Video Retrieval
on MSR-VTT-1kA
(using extra training data)
1 code implementation • 29 Nov 2022 • Xinyu Cai, Wentao Jiang, Runsheng Xu, Wenquan Zhao, Jiaqi Ma, Si Liu, Yikang Li
Through simulating point cloud data in different LiDAR placements, we can evaluate the perception accuracy of these placements using multiple detection models.
1 code implementation • 22 Nov 2022 • Linjiang Huang, Kaixin Lu, Guanglu Song, Liang Wang, Si Liu, Yu Liu, Hongsheng Li
In this paper, we present a novel training scheme, namely Teach-DETR, to learn better DETR-based detectors from versatile teacher detectors.
no code implementations • 21 Nov 2022 • Jiaxu Tian, Dapeng Zhi, Si Liu, Peixin Wang, Guy Katz, Min Zhang
In this paper we propose a novel, tight and scalable reachability analysis approach for DRL systems.
1 code implementation • 21 Nov 2022 • Le Zhuo, Zhaokai Wang, Baisen Wang, Yue Liao, Chenxi Bao, Stanley Peng, Songhao Han, Aixi Zhang, Fei Fang, Si Liu
We believe our dataset, benchmark model, and evaluation metric will boost the development of video background music generation.
no code implementations • 21 Nov 2022 • Yiting Wu, Zhaodi Zhang, Zhiyi Xue, Si Liu, Min Zhang
We observe that existing approaches only rely on overestimated domains, while the corresponding tight approximation may not necessarily be tight on its actual domain.
no code implementations • 6 Oct 2022 • Yuanbin Wang, Leyan Zhu, Shaofei Huang, Tianrui Hui, Xiaojie Li, Fei Wang, Si Liu
To better bridge the domain gap between source domain (synthetic data) and target domain (real-world data), we also propose a Selective Feature Alignment (SFA) module which only aligns the features of consistent foreground area between the two domains, thus realizing inter-domain intra-modality adaptation.
no code implementations • 4 Oct 2022 • Xiangjian Jiang, Xuecheng Nie, Zitian Wang, Luoqi Liu, Si Liu
Existing methods for human mesh recovery mainly focus on single-view frameworks, but they often fail to produce accurate results due to the ill-posed setup.
2 code implementations • 12 Sep 2022 • Hongyang Li, Chonghao Sima, Jifeng Dai, Wenhai Wang, Lewei Lu, Huijie Wang, Jia Zeng, Zhiqi Li, Jiazhi Yang, Hanming Deng, Hao Tian, Enze Xie, Jiangwei Xie, Li Chen, Tianyu Li, Yang Li, Yulu Gao, Xiaosong Jia, Si Liu, Jianping Shi, Dhaka Lin, Yu Qiao
As sensor configurations get more complex, integrating multi-source information from different sensors and representing features in a unified view come of vital importance.
no code implementations • 21 Aug 2022 • Zhaodi Zhang, Yiting Wu, Si Liu, Jing Liu, Min Zhang
Considerable efforts have been devoted to finding the so-called tighter approximations to obtain more precise verification results.
1 code implementation • 16 Aug 2022 • Wentao Jiang, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Si Liu
Human pose estimation aims to accurately estimate a wide variety of human poses.
1 code implementation • 11 Aug 2022 • Zihan Ding, Zi-han Ding, Tianrui Hui, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Si Liu
To alleviate these drawbacks, we propose a one-stage end-to-end Pixel-Phrase Matching Network (PPMN), which directly matches each phrase to its corresponding pixels instead of region proposals and outputs panoptic segmentation by simple combination.
1 code implementation • 19 Jul 2022 • Yusheng Zhao, Jinyu Chen, Chen Gao, Wenguan Wang, Lirong Yang, Haibing Ren, Huaxia Xia, Si Liu
Vision-language navigation is the task of directing an embodied agent to navigate in 3D scenes with natural language instructions.
1 code implementation • 12 Jul 2022 • Luting Wang, Xiaojie Li, Yue Liao, Zeren Jiang, Jianlong Wu, Fei Wang, Chen Qian, Si Liu
We observe that the core difficulty for heterogeneous KD (hetero-KD) is the significant semantic gap between the backbone features of heterogeneous detectors due to the different optimization manners.
1 code implementation • CVPR 2022 • Zihan Ding, Tianrui Hui, Junshi Huang, Xiaoming Wei, Jizhong Han, Si Liu
Referring video object segmentation aims to predict foreground labels for objects referred by natural language expressions in videos.
Ranked #5 on
Referring Video Object Segmentation
on MeViS
1 code implementation • CVPR 2022 • Jinyu Chen, Chen Gao, Erli Meng, Qiong Zhang, Si Liu
However, the crucial navigation clues (i. e., object-level environment layout) for embodied navigation task is discarded since the maintained vector is essentially unstructured.
1 code implementation • CVPR 2022 • Junyu Luo, Jiahui Fu, Xianghao Kong, Chen Gao, Haibing Ren, Hao Shen, Huaxia Xia, Si Liu
3D visual grounding aims to locate the referred target object in 3D point cloud scenes according to a free-form language description.
no code implementations • 30 Mar 2022 • Mingfei Chen, Yue Liao, Si Liu, Fei Wang, Jenq-Neng Hwang
RS takes previous detected results as references to aggregate the corresponding features from the combined features of the adjacent frames and makes a one-to-one track state prediction for each reference in parallel.
1 code implementation • CVPR 2022 • Yue Liao, Aixi Zhang, Miao Lu, Yongliang Wang, Xiaobo Li, Si Liu
In this paper, we reveal and address the disadvantages of the conventional query-driven HOI detectors from the two aspects.
Ranked #11 on
Human-Object Interaction Detection
on HICO-DET
1 code implementation • CVPR 2022 • Zitian Wang, Xuecheng Nie, Xiaochao Qu, Yunpeng Chen, Si Liu
In this paper, we present a novel Distribution-Aware Single-stage (DAS) model for tackling the challenging multi-person 3D pose estimation problem.
3D Multi-Person Pose Estimation (absolute)
3D Multi-Person Pose Estimation (root-relative)
+2
no code implementations • 12 Oct 2021 • Jiahui Fu, Guanghui Ren, Yunpeng Chen, Si Liu
In contrast, the 2D grid-based methods, such as PointPillar, can easily achieve a stable and efficient speed based on simple 2D convolution, but it is hard to get the competitive accuracy limited by the coarse-grained point clouds representation.
1 code implementation • NeurIPS 2021 • Aixi Zhang, Yue Liao, Si Liu, Miao Lu, Yongliang Wang, Chen Gao, Xiaobo Li
To this end, we propose a novel one-stage framework with disentangling human-object detection and interaction classification in a cascade manner.
Ranked #7 on
Human-Object Interaction Detection
on V-COCO
no code implementations • 5 Aug 2021 • Dailan He, Yusheng Zhao, Junyu Luo, Tianrui Hui, Shaofei Huang, Aixi Zhang, Si Liu
Existing works usually adopt dynamic graph networks to indirectly model the intra/inter-modal interactions, making the model difficult to distinguish the referred object from distractors due to the monolithic representations of visual and linguistic contents.
1 code implementation • CVPR 2021 • Chen Gao, Jinyu Chen, Si Liu, Luting Wang, Qiong Zhang, Qi Wu
The Remote Embodied Referring Expression (REVERIE) is a recently raised task that requires an agent to navigate to and localise a referred remote object according to a high-level language instruction.
1 code implementation • 8 Jun 2021 • MingJie Sun, Jimin Xiao, Eng Gee Lim, Si Liu, John Y. Goulermas
In this paper, we are tackling the weakly-supervised referring expression grounding task, for the localization of a referent object in an image according to a query sentence, where the mapping between image regions and queries are not available during the training stage.
1 code implementation • 26 May 2021 • Si Liu, Wentao Jiang, Chen Gao, Ran He, Jiashi Feng, Bo Li, Shuicheng Yan
In this paper, we address the makeup transfer and removal tasks simultaneously, which aim to transfer the makeup from a reference image to a source image and remove the makeup from the with-makeup image respectively.
no code implementations • 24 May 2021 • Si Liu, Zitian Wang, Yulu Gao, Lejian Ren, Yue Liao, Guanghui Ren, Bo Li, Shuicheng Yan
For the above exemplar case, our HRS task produces results in the form of relation triplets <girl [left hand], hold, book> and exacts segmentation masks of the book, with which the robot can easily accomplish the grabbing task.
1 code implementation • 15 May 2021 • Si Liu, Tianrui Hui, Shaofei Huang, Yunchao Wei, Bo Li, Guanbin Li
In this paper, we propose a Cross-Modal Progressive Comprehension (CMPC) scheme to effectively mimic human behaviors and implement it as a CMPC-I (Image) module and a CMPC-V (Video) module to improve referring image and video segmentation models.
Ranked #7 on
Referring Expression Segmentation
on J-HMDB
no code implementations • CVPR 2021 • Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang
Though 3D convolutions are amenable to recognizing which actor is performing the queried actions, it also inevitably introduces misaligned spatial information from adjacent frames, which confuses features of the target frame and yields inaccurate segmentation.
Ranked #8 on
Referring Expression Segmentation
on J-HMDB
1 code implementation • CVPR 2021 • Mingfei Chen, Yue Liao, Si Liu, ZhiYuan Chen, Fei Wang, Chen Qian
To attain this, we map a trainable interaction query set to an interaction prediction set with a transformer.
Ranked #27 on
Human-Object Interaction Detection
on HICO-DET
(using extra training data)
1 code implementation • CVPR 2021 • Tianfei Zhou, Wenguan Wang, Si Liu, Yi Yang, Luc van Gool
To address the challenging task of instance-aware human part parsing, a new bottom-up regime is proposed to learn category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner.
no code implementations • CVPR 2021 • Xing Dai, Zeren Jiang, Zhao Wu, Yiping Bao, Zhicheng Wang, Si Liu, Erjin Zhou
In recent years, knowledge distillation has been proved to be an effective solution for model compression.
no code implementations • 20 Jan 2021 • Wentao Xie, Guanghui Ren, Si Liu
Considering the complexity of doing visual relation detection in videos, we decompose this task into three sub-tasks: object detection, trajectory proposal and relation prediction.
no code implementations • 11 Jan 2021 • Shaofei Huang, Si Liu, Tianrui Hui, Jizhong Han, Bo Li, Jiashi Feng, Shuicheng Yan
Our ORDNet is able to extract more comprehensive context information and well adapt to complex spatial variance in scene images.
no code implementations • ICCV 2021 • Wentao Jiang, Ning Xu, Jiayun Wang, Chen Gao, Jing Shi, Zhe Lin, Si Liu
Given the cycle, we propose several free augmentation strategies to help our model understand various editing requests given the imbalanced dataset.
1 code implementation • 7 Dec 2020 • Zhaokai Wang, Renda Bao, Qi Wu, Si Liu
Our CNMT consists of a reading, a reasoning and a generation modules, in which Reading Module employs better OCR systems to enhance text reading ability and a confidence embedding to select the most noteworthy tokens.
1 code implementation • 10 Nov 2020 • Zongheng Tang, Yue Liao, Si Liu, Guanbin Li, Xiaojie Jin, Hongxu Jiang, Qian Yu, Dong Xu
HC-STVG is a video grounding task that requires both spatial (where) and temporal (when) localization.
1 code implementation • ECCV 2020 • Tianrui Hui, Si Liu, Shaofei Huang, Guanbin Li, Sansi Yu, Faxi Zhang, Jizhong Han
Referring image segmentation aims to predict the foreground mask of the object referred by a natural language sentence.
1 code implementation • CVPR 2020 • Shaofei Huang, Tianrui Hui, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li
In addition to the CMPC module, we further leverage a simple yet effective TGFE module to integrate the reasoned multimodal features from different levels with the guidance of textual information.
Ranked #11 on
Referring Expression Segmentation
on RefCOCO testB
no code implementations • 2 Jun 2020 • Chen Gao, Si Liu, Ran He, Shuicheng Yan, Bo Li
LGR module utilizes body skeleton knowledge to construct a layout graph that connects all relevant part features, where graph reasoning mechanism is used to propagate information among part nodes to mine their relations.
1 code implementation • 18 Jan 2020 • Jie Wu, Guanbin Li, Si Liu, Liang Lin
Temporally language grounding in untrimmed videos is a newly-raised task in video understanding.
1 code implementation • CVPR 2020 • Yue Liao, Si Liu, Fei Wang, Yanjie Chen, Chen Qian, Jiashi Feng
Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points.
Ranked #24 on
Human-Object Interaction Detection
on V-COCO
1 code implementation • CVPR 2020 • Chen Gao, Yunpeng Chen, Si Liu, Zhenxiong Tan, Shuicheng Yan
In this paper, we propose an AdversarialNAS method specially tailored for Generative Adversarial Networks (GANs) to search for a superior generative model on the task of unconditional image generation.
1 code implementation • 20 Nov 2019 • Guanglin Niu, Yongfei Zhang, Bo Li, Peng Cui, Si Liu, Jingyang Li, Xiaowei Zhang
Representation learning on a knowledge graph (KG) is to embed entities and relations of a KG into low-dimensional continuous vector spaces.
1 code implementation • ICCV 2019 • Guan'an Wang, Tianzhu Zhang, Jian Cheng, Si Liu, Yang Yang, Zeng-Guang Hou
First, it can exploit pixel alignment and feature alignment jointly.
Cross-Modality Person Re-identification
Person Re-Identification
+1
no code implementations • 25 Sep 2019 • Defa Zhu, Si Liu, Wentao Jiang, Guanbin Li, Tianyi Wu, Guodong Guo
Visual relationship recognition models are limited in the ability to generalize from finite seen predicates to unseen ones.
1 code implementation • CVPR 2020 • Wentao Jiang, Si Liu, Chen Gao, Jie Cao, Ran He, Jiashi Feng, Shuicheng Yan
In this paper, we address the makeup transfer task, which aims to transfer the makeup from a reference image to a source image.
no code implementations • CVPR 2020 • Yue Liao, Si Liu, Guanbin Li, Fei Wang, Yanjie Chen, Chen Qian, Bo Li
RCCF reformulates the referring expression comprehension as a correlation filtering process.
no code implementations • 26 Jul 2019 • Defa Zhu, Si Liu, Wentao Jiang, Chen Gao, Tianyi Wu, Qaingchang Wang, Guodong Guo
To address this issue, we propose a method called Untraceable GAN, which has a novel source classifier to differentiate which domain an image is translated from, and determines whether the translated image still retains the characteristics of the source domain.
no code implementations • IEEE Transactions on Image Processing 2019 • Zhen Wei, Si Liu, Yao Sun, Hefei Ling
In this paper, we propose a design scheme for deep learning networks in the face parsing task with promising accuracy and real-time inference speed.
Ranked #6 on
Face Parsing
on CelebAMask-HQ
1 code implementation • ICML 2018 • Si Liu, Risheek Garrepalli, Thomas G. Dietterich, Alan Fern, Dan Hendrycks
Further, while there are algorithms for open category detection, there are few empirical results that directly report alien detection rates.
no code implementations • 10 May 2018 • Xiaobo Wang, Shifeng Zhang, Zhen Lei, Si Liu, Xiaojie Guo, Stan Z. Li
On the other hand, the learned classifier of softmax loss is weak.
no code implementations • 1 Feb 2018 • Si Liu, Yao Sun, Defa Zhu, Renda Bao, Wei Wang, Xiangbo Shu, Shuicheng Yan
The age discriminative network guides the synthesized face to fit the real conditional distribution.
no code implementations • 4 Jan 2018 • Si Liu, Yao Sun, Defa Zhu, Guanghui Ren, Yu Chen, Jiashi Feng, Jizhong Han
Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences.
1 code implementation • 26 Jul 2017 • Bingke Zhu, Yingying Chen, Jinqiao Wang, Si Liu, Bo Zhang, Ming Tang
Finally, an automatic portrait animation system based on fast deep matting is built on mobile devices, which does not need any interaction and can realize real-time matting with 15 fps.
no code implementations • CVPR 2017 • Zhen Wei, Yao Sun, Jinqiao Wang, Hanjiang Lai, Si Liu
In this paper, we introduce a novel approach to regulate receptive field in deep image parsing network automatically.
no code implementations • CVPR 2017 • Si Liu, Changhu Wang, Ruihe Qian, Han Yu, Renda Bao
In this paper, we develop a Single frame Video Parsing (SVP) method which requires only one labeled frame per video in training stage.
no code implementations • CVPR 2016 • Hua Zhang, Si Liu, Changqing Zhang, Wenqi Ren, Rui Wang, Xiaochun Cao
In this study, we present a weakly supervised approach that discovers the discriminative structures of sketch images, given pairs of sketch images and web images.
no code implementations • CVPR 2016 • Si Liu, Tianzhu Zhang, Xiaochun Cao, Changsheng Xu
In this paper, we propose a novel structural correlation filter (SCF) model for robust visual tracking.
no code implementations • 25 Apr 2016 • Si Liu, Xinyu Ou, Ruihe Qian, Wei Wang, Xiaochun Cao
In this paper, we propose a novel Deep Localized Makeup Transfer Network to automatically recommend the most suitable makeup for a female and synthesis the makeup on her face.
no code implementations • ICCV 2015 • Changqing Zhang, Huazhu Fu, Si Liu, Guangcan Liu, Xiaochun Cao
We introduce a low-rank tensor constraint to explore the complementary information from multiple views and, accordingly, establish a novel method called Low-rank Tensor constrained Multiview Subspace Clustering (LT-MSC).
no code implementations • ICCV 2015 • Xiaodan Liang, Si Liu, Yunchao Wei, Luoqi Liu, Liang Lin, Shuicheng Yan
Then the concept detector can be fine-tuned based on these new instances.
no code implementations • ICCV 2015 • Xiaodan Liang, Chunyan Xu, Xiaohui Shen, Jianchao Yang, Si Liu, Jinhui Tang, Liang Lin, Shuicheng Yan
In this work, we address the human parsing task with a novel Contextualized Convolutional Neural Network (Co-CNN) architecture, which well integrates the cross-layer context, global image-level context, within-super-pixel context and cross-super-pixel neighborhood context into a unified network.
no code implementations • CVPR 2015 • Xiaochun Cao, Changqing Zhang, Huazhu Fu, Si Liu, Hua Zhang
In this paper, we focus on how to boost the multi-view clustering by exploring the complementary information among multi-view features.
no code implementations • CVPR 2015 • Tianzhu Zhang, Si Liu, Changsheng Xu, Shuicheng Yan, Bernard Ghanem, Narendra Ahuja, Ming-Hsuan Yang
Sparse representation has been applied to visual tracking by finding the best target candidate with minimal reconstruction error by use of target templates.
no code implementations • CVPR 2015 • Si Liu, Xiaodan Liang, Luoqi Liu, Xiaohui Shen, Jianchao Yang, Changsheng Xu, Liang Lin, Xiaochun Cao, Shuicheng Yan
Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image.
1 code implementation • 9 Mar 2015 • Xiaodan Liang, Si Liu, Xiaohui Shen, Jianchao Yang, Luoqi Liu, Jian Dong, Liang Lin, Shuicheng Yan
The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters.
no code implementations • 11 Nov 2014 • Xiaodan Liang, Si Liu, Yunchao Wei, Luoqi Liu, Liang Lin, Shuicheng Yan
Then the concept detector can be fine-tuned based on these new instances.