no code implementations • ECCV 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis, Wengang Zhou, Qi Tian
When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.
no code implementations • 16 Mar 2023 • Xinyue Huo, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Currently, a popular UDA framework lies in self-training which endows the model with two-fold abilities: (i) learning reliable semantics from the labeled images in the source domain, and (ii) adapting to the target domain via generating pseudo labels on the unlabeled images.
1 code implementation • 16 Mar 2023 • Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, Hezhen Hu, Hong Chen, Houqiang Li
We find that existing detectors struggle to detect images generated by diffusion models, even if we include generated images from a specific diffusion model in their training data.
no code implementations • 10 Feb 2023 • Weichao Zhao, Hezhen Hu, Wengang Zhou, Jiaxin Shi, Houqiang Li
In this work, we are dedicated to leveraging the BERT pre-training success and modeling the domain-specific statistics to fertilize the sign language recognition~(SLR) model.
1 code implementation • 21 Jan 2023 • Hao Feng, Wengang Zhou, Yufei Yin, Jiajun Deng, Qi Sun, Houqiang Li
In this work, we propose a novel deep network architecture, i. e., PolySnake, for contour-based instance segmentation.
Ranked #1 on
Semantic Contour Prediction
on Sbd val
no code implementations • 28 Nov 2022 • YiXuan Wang, Wengang Zhou, Jianmin Bao, Weilun Wang, Li Li, Houqiang Li
The key idea of our CLIP2GAN is to bridge the output feature embedding space of CLIP and the input latent space of StyleGAN, which is realized by introducing a mapping network.
no code implementations • 28 Nov 2022 • Hezhen Hu, Weilun Wang, Wengang Zhou, Houqiang Li
In this work, we are dedicated to a new task, i. e., hand-object interaction image generation, which aims to conditionally generate the hand-object image under the given hand, object and their interaction status.
1 code implementation • 22 Nov 2022 • Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
Ranked #1 on
Image Generation
on Places50
no code implementations • 31 Oct 2022 • Yudong Lu, Jian Zhao, Youpeng Zhao, Wengang Zhou, Houqiang Li
We compare it with 8 baseline AI programs which are based on heuristic rules and the results reveal the outstanding performance of DanZero.
no code implementations • Findings (EMNLP) 2021 • Yuechen Wang, Wengang Zhou, Houqiang Li
In this work, we propose a novel candidate-free framework: Fine-grained Semantic Alignment Network (FSAN), for weakly supervised TLG.
1 code implementation • 15 Oct 2022 • Yonghui Wang, Wengang Zhou, Zhenbo Lu, Houqiang Li
To this end, we propose UDoc-GAN, the first framework to address the problem of document illumination correction under the unpaired setting.
1 code implementation • 15 Oct 2022 • Hao Feng, Wengang Zhou, Jiajun Deng, Yuechen Wang, Houqiang Li
In document image rectification, there exist rich geometric constraints between the distorted image and the ground truth one.
1 code implementation • 26 Aug 2022 • Yunyao Mao, Wengang Zhou, Zhenbo Lu, Jiajun Deng, Houqiang Li
In this work, we formulate the cross-modal interaction as a bidirectional knowledge distillation problem.
no code implementations • 23 Aug 2022 • Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian
Low-light video enhancement (LLVE) is an important yet challenging task with many applications such as photographing and autonomous driving.
1 code implementation • 14 Jul 2022 • Jinhua Zhu, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
The model is pre-trained on three tasks: reconstruction of masked atoms and coordinates, 3D conformation generation conditioned on 2D graph, and 2D graph generation conditioned on 3D conformation.
1 code implementation • 30 Jun 2022 • Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks compared with Generative Adversarial Nets (GANs).
1 code implementation • 14 Jun 2022 • Jiajun Deng, Zhengyuan Yang, Daqing Liu, Tianlang Chen, Wengang Zhou, Yanyong Zhang, Houqiang Li, Wanli Ouyang
For another, we devise Language Conditioned Vision Transformer that removes external fusion modules and reuses the uni-modal ViT for vision-language fusion at the intermediate layers.
1 code implementation • 8 Jun 2022 • Minrui Wang, Mingxiao Feng, Wengang Zhou, Houqiang Li
Utilizing MARL algorithms to coordinate multiple control units in the grid, which is able to handle rapid changes of power systems, has been widely studied in active voltage control task recently.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
1 code implementation • 8 May 2022 • Qing Li, Wengang Zhou, Zhenbo Lu, Houqiang Li
Actor-critic Reinforcement Learning (RL) algorithms have achieved impressive performance in continuous control tasks.
no code implementations • 7 May 2022 • Zheng Chen, Jian Zhao, Mingyu Yang, Wengang Zhou, Houqiang Li
In this work, we are dedicated to multi-target active object tracking (AOT), where there are multiple targets as well as multiple cameras in the environment.
no code implementations • 5 May 2022 • Mingyu Yang, Jian Zhao, Xunhan Hu, Wengang Zhou, Jiangcheng Zhu, Houqiang Li
In this way, agents dealing with the same subtask share their learning of specific abilities and different subtasks correspond to different specific abilities.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
no code implementations • CVPR 2022 • Xinyue Huo, Lingxi Xie, Hengtong Hu, Wengang Zhou, Houqiang Li, Qi Tian
Unsupervised domain adaptation (UDA) is an important topic in the computer vision community.
1 code implementation • 6 Apr 2022 • Youpeng Zhao, Jian Zhao, Xunhan Hu, Wengang Zhou, Houqiang Li
Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various perfect and imperfect information games.
no code implementations • 21 Mar 2022 • Xiaodong Cun, Zhendong Wang, Chi-Man Pun, Jianzhuang Liu, Wengang Zhou, Xu Jia, Houqiang Li
Color constancy aims to restore the constant colors of a scene under different illuminants.
1 code implementation • 16 Mar 2022 • Jian Zhao, Xunhan Hu, Mingyu Yang, Wengang Zhou, Jiangcheng Zhu, Houqiang Li
In this way, CTDS balances the full utilization of global observation during training and the feasibility of decentralized execution for online inference.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
1 code implementation • 16 Mar 2022 • Jian Zhao, Youpeng Zhao, Weixun Wang, Mingyu Yang, Xunhan Hu, Wengang Zhou, Jianye Hao, Houqiang Li
To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
no code implementations • 11 Mar 2022 • Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Qi Tian
In this paper, we propose a novel approach that embeds a task-agnostic prior into a transformer.
no code implementations • 10 Mar 2022 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Recently, masked image modeling (MIM) has become a promising direction for visual pre-training.
no code implementations • 22 Feb 2022 • Zeyu Fang, Jian Zhao, Mingyu Yang, Wengang Zhou, Zhenbo Lu, Houqiang Li
In our approach, we regard each camera as an agent and address AMOT with a multi-agent reinforcement learning solution.
1 code implementation • 21 Feb 2022 • Jian Zhao, Mingyu Yang, Youpeng Zhao, Xunhan Hu, Wengang Zhou, Jiangcheng Zhu, Houqiang Li
Specifically, we model both individual Q-values and global Q-value with categorical distribution.
no code implementations • 9 Feb 2022 • Jian Zhao, Yue Zhang, Xunhan Hu, Weixun Wang, Wengang Zhou, Jianye Hao, Jiangcheng Zhu, Houqiang Li
In cooperative multi-agent systems, agents jointly take actions and receive a team reward instead of individual rewards.
1 code implementation • 3 Feb 2022 • Jinhua Zhu, Yingce Xia, Chang Liu, Lijun Wu, Shufang Xie, Yusong Wang, Tong Wang, Tao Qin, Wengang Zhou, Houqiang Li, Haiguang Liu, Tie-Yan Liu
Molecular conformation generation aims to generate three-dimensional coordinates of all the atoms in a molecule and is an important task in bioinformatics and pharmacology.
no code implementations • CVPR 2022 • Hui Wu, Min Wang, Wengang Zhou, Houqiang Li, Qi Tian
To this end, we propose a flexible contextual similarity distillation framework to enhance the small query model and keep its output feature compatible with that of large gallery model, which is crucial with asymmetric retrieval.
1 code implementation • 12 Dec 2021 • Hui Wu, Min Wang, Wengang Zhou, Yang Hu, Houqiang Li
Next, a refinement block is introduced to enhance the visual tokens with self-attention and cross-attention.
no code implementations • 29 Oct 2021 • Yiheng Liu, Wengang Zhou, Qiaokang Xie, Houqiang Li
To this end, we propose to explore unsupervised person re-identification with both visual data and wireless positioning trajectories under weak scene labeling, in which we only need to know the locations of the cameras.
3 code implementations • 28 Oct 2021 • Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, Houqiang Li
The iterative refinements make DocScanner converge to a robust and superior rectification performance, while the lightweight recurrent architecture ensures the running efficiency.
1 code implementation • NeurIPS 2021 • Jianbo Ouyang, Hui Wu, Min Wang, Wengang Zhou, Houqiang Li
Since our re-ranking model is not directly involved with the visual feature used in the initial retrieval, it is ready to be applied to retrieval result lists obtained from various retrieval algorithms.
1 code implementation • 25 Oct 2021 • Hao Feng, Yuechen Wang, Wengang Zhou, Jiajun Deng, Houqiang Li
Specifically, DocTr consists of a geometric unwarping transformer and an illumination correction transformer.
no code implementations • ICCV 2021 • Hezhen Hu, Weichao Zhao, Wengang Zhou, Yuechen Wang, Houqiang Li
To validate the effectiveness of our method on SLR, we perform extensive experiments on four public benchmark datasets, i. e., NMFs-CSL, SLR500, MSASL and WLASL.
Ranked #1 on
Sign Language Recognition
on WLASL100
(using extra training data)
no code implementations • 29 Sep 2021 • Mingxiao Feng, Guozi Liu, Li Zhao, Lei Song, Jiang Bian, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
We consider inventory management (IM) problem for a single store with a large number of SKUs (stock keeping units) in this paper, where we need to make replenishment decisions for each SKU to balance its supply and demand.
no code implementations • 25 Aug 2021 • Xiao Cui, Wengang Zhou, Yang Hu, Weilun Wang, Houqiang Li
The main idea is to disentangle the latent space of a pre-trained generation model and precisely control the face attributes of child images with clear semantics.
no code implementations • ICCV 2021 • Weilun Wang, Wengang Zhou, Jianmin Bao, Dong Chen, Houqiang Li
In this paper, we uncover that the negative examples play a critical role in the performance of contrastive learning for image translation.
1 code implementation • ICCV 2021 • Yunyao Mao, Ning Wang, Wengang Zhou, Houqiang Li
In this work, we propose to integrate transductive and inductive learning into a unified framework to exploit the complementarity between them for accurate and robust video object segmentation.
Semantic Segmentation
Semi-Supervised Video Object Segmentation
+2
1 code implementation • 30 Jul 2021 • Jiajun Deng, Wengang Zhou, Yanyong Zhang, Houqiang Li
To this end, in this work, we regard point clouds as hollow-3D data and propose a new architecture, namely Hallucinated Hollow-3D R-CNN ($\text{H}^2$3D R-CNN), to address the problem of 3D object detection.
1 code implementation • 30 Jun 2021 • Yuechen Wang, Jiajun Deng, Wengang Zhou, Houqiang Li
To this end, we introduce a novel weakly supervised temporal adjacent network (WSTAN) for temporal language grounding.
no code implementations • CVPR 2021 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data.
no code implementations • CVPR 2021 • Hezhen Hu, Weilun Wang, Wengang Zhou, Weichao Zhao, Houqiang Li
Then, a transformation flow is calculated based on the correspondence of the source and target topology map.
no code implementations • 17 Jun 2021 • Jinhua Zhu, Yingce Xia, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
After pre-training, we can use either the Transformer branch (this one is recommended according to empirical results), the GNN branch, or both for downstream tasks.
4 code implementations • CVPR 2022 • Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, Houqiang Li
Powered by these two designs, Uformer enjoys a high capability for capturing both local and global dependencies for image restoration.
Ranked #1 on
Deblurring
on RSBlur
no code implementations • 1 Jun 2021 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
By simply pulling the different augmented views of each image together or other novel mechanisms, they can learn much unsupervised knowledge and significantly improve the transfer performance of pre-training models.
no code implementations • CVPR 2021 • Hao Zhou, Wengang Zhou, Weizhen Qi, Junfu Pu, Houqiang Li
Finally, the synthetic parallel data serves as a strong supplement for the end-to-end training of the encoder-decoder SLT framework.
Ranked #3 on
Sign Language Translation
on CSL-Daily
2 code implementations • ICCV 2021 • Jiajun Deng, Zhengyuan Yang, Tianlang Chen, Wengang Zhou, Houqiang Li
In this paper, we present a neat yet effective transformer-based framework for visual grounding, namely TransVG, to address the task of grounding a language query to the corresponding region onto an image.
Ranked #12 on
Referring Expression Comprehension
on RefCOCO
1 code implementation • CVPR 2021 • Ning Wang, Wengang Zhou, Jie Wang, Houqaing Li
In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers.
Ranked #18 on
Visual Object Tracking
on LaSOT
1 code implementation • ICLR 2021 • Jinhua Zhu, Lijun Wu, Yingce Xia, Shufang Xie, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
Based on this observation, in this work, we break the assumption of the fixed layer order in the Transformer and introduce instance-wise layer reordering into the model structure.
1 code implementation • ICCV 2021 • Hui Wu, Min Wang, Wengang Zhou, Houqiang Li
To this end, we propose a novel deep local feature learning architecture to simultaneously focus on multiple discriminative local patterns in an image.
4 code implementations • 31 Dec 2020 • Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li
In this paper, we take a slightly different viewpoint -- we find that precise positioning of raw points is not essential for high performance 3D object detection and that the coarse voxel granularity can also offer sufficient detection accuracy.
Ranked #4 on
3D Object Detection
on KITTI Cars Moderate val
1 code implementation • 9 Dec 2020 • Ning Wang, Wengang Zhou, Houqiang Li
It is worth mentioning that our method also surpasses the fully-supervised affinity representation (e. g., ResNet) and performs competitively against the recent fully-supervised algorithms designed for the specific tasks (e. g., VOT and VOS).
no code implementations • 27 Nov 2020 • Zhenxun Yuan, Xiao Song, Lei Bai, Wengang Zhou, Zhe Wang, Wanli Ouyang
As a special design of this transformer, the information encoded in the encoder is different from that in the decoder, i. e. the encoder encodes temporal-channel information of multiple frames while the decoder decodes the spatial-channel information for the current frame in a voxel-wise manner.
no code implementations • 19 Nov 2020 • Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation.
no code implementations • 17 Nov 2020 • Longhui Wei, Lingxi Xie, Jianzhong He, Jianlong Chang, Xiaopeng Zhang, Wengang Zhou, Houqiang Li, Qi Tian
Recently, contrastive learning has largely advanced the progress of unsupervised visual representation learning.
1 code implementation • 15 Oct 2020 • Jinhua Zhu, Yingce Xia, Lijun Wu, Jiajun Deng, Wengang Zhou, Tao Qin, Houqiang Li
During inference, the CNN encoder and the policy network are used to take actions, and the Transformer module is discarded.
no code implementations • 11 Oct 2020 • Junfu Pu, Wengang Zhou, Hezhen Hu, Houqiang Li
Continuous sign language recognition (SLR) deals with unaligned video-text pair and uses the word error rate (WER), i. e., edit distance, as the main evaluation metric.
no code implementations • 24 Aug 2020 • Hezhen Hu, Wengang Zhou, Junfu Pu, Houqiang Li
Sign language recognition (SLR) is a challenging problem, involving complex manual features, i. e., hand gestures, and fine-grained non-manual features (NMFs), i. e., facial expression, mouth shapes, etc.
1 code implementation • 10 Aug 2020 • Yiheng Liu, Wengang Zhou, Mao Xi, Sanjing Shen, Houqiang Li
Existing person re-identification methods rely on the visual sensor to capture the pedestrians.
1 code implementation • 22 Jul 2020 • Ning Wang, Wengang Zhou, Yibing Song, Chao Ma, Wei Liu, Houqiang Li
The advancement of visual tracking has continuously been brought by deep learning models.
no code implementations • 14 Jul 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian
When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.
1 code implementation • 7 Jul 2020 • Jiajun Deng, Yingwei Pan, Ting Yao, Wengang Zhou, Houqiang Li, Tao Mei
Single shot detectors that are potentially faster and simpler than two-stage detectors tend to be more applicable to object detection in videos.
no code implementations • 18 Jun 2020 • Ning Wang, Wengang Zhou, Qi Tian, Houqiang Li
In the second stage, a discrete sampling based ridge regression is designed to double-check the remaining ambiguous hard samples, which serves as an alternative of fully-connected layers and benefits from the closed-form solver for efficient learning.
3 code implementations • ICLR 2020 • Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
While BERT is more commonly used as fine-tuning instead of contextual embedding for downstream language understanding tasks, in NMT, our preliminary exploration of using BERT as contextual embedding is better than using for fine-tuning.
no code implementations • 8 Feb 2020 • Hao Zhou, Wengang Zhou, Yun Zhou, Houqiang Li
Our STMC network consists of a spatial multi-cue (SMC) module and a temporal multi-cue (TMC) module.
Ranked #3 on
Sign Language Recognition
on RWTH-PHOENIX-Weather 2014 T
(Word Error Rate (WER) metric)
1 code implementation • 25 Oct 2019 • Qiaokang Xie, Wengang Zhou, Guo-Jun Qi, Qi Tian, Houqiang Li
In our approach, we first collect tracklet data within each camera by automatic person detection and tracking.
no code implementations • 25 Oct 2019 • Yiheng Liu, Wengang Zhou, Jianzhuang Liu, Guo-Jun Qi, Qi Tian, Houqiang Li
By presenting a target attention loss, the pedestrian features extracted from the foreground branch become more insensitive to the backgrounds, which greatly reduces the negative impacts of changing backgrounds on matching an identical across different camera views.
2 code implementations • ICCV 2019 • Jiajun Deng, Yingwei Pan, Ting Yao, Wengang Zhou, Houqiang Li, Tao Mei
In this paper, we introduce a new design to capture the interactions across the objects in spatio-temporal context.
1 code implementation • 23 Jul 2019 • Ning Wang, Wengang Zhou, Yibing Song, Chao Ma, Houqiang Li
In the distillation process, we propose a fidelity loss to enable the student network to maintain the representation capability of the teacher network.
no code implementations • 28 May 2019 • Zhengguang Zhou, Wengang Zhou, Richang Hong, Houqiang Li
Pruning filters is an effective method for accelerating deep neural networks (DNNs), but most existing approaches prune filters on a pre-trained network directly which limits in acceleration.
no code implementations • 28 May 2019 • Zhengguang Zhou, Wengang Zhou, Xutao Lv, Xuan Huang, Xiaoyu Wang, Houqiang Li
Recent years have witnessed the great advance of deep learning in a variety of vision tasks.
1 code implementation • ACL 2019 • Jinhua Zhu, Fei Gao, Lijun Wu, Yingce Xia, Tao Qin, Wengang Zhou, Xue-Qi Cheng, Tie-Yan Liu
While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study in natural language tasks is still very limited.
1 code implementation • CVPR 2019 • Ning Wang, Yibing Song, Chao Ma, Wengang Zhou, Wei Liu, Houqiang Li
We propose an unsupervised visual tracking method in this paper.
no code implementations • 19 Feb 2019 • Chen Change Loy, Dahua Lin, Wanli Ouyang, Yuanjun Xiong, Shuo Yang, Qingqiu Huang, Dongzhan Zhou, Wei Xia, Quanquan Li, Ping Luo, Junjie Yan, Jian-Feng Wang, Zuoxin Li, Ye Yuan, Boxun Li, Shuai Shao, Gang Yu, Fangyun Wei, Xiang Ming, Dong Chen, Shifeng Zhang, Cheng Chi, Zhen Lei, Stan Z. Li, Hongkai Zhang, Bingpeng Ma, Hong Chang, Shiguang Shan, Xilin Chen, Wu Liu, Boyan Zhou, Huaxiong Li, Peng Cheng, Tao Mei, Artem Kukharenko, Artem Vasenin, Nikolay Sergievskiy, Hua Yang, Liangqi Li, Qiling Xu, Yuan Hong, Lin Chen, Mingjun Sun, Yirong Mao, Shiying Luo, Yongjun Li, Ruiping Wang, Qiaokang Xie, Ziyang Wu, Lei Lu, Yiheng Liu, Wengang Zhou
This paper presents a review of the 2018 WIDER Challenge on Face and Pedestrian.
1 code implementation • 26 Dec 2018 • Yiheng Liu, Zhenxun Yuan, Wengang Zhou, Houqiang Li
How to explore the abundant spatial-temporal information in video sequences is the key to solve this problem.
1 code implementation • ECCV 2018 • Yiding Liu, Siyu Yang, Bin Li, Wengang Zhou, Jizheng Xu, Houqiang Li, Yan Lu
We present an instance segmentation scheme based on pixel affinity information, which is the relationship of two pixels belonging to a same instance.
1 code implementation • CVPR 2018 • Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, Houqiang Li
By combining different types of features, our approach constructs multiple experts through Discriminative Correlation Filter (DCF) and each of them tracks the target independently.
no code implementations • 8 May 2018 • Yunfeng Wang, Wengang Zhou, Qilin Zhang, Xiaotian Zhu, Houqiang Li
Termed "Weighted Multi-Region Convolutional Neural Network" (WMR ConvNet), the proposed system is LSTM-free, and is based on 2D ConvNet that does not require the accumulation of video frames for 3D ConvNet filtering.
no code implementations • 8 May 2018 • Yunfeng Wang, Wengang Zhou, Qilin Zhang, Houqiang Li
Visual attributes in individual video frames, such as the presence of characteristic objects and scenes, offer substantial information for action recognition in videos.
no code implementations • 30 Jan 2018 • Jie Huang, Wengang Zhou, Qilin Zhang, Houqiang Li, Weiping Li
Worse still, isolated SLR methods typically require strenuous labeling of each word separately in a sentence, severely limiting the amount of attainable training data.
no code implementations • 19 Jun 2017 • Wengang Zhou, Houqiang Li, Qi Tian
The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval.
no code implementations • CVPR 2016 • Xiaopeng Zhang, Hongkai Xiong, Wengang Zhou, Weiyao Lin, Qi Tian
Recognizing fine-grained sub-categories such as birds and dogs is extremely challenging due to the highly localized and subtle differences in some specific parts.
no code implementations • CVPR 2015 • Peng Zhang, Wengang Zhou, Lei Wu, Houqiang Li
We propose to extract two types of features, one to measure the semantic obviousness of the image and the other to discover local characteristic.
Image Quality Estimation
No-Reference Image Quality Assessment
no code implementations • CVPR 2014 • Liang Zheng, Shengjin Wang, Wengang Zhou, Qi Tian
Albeit simple, Bayes merging can be well applied in various merging tasks, and consistently improves the baselines on multi-vocabulary merging.