no code implementations • RDSM (COLING) 2020 • Yu Qiao, Daniel Wiechmann, Elma Kerz
We demonstrate that our approach is promising as it achieves similar results on these two datasets as the best performing black box models reported in the literature.
no code implementations • EACL (WASSA) 2021 • Elma Kerz, Yu Qiao, Daniel Wiechmann
The aim of the paper is twofold: (1) to automatically predict the ratings assigned by viewers to 14 categories available for TED talks in a multi-label classification task and (2) to determine what types of features drive classification accuracy for each of the categories.
1 code implementation • ECCV 2020 • Xiao Zhang, Rui Zhao, Yu Qiao, Hongsheng Li
To address this problem, this paper introduces a novel Radial Basis Function (RBF) distances to replace the commonly used inner products in the softmax loss function, such that it can adaptively assign losses to regularize the intra-class and inter-class distances by reshaping the relative differences, and thus creating more representative prototypes of classes to improve optimization.
1 code implementation • ECCV 2020 • Mingfei Han, Yali Wang, Xiaojun Chang, Yu Qiao
Recent studies have shown that, context aggregating information from proposals in different frames can clearly enhance the performance of video object detection.
Ranked #1 on
Video Object Detection
on ImageNet VID
no code implementations • EACL (BEA) 2021 • Elma Kerz, Daniel Wiechmann, Yu Qiao, Emma Tseng, Marcus Ströbel
The key to the present paper is the combined use of what we refer to as ‘complexity contours’, a series of measurements of indices of L2 proficiency obtained by a computational tool that implements a sliding window technique, and recurrent neural network (RNN) classifiers that adequately capture the sequential information in those contours.
1 code implementation • EMNLP (FEVER) 2021 • Justus Mattern, Yu Qiao, Elma Kerz, Daniel Wiechmann, Markus Strohmaier
As the world continues to fight the COVID-19 pandemic, it is simultaneously fighting an ‘infodemic’ – a flood of disinformation and spread of conspiracy theories leading to health threats and the division of society.
1 code implementation • 17 May 2022 • Zhe Chen, Yuchen Duan, Wenhai Wang, Junjun He, Tong Lu, Jifeng Dai, Yu Qiao
When fine-tuning on downstream tasks, a modality-specific adapter is used to introduce the data and tasks' prior information into the model, making it suitable for these tasks.
Ranked #1 on
Semantic Segmentation
on ADE20K val
no code implementations • 14 May 2022 • Yihao Liu, Hengyuan Zhao, Jinjin Gu, Yu Qiao, Chao Dong
However, research on the generalization ability of Super-Resolution (SR) networks is currently absent.
no code implementations • 12 May 2022 • Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Jinjin Gu, Yu Qiao, Chao Dong
One is the usage of blueprint separable convolution (BSConv), which takes place of the redundant convolution operation.
2 code implementations • 11 May 2022 • Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang
The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.
1 code implementation • 8 May 2022 • Peng Gao, Teli Ma, Hongsheng Li, Ziyi Lin, Jifeng Dai, Yu Qiao
Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT, leading to state-of-the-art performances on image classification, detection and semantic segmentation.
1 code implementation • 3 May 2022 • Mengzhe He, Yali Wang, Jiaxi Wu, Yiru Wang, Hanqing Li, Bo Li, Weihao Gan, Wei Wu, Yu Qiao
It can adaptively enhance source detector to perceive objects in a target image, by leveraging target proposal contexts from iterative cross-attention.
no code implementations • WASSA (ACL) 2022 • Elma Kerz, Yu Qiao, Sourabh Zanwar, Daniel Wiechmann
Research at the intersection of personality psychology, computer science, and linguistics has recently focused increasingly on modeling and predicting personality from language use.
no code implementations • 5 Apr 2022 • Mingfei Han, David Junhao Zhang, Yali Wang, Rui Yan, Lina Yao, Xiaojun Chang, Yu Qiao
Learning spatial-temporal relation among multiple actors is crucial for group activity recognition.
1 code implementation • 3 Apr 2022 • Kexue Fu, Peng Gao, Shaolei Liu, Renrui Zhang, Yu Qiao, Manning Wang
We propose to use the dynamically updated momentum encoder as the tokenizer, which is updated and outputs the dynamic supervision signal along with the training process.
1 code implementation • 24 Mar 2022 • Renrui Zhang, Han Qiu, Tai Wang, Xuanzhuo Xu, Ziyu Guo, Yu Qiao, Peng Gao, Hongsheng Li
In this paper, we introduce a simple framework for Monocular DEtection with depth-aware TRansformer, named MonoDETR.
2 code implementations • 21 Mar 2022 • Li Chen, Chonghao Sima, Yang Li, Zehan Zheng, Jiajie Xu, Xiangwei Geng, Hongyang Li, Conghui He, Jianping Shi, Yu Qiao, Junchi Yan
Methods for 3D lane detection have been recently proposed to address the issue of inaccurate lane layouts in many autonomous driving scenarios (uphill/downhill, bump, etc.).
Ranked #1 on
3D Lane Detection
on OpenLane
no code implementations • 16 Mar 2022 • Yinan He, Gengshi Huang, Siyu Chen, Jianing Teng, Wang Kun, Zhenfei Yin, Lu Sheng, Ziwei Liu, Yu Qiao, Jing Shao
2) Squeeze Stage: X-Learner condenses the model to a reasonable size and learns the universal and generalizable representation for various tasks transferring.
no code implementations • ACL 2022 • Daniel Wiechmann, Yu Qiao, Elma Kerz, Justus Mattern
There is a growing interest in the combined use of NLP and machine learning methods to predict gaze patterns during naturalistic reading.
1 code implementation • 15 Mar 2022 • Yuanhan Zhang, Qinghong Sun, Yichun Zhou, Zexin He, Zhenfei Yin, Kun Wang, Lu Sheng, Yu Qiao, Jing Shao, Ziwei Liu
Specifically, we contribute Bamboo Dataset, a mega-scale and information-dense dataset for both classification and detection.
no code implementations • 9 Feb 2022 • Kexue Fu, Peng Gao, Renrui Zhang, Hongsheng Li, Yu Qiao, Manning Wang
Especially, we develop a variant of ViT for 3D point cloud feature extraction, which also achieves comparable results with existing backbones when combined with our framework, and visualization of the attention maps show that our model does understand the point cloud by combining the global shape information and multiple local structural information, which is consistent with the inspiration of our representation learning method.
3 code implementations • 24 Jan 2022 • Kunchang Li, Yali Wang, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
Different from the typical transformer blocks, the relation aggregators in our UniFormer block are equipped with local and global token affinity respectively in shallow and deep layers, allowing to tackle both redundancy and dependency for efficient and effective representation learning.
Ranked #72 on
Image Classification
on ImageNet
no code implementations • 20 Jan 2022 • Mingye Xu, Yali Wang, Zhipeng Zhou, Hongbin Xu, Yu Qiao
To fill this gap, we propose a generic Contour-Perturbed Reconstruction Network (CP-Net), which can effectively guide self-supervised reconstruction to learn semantic content in the point cloud, and thus promote discriminative power of point cloud representation.
1 code implementation • 12 Jan 2022 • Kunchang Li, Yali Wang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
For Something-Something V1 and V2, our UniFormer achieves new state-of-the-art performances of 60. 9% and 71. 2% top-1 accuracy respectively.
2 code implementations • 22 Dec 2021 • Liang Pan, Tong Wu, Zhongang Cai, Ziwei Liu, Xumin Yu, Yongming Rao, Jiwen Lu, Jie zhou, Mingye Xu, Xiaoyuan Luo, Kexue Fu, Peng Gao, Manning Wang, Yali Wang, Yu Qiao, Junsheng Zhou, Xin Wen, Peng Xiang, Yu-Shen Liu, Zhizhong Han, Yuanjie Yan, Junyi An, Lifa Zhu, Changwei Lin, Dongrui Liu, Xin Li, Francisco Gómez-Fernández, Qinlong Wang, Yang Yang
Based on the MVP dataset, this paper reports methods and results in the Multi-View Partial Point Cloud Challenge 2021 on Completion and Registration.
no code implementations • 22 Dec 2021 • Xiangtao Kong, Xina Liu, Jinjin Gu, Yu Qiao, Chao Dong
Dropout is designed to relieve the overfitting problem in high-level vision tasks but is rarely applied in low-level vision tasks, like image super-resolution (SR).
no code implementations • 11 Dec 2021 • Yu Qiao, Jincheng Zhu, Chengjiang Long, Zeyao Zhang, Yuxin Wang, Zhenjun Du, Xin Yang
Acquiring the most representative examples via active learning (AL) can benefit many data-dependent computer vision tasks by minimizing efforts of image-level or pixel-wise annotations.
2 code implementations • 4 Dec 2021 • Renrui Zhang, Ziyu Guo, Wei zhang, Kunchang Li, Xupeng Miao, Bin Cui, Yu Qiao, Peng Gao, Hongsheng Li
On top of that, we design an inter-view adapter to better extract the global feature and adaptively fuse the few-shot knowledge learned from 3D into CLIP pre-trained in 2D.
1 code implementation • 29 Nov 2021 • Teli Ma, Shijie Geng, Mengmeng Wang, Jing Shao, Jiasen Lu, Hongsheng Li, Peng Gao, Yu Qiao
Recent advances in large-scale contrastive visual-language pretraining shed light on a new pathway for visual recognition.
Ranked #2 on
Long-tail Learning
on ImageNet-LT
1 code implementation • 26 Nov 2021 • Changyao Tian, Wenhai Wang, Xizhou Zhu, Xiaogang Wang, Jifeng Dai, Yu Qiao
Deep learning-based models encounter challenges when processing long-tailed data in the real world.
Ranked #1 on
Long-tail Learning
on ImageNet-LT
(using extra training data)
no code implementations • 24 Nov 2021 • Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu
Vision transformers (ViTs) have become the popular structures and outperformed convolutional neural networks (CNNs) on various vision tasks.
1 code implementation • 24 Nov 2021 • David Junhao Zhang, Kunchang Li, Yunpeng Chen, Yali Wang, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou
Self-attention has become an integral component of the recent network architectures, e. g., Transformer, that dominate major image and video benchmarks.
Ranked #11 on
Action Recognition
on Something-Something V2
(using extra training data)
no code implementations • 16 Nov 2021 • Jing Shao, Siyu Chen, Yangguang Li, Kun Wang, Zhenfei Yin, Yinan He, Jianing Teng, Qinghong Sun, Mengya Gao, Jihao Liu, Gengshi Huang, Guanglu Song, Yichao Wu, Yuming Huang, Fenggang Liu, Huan Peng, Shuo Qin, Chengyu Wang, Yujie Wang, Conghui He, Ding Liang, Yu Liu, Fengwei Yu, Junjie Yan, Dahua Lin, Xiaogang Wang, Yu Qiao
Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society.
no code implementations • 13 Nov 2021 • Yu Qiao, Sourabh Zanwar, Rishab Bhattacharyya, Daniel Wiechmann, Wei Zhou, Elma Kerz, Ralf Schlüter
One of the key communicative competencies is the ability to maintain fluency in monologic speech and the ability to produce sophisticated language to argue a position convincingly.
1 code implementation • 6 Nov 2021 • Renrui Zhang, Rongyao Fang, Wei zhang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li
To further enhance CLIP's few-shot capability, CLIP-Adapter proposed to fine-tune a lightweight residual feature adapter and significantly improves the performance for few-shot classification.
no code implementations • 11 Oct 2021 • Yu Qiao, Sikai Chen, Majed Alinizzi, Miltos Alamaniotis, Samuel Labi
However, it is costly to measure IRI, and for this reason, certain road classes are excluded from IRI measurements at a network level.
1 code implementation • 9 Oct 2021 • Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, Yu Qiao
Large-scale contrastive vision-language pre-training has shown significant progress in visual representation learning.
no code implementations • 9 Oct 2021 • Yihao Liu, Hengyuan Zhao, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Yu Qiao, Chao Dong
We address this problem from a new perspective, by jointly considering colorization and temporal consistency in a unified framework.
2 code implementations • ICLR 2022 • Kunchang Li, Yali Wang, Gao Peng, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
For Something-Something V1 and V2, our UniFormer achieves new state-of-the-art performances of 60. 8% and 71. 4% top-1 accuracy respectively.
Ranked #1 on
Action Recognition
on Something-Something V1
no code implementations • 29 Sep 2021 • Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu
Specifically, we first design a novel Token Slimming Module (TSM), which can boost the inference efficiency of ViTs by dynamic token aggregation.
no code implementations • 26 Sep 2021 • Zijie Chen, Cheng Li, Junjun He, Jin Ye, Diping Song, Shanshan Wang, Lixu Gu, Yu Qiao
An essential step of RT planning is the accurate segmentation of various organs-at-risks (OARs) in HaN CT images.
no code implementations • 26 Sep 2021 • Junjun He, Jin Ye, Cheng Li, Diping Song, Wanli Chen, Shanshan Wang, Lixu Gu, Yu Qiao
Recent studies have witnessed the effectiveness of 3D convolutions on segmenting volumetric medical images.
no code implementations • 15 Sep 2021 • Junhao Zhang, Yali Wang, Zhipeng Zhou, Tianyu Luan, Zhe Wang, Yu Qiao
Graph Convolution Network (GCN) has been successfully used for 3D human pose estimation in videos.
1 code implementation • ICCV 2021 • Hongbin Xu, Zhipeng Zhou, Yali Wang, Wenxiong Kang, Baigui Sun, Hao Li, Yu Qiao
Specially, the limitations can be categorized into two types: ambiguious supervision in foreground and invalid supervision in background.
1 code implementation • ICCV 2021 • Xiangyu Chen, Zhengwen Zhang, Jimmy S. Ren, Lynhoo Tian, Yu Qiao, Chao Dong
However, most available resources are still in standard dynamic range (SDR).
no code implementations • 1 Aug 2021 • Yihao Liu, Anran Liu, Jinjin Gu, Zhipeng Zhang, Wenhao Wu, Yu Qiao, Chao Dong
Super-resolution (SR) is a fundamental and representative task of low-level vision area.
no code implementations • 27 Jul 2021 • Haisheng Su, Peiqin Zhuang, Yukun Li, Dongliang Wang, Weihao Gan, Wei Wu, Yu Qiao
This technical report presents an overview of our solution used in the submission to 2021 HACS Temporal Action Localization Challenge on both Supervised Learning Track and Weakly-Supervised Learning Track.
Transfer Learning
Weakly-supervised Temporal Action Localization
+1
no code implementations • 20 Jul 2021 • Wenlong Zhang, Yihao Liu, Chao Dong, Yu Qiao
To address the problem, we propose Super-Resolution Generative Adversarial Networks with Ranker (RankSRGAN) to optimize generator in the direction of different perceptual metrics.
no code implementations • 7 Jul 2021 • Anran Liu, Yihao Liu, Jinjin Gu, Yu Qiao, Chao Dong
This paper serves as a systematic review on recent progress in blind image SR, and proposes a taxonomy to categorize existing methods into three different classes according to their ways of degradation modelling and the data used for solving the SR model.
2 code implementations • 5 Jul 2021 • Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang
In this work, we address domain generalization with MixStyle, a plug-and-play, parameter-free module that is simply inserted to shallow CNN layers and requires no modification to training objectives.
no code implementations • 28 Jun 2021 • Yuhao Liu, Jiake Xie, Yu Qiao, Yong Tang and, Xin Yang
Image matting is an ill-posed problem that aims to estimate the opacity of foreground pixels in an image.
no code implementations • 16 Jun 2021 • Yu Qiao, Xuefeng Yin, Daniel Wiechmann, Elma Kerz
In this paper, we combined linguistic complexity and (dis)fluency features with pretrained language models for the task of Alzheimer's disease detection of the 2021 ADReSSo (Alzheimer's Dementia Recognition through Spontaneous Speech) challenge.
no code implementations • CVPR 2021 • Xiao Zhang, Yixiao Ge, Yu Qiao, Hongsheng Li
Unsupervised object re-identification targets at learning discriminative representations for object retrieval without any annotations.
no code implementations • 4 Jun 2021 • Peng Gao, Shijie Geng, Yu Qiao, Xiaogang Wang, Jifeng Dai, Hongsheng Li
In this paper, we propose a novel Scalable Transformers, which naturally contains sub-Transformers of different scales and have shared parameters.
1 code implementation • ICLR 2021 • Kunchang Li, Xianhang Li, Yali Wang, Jun Wang, Yu Qiao
It can learn to exploit spatial, temporal and channel attention in a high-dimensional manner, to improve the cooperative power of all the feature dimensions in our CT-Module.
Ranked #6 on
Action Recognition
on Something-Something V1
no code implementations • 2 Jun 2021 • Haisheng Su, Jinyuan Feng, Dongliang Wang, Weihao Gan, Wei Wu, Yu Qiao
Specifically, SME aims to highlight the motion-sensitive area through local-global motion modeling, where the saliency alignment and pyramidal feature difference are conducted successively between neighboring frames to capture motion dynamics with less noises caused by misaligned background.
1 code implementation • 27 May 2021 • Xiangyu Chen, Yihao Liu, Zhengwen Zhang, Yu Qiao, Chao Dong
In this work, we propose a novel learning-based approach using a spatially dynamic encoder-decoder network, HDRUNet, to learn an end-to-end mapping for single image HDR reconstruction with denoising and dequantization.
no code implementations • 26 May 2021 • Shijie Yu, Feng Zhu, Dapeng Chen, Rui Zhao, Haobin Chen, Shixiang Tang, Jinguo Zhu, Yu Qiao
In UDCL, a universal expert supervises the learning of domain experts and continuously gathers knowledge from all domain experts.
no code implementations • 24 May 2021 • Yi Liu, LiMin Wang, Xiao Ma, Yali Wang, Yu Qiao
Second, the coarse action classes often lead to the ambiguous annotations of temporal boundaries, which are inappropriate for temporal action localization.
no code implementations • 16 May 2021 • Shijie Yu, Dapeng Chen, Rui Zhao, Haobin Chen, Yu Qiao
Person images captured by surveillance cameras are often occluded by various obstacles, which lead to defective feature representation and harm person re-identification (Re-ID) performance.
no code implementations • 7 May 2021 • Jinjin Gu, Haoming Cai, Chao Dong, Jimmy S. Ren, Yu Qiao, Shuhang Gu, Radu Timofte, Manri Cheon, SungJun Yoon, Byungyeon Kang, Junwoo Lee, Qing Zhang, Haiyang Guo, Yi Bin, Yuqing Hou, Hengliang Luo, Jingyu Guo, ZiRui Wang, Hai Wang, Wenming Yang, Qingyan Bai, Shuwei Shi, Weihao Xia, Mingdeng Cao, Jiahao Wang, Yifan Chen, Yujiu Yang, Yang Li, Tao Zhang, Longtao Feng, Yiting Liao, Junlin Li, William Thong, Jose Costa Pereira, Ales Leonardis, Steven McDonagh, Kele Xu, Lehan Yang, Hengxing Cai, Pengfei Sun, Seyed Mehdi Ayyoubzadeh, Ali Royat, Sid Ahmed Fezza, Dounia Hammou, Wassim Hamidouche, Sewoong Ahn, Gwangjin Yoon, Koki Tsubota, Hiroaki Akutsu, Kiyoharu Aizawa
This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021.
no code implementations • 17 Apr 2021 • Yu Qiao, Wei Zhou, Elma Kerz, Ralf Schlüter
In recent years, automated approaches to assessing linguistic complexity in second language (L2) writing have made significant progress in gauging learner performance, predicting human ratings of the quality of learner productions, and benchmarking L2 development.
no code implementations • 13 Apr 2021 • Yihao Liu, Jingwen He, Xiangyu Chen, Zhengwen Zhang, Hengyuan Zhao, Chao Dong, Yu Qiao
Photo retouching aims at improving the aesthetic visual quality of images that suffer from photographic defects such as poor contrast, over/under exposure, and inharmonious saturation.
1 code implementation • 12 Apr 2021 • Hongbin Xu, Zhipeng Zhou, Yu Qiao, Wenxiong Kang, Qiuxia Wu
Recent studies have witnessed that self-supervised methods based on view synthesis obtain clear progress on multi-view stereo (MVS).
2 code implementations • CVPR 2021 • Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao
The proposed method can thus be used to 1) improve the performance of HOI detection, especially for the HOIs with unseen objects; and 2) infer the affordances of novel objects.
3 code implementations • ICLR 2021 • Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang
Our method, termed MixStyle, is motivated by the observation that visual domain is closely related to image style (e. g., photo vs.~sketch images).
Ranked #32 on
Domain Generalization
on PACS
no code implementations • 31 Mar 2021 • Xin Yang, Yu Qiao, Shaozhe Chen, Shengfeng He, BaoCai Yin, Qiang Zhang, Xiaopeng Wei, Rynson W. H. Lau
Image matting is an ill-posed problem that usually requires additional user input, such as trimaps or scribbles.
1 code implementation • CVPR 2021 • Zhiwu Qing, Haisheng Su, Weihao Gan, Dongliang Wang, Wei Wu, Xiang Wang, Yu Qiao, Junjie Yan, Changxin Gao, Nong Sang
In this paper, we propose Temporal Context Aggregation Network (TCANet) to generate high-quality action proposals through "local and global" temporal context aggregation and complementary as well as progressive boundary refinement.
Ranked #3 on
Temporal Action Localization
on ActivityNet-1.3
1 code implementation • 18 Mar 2021 • Mingye Xu, Zhipeng Zhou, Junhao Zhang, Yu Qiao
This paper investigates the indistinguishable points (difficult to predict label) in semantic segmentation for large-scale 3D point clouds.
no code implementations • 16 Mar 2021 • Tianyu Luan, Yali Wang, Junhao Zhang, Zhe Wang, Zhipeng Zhou, Yu Qiao
By coupling advanced 3D pose estimators and HMR in a serial or parallel manner, these two frameworks can effectively correct human mesh with guidance of a concise pose calibration module.
1 code implementation • CVPR 2021 • Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao
With the proposed object fabricator, we are able to generate large-scale HOI samples for rare and unseen categories to alleviate the open long-tailed issues in HOI detection.
no code implementations • 8 Mar 2021 • Qing Li, Xiaojiang Peng, Yu Qiao, Qi Hao
The multi-label learning module leverages a memory feature bank and assigns each image with a multi-label vector based on the similarities between the image and feature bank.
3 code implementations • CVPR 2021 • Xiangtao Kong, Hengyuan Zhao, Yu Qiao, Chao Dong
On this basis, we propose a new solution pipeline -- ClassSR that combines classification and SR in a unified framework.
2 code implementations • 3 Mar 2021 • Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, Chen Change Loy
Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce.
no code implementations • 7 Jan 2021 • Yu Qiao, Yuhao Liu, Qiang Zhu, Xin Yang, Yuxin Wang, Qiang Zhang, Xiaopeng Wei
Image matting is a long-standing problem in computer graphics and vision, mostly identified as the accurate estimation of the foreground in input images.
1 code implementation • ICCV 2021 • Yuhao Liu, Jiake Xie, Xiao Shi, Yu Qiao, Yujie Huang, Yong Tang, Xin Yang
Regarding the nature of image matting, most researches have focused on solutions for transition regions.
no code implementations • 27 Dec 2020 • Hengshun Zhou, Debin Meng, Yuanyuan Zhang, Xiaojiang Peng, Jun Du, Kai Wang, Yu Qiao
The audio-video based emotion recognition aims to classify a given video into basic emotions.
1 code implementation • 20 Dec 2020 • Mutian Xu, Junhao Zhang, Zhipeng Zhou, Mingye Xu, Xiaojuan Qi, Yu Qiao
GDANet introduces Geometry-Disentangle Module to dynamically disentangle point clouds into the contour and flat part of 3D objects, respectively denoted by sharp and gentle variation components.
Ranked #5 on
3D Part Segmentation
on ShapeNet-Part
1 code implementation • ECCV 2020 • Jin Ye, Junjun He, Xiaojiang Peng, Wenhao Wu, Yu Qiao
To this end, we propose an Attention-Driven Dynamic Graph Convolutional Network (ADD-GCN) to dynamically generate a specific graph for each image.
1 code implementation • ECCV 2020 • Xiaojiang Peng, Kai Wang, Zhaoyang Zeng, Qing Li, Jianfei Yang, Yu Qiao
Specifically, this plug-and-play AFM first leverages a \textit{group-to-attend} module to construct groups and assign attention weights for group-wise samples, and then uses a \textit{mixup} module with the attention weights to interpolate massive noisy-suppressed samples.
1 code implementation • 2 Oct 2020 • Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong
Pixel attention (PA) is similar as channel attention and spatial attention in formulation.
1 code implementation • ECCV 2020 • Jingwen He, Yihao Liu, Yu Qiao, Chao Dong
The base network acts like an MLP that processes each pixel independently and the condition network extracts the global features of the input image to generate a condition vector.
3 code implementations • 15 Sep 2020 • Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu, Chenghua Li, Cong Leng, Jian Cheng, Guangyang Wu, Wenyi Wang, Xiaohong Liu, Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong, Maitreya Suin, Kuldeep Purohit, A. N. Rajagopalan, Xiaochuan Li, Zhiqiang Lang, Jiangtao Nie, Wei Wei, Lei Zhang, Abdul Muqeet, Jiwon Hwang, Subin Yang, JungHeum Kang, Sung-Ho Bae, Yongwoo Kim, Geun-Woo Jeon, Jun-Ho Choi, Jun-Hyuk Kim, Jong-Seok Lee, Steven Marty, Eric Marty, Dongliang Xiong, Siang Chen, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Haicheng Wang, Vineeth Bhaskara, Alex Levinshtein, Stavros Tsogkas, Allan Jepson, Xiangzhen Kong, Tongtong Zhao, Shanshan Zhao, Hrishikesh P. S, Densen Puthussery, Jiji C. V, Nan Nan, Shuai Liu, Jie Cai, Zibo Meng, Jiaming Ding, Chiu Man Ho, Xuehui Wang, Qiong Yan, Yuzhi Zhao, Long Chen, Jiangtao Zhang, Xiaotong Luo, Liang Chen, Yanyun Qu, Long Sun, Wenhao Wang, Zhenbing Liu, Rushi Lan, Rao Muhammad Umer, Christian Micheloni
This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results.
no code implementations • 15 Sep 2020 • Haisheng Su, Jing Su, Dongliang Wang, Weihao Gan, Wei Wu, Mengmeng Wang, Junjie Yan, Yu Qiao
Second, the parameter frequency distribution is further adopted to guide the student network to learn the appearance modeling process from the teacher.
1 code implementation • 15 Sep 2020 • Haisheng Su, Weihao Gan, Wei Wu, Yu Qiao, Junjie Yan
In this paper, we present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation.
2 code implementations • 10 Sep 2020 • Yihao Liu, Liangbin Xie, Li Si-Yao, Wenxiu Sun, Yu Qiao, Chao Dong
In this work, we further improve the performance of QVI from three facets and propose an enhanced quadratic video interpolation (EQVI) model.
no code implementations • 1 Aug 2020 • Ruicheng Feng, Weipeng Guan, Yu Qiao, Chao Dong
Multi-scale techniques have achieved great success in a wide range of computer vision tasks.
4 code implementations • ECCV 2020 • Zhi Hou, Xiaojiang Peng, Yu Qiao, DaCheng Tao
The integration of decomposition and composition enables VCL to share object and verb features among different HOI samples and images, and to generate new interaction samples and new types of HOI, and thus largely alleviates the long-tail distribution problem and benefits low-shot or zero-shot HOI detection.
no code implementations • WS 2020 • Elma Kerz, Yu Qiao, Daniel Wiechmann, Marcus Str{\"o}bel
In this paper we employ a novel approach to advancing our understanding of the development of writing in English and German children across school grades using classification tasks.
1 code implementation • CVPR 2020 • Xianhang Li, Yali Wang, Zhipeng Zhou, Yu Qiao
Our SmallBig network outperforms a number of recent state-of-the-art approaches, in terms of accuracy and/or efficiency.
1 code implementation • CVPR 2020 • Yu Qiao, Yuhao Liu, Xin Yang, Dongsheng Zhou, Mingliang Xu, Qiang Zhang, Xiaopeng Wei
In this paper, we propose an end-to-end Hierarchical Attention Matting Network (HAttMatting), which can predict the better structure of alpha mattes from single RGB images without additional input.
Ranked #4 on
Image Matting
on P3M-10k
no code implementations • CVPR 2020 • Shijie Yu, Shihua Li, Dapeng Chen, Rui Zhao, Junjie Yan, Yu Qiao
To address the clothes changing person re-id problem, we construct a novel large-scale re-id benchmark named ClOthes ChAnging Person Set (COCAS), which provides multiple images of the same identity with different clothes.
no code implementations • LREC 2020 • Elma Kerz, Fabio Pruneri, Daniel Wiechmann, Yu Qiao, Marcus Str{\"o}bel
The purpose of this paper is twofold: [1] to introduce, to our knowledge, the largest available resource of keystroke logging (KSL) data generated by Etherpad (https://etherpad. org/), an open-source, web-based collaborative real-time editor, that captures the dynamics of second language (L2) production and [2] to relate the behavioral data from KSL to indices of syntactic and lexical complexity of the texts produced obtained from a tool that implements a sliding window approach capturing the progression of complexity within a text.
no code implementations • ECCV 2020 • Jianbo Liu, Junjun He, Jimmy S. Ren, Yu Qiao, Hongsheng Li
Long-range contextual information is essential for achieving high-performance semantic segmentation.
1 code implementation • 16 Mar 2020 • Ze Yang, Yali Wang, Xianyu Chen, Jianzhuang Liu, Yu Qiao
Few-shot object detection is a challenging but realistic scenario, where only a few annotated training images are available for training detectors.
1 code implementation • 16 Mar 2020 • Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang
Each such classifier is an expert to its own domain and a non-expert to others.
no code implementations • 7 Mar 2020 • Wen Wang, Xiaojiang Peng, Yanzhou Su, Yu Qiao, Jian Cheng
Video action anticipation aims to predict future action categories from observed frames.
no code implementations • 26 Feb 2020 • Zhanzhan Cheng, Yunlu Xu, Mingjian Cheng, Yu Qiao, ShiLiang Pu, Yi Niu, Fei Wu
Recurrent neural network (RNN) has been widely studied in sequence learning tasks, while the mainstream models (e. g., LSTM and GRU) rely on the gating mechanism (in control of how information flows between hidden states).
2 code implementations • CVPR 2020 • Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, Yu Qiao
Annotating a qualitative large-scale facial expression dataset is extremely difficult due to the uncertainties caused by ambiguous facial expressions, low-quality facial images, and the subjectiveness of annotators.
1 code implementation • 24 Feb 2020 • Peiqin Zhuang, Yali Wang, Yu Qiao
These distinct gate vectors inherit mutual context on semantic differences, which allow API-Net to attentively capture contrastive clues by pairwise interaction between two images.
Ranked #3 on
Fine-Grained Image Classification
on Stanford Dogs
no code implementations • 12 Feb 2020 • Hao Chen, Yali Wang, Guoyou Wang, Xiang Bai, Yu Qiao
Inspired by this procedure of learning to detect, we propose a novel Progressive Object Transfer Detection (POTD) framework.
1 code implementation • 21 Jan 2020 • Wen Wang, Xiaojiang Peng, Yu Qiao, Jian Cheng
Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years.
no code implementations • 20 Jan 2020 • Yu Dong, Yihao Liu, He Zhang, Shifeng Chen, Yu Qiao
With the proposed Fusion-discriminator which takes frequency information as additional priors, our model can generator more natural and realistic dehazed images with less color distortion and fewer artifacts.
no code implementations • 15 Jan 2020 • Jing Li, Jing Xu, Fangwei Zhong, Xiangyu Kong, Yu Qiao, Yizhou Wang
In the system, each camera is equipped with two controllers and a switcher: The vision-based controller tracks targets based on observed images.
1 code implementation • 23 Dec 2019 • Mingye Xu, Zhipeng Zhou, Yu Qiao
Specially, GS-Net consists of Geometry Similarity Connection (GSC) modules which exploit Eigen-Graph to group distant points with similar and relevant geometric information, and aggregate features from nearest neighbors in both Euclidean space and Eigenvalue space.
Ranked #7 on
3D Point Cloud Classification
on IntrA
1 code implementation • ECCV 2020 • Jingwen He, Chao Dong, Yu Qiao
To make a step forward, this paper presents a new problem setup, called multi-dimension (MD) modulation, which aims at modulating output effects across multiple degradation types and levels.
no code implementations • 28 Sep 2019 • Qing Li, Xiaojiang Peng, Yu Qiao, Qiang Peng
In this paper, instead of using a pre-defined graph which is inflexible and may be sub-optimal for multi-label classification, we propose the A-GCN, which leverages the popular Graph Convolutional Networks with an Adaptive label correlation graph to model label dependencies.
1 code implementation • ICCV 2019 • Wenlong Zhang, Yihao Liu, Chao Dong, Yu Qiao
To address the problem, we propose Super-Resolution Generative Adversarial Networks with Ranker (RankSRGAN) to optimize generator in the direction of perceptual metrics.
Ranked #1 on
Image Super-Resolution
on PIRM-test
no code implementations • 26 Jul 2019 • Qing Li, Xiaojiang Peng, Liangliang Cao, Wenbin Du, Hao Xing, Yu Qiao
Instead of collecting product images by labor-and time-intensive image capturing, we take advantage of the web and download images from the reviews of several e-commerce websites where the images are casually captured by consumers.
no code implementations • 8 Jul 2019 • Kai Wang, Jianfei Yang, Da Guo, Kaipeng Zhang, Xiaojiang Peng, Yu Qiao
Based on our winner solution last year, we mainly explore head features and body features with a bootstrap strategy and two novel loss functions in this paper.
1 code implementation • 29 Jun 2019 • Debin Meng, Xiaojiang Peng, Kai Wang, Yu Qiao
The feature embedding module is a deep Convolutional Neural Network (CNN) which embeds face images into feature vectors.
Ranked #2 on
Facial Expression Recognition
on CK+
(Accuracy (7 emotion) metric)
no code implementations • 11 Jun 2019 • Ruicheng Feng, Jinjin Gu, Yu Qiao, Chao Dong
Large deep networks have demonstrated competitive performance in single image super-resolution (SISR), with a huge volume of data involved.
1 code implementation • 10 May 2019 • Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, Yu Qiao
Extensive experiments show that our RAN and region biased loss largely improve the performance of FER with occlusion and variant pose.
Ranked #1 on
Facial Expression Recognition
on SFEW
no code implementations • CVPR 2019 • Xiao Zhang, Rui Zhao, Junjie Yan, Mengya Gao, Yu Qiao, Xiaogang Wang, Hongsheng Li
Cosine-based softmax losses significantly improve the performance of deep face recognition networks.
2 code implementations • CVPR 2019 • Xiao Zhang, Rui Zhao, Yu Qiao, Xiaogang Wang, Hongsheng Li
Our results show that training deep neural networks with the AdaCos loss is stable and able to achieve high face recognition accuracy.
Ranked #5 on
Face Verification
on MegaFace
1 code implementation • CVPR 2019 • Jingwen He, Chao Dong, Yu Qiao
In image restoration tasks, like denoising and super resolution, continual modulation of restoration levels is of great importance for real-world applications, but has failed most of existing deep learning based image restoration methods.
Ranked #1 on
Color Image Denoising
on CBSD68 sigma75
no code implementations • ECCV 2018 • Kaipeng Zhang, Zhanpeng Zhang, Chia-Wen Cheng, Winston H. Hsu, Yu Qiao, Wei Liu, Tong Zhang
Face hallucination is a generative task to super-resolve the facial image with low resolution while human perception of face heavily relies on identity information.
no code implementations • 3 Oct 2018 • Andrey Ignatov, Radu Timofte, Thang Van Vu, Tung Minh Luu, Trung X. Pham, Cao Van Nguyen, Yongwoo Kim, Jae-Seok Choi, Munchurl Kim, Jie Huang, Jiewen Ran, Chen Xing, Xingguang Zhou, Pengfei Zhu, Mingrui Geng, Yawei Li, Eirikur Agustsson, Shuhang Gu, Luc van Gool, Etienne de Stoutz, Nikolay Kobyshev, Kehui Nie, Yan Zhao, Gen Li, Tong Tong, Qinquan Gao, Liu Hanwen, Pablo Navarrete Michelini, Zhu Dan, Hu Fengshuo, Zheng Hui, Xiumei Wang, Lirui Deng, Rang Meng, Jinghui Qin, Yukai Shi, Wushao Wen, Liang Lin, Ruicheng Feng, Shixiang Wu, Chao Dong, Yu Qiao, Subeesh Vasu, Nimisha Thekke Madam, Praveen Kandula, A. N. Rajagopalan, Jie Liu, Cheolkon Jung
This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones.
no code implementations • ECCV 2018 • Dian Shao, Yu Xiong, Yue Zhao, Qingqiu Huang, Yu Qiao, Dahua Lin
The thriving of video sharing services brings new challenges to video retrieval, e. g. the rapid growth in video duration and content diversity.
35 code implementations • 1 Sep 2018 • Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Chen Change Loy, Yu Qiao, Xiaoou Tang
To further enhance the visual quality, we thoroughly study three key components of SRGAN - network architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN).
Ranked #2 on
Image Super-Resolution
on PIRM-test
no code implementations • 12 Jul 2018 • Wanli Chen, Yue Zhang, Junjun He, Yu Qiao, Yi-fan Chen, Hongjian Shi, Xiaoying Tang
To address the aforementioned three problems, we propose and validate a deeper network that can fit medical image datasets that are usually small in the sample size.
no code implementations • CVPR 2018 • Yali Wang, Lei Zhou, Yu Qiao
To mimic this capacity, we propose a novel Hybrid Video Memory (HVM) machine, which can hallucinate temporal features of still images from video memory, in order to boost action recognition with few still images.
no code implementations • 22 May 2018 • Tao Yu, Yu Qiao, Huan Long
A variety of deep neural networks have been applied in medical image segmentation and achieve good performance.
no code implementations • 10 May 2018 • Xiaoyu Yue, Zhanghui Kuang, Zhaoyang Zhang, Zhenfang Chen, Pan He, Yu Qiao, Wei zhang
Deep CNNs have achieved great success in text detection.
1 code implementation • ECCV 2018 • Yifan Xu, Tianqi Fan, Mingye Xu, Long Zeng, Yu Qiao
Deep neural networks have enjoyed remarkable success for various vision tasks, however it remains challenging to apply CNNs to domains lacking a regular underlying structures such as 3D point clouds.
Ranked #6 on
3D Part Segmentation
on IntrA
2 code implementations • CVPR 2018 • Tong He, Zhi Tian, Weilin Huang, Chunhua Shen, Yu Qiao, Changming Sun
This allows the two tasks to work collaboratively by shar- ing convolutional features, which is critical to identify challenging text instances.
1 code implementation • 5 Mar 2018 • Hao Chen, Yali Wang, Guoyou Wang, Yu Qiao
Second, we introduce a novel regularized transfer learning framework for low-shot detection, where the transfer knowledge (TK) and background depression (BD) regularizations are proposed to leverage object knowledge respectively from source and target domains, in order to further enhance fine-tuning with a few target images.
Ranked #14 on
Few-Shot Object Detection
on MS-COCO (30-shot)
1 code implementation • 24 Jan 2018 • Zhe Wang, Xiaoyi Liu, Liangjian Chen, Li-Min Wang, Yu Qiao, Xiaohui Xie, Charless Fowlkes
Visual question answering (VQA) is of significant interest due to its potential to be a strong test of image understanding systems and to probe the connection between language and vision.
7 code implementations • CVPR 2018 • Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan
Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document analysis community.
Ranked #4 on
Scene Text Detection
on ICDAR 2015
6 code implementations • 29 Dec 2017 • Kaiyang Zhou, Yu Qiao, Tao Xiang
Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos.
Ranked #5 on
Unsupervised Video Summarization
on TvSum
1 code implementation • 2017 IEEE International Conference on Computer Vision (ICCV) 2017 • Wenbin Du, Yali Wang, Yu Qiao
Firstly, unlike previous works on pose-related action recognition, our RPAN is an end-to-end recurrent network which can exploit important spatial-temporal evolutions of human pose to assist action recognition in a unified framework.
Ranked #5 on
Skeleton Based Action Recognition
on J-HMDB
no code implementations • ICCV 2017 • Kaipeng Zhang, Zhanpeng Zhang, Hao Wang, Zhifeng Li, Yu Qiao, Wei Liu
Deep Convolutional Neural Networks (CNNs) achieve substantial improvements in face detection in the wild.
no code implementations • ICCV 2017 • Xiao Zhang, Zhiyuan Fang, Yandong Wen, Zhifeng Li, Yu Qiao
Unlike these work, this paper investigated how long-tailed data impact the training of face CNNs and develop a novel loss function, called range loss, to effectively utilize the tailed data in training process.
no code implementations • 7 Sep 2017 • Lei Xiang, Qian Wang, Xiyao Jin, Dong Nie, Yu Qiao, Dinggang Shen
After repeat-ing this embedding procedure for several times in the network, we can eventually synthesize a final CT image in the end of the DECNN.
1 code implementation • ICCV 2017 • Pan He, Weilin Huang, Tong He, Qile Zhu, Yu Qiao, Xiaolin Li
Our text detector achieves an F-measure of 77% on the ICDAR 2015 bench- mark, advancing the state-of-the-art results in [18, 28].
Ranked #4 on
Scene Text Detection
on COCO-Text
8 code implementations • 8 May 2017 • Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool
Furthermore, based on the temporal segment networks, we won the video classification track at the ActivityNet challenge 2016 among 24 teams, which demonstrates the effectiveness of TSN and the proposed good practices.
Ranked #17 on
Action Classification
on Moments in Time
(Top 5 Accuracy metric)
2 code implementations • 28 Nov 2016 • Xiao Zhang, Zhiyuan Fang, Yandong Wen, Zhifeng Li, Yu Qiao
Convolutional neural networks have achieved great improvement on face recognition in recent years because of its extraordinary ability in learning discriminative features of people with different identities.
2 code implementations • 4 Oct 2016 • Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, Yu Qiao
Convolutional Neural Networks (CNNs) have made remarkable progress on scene recognition, partially due to these recent large-scale scene datasets, such as the Places and Places2.
no code implementations • ECCV 2016 2016 • Yandong Wen, Kaipeng Zhang, Zhifeng Li, Yu Qiao
In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model.
17 code implementations • 12 Sep 2016 • Zhi Tian, Weilin Huang, Tong He, Pan He, Yu Qiao
We propose a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image.
no code implementations • 1 Sep 2016 • Limin Wang, Zhe Wang, Yu Qiao, Luc van Gool
These newly designed transferring techniques exploit multi-task learning frameworks to incorporate extra knowledge from other networks and additional datasets into the training procedure of event CNNs.
1 code implementation • 1 Sep 2016 • Zhe Wang, Li-Min Wang, Yali Wang, Bo-Wen Zhang, Yu Qiao
In this paper, we propose a hybrid representation, which leverages the discriminative capacity of CNNs and the simplicity of descriptor encoding schema for image recognition, with a focus on scene recognition.
19 code implementations • 2 Aug 2016 • Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool
The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network.
Ranked #3 on
Multimodal Activity Recognition
on EV-Action
1 code implementation • 2 Aug 2016 • Yuanjun Xiong, Li-Min Wang, Zhe Wang, Bo-Wen Zhang, Hang Song, Wei Li, Dahua Lin, Yu Qiao, Luc van Gool, Xiaoou Tang
This paper presents the method that underlies our submission to the untrimmed video classification task of ActivityNet Challenge 2016.
no code implementations • 21 Jun 2016 • Linjie Xing, Yu Qiao
The main contributions are: 1) we design and optimize multi-stream structure for writer identification task; 2) we introduce data augmentation learning to enhance the performance of DeepWriter; 3) we introduce a patch scanning strategy to handle text image with different lengths.
no code implementations • CVPR 2016 • Wangjiang Zhu, Jie Hu, Gang Sun, Xudong Cao, Yu Qiao
Training with a large proportion of irrelevant volumes will hurt performance.
no code implementations • CVPR 2016 • Yandong Wen, Zhifeng Li, Yu Qiao
In order to address this problem, we propose a novel deep face recognition framework to learn the age-invariant deep face features through a carefully designed CNN model.
Ranked #7 on
Age-Invariant Face Recognition
on CACDVS
1 code implementation • CVPR 2016 • Bowen Zhang, Li-Min Wang, Zhe Wang, Yu Qiao, Hanli Wang
The deep two-stream architecture exhibited excellent performance on video based action recognition.