1 code implementation • 29 Apr 2021 • Jiachen Li, Bowen Cheng, Rogerio Feris, JinJun Xiong, Thomas S. Huang, Wen-mei Hwu, Humphrey Shi
Current anchor-free object detectors are quite simple and effective yet lack accurate label assignment methods, which limits their potential in competing with classic anchor-based models that are supported by well-designed assignment methods based on the Intersection-over-Union~(IoU) metric.
1 code implementation • 7 Dec 2020 • Yang Fu, Linjie Yang, Ding Liu, Thomas S. Huang, Humphrey Shi
Video instance segmentation is a complex task in which we need to detect, segment, and track each object for any given video.
Ranked #34 on
Video Instance Segmentation
on YouTube-VIS validation
no code implementations • 28 Jun 2020 • Hanchao Yu, Xiao Chen, Humphrey Shi, Terrence Chen, Thomas S. Huang, Shanhui Sun
In this paper, we propose Motion Pyramid Networks, a novel deep learning-based approach for accurate and efficient cardiac motion estimation.
1 code implementation • NeurIPS 2020 • Yuchen Fan, Jiahui Yu, Yiqun Mei, Yulun Zhang, Yun Fu, Ding Liu, Thomas S. Huang
Inspired by the robustness and efficiency of sparse representation in sparse coding based image restoration models, we investigate the sparsity of neurons in deep networks.
3 code implementations • CVPR 2020 • Yiqun Mei, Yuchen Fan, Yuqian Zhou, Lichao Huang, Thomas S. Huang, Humphrey Shi
By combining the new CS-NL prior with local and in-scale non-local priors in a powerful recurrent fusion cell, we can find more cross-scale feature correlations within a single low-resolution (LR) image.
Ranked #7 on
Image Super-Resolution
on Manga109 - 3x upscaling
no code implementations • 3 May 2020 • Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He, Wenhao Wu, Yukang Ding, Chao Li, Fu Li, Shilei Wen, Jianwei Li, Fuzhi Yang, Huan Yang, Jianlong Fu, Byung-Hoon Kim, JaeHyun Baek, Jong Chul Ye, Yuchen Fan, Thomas S. Huang, Junyeop Lee, Bokyeung Lee, Jungki Min, Gwantae Kim, Kanghyu Lee, Jaihyun Park, Mykola Mykhailych, Haoyu Zhong, Yukai Shi, Xiaojun Yang, Zhijing Yang, Liang Lin, Tongtong Zhao, Jinjia Peng, Huibing Wang, Zhi Jin, Jiahao Wu, Yifu Chen, Chenming Shang, Huanrong Zhang, Jeongki Min, Hrishikesh P. S, Densen Puthussery, Jiji C. V
This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results.
2 code implementations • 28 Apr 2020 • Yiqun Mei, Yuchen Fan, Yulun Zhang, Jiahui Yu, Yuqian Zhou, Ding Liu, Yun Fu, Thomas S. Huang, Humphrey Shi
Self-similarity refers to the image prior widely used in image restoration algorithms that small but similar patterns tend to occur at different locations and scales.
1 code implementation • 21 Apr 2020 • Mang Tik Chiu, Xingqian Xu, Kai Wang, Jennifer Hobbs, Naira Hovakimyan, Thomas S. Huang, Honghui Shi, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Ivan Dozier, Wyatt Dozier, Karen Ghandilyan, David Wilson, Hyunseong Park, Junhee Kim, Sungho Kim, Qinghui Liu, Michael C. Kampffmeyer, Robert Jenssen, Arnt B. Salberg, Alexandre Barbosa, Rodrigo Trevisan, Bingchen Zhao, Shaozuo Yu, Siwei Yang, Yin Wang, Hao Sheng, Xiao Chen, Jingyi Su, Ram Rajagopal, Andrew Ng, Van Thong Huynh, Soo-Hyung Kim, In-Seop Na, Ujjwal Baid, Shubham Innani, Prasad Dutande, Bhakti Baheti, Sanjay Talbar, Jianyu Tang
The first Agriculture-Vision Challenge aims to encourage research in developing novel and effective algorithms for agricultural pattern recognition from aerial images, especially for the semantic segmentation task associated with our challenge dataset.
no code implementations • 2 Apr 2020 • Zhonghao Wang, Yunchao Wei, Rogerior Feris, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi
A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains, i. e. reducing domain shift.
no code implementations • 30 Mar 2020 • Jianbo Jiao, Linchao Bao, Yunchao Wei, Shengfeng He, Honghui Shi, Rynson Lau, Thomas S. Huang
This can be naturally generalized to span multiple scales with a Laplacian pyramid representation of the input data.
1 code implementation • CVPR 2020 • Zhonghao Wang, Mo Yu, Yunchao Wei, Rogerio Feris, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi
We consider the problem of unsupervised domain adaptation for semantic segmentation by easing the domain shift between the source domain (synthetic data) and the target domain (real data) in this work.
Ranked #8 on
Semantic Segmentation
on DensePASS
no code implementations • 15 Mar 2020 • Xingqian Xu, Mang Tik Chiu, Thomas S. Huang, Honghui Shi
Most of the modern instance segmentation approaches fall into two categories: region-based approaches in which object bounding boxes are detected first and later used in cropping and segmenting instances; and keypoint-based approaches in which individual instances are represented by a set of keypoints followed by a dense pixel clustering around those keypoints.
1 code implementation • 24 Feb 2020 • Zilong Huang, Yunchao Wei, Xinggang Wang, Wenyu Liu, Thomas S. Huang, Humphrey Shi
Aggregating features in terms of different convolutional blocks or contextual embeddings has been proven to be an effective way to strengthen feature representations for semantic segmentation.
2 code implementations • CVPR 2020 • Mang Tik Chiu, Xingqian Xu, Yunchao Wei, Zilong Huang, Alexander Schwing, Robert Brunner, Hrant Khachatrian, Hovnatan Karapetyan, Ivan Dozier, Greg Rose, David Wilson, Adrian Tudor, Naira Hovakimyan, Thomas S. Huang, Honghui Shi
To encourage research in computer vision for agriculture, we present Agriculture-Vision: a large-scale aerial farmland image dataset for semantic segmentation of agricultural patterns.
1 code implementation • 19 Dec 2019 • Yuchen Fan, Jiahui Yu, Ding Liu, Thomas S. Huang
In this paper, we show that properly modeling scale-invariance into neural networks can bring significant benefits to image restoration performance.
9 code implementations • CVPR 2020 • Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen
In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast system for panoptic segmentation, aiming to establish a solid baseline for bottom-up methods that can achieve comparable performance of two-stage methods while yielding fast inference speed.
Ranked #3 on
Instance Segmentation
on Cityscapes test
(using extra training data)
2 code implementations • 17 Nov 2019 • Haichao Yu, Haoxiang Li, Honghui Shi, Thomas S. Huang, Gang Hua
When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision.
2 code implementations • 10 Oct 2019 • Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen
The semantic segmentation branch is the same as the typical design of any semantic segmentation model (e. g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression.
no code implementations • 9 Sep 2019 • Zengming Shen, S. Kevin Zhou, Yi-fan Chen, Bogdan Georgescu, Xuqi Liu, Thomas S. Huang
Here we propose a self-inverse network learning approach for unpaired image-to-image translation.
no code implementations • 9 Sep 2019 • Zengming Shen, Yifan Chen, S. Kevin Zhou, Bogdan Georgescu, Xuqi Liu, Thomas S. Huang
A self-inverse network shares several distinct advantages: only one network instead of two, better generalization and more restricted parameter space.
19 code implementations • CVPR 2020 • Bowen Cheng, Bin Xiao, Jingdong Wang, Honghui Shi, Thomas S. Huang, Lei Zhang
HigherHRNet even surpasses all top-down methods on CrowdPose test (67. 6% AP), suggesting its robustness in crowded scene.
Ranked #2 on
Pose Estimation
on UAV-Human
no code implementations • ICLR 2020 • Yingzhen Yang, Jiahui Yu, Nebojsa Jojic, Jun Huan, Thomas S. Huang
FSNet has the same architecture as that of the baseline CNN to be compressed, and each convolution layer of FSNet has the same number of filters from FS as that of the basline CNN in the forward process.
no code implementations • 3 Feb 2019 • Yingzhen Yang, Jiahui Yu, Xingjian Li, Jun Huan, Thomas S. Huang
In this paper, we investigate the role of Rademacher complexity in improving generalization of DNNs and propose a novel regularizer rooted in Local Rademacher Complexity (LRC).
4 code implementations • ICCV 2019 • Zilong Huang, Xinggang Wang, Yunchao Wei, Lichao Huang, Humphrey Shi, Wenyu Liu, Thomas S. Huang
Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.
Ranked #7 on
Semantic Segmentation
on FoodSeg103
(using extra training data)
no code implementations • 23 Nov 2018 • Bowen Cheng, Yunchao Wei, Jiahui Yu, Shiyu Chang, JinJun Xiong, Wen-mei Hwu, Thomas S. Huang, Humphrey Shi
While training on samples drawn from independent and identical distribution has been a de facto paradigm for optimizing image classification networks, humans learn new concepts in an easy-to-hard manner and on the selected examples progressively.
1 code implementation • 6 Sep 2018 • Ding Liu, Bihan Wen, Jianbo Jiao, Xian-Ming Liu, Zhangyang Wang, Thomas S. Huang
Second we propose a deep neural network solution that cascades two modules for image denoising and various high-level tasks, respectively, and use the joint loss for updating only the denoising network via back-propagation.
1 code implementation • NeurIPS 2018 • Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, Thomas S. Huang
The main contributions of this work are: (1) Unlike existing methods that measure self-similarity in an isolated manner, the proposed non-local module can be flexibly integrated into existing deep networks for end-to-end training to capture deep feature correlation between each location and its neighborhood.
Ranked #1 on
Grayscale Image Denoising
on Set12 sigma30
no code implementations • CVPR 2018 • Yunchao Wei, Huaxin Xiao, Honghui Shi, Zequn Jie, Jiashi Feng, Thomas S. Huang
Despite remarkable progress, weakly supervised segmentation methods are still inferior to their fully supervised counterparts.
no code implementations • CVPR 2018 • Yunchao Wei, Huaxin Xiao, Honghui Shi, Zequn Jie, Jiashi Feng, Thomas S. Huang
It can produce dense and reliable object localization maps and effectively benefit both weakly- and semi- supervised semantic segmentation.
1 code implementation • CVPR 2018 • Wei Han, Shiyu Chang, Ding Liu, Mo Yu, Michael Witbrock, Thomas S. Huang
Advances in image super-resolution (SR) have recently benefited significantly from rapid developments in deep neural networks.
Ranked #41 on
Image Super-Resolution
on BSD100 - 4x upscaling
28 code implementations • CVPR 2018 • Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang
Motivated by these observations, we propose a new deep generative model-based approach which can not only synthesize novel image structures but also explicitly utilize surrounding image features as references during network training to make better predictions.
no code implementations • 20 Dec 2017 • Ding Liu, Bowen Cheng, Zhangyang Wang, Haichao Zhang, Thomas S. Huang
Visual recognition under adverse conditions is a very important and challenging problem of high practical value, due to the ubiquitous existence of quality distortions during image acquisition, transmission, or storage.
2 code implementations • NeurIPS 2017 • Shiyu Chang, Yang Zhang, Wei Han, Mo Yu, Xiaoxiao Guo, Wei Tan, Xiaodong Cui, Michael Witbrock, Mark Hasegawa-Johnson, Thomas S. Huang
To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures.
Ranked #24 on
Sequential Image Classification
on Sequential MNIST
no code implementations • 10 Sep 2017 • Bowen Cheng, Zhangyang Wang, Zhaobin Zhang, Zhu Li, Ding Liu, Jianchao Yang, Shuai Huang, Thomas S. Huang
Emotion recognition from facial expressions is tremendously useful, especially when coupled with smart devices and wireless multimedia applications.
no code implementations • 5 Sep 2017 • Yingzhen Yang, Jiashi Feng, Nebojsa Jojic, Jianchao Yang, Thomas S. Huang
We study the proximal gradient descent (PGD) method for $\ell^{0}$ sparse approximation problem as well as its accelerated optimization with randomized algorithms in this paper.
no code implementations • 5 Sep 2017 • Yingzhen Yang, Feng Liang, Nebojsa Jojic, Shuicheng Yan, Jiashi Feng, Thomas S. Huang
By generalization analysis via Rademacher complexity, the generalization error bound for the kernel classifier learned from hypothetical labeling is expressed as the sum of pairwise similarity between the data from different classes, parameterized by the weights of the kernel classifier.
2 code implementations • 14 Jun 2017 • Ding Liu, Bihan Wen, Xianming Liu, Zhangyang Wang, Thomas S. Huang
Conventionally, image denoising and high-level vision tasks are handled separately in computer vision.
1 code implementation • 20 Apr 2017 • Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A. Hasegawa-Johnson, Roy H. Campbell, Thomas S. Huang
In this work, we describe a method to speed up generation in convolutional autoregressive models.
no code implementations • 8 Dec 2016 • Xianming Liu, Amy Zhang, Tobias Tiecke, Andreas Gros, Thomas S. Huang
Learning from weakly-supervised data is one of the main challenges in machine learning and computer vision, especially for tasks such as image semantic segmentation where labeling is extremely expensive and subjective.
6 code implementations • 29 Nov 2016 • Tom Le Paine, Pooya Khorrami, Shiyu Chang, Yang Zhang, Prajit Ramachandran, Mark A. Hasegawa-Johnson, Thomas S. Huang
This paper presents an efficient implementation of the Wavenet generation process called Fast Wavenet.
no code implementations • 23 Aug 2016 • Zhangyang Wang, Thomas S. Huang
This paper emphasizes the significance to jointly exploit the problem structure and the parameter structure, in the context of deep modeling.
no code implementations • 14 Aug 2016 • Zhangyang Wang, Shiyu Chang, Qing Ling, Shuai Huang, Xia Hu, Honghui Shi, Thomas S. Huang
With the agreement of my coauthors, I Zhangyang Wang would like to withdraw the manuscript "Stacked Approximated Regression Machine: A Simple Deep Learning Approach".
no code implementations • 21 Jul 2016 • Shiyu Chang, Yang Zhang, Jiliang Tang, Dawei Yin, Yi Chang, Mark A. Hasegawa-Johnson, Thomas S. Huang
The increasing popularity of real-world recommender systems produces data continuously and rapidly, and it becomes more realistic to study recommender systems under streaming scenarios.
no code implementations • CVPR 2016 • Zhangyang Wang, Ding Liu, Shiyu Chang, Qing Ling, Yingzhen Yang, Thomas S. Huang
In this paper, we design a Deep Dual-Domain (D3) based fast restoration model to remove artifacts of JPEG compressed images.
no code implementations • 6 Apr 2016 • Zhangyang Wang, Yingzhen Yang, Shiyu Chang, Qing Ling, Thomas S. Huang
We investigate the $\ell_\infty$-constrained representation which demonstrates robustness to quantization errors, utilizing the tool of deep learning.
1 code implementation • 26 Feb 2016 • Wei Han, Pooya Khorrami, Tom Le Paine, Prajit Ramachandran, Mohammad Babaeizadeh, Honghui Shi, Jianan Li, Shuicheng Yan, Thomas S. Huang
Video object detection is challenging because objects that are easily detected in one frame may be difficult to detect in another frame within the same clip.
1 code implementation • 24 Feb 2016 • Pooya Khorrami, Tom Le Paine, Kevin Brady, Charlie Dagli, Thomas S. Huang
In this work, we present a system that performs emotion recognition on video data using both CNNs and RNNs, and we also analyze how much each neural network component contributes to the system's overall performance.
no code implementations • 16 Jan 2016 • Zhangyang Wang, Shiyu Chang, Florin Dolcos, Diane Beck, Ding Liu, Thomas S. Huang
Image aesthetics assessment has been challenging due to its subjective nature.
no code implementations • CVPR 2016 • Zhangyang Wang, Shiyu Chang, Yingzhen Yang, Ding Liu, Thomas S. Huang
Visual recognition research often assumes a sufficient resolution of the region of interest (ROI).
no code implementations • 16 Jan 2016 • Zhangyang Wang, Ding Liu, Shiyu Chang, Qing Ling, Yingzhen Yang, Thomas S. Huang
In this paper, we design a Deep Dual-Domain ($\mathbf{D^3}$) based fast restoration model to remove artifacts of JPEG compressed images.
no code implementations • ICCV 2015 • Chunshui Cao, Xian-Ming Liu, Yi Yang, Yinan Yu, Jiang Wang, Zilei Wang, Yongzhen Huang, Liang Wang, Chang Huang, Wei Xu, Deva Ramanan, Thomas S. Huang
While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, it is important to remember that the human visual contex contains generally more feedback connections than foward connections.
no code implementations • 28 Oct 2015 • Yingzhen Yang, Jiashi Feng, Jianchao Yang, Thomas S. Huang
Sparse subspace clustering methods, such as Sparse Subspace Clustering (SSC) \cite{ElhamifarV13} and $\ell^{1}$-graph \cite{YanW09, ChengYYFH10}, are effective in partitioning the data that lie in a union of subspaces.
1 code implementation • 10 Oct 2015 • Pooya Khorrami, Tom Le Paine, Thomas S. Huang
Despite being the appearance-based classifier of choice in recent years, relatively few works have examined how much convolutional neural networks (CNNs) can improve performance on accepted expression recognition benchmarks and, more importantly, examine what it is they actually learn.
no code implementations • 1 Sep 2015 • Zhangyang Wang, Qing Ling, Thomas S. Huang
We study the $\ell_0$ sparse approximation problem with the tool of deep learning, by proposing Deep $\ell_0$ Encoders.
no code implementations • 1 Sep 2015 • Zhangyang Wang, Shiyu Chang, Jiayu Zhou, Meng Wang, Thomas S. Huang
In this paper, we propose to emulate the sparse coding-based clustering pipeline in the context of deep learning, leading to a carefully crafted deep model benefiting from both.
1 code implementation • 12 Jul 2015 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang
As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers.
Ranked #1 on
Font Recognition
on VFR-Wild
no code implementations • CVPR 2015 • Xian-Ming Liu, Rongrong Ji, Changhu Wang, Wei Liu, Bineng Zhong, Thomas S. Huang
A hierarchical shape parsing strategy is proposed to partition and organize image components into a hierarchical structure in the scale space.
no code implementations • 22 Apr 2015 • Zhangyang Wang, Yingzhen Yang, Zhaowen Wang, Shiyu Chang, Wei Han, Jianchao Yang, Thomas S. Huang
Deep learning has been successfully applied to image super resolution (SR).
no code implementations • 31 Mar 2015 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang
We address a challenging fine-grain classification problem: recognizing a font style from an image of text.
no code implementations • 12 Mar 2015 • Zhangyang Wang, Yingzhen Yang, Jianchao Yang, Thomas S. Huang
We study the complementary behaviors of external and internal examples in image restoration, and are motivated to formulate a composite dictionary design framework.
no code implementations • 3 Mar 2015 • Zhangyang Wang, Yingzhen Yang, Zhaowen Wang, Shiyu Chang, Jianchao Yang, Thomas S. Huang
Single image super-resolution (SR) aims to estimate a high-resolution (HR) image from a lowresolution (LR) input.
2 code implementations • 20 Dec 2014 • Tom Le Paine, Pooya Khorrami, Wei Han, Thomas S. Huang
We discover unsupervised pre-training, as expected, helps when the ratio of unsupervised to supervised samples is high, and surprisingly, hurts when the ratio is low.
Ranked #93 on
Image Classification
on STL-10
no code implementations • 18 Dec 2014 • Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, Thomas S. Huang
We present a domain adaption framework to address a domain mismatch between synthetic training and real-world testing data.
no code implementations • NeurIPS 2014 • Yingzhen Yang, Feng Liang, Shuicheng Yan, Zhangyang Wang, Thomas S. Huang
Modeling the underlying data distribution by nonparametric kernel density estimation, the generalization error bounds for both unsupervised nonparametric classifiers are the sum of nonparametric pairwise similarity terms between the data points for the purpose of clustering.
no code implementations • CVPR 2013 • Zhen Li, Shiyu Chang, Feng Liang, Thomas S. Huang, Liangliang Cao, John R. Smith
This paper proposes to learn a decision function for verification that can be viewed as a joint model of a distance metric and a locally adaptive thresholding rule.
no code implementations • 2 Oct 2012 • Yingzhen Yang, Thomas S. Huang
Unsupervised classification methods learn a discriminative classifier from unlabeled data, which has been proven to be an effective way of simultaneously clustering the data and training a classifier from the data.
no code implementations • NeurIPS 2011 • Zhen Li, Huazhong Ning, Liangliang Cao, Tong Zhang, Yihong Gong, Thomas S. Huang
Traditional approaches relied on algorithmic constructions that are often data independent (such as Locality Sensitive Hashing) or weakly dependent (such as kd-trees, k-means trees).
no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence 2011 • Deng Cai, Xiaofei He, Jiawei Han, Thomas S. Huang
In GNMF, an affinity graph is constructed to encode the geometrical information and we seek a matrix factorization, which respects the graph structure.
no code implementations • 21 Apr 2010 • Mithun Das Gupta, Thomas S. Huang
In this work we investigate the relationship between Bregman distances and regularized Logistic Regression model.