Search Results for author: Thomas Huang

Found 50 papers, 25 papers with code

Deep Image Matting

8 code implementations CVPR 2017 Ning Xu, Brian Price, Scott Cohen, Thomas Huang

We evaluate our algorithm on the image matting benchmark, our testing set, and a wide variety of real images.

Semantic Image Matting

AutoSlim: Towards One-Shot Architecture Search for Channel Numbers

10 code implementations ICLR 2020 Jiahui Yu, Thomas Huang

Notably, by setting optimized channel numbers, our AutoSlim-MobileNet-v2 at 305M FLOPs achieves 74. 2% top-1 accuracy, 2. 4% better than default MobileNet-v2 (301M FLOPs), and even 0. 2% better than RL-searched MNasNet (317M FLOPs).

Neural Architecture Search

When AWGN-based Denoiser Meets Real Noises

2 code implementations6 Apr 2019 Yuqian Zhou, Jianbo Jiao, Haibin Huang, Yang Wang, Jue Wang, Honghui Shi, Thomas Huang

In this paper, we propose a novel approach to boost the performance of a real image denoiser which is trained only with synthetic pixel-independent noise data dominated by AWGN.

Denoising

Slimmable Neural Networks

4 code implementations ICLR 2019 Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, Thomas Huang

Instead of training individual networks with different width configurations, we train a shared network with switchable batch normalization.

Instance Segmentation Keypoint Detection +3

Universally Slimmable Networks and Improved Training Techniques

1 code implementation ICCV 2019 Jiahui Yu, Thomas Huang

We also evaluate the proposed US-Nets and improved training techniques on tasks of image super-resolution and deep reinforcement learning.

Image Super-Resolution

SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection

1 code implementation25 Jun 2019 Xiaofan Zhang, Cong Hao, Haoming Lu, Jiachen Li, Yuhong Li, Yuchen Fan, Kyle Rupnow, JinJun Xiong, Thomas Huang, Honghui Shi, Wen-mei Hwu, Deming Chen

Developing artificial intelligence (AI) at the edge is always challenging, since edge devices have limited computation capability and memory resources but need to meet demanding requirements, such as real-time processing, high throughput performance, and high inference accuracy.

object-detection Object Detection

Self-similarity Grouping: A Simple Unsupervised Cross Domain Adaptation Approach for Person Re-identification

1 code implementation ICCV 2019 Yang Fu, Yunchao Wei, Guanshuo Wang, Yuqian Zhou, Honghui Shi, Thomas Huang

Upon our SSG, we further introduce a clustering-guided semisupervised approach named SSG ++ to conduct the one-shot domain adaption in an open set setting (i. e. the number of independent identities from the target domain is unknown).

Clustering One-Shot Learning +2

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

7 code implementations ECCV 2018 Bowen Cheng, Yunchao Wei, Honghui Shi, Rogerio Feris, JinJun Xiong, Thomas Huang

Recent region-based object detectors are usually built with separate classification and localization branches on top of shared feature extraction networks.

Classification General Classification +1

SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation

1 code implementation22 Oct 2018 Xiaolin Zhang, Yunchao Wei, Yi Yang, Thomas Huang

In this way, the possibilities embedded in the produced similarity maps can be adapted to guide the process of segmenting objects.

Few-Shot Semantic Segmentation Segmentation +1

Improving Object Detection from Scratch via Gated Feature Reuse

2 code implementations4 Dec 2017 Zhiqiang Shen, Honghui Shi, Jiahui Yu, Hai Phan, Rogerio Feris, Liangliang Cao, Ding Liu, Xinchao Wang, Thomas Huang, Marios Savvides

In this paper, we present a simple and parameter-efficient drop-in module for one-stage object detectors like SSD when learning from scratch (i. e., without pre-trained models).

Object object-detection +1

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models

1 code implementation ECCV 2020 Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, Quoc Le

Without extra retraining or post-processing steps, we are able to train a single set of shared weights on ImageNet and use these weights to obtain child models whose sizes range from 200 to 1000 MFLOPs.

Neural Architecture Search

Horizontal Pyramid Matching for Person Re-identification

1 code implementation14 Apr 2018 Yang Fu, Yunchao Wei, Yuqian Zhou, Honghui Shi, Gao Huang, Xinchao Wang, Zhiqiang Yao, Thomas Huang

Despite the remarkable recent progress, person re-identification (Re-ID) approaches are still suffering from the failure cases where the discriminative body parts are missing.

Person Re-Identification

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

4 code implementations ECCV 2018 Ning Xu, Linjie Yang, Yuchen Fan, Jianchao Yang, Dingcheng Yue, Yuchen Liang, Brian Price, Scott Cohen, Thomas Huang

End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.

Ranked #12 on Video Object Segmentation on YouTube-VOS 2018 (F-Measure (Unseen) metric)

Image Segmentation Object +7

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction

1 code implementation ICLR 2021 Wonkwang Lee, Whie Jung, Han Zhang, Ting Chen, Jing Yu Koh, Thomas Huang, Hyungsuk Yoon, Honglak Lee, Seunghoon Hong

Despite the recent advances in the literature, existing approaches are limited to moderately short-term prediction (less than a few seconds), while extrapolating it to a longer future quickly leads to destruction in structure and content.

Translation Video Prediction

Adversarial Complementary Learning for Weakly Supervised Object Localization

2 code implementations CVPR 2018 Xiaolin Zhang, Yunchao Wei, Jiashi Feng, Yi Yang, Thomas Huang

With such an adversarial learning, the two parallel-classifiers are forced to leverage complementary object regions for classification and can finally generate integral object localization together.

General Classification Object +1

Unsupervised Representation Adversarial Learning Network: from Reconstruction to Generation

1 code implementation19 Apr 2018 Yuqian Zhou, Kuangxiao Gu, Thomas Huang

The newly proposed RepGAN is tested on MNIST, fashionMNIST, CelebA, and SVHN datasets to perform unsupervised classification, generation and reconstruction tasks.

Clustering General Classification

Survey of Face Detection on Low-quality Images

no code implementations19 Apr 2018 Yuqian Zhou, Ding Liu, Thomas Huang

However, previous proposed models are mostly trained and tested on good-quality images which are not always the case for practical applications like surveillance systems.

Face Detection Robust Design

Deep GrabCut for Object Selection

no code implementations2 Jul 2017 Ning Xu, Brian Price, Scott Cohen, Jimei Yang, Thomas Huang

In this paper, we propose a novel segmentation approach that uses a rectangle as a soft constraint by transforming it into an Euclidean distance map.

Instance Segmentation Interactive Segmentation +3

Joint Intermodal and Intramodal Label Transfers for Extremely Rare or Unseen Classes

no code implementations22 Mar 2017 Guo-Jun Qi, Wei Liu, Charu Aggarwal, Thomas Huang

One of our goals in this paper is to develop a model for revealing the functional relationships between text and image features as to directly transfer intermodal and intramodal labels to annotate the images.

General Classification Image Classification +3

Learning a Mixture of Deep Networks for Single Image Super-Resolution

no code implementations3 Jan 2017 Ding Liu, Zhaowen Wang, Nasser Nasrabadi, Thomas Huang

This paper proposes the method of learning a mixture of SR inference modules in a unified framework to tackle this problem.

Image Super-Resolution

UnitBox: An Advanced Object Detection Network

no code implementations4 Aug 2016 Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, Thomas Huang

In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods.

Face Detection Object +3

Deep Networks for Image Super-Resolution with Sparse Prior

no code implementations ICCV 2015 Zhaowen Wang, Ding Liu, Jianchao Yang, Wei Han, Thomas Huang

We show that a sparse coding model particularly designed for super-resolution can be incarnated as a neural network, and trained in a cascaded structure from end to end.

Image Restoration Image Super-Resolution

Scalable Similarity Learning using Large Margin Neighborhood Embedding

no code implementations24 Apr 2014 Zhaowen Wang, Jianchao Yang, Zhe Lin, Jonathan Brandt, Shiyu Chang, Thomas Huang

In this paper, we present an image similarity learning method that can scale well in both the number of images and the dimensionality of image descriptors.

Metric Learning

GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training

no code implementations21 Dec 2013 Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, Thomas Huang

The ability to train large-scale neural networks has resulted in state-of-the-art performance in many areas of computer vision.

TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection

no code implementations ECCV 2018 Yunchao Wei, Zhiqiang Shen, Bowen Cheng, Honghui Shi, JinJun Xiong, Jiashi Feng, Thomas Huang

This work provides a simple approach to discover tight object bounding boxes with only image-level supervision, called Tight box mining with Surrounding Segmentation Context (TS2C).

Multiple Instance Learning Object +4

STA: Spatial-Temporal Attention for Large-Scale Video-based Person Re-Identification

no code implementations9 Nov 2018 Yang Fu, Xiaoyang Wang, Yunchao Wei, Thomas Huang

Thus, a more robust clip-level feature representation can be generated according to a weighted sum operation guided by the mined 2-D attention score matrix.

Large-Scale Person Re-Identification Video-Based Person Re-Identification

BodyPrint: Pose Invariant 3D Shape Matching of Human Bodies

no code implementations ICCV 2015 Jiangping Wang, Kai Ma, Vivek Kumar Singh, Thomas Huang, Terrence Chen

3D human body shape matching has large potential on many real world applications, especially with the recent advances in the 3D range sensing technology.

Robust Video Super-Resolution With Learned Temporal Dynamics

no code implementations ICCV 2017 Ding Liu, Zhaowen Wang, Yuchen Fan, Xian-Ming Liu, Zhangyang Wang, Shiyu Chang, Thomas Huang

Second, we reduce the complexity of motion between neighboring frames using a spatial alignment network that is much more robust and efficient than competing alignment methods and can be jointly trained with the temporal adaptive network in an end-to-end manner.

Relation Video Super-Resolution

Towards Instance-level Image-to-Image Translation

no code implementations CVPR 2019 Zhiqiang Shen, Mingyang Huang, Jianping Shi, xiangyang xue, Thomas Huang

The proposed INIT exhibits three import advantages: (1) the instance-level objective loss can help learn a more accurate reconstruction and incorporate diverse attributes of objects; (2) the styles used for target domain of local/global areas are from corresponding spatial regions in source domain, which intuitively is a more reasonable mapping; (3) the joint training process can benefit both fine and coarse granularity and incorporates instance information to improve the quality of global translation.

Attribute Image-to-Image Translation +3

High Frequency Residual Learning for Multi-Scale Image Classification

no code implementations7 May 2019 Bowen Cheng, Rong Xiao, Jian-Feng Wang, Thomas Huang, Lei Zhang

We present a novel high frequency residual learning framework, which leads to a highly efficient multi-scale network (MSNet) architecture for mobile and embedded vision problems.

Classifier calibration General Classification +2

SPGNet: Semantic Prediction Guidance for Scene Parsing

no code implementations ICCV 2019 Bowen Cheng, Liang-Chieh Chen, Yunchao Wei, Yukun Zhu, Zilong Huang, JinJun Xiong, Thomas Huang, Wen-mei Hwu, Honghui Shi

The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path.

Pose Estimation Scene Parsing +2

FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis

no code implementations21 Nov 2019 Kuangxiao Gu, Yuqian Zhou, Thomas Huang

In this paper, We present a landmark driven two-stream network to generate faithful talking facial animation, in which more facial details are created, preserved and transferred from multiple source images instead of a single one.

Face Generation

FOAL: Fast Online Adaptive Learning for Cardiac Motion Estimation

no code implementations CVPR 2020 Hanchao Yu, Shanhui Sun, Haichao Yu, Xiao Chen, Honghui Shi, Thomas Huang, Terrence Chen

In clinical deployment, however, they suffer dramatic performance drops due to mismatched distributions between training and testing datasets, commonly encountered in the clinical environment.

Anatomy Motion Estimation

Image Segmentation of Zona-Ablated Human Blastocysts

no code implementations19 Aug 2020 Md Yousuf Harun, M. Arifur Rahman, Joshua Mellinger, Willy Chang, Thomas Huang, Brienne Walker, Kristen Hori, Aaron T. Ohta

Automating human preimplantation embryo grading offers the potential for higher success rates with in vitro fertilization (IVF) by providing new quantitative and objective measures of embryo quality.

Image Segmentation Semantic Segmentation

Inner Cell Mass and Trophectoderm Segmentation in Human Blastocyst Images using Deep Neural Network

no code implementations19 Aug 2020 Md Yousuf Harun, Thomas Huang, Aaron T. Ohta

Embryo quality assessment based on morphological attributes is important for achieving higher pregnancy rates from in vitro fertilization (IVF).

Segmentation

Scaling Up Neural Architecture Search with Big Single-Stage Models

no code implementations25 Sep 2019 Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Quoc Le

In this work, we propose BigNAS, an approach that simplifies this workflow and scales up neural architecture search to target a wide range of model sizes simultaneously.

Neural Architecture Search

YouTube-VOS: A Large-Scale Video Object Segmentation Benchmark

no code implementations6 Sep 2018 Ning Xu, Linjie Yang, Yuchen Fan, Dingcheng Yue, Yuchen Liang, Jianchao Yang, Thomas Huang

End-to-end sequential learning to explore spatialtemporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.

Image Segmentation Object +6

AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning

no code implementations20 Feb 2024 Qiao Jin, Zhizheng Wang, Yifan Yang, Qingqing Zhu, Donald Wright, Thomas Huang, W John Wilbur, Zhe He, Andrew Taylor, Qingyu Chen, Zhiyong Lu

Clinical calculators play a vital role in healthcare by offering accurate evidence-based predictions for various purposes such as prognosis.

Cannot find the paper you are looking for? You can Submit a new open access paper.