Search Results for author: Hongyang Chao

Found 32 papers, 12 papers with code

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance

1 code implementation • ICCV 2023 • Kan Wu, Houwen Peng, Zhenghong Zhou, Bin Xiao, Mengchen Liu, Lu Yuan, Hong Xuan, Michael Valenzuela, Xi, Chen, Xinggang Wang, Hongyang Chao, Han Hu

In this paper, we propose a novel cross-modal distillation method, called TinyCLIP, for large-scale language-image pre-trained models.

1,563

Paper
Code

Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning

no code implementations • 20 Jun 2023 • Huiguo He, Tianfu Wang, Huan Yang, Jianlong Fu, Nicholas Jing Yuan, Jian Yin, Hongyang Chao, Qi Zhang

The proposed framework consists of a large language model (LLM), a diffusion-based image generator, and a series of visual rewards by design.

Attribute Image Generation +3

Paper
Add Code

Semantic-Conditional Diffusion Networks for Image Captioning

1 code implementation • CVPR 2023 • Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei

The rich semantics are further regarded as semantic prior to trigger the learning of Diffusion Transformer, which produces the output sentence in a diffusion process.

Cross-Modal Retrieval Image Captioning +3

1,004

Paper
Code

Out-of-Distribution Detection with Hilbert-Schmidt Independence Optimization

1 code implementation • 26 Sep 2022 • Jingyang Lin, Yu Wang, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

Existing works attempt to solve the problem by explicitly imposing uncertainty on classifiers when OOD inputs are exposed to the classifier during training.

Outlier Detection Out-of-Distribution Detection +1

Paper
Code

CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

1 code implementation • 14 Dec 2021 • Jingyang Lin, Yingwei Pan, Rongfeng Lai, Xuehang Yang, Hongyang Chao, Ting Yao

In this work, we quantitatively analyze the sub-text problem and present a simple yet effective design, COntrastive RElation (CORE) module, to mitigate that issue.

Relation Relational Reasoning +2

Paper
Code

CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising

no code implementations • 14 Dec 2021 • Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

BERT-type structure has led to the revolution of vision-language pre-training and the achievement of state-of-the-art results on numerous vision-language downstream tasks.

Cross-Modal Retrieval Denoising +6

Paper
Add Code

Searching the Search Space of Vision Transformer

2 code implementations • NeurIPS 2021 • Minghao Chen, Kan Wu, Bolin Ni, Houwen Peng, Bei Liu, Jianlong Fu, Hongyang Chao, Haibin Ling

Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection, and thus been attracting fast-growing efforts on manually designing more effective architectures.

Neural Architecture Search object-detection +4

1,562

Paper
Code

Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers

no code implementations • NeurIPS 2021 • Yanhong Zeng, Huan Yang, Hongyang Chao, Jianbo Wang, Jianlong Fu

Given a sequence of style tokens, the TokenGAN is able to control the image synthesis by assigning the styles to the content tokens by attention mechanism with a Transformer.

Image Generation

Paper
Add Code

Reference-based Defect Detection Network

no code implementations • 10 Aug 2021 • Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao

To solve the partial visual confusion issue, we propose to leverage the carried context information of context reference, which is the concentric bigger box of each region proposal, to perform more accurate region classification and regression.

Defect Detection object-detection +2

Paper
Add Code

A Low Rank Promoting Prior for Unsupervised Contrastive Learning

no code implementations • 5 Aug 2021 • Yu Wang, Jingyang Lin, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

In this paper, we construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning, referred to as LORAC.

Contrastive Learning Image Classification +5

Paper
Add Code

Rethinking and Improving Relative Position Encoding for Vision Transformer

1 code implementation • ICCV 2021 • Kan Wu, Houwen Peng, Minghao Chen, Jianlong Fu, Hongyang Chao

We then propose new relative position encoding methods dedicated to 2D images, called image RPE (iRPE).

Ranked #140 on Object Detection on COCO minival

Image Classification Object Detection +1

1,562

Paper
Code

3D Human Body Reshaping with Anthropometric Modeling

1 code implementation • 5 Apr 2021 • Yanhong Zeng, Jianlong Fu, Hongyang Chao

First, we calculate full-body anthropometric parameters from limited user inputs by imputation technique, and thus essential anthropometric parameters for 3D body reshaping can be obtained.

feature selection Imputation +1

333

Paper
Code

Aggregated Contextual Transformations for High-Resolution Image Inpainting

2 code implementations • 3 Apr 2021 • Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo

For improving texture synthesis, we enhance the discriminator of AOT-GAN by training it with a tailored mask-prediction task.

Ranked #9 on Image Inpainting on Places2

Image Inpainting Texture Synthesis +1

4,198

Paper
Code

Learning Joint Spatial-Temporal Transformations for Video Inpainting

2 code implementations • ECCV 2020 • Yanhong Zeng, Jianlong Fu, Hongyang Chao

In this paper, we propose to learn a joint Spatial-Temporal Transformer Network (STTN) for video inpainting.

Ranked #5 on Seeing Beyond the Visible on KITTI360-EX

Seeing Beyond the Visible Video Inpainting

435

Paper
Code

Mining Domain Knowledge: Improved Framework towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy

no code implementations • 4 Dec 2019 • Qiming Yang, Hongyang Chao, Dan Nguyen, Steve Jiang

To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV).

Paper
Add Code

Endowing Deep 3D Models with Rotation Invariance Based on Principal Component Analysis

no code implementations • 20 Oct 2019 • Zelin Xiao, Hongxin Lin, Renjie Li, Hongyang Chao, Shengyong Ding

Interestingly, the principal component analysis exactly provides an effective way to define such a frame, i. e. setting the principal components as the frame axes.

Object Retrieval

Paper
Add Code

WSOD2: Learning Bottom-up and Top-down Objectness Distillation forWeakly-supervised Object Detection

no code implementations • ICCV 2019 • Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao, Lei Zhang

We study on weakly-supervised object detection (WSOD)which plays a vital role in relieving human involvement fromobject-level annotations.

Ranked #6 on Weakly Supervised Object Detection on PASCAL VOC 2007

Object object-detection +2

Paper
Add Code

WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection

1 code implementation • 11 Sep 2019 • Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao, Lei Zhang

We study on weakly-supervised object detection (WSOD) which plays a vital role in relieving human involvement from object-level annotations.

Object object-detection +3

Paper
Code

Deep Metric Learning with Density Adaptivity

no code implementations • 9 Sep 2019 • Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei

The problem of distance metric learning is mostly considered from the perspective of learning an embedding space, where the distances between pairs of examples are in correspondence with a similarity metric.

Metric Learning

Paper
Add Code

Justlookup: One Millisecond Deep Feature Extraction for Point Clouds By Lookup Tables

no code implementations • 14 Aug 2019 • Hongxin Lin, Zelin Xiao, Yang Tan, Hongyang Chao, Shengyong Ding

Deep models are capable of fitting complex high dimensional functions while usually yielding large computation load.

Paper
Add Code

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

1 code implementation • 3 May 2019 • Jingwen Chen, Yingwei Pan, Yehao Li, Ting Yao, Hongyang Chao, Tao Mei

Moreover, the inherently recurrent dependency in RNN prevents parallelization within a sequence during training and therefore limits the computations.

Sentence Video Captioning

Paper
Code

Pointing Novel Objects in Image Captioning

no code implementations • CVPR 2019 • Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei

Image captioning has received significant attention with remarkable improvements in recent advances.

Image Captioning Object +2

Paper
Add Code

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting

2 code implementations • CVPR 2019 • Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo

As the missing content can be filled by attention transfer from deep to shallow in a pyramid fashion, both visual and semantic coherence for image inpainting can be ensured.

Image Inpainting Vocal Bursts Intensity Prediction

349

Paper
Code

Face Recognition from Sequential Sparse 3D Data via Deep Registration

no code implementations • 23 Oct 2018 • Yang Tan, Hongxin Lin, Zelin Xiao, Shengyong Ding, Hongyang Chao

However, such devices only provide sparse(limited speckles in structured light system) and noisy 3D data which can not support face recognition directly.

Face Recognition

Paper
Add Code

Image Blind Denoising With Generative Adversarial Network Based Noise Modeling

no code implementations • CVPR 2018 • Jingwen Chen, Jia-Wei Chen, Hongyang Chao, Ming Yang

In this paper, we consider a typical image blind denoising problem, which is to remove unknown noise from noisy images.

Denoising Generative Adversarial Network

Paper
Add Code

Jointly Localizing and Describing Events for Dense Video Captioning

no code implementations • CVPR 2018 • Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei

A valid question is how to temporally localize and then describe events, which is known as "dense video captioning."

Attribute Dense Video Captioning +3

Paper
Add Code

Deep Joint Face Hallucination and Recognition

no code implementations • 24 Nov 2016 • Junyu Wu, Shengyong Ding, Wei Xu, Hongyang Chao

However, we observe that directly feeding the hallucinated facial images into recog- nition models can even degrade the recognition performance despite the much better visualization quality.

Face Hallucination Face Recognition +1

Paper
Add Code

Automatically Building Face Datasets of New Domains from Weakly Labeled Data with Pretrained Models

no code implementations • 24 Nov 2016 • Shengyong Ding, Junyu Wu, Wei Xu, Hongyang Chao

In this paper, we propose a method to automatically and incrementally construct datasets from massive weakly labeled data of the target domain which are readily available on the Internet under the help of a pretrained face model.

Face Model Face Recognition

Paper
Add Code

One-to-Many Network for Visually Pleasing Compression Artifacts Reduction

no code implementations • CVPR 2017 • Jun Guo, Hongyang Chao

We consider the compression artifacts reduction problem, where a compressed image is transformed into an artifact-free image.

Paper
Add Code

Joint Multiview Segmentation and Localization of RGB-D Images Using Depth-Induced Silhouette Consistency

no code implementations • CVPR 2016 • Chi Zhang, Zhiwei Li, Rui Cai, Hongyang Chao, Yong Rui

In this paper, we propose an RGB-D camera localization approach which takes an effective geometry constraint, i. e. silhouette consistency, into consideration.

Camera Localization Image Segmentation +2