TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance

1 code implementation ICCV 2023 Kan Wu, Houwen Peng, Zhenghong Zhou, Bin Xiao, Mengchen Liu, Lu Yuan, Hong Xuan, Michael Valenzuela, Xi, Chen, Xinggang Wang, Hongyang Chao, Han Hu

In this paper, we propose a novel cross-modal distillation method, called TinyCLIP, for large-scale language-image pre-trained models.

Semantic-Conditional Diffusion Networks for Image Captioning

1 code implementation CVPR 2023 Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei

The rich semantics are further regarded as semantic prior to trigger the learning of Diffusion Transformer, which produces the output sentence in a diffusion process.

Out-of-Distribution Detection with Hilbert-Schmidt Independence Optimization

1 code implementation26 Sep 2022 Jingyang Lin, Yu Wang, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

Existing works attempt to solve the problem by explicitly imposing uncertainty on classifiers when OOD inputs are exposed to the classifier during training.

CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

1 code implementation14 Dec 2021 Jingyang Lin, Yingwei Pan, Rongfeng Lai, Xuehang Yang, Hongyang Chao, Ting Yao

In this work, we quantitatively analyze the sub-text problem and present a simple yet effective design, COntrastive RElation (CORE) module, to mitigate that issue.

CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising

no code implementations14 Dec 2021 Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

BERT-type structure has led to the revolution of vision-language pre-training and the achievement of state-of-the-art results on numerous vision-language downstream tasks.

Searching the Search Space of Vision Transformer

1 code implementation NeurIPS 2021 Minghao Chen, Kan Wu, Bolin Ni, Houwen Peng, Bei Liu, Jianlong Fu, Hongyang Chao, Haibin Ling

Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection, and thus been attracting fast-growing efforts on manually designing more effective architectures.

Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers

no code implementations NeurIPS 2021 Yanhong Zeng, Huan Yang, Hongyang Chao, Jianbo Wang, Jianlong Fu

Given a sequence of style tokens, the TokenGAN is able to control the image synthesis by assigning the styles to the content tokens by attention mechanism with a Transformer.

Reference-based Defect Detection Network

no code implementations10 Aug 2021 Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao

To solve the partial visual confusion issue, we propose to leverage the carried context information of context reference, which is the concentric bigger box of each region proposal, to perform more accurate region classification and regression.

A Low Rank Promoting Prior for Unsupervised Contrastive Learning

no code implementations5 Aug 2021 Yu Wang, Jingyang Lin, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

In this paper, we construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning, referred to as LORAC.

3D Human Body Reshaping with Anthropometric Modeling

1 code implementation5 Apr 2021 Yanhong Zeng, Jianlong Fu, Hongyang Chao

First, we calculate full-body anthropometric parameters from limited user inputs by imputation technique, and thus essential anthropometric parameters for 3D body reshaping can be obtained.

Aggregated Contextual Transformations for High-Resolution Image Inpainting

2 code implementations3 Apr 2021 Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo

For improving texture synthesis, we enhance the discriminator of AOT-GAN by training it with a tailored mask-prediction task.

Mining Domain Knowledge: Improved Framework towards Automatically Standardizing Anatomical Structure Nomenclature in Radiotherapy

no code implementations4 Dec 2019 Qiming Yang, Hongyang Chao, Dan Nguyen, Steve Jiang

To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV).


Endowing Deep 3D Models with Rotation Invariance Based on Principal Component Analysis

no code implementations20 Oct 2019 Zelin Xiao, Hongxin Lin, Renjie Li, Hongyang Chao, Shengyong Ding

Interestingly, the principal component analysis exactly provides an effective way to define such a frame, i. e. setting the principal components as the frame axes.


WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection

1 code implementation11 Sep 2019 Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao, Lei Zhang

We study on weakly-supervised object detection (WSOD) which plays a vital role in relieving human involvement from object-level annotations.

Deep Metric Learning with Density Adaptivity

no code implementations9 Sep 2019 Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei

The problem of distance metric learning is mostly considered from the perspective of learning an embedding space, where the distances between pairs of examples are in correspondence with a similarity metric.

Justlookup: One Millisecond Deep Feature Extraction for Point Clouds By Lookup Tables

no code implementations14 Aug 2019 Hongxin Lin, Zelin Xiao, Yang Tan, Hongyang Chao, Shengyong Ding

Deep models are capable of fitting complex high dimensional functions while usually yielding large computation load.

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

1 code implementation3 May 2019 Jingwen Chen, Yingwei Pan, Yehao Li, Ting Yao, Hongyang Chao, Tao Mei

Moreover, the inherently recurrent dependency in RNN prevents parallelization within a sequence during training and therefore limits the computations.

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting

2 code implementations CVPR 2019 Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo

As the missing content can be filled by attention transfer from deep to shallow in a pyramid fashion, both visual and semantic coherence for image inpainting can be ensured.

Face Recognition from Sequential Sparse 3D Data via Deep Registration

no code implementations23 Oct 2018 Yang Tan, Hongxin Lin, Zelin Xiao, Shengyong Ding, Hongyang Chao

However, such devices only provide sparse(limited speckles in structured light system) and noisy 3D data which can not support face recognition directly.

Image Blind Denoising With Generative Adversarial Network Based Noise Modeling

no code implementations CVPR 2018 Jingwen Chen, Jia-Wei Chen, Hongyang Chao, Ming Yang

In this paper, we consider a typical image blind denoising problem, which is to remove unknown noise from noisy images.


Deep Joint Face Hallucination and Recognition

no code implementations24 Nov 2016 Junyu Wu, Shengyong Ding, Wei Xu, Hongyang Chao

However, we observe that directly feeding the hallucinated facial images into recog- nition models can even degrade the recognition performance despite the much better visualization quality.

Automatically Building Face Datasets of New Domains from Weakly Labeled Data with Pretrained Models

no code implementations24 Nov 2016 Shengyong Ding, Junyu Wu, Wei Xu, Hongyang Chao

In this paper, we propose a method to automatically and incrementally construct datasets from massive weakly labeled data of the target domain which are readily available on the Internet under the help of a pretrained face model.

One-to-Many Network for Visually Pleasing Compression Artifacts Reduction

no code implementations CVPR 2017 Jun Guo, Hongyang Chao

We consider the compression artifacts reduction problem, where a compressed image is transformed into an artifact-free image.

Joint Multiview Segmentation and Localization of RGB-D Images Using Depth-Induced Silhouette Consistency

no code implementations CVPR 2016 Chi Zhang, Zhiwei Li, Rui Cai, Hongyang Chao, Yong Rui

In this paper, we propose an RGB-D camera localization approach which takes an effective geometry constraint, i. e. silhouette consistency, into consideration.

Deep Feature Learning with Relative Distance Comparison for Person Re-identification

no code implementations11 Dec 2015 Shengyong Ding, Liang Lin, Guangrun Wang, Hongyang Chao

Identifying the same individual across different scenes is an important yet difficult task in intelligent video surveillance.

