no code implementations • 31 Dec 2024 • Jianjie Luo, Jingwen Chen, Yehao Li, Yingwei Pan, Jianlin Feng, Hongyang Chao, Ting Yao
Additionally, to facilitate the model training with synthetic data, a novel CLIP-weighted cross-entropy loss is devised to prioritize the high-quality image-text pairs over the low-quality counterparts.
no code implementations • 28 Nov 2024 • Huiguo He, Qiuyue Wang, Yuan Zhou, Yuxuan Cai, Hongyang Chao, Jian Yin, Huan Yang
This ensures that subjects in the target image can better reference those in the reference image, thereby maintaining better consistency.
no code implementations • 17 Jul 2024 • Huiguo He, Huan Yang, Zixi Tuo, Yuan Zhou, Qiuyue Wang, Yuhang Zhang, Zeyu Liu, Wenhao Huang, Hongyang Chao, Jian Yin
DreamStory consists of (1) an LLM acting as a story director and (2) an innovative Multi-Subject consistent Diffusion model (MSD) for generating consistent multi-subject across the images.
1 code implementation • ICCV 2023 • Kan Wu, Houwen Peng, Zhenghong Zhou, Bin Xiao, Mengchen Liu, Lu Yuan, Hong Xuan, Michael Valenzuela, Xi, Chen, Xinggang Wang, Hongyang Chao, Han Hu
In this paper, we propose a novel cross-modal distillation method, called TinyCLIP, for large-scale language-image pre-trained models.
no code implementations • 20 Jun 2023 • Huiguo He, Tianfu Wang, Huan Yang, Jianlong Fu, Nicholas Jing Yuan, Jian Yin, Hongyang Chao, Qi Zhang
The proposed framework consists of a large language model (LLM), a diffusion-based image generator, and a series of visual rewards by design.
1 code implementation • CVPR 2023 • Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei
The rich semantics are further regarded as semantic prior to trigger the learning of Diffusion Transformer, which produces the output sentence in a diffusion process.
1 code implementation • 26 Sep 2022 • Jingyang Lin, Yu Wang, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei
Existing works attempt to solve the problem by explicitly imposing uncertainty on classifiers when OOD inputs are exposed to the classifier during training.
1 code implementation • 14 Dec 2021 • Jingyang Lin, Yingwei Pan, Rongfeng Lai, Xuehang Yang, Hongyang Chao, Ting Yao
In this work, we quantitatively analyze the sub-text problem and present a simple yet effective design, COntrastive RElation (CORE) module, to mitigate that issue.
no code implementations • 14 Dec 2021 • Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei
BERT-type structure has led to the revolution of vision-language pre-training and the achievement of state-of-the-art results on numerous vision-language downstream tasks.
2 code implementations • NeurIPS 2021 • Minghao Chen, Kan Wu, Bolin Ni, Houwen Peng, Bei Liu, Jianlong Fu, Hongyang Chao, Haibin Ling
Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection, and thus been attracting fast-growing efforts on manually designing more effective architectures.
no code implementations • NeurIPS 2021 • Yanhong Zeng, Huan Yang, Hongyang Chao, Jianbo Wang, Jianlong Fu
Given a sequence of style tokens, the TokenGAN is able to control the image synthesis by assigning the styles to the content tokens by attention mechanism with a Transformer.
no code implementations • 10 Aug 2021 • Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao
To solve the partial visual confusion issue, we propose to leverage the carried context information of context reference, which is the concentric bigger box of each region proposal, to perform more accurate region classification and regression.
no code implementations • 5 Aug 2021 • Yu Wang, Jingyang Lin, Qi Cai, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei
In this paper, we construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning, referred to as LORAC.
1 code implementation • ICCV 2021 • Kan Wu, Houwen Peng, Minghao Chen, Jianlong Fu, Hongyang Chao
We then propose new relative position encoding methods dedicated to 2D images, called image RPE (iRPE).
Ranked #153 on
Object Detection
on COCO minival
1 code implementation • 5 Apr 2021 • Yanhong Zeng, Jianlong Fu, Hongyang Chao
First, we calculate full-body anthropometric parameters from limited user inputs by imputation technique, and thus essential anthropometric parameters for 3D body reshaping can be obtained.
2 code implementations • 3 Apr 2021 • Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo
For improving texture synthesis, we enhance the discriminator of AOT-GAN by training it with a tailored mask-prediction task.
Ranked #11 on
Image Inpainting
on Places2
2 code implementations • ECCV 2020 • Yanhong Zeng, Jianlong Fu, Hongyang Chao
In this paper, we propose to learn a joint Spatial-Temporal Transformer Network (STTN) for video inpainting.
Ranked #5 on
Seeing Beyond the Visible
on KITTI360-EX
no code implementations • 4 Dec 2019 • Qiming Yang, Hongyang Chao, Dan Nguyen, Steve Jiang
To solve these problems, we propose an automated structure nomenclature standardization framework, 3D Non-local Network with Voting (3DNNV).
no code implementations • 20 Oct 2019 • Zelin Xiao, Hongxin Lin, Renjie Li, Hongyang Chao, Shengyong Ding
Interestingly, the principal component analysis exactly provides an effective way to define such a frame, i. e. setting the principal components as the frame axes.
1 code implementation • 11 Sep 2019 • Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao, Lei Zhang
We study on weakly-supervised object detection (WSOD) which plays a vital role in relieving human involvement from object-level annotations.
no code implementations • ICCV 2019 • Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao, Lei Zhang
We study on weakly-supervised object detection (WSOD)which plays a vital role in relieving human involvement fromobject-level annotations.
Ranked #12 on
Weakly Supervised Object Detection
on PASCAL VOC 2007
no code implementations • 9 Sep 2019 • Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei
The problem of distance metric learning is mostly considered from the perspective of learning an embedding space, where the distances between pairs of examples are in correspondence with a similarity metric.
no code implementations • 14 Aug 2019 • Hongxin Lin, Zelin Xiao, Yang Tan, Hongyang Chao, Shengyong Ding
Deep models are capable of fitting complex high dimensional functions while usually yielding large computation load.
1 code implementation • 3 May 2019 • Jingwen Chen, Yingwei Pan, Yehao Li, Ting Yao, Hongyang Chao, Tao Mei
Moreover, the inherently recurrent dependency in RNN prevents parallelization within a sequence during training and therefore limits the computations.
no code implementations • CVPR 2019 • Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei
Image captioning has received significant attention with remarkable improvements in recent advances.
2 code implementations • CVPR 2019 • Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo
As the missing content can be filled by attention transfer from deep to shallow in a pyramid fashion, both visual and semantic coherence for image inpainting can be ensured.
no code implementations • 23 Oct 2018 • Yang Tan, Hongxin Lin, Zelin Xiao, Shengyong Ding, Hongyang Chao
However, such devices only provide sparse(limited speckles in structured light system) and noisy 3D data which can not support face recognition directly.
no code implementations • CVPR 2018 • Jingwen Chen, Jia-Wei Chen, Hongyang Chao, Ming Yang
In this paper, we consider a typical image blind denoising problem, which is to remove unknown noise from noisy images.
no code implementations • CVPR 2018 • Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei
A valid question is how to temporally localize and then describe events, which is known as "dense video captioning."
no code implementations • 24 Nov 2016 • Junyu Wu, Shengyong Ding, Wei Xu, Hongyang Chao
However, we observe that directly feeding the hallucinated facial images into recog- nition models can even degrade the recognition performance despite the much better visualization quality.
no code implementations • 24 Nov 2016 • Shengyong Ding, Junyu Wu, Wei Xu, Hongyang Chao
In this paper, we propose a method to automatically and incrementally construct datasets from massive weakly labeled data of the target domain which are readily available on the Internet under the help of a pretrained face model.
no code implementations • CVPR 2017 • Jun Guo, Hongyang Chao
We consider the compression artifacts reduction problem, where a compressed image is transformed into an artifact-free image.
no code implementations • CVPR 2016 • Chi Zhang, Zhiwei Li, Rui Cai, Hongyang Chao, Yong Rui
In this paper, we propose an RGB-D camera localization approach which takes an effective geometry constraint, i. e. silhouette consistency, into consideration.
no code implementations • 11 Dec 2015 • Shengyong Ding, Liang Lin, Guangrun Wang, Hongyang Chao
Identifying the same individual across different scenes is an important yet difficult task in intelligent video surveillance.
Ranked #9 on
Person Re-Identification
on SYSU-30k
(using extra training data)
no code implementations • ICCV 2015 • Chi Zhang, Zhiwei Li, Yanhua Cheng, Rui Cai, Hongyang Chao, Yong Rui
We present a novel global stereo model designed for view interpolation.