Search Results for author: Kang Zhao

Found 22 papers, 7 papers with code

Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization

no code implementations19 Mar 2024 Haocheng Xi, Yuxiang Chen, Kang Zhao, Kaijun Zheng, Jianfei Chen, Jun Zhu

Moreover, for a standard transformer block, our method offers an end-to-end training speedup of 1. 42x and a 1. 49x memory reduction compared to the FP16 baseline.

Quantization

AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis

no code implementations18 Dec 2023 Dongze Li, Kang Zhao, Wei Wang, Bo Peng, Yingya Zhang, Jing Dong, Tieniu Tan

Audio-driven talking head synthesis is a promising topic with wide applications in digital human, film making and virtual reality.

Talking Head Generation

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

3 code implementations7 Nov 2023 Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, Jingren Zhou

By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos.

DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image Editing

1 code implementation12 Oct 2023 Yueming Lyu, Kang Zhao, Bo Peng, Yue Jiang, Yingya Zhang, Jing Dong

Based on DeltaSpace, we propose a novel framework called DeltaEdit, which maps the CLIP visual feature differences to the latent space directions of a generative model during the training phase, and predicts the latent space directions from the CLIP textual feature differences during the inference phase.

text-guided-image-editing

Freestyle 3D-Aware Portrait Synthesis Based on Compositional Generative Priors

no code implementations27 Jun 2023 Tianxiang Ma, Kang Zhao, Jianxin Sun, Yingya Zhang, Jing Dong

Efficiently generating a freestyle 3D portrait with high quality and 3D-consistency is a promising yet challenging task.

UniMC: A Unified Framework for Long-Term Memory Conversation via Relevance Representation Learning

no code implementations18 Jun 2023 Kang Zhao, Wei Liu, Jian Luan, Minglei Gao, Li Qian, Hanlin Teng, Bin Wang

In this paper, we propose a Unified framework for Long-term Memory Conversations (UniMC), which increases the connection between different stages by learning relevance representation.

Representation Learning Retrieval

Learning Residual Model of Model Predictive Control via Random Forests for Autonomous Driving

no code implementations10 Apr 2023 Kang Zhao, Jianru Xue, Xiangning Meng, Gengxin Li, Mengsen Wu

One major issue in learning-based model predictive control (MPC) for autonomous driving is the contradiction between the system model's prediction accuracy and computation efficiency.

Autonomous Driving Model Predictive Control +1

RiDDLE: Reversible and Diversified De-identification with Latent Encryptor

1 code implementation CVPR 2023 Dongze Li, Wei Wang, Kang Zhao, Jing Dong, Tieniu Tan

This work presents RiDDLE, short for Reversible and Diversified De-identification with Latent Encryptor, to protect the identity information of people from being misused.

De-identification

Semi-MAE: Masked Autoencoders for Semi-supervised Vision Transformers

no code implementations4 Jan 2023 Haojie Yu, Kang Zhao, Xiaoming Xu

To alleviate this issue, inspired by masked autoencoder (MAE), which is a data-efficient self-supervised learner, we propose Semi-MAE, a pure ViT-based SSL framework consisting of a parallel MAE branch to assist the visual representation learning and make the pseudo labels more accurate.

Representation Learning Semi-Supervised Image Classification

Consistent Representation Learning for Continual Relation Extraction

1 code implementation Findings (ACL) 2022 Kang Zhao, Hua Xu, Jiangong Yang, Kai Gao

Specifically, supervised contrastive learning based on a memory bank is first used to train each new task so that the model can effectively learn the relation representation.

Continual Relation Extraction Contrastive Learning +3

Communication Efficient SGD via Gradient Sampling With Bayes Prior

no code implementations CVPR 2021 Liuyihan Song, Kang Zhao, Pan Pan, Yu Liu, Yingya Zhang, Yinghui Xu, Rong Jin

Different from all of them, we regard large and small gradients selection as the exploitation and exploration of gradient information, respectively.

Image Classification object-detection +2

Visual Search at Alibaba

no code implementations9 Feb 2021 Yanhao Zhang, Pan Pan, Yun Zheng, Kang Zhao, Yingya Zhang, Xiaofeng Ren, Rong Jin

We hope visual search at Alibaba becomes more widely incorporated into today's commercial applications.

Image Retrieval

Large-Scale Visual Search with Binary Distributed Graph at Alibaba

no code implementations9 Feb 2021 Kang Zhao, Pan Pan, Yun Zheng, Yanhao Zhang, Changxu Wang, Yingya Zhang, Yinghui Xu, Rong Jin

For a deployed visual search system with several billions of online images in total, building a billion-scale offline graph in hours is essential, which is almost unachievable by most existing methods.

graph construction

Distribution Adaptive INT8 Quantization for Training CNNs

no code implementations9 Feb 2021 Kang Zhao, Sida Huang, Pan Pan, Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Researches have demonstrated that low bit-width (e. g., INT8) quantization can be employed to accelerate the inference process.

Image Classification object-detection +3

Virtual ID Discovery from E-commerce Media at Alibaba: Exploiting Richness of User Click Behavior for Visual Search Relevance

no code implementations9 Feb 2021 Yanhao Zhang, Pan Pan, Yun Zheng, Kang Zhao, Jianmin Wu, Yinghui Xu, Rong Jin

Benefiting from exploration of user click data, our networks are more effective to encode richer supervision and better distinguish real-shot images in terms of category and feature.

BOUNDARY REGULARIZED BUILDING FOOTPRINT EXTRACTION FROM SATELLITE IMAGES USING DEEP NEURAL NETWORKS

no code implementations arXiv 2020 Kang Zhao, Muhammad Kamran, Gunho Sohn

The proposed deep learning method consists of a two-stage object detection network to produce region of interest (RoI) features and a building boundary extraction network using graph models to learn geometric information of the polygon shapes.

Object object-detection +2

Boundary Regularized Building Footprint Extraction From Satellite Images Using Deep Neural Network

no code implementations23 Jun 2020 Kang Zhao, Muhammad Kamran, Gunho Sohn

The proposed deep learning method consists of a two-stage object detection network to produce region of interest (RoI) features and a building boundary extraction network using graph models to learn geometric information of the polygon shapes.

Object object-detection +2

Early Predictions of Movie Success: the Who, What, and When of Profitability

2 code implementations17 Jun 2015 Michael T. Lash, Kang Zhao

This paper proposes a decision support system to aid movie investment decisions at the early stage of movie productions.

Cannot find the paper you are looking for? You can Submit a new open access paper.