Search Results for author: Kunpeng Li

Found 30 papers, 16 papers with code

Transfer between Modalities with MetaQueries

no code implementations8 Apr 2025 Xichen Pan, Satya Narayan Shukla, Aashu Singh, Zhuokai Zhao, Shlok Kumar Mishra, Jialiang Wang, Zhiyang Xu, Jiuhai Chen, Kunpeng Li, Felix Juefei-Xu, Ji Hou, Saining Xie

Unified multimodal models aim to integrate understanding (text output) and generation (pixel output), but aligning these different modalities within a single architecture often demands complex training recipes and careful data balancing.

Decoder Image Generation

MoCha: Towards Movie-Grade Talking Character Synthesis

no code implementations30 Mar 2025 Cong Wei, Bo Sun, Haoyu Ma, Ji Hou, Felix Juefei-Xu, Zecheng He, Xiaoliang Dai, Luxin Zhang, Kunpeng Li, Tingbo Hou, Animesh Sinha, Peter Vajda, Wenhu Chen

We introduce Talking Characters, a more realistic task to generate talking character animations directly from speech and text.

Video Generation

Bayesian inference for dynamic spatial quantile models with interactive effects

no code implementations2 Mar 2025 Tomohiro Ando, Jushan Bai, Kunpeng Li, Yong Song

The proposed model captures the dynamic structure of panel data, high-dimensional cross-sectional dependence, and allows for heterogeneous regression coefficients.

Bayesian Inference

Movie Gen: A Cast of Media Foundation Models

2 code implementations17 Oct 2024 Adam Polyak, Amit Zohar, Andrew Brown, Andros Tjandra, Animesh Sinha, Ann Lee, Apoorv Vyas, Bowen Shi, Chih-Yao Ma, Ching-Yao Chuang, David Yan, Dhruv Choudhary, Dingkang Wang, Geet Sethi, Guan Pang, Haoyu Ma, Ishan Misra, Ji Hou, Jialiang Wang, Kiran Jagadeesh, Kunpeng Li, Luxin Zhang, Mannat Singh, Mary Williamson, Matt Le, Matthew Yu, Mitesh Kumar Singh, Peizhao Zhang, Peter Vajda, Quentin Duval, Rohit Girdhar, Roshan Sumbaly, Sai Saketh Rambhatla, Sam Tsai, Samaneh Azadi, Samyak Datta, Sanyuan Chen, Sean Bell, Sharadh Ramaswamy, Shelly Sheynin, Siddharth Bhattacharya, Simran Motwani, Tao Xu, Tianhe Li, Tingbo Hou, Wei-Ning Hsu, Xi Yin, Xiaoliang Dai, Yaniv Taigman, Yaqiao Luo, Yen-Cheng Liu, Yi-Chiao Wu, Yue Zhao, Yuval Kirstain, Zecheng He, Zijian He, Albert Pumarola, Ali Thabet, Artsiom Sanakoyeu, Arun Mallya, Baishan Guo, Boris Araya, Breena Kerr, Carleigh Wood, Ce Liu, Cen Peng, Dimitry Vengertsev, Edgar Schonfeld, Elliot Blanchard, Felix Juefei-Xu, Fraylie Nord, Jeff Liang, John Hoffman, Jonas Kohler, Kaolin Fire, Karthik Sivakumar, Lawrence Chen, Licheng Yu, Luya Gao, Markos Georgopoulos, Rashel Moritz, Sara K. Sampson, Shikai Li, Simone Parmeggiani, Steve Fine, Tara Fowler, Vladan Petrovic, Yuming Du

Our models set a new state-of-the-art on multiple tasks: text-to-video synthesis, video personalization, video editing, video-to-audio generation, and text-to-audio generation.

Audio Generation Video Editing +1

ControlRoom3D: Room Generation using Semantic Proxy Rooms

no code implementations CVPR 2024 Jonas Schult, Sam Tsai, Lukas Höllein, Bichen Wu, Jialiang Wang, Chih-Yao Ma, Kunpeng Li, Xiaofang Wang, Felix Wimbauer, Zijian He, Peizhao Zhang, Bastian Leibe, Peter Vajda, Ji Hou

Central to our approach is a user-defined 3D semantic proxy room that outlines a rough room layout based on semantic bounding boxes and a textual description of the overall room style.

A Close Look at Spatial Modeling: From Attention to Convolution

1 code implementation23 Dec 2022 Xu Ma, Huan Wang, Can Qin, Kunpeng Li, Xingchen Zhao, Jie Fu, Yun Fu

Vision Transformers have shown great promise recently for many vision tasks due to the insightful architecture design and attention mechanism.

Instance Segmentation object-detection +2

S^2-Transformer for Mask-Aware Hyperspectral Image Reconstruction

1 code implementation24 Sep 2022 Jiamian Wang, Kunpeng Li, Yulun Zhang, Xin Yuan, Zhiqiang Tao

Besides, CASSI entangles the spatial and spectral information into a 2D measurement, placing a barrier for information disentanglement and modeling.

Blocking Disentanglement +1

Billion-user Customer Lifetime Value Prediction: An Industrial-scale Solution from Kuaishou

no code implementations29 Aug 2022 Kunpeng Li, Guangcui Shao, Naijun Yang, Xiao Fang, Yang song

Customer Life Time Value (LTV) is the expected total revenue that a single user can bring to a business.

Value prediction

Dual Lottery Ticket Hypothesis

1 code implementation ICLR 2022 Yue Bai, Huan Wang, Zhiqiang Tao, Kunpeng Li, Yun Fu

In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark, then go from a complementary direction to articulate the Dual Lottery Ticket Hypothesis (DLTH): Randomly selected subnetworks from a randomly initialized dense network can be transformed into a trainable condition and achieve admirable performance compared with LTH -- random tickets in a given lottery pool can be transformed into winning tickets.

Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble

2 code implementations12 Oct 2021 Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li, Yun Fu

Current Sign Language Recognition (SLR) methods usually extract features via deep neural networks and suffer overfitting due to limited and noisy data.

Action Recognition Sign Language Recognition +1

MR Image Super-Resolution With Squeeze and Excitation Reasoning Attention Network

no code implementations CVPR 2021 Yulun Zhang, Kai Li, Kunpeng Li, Yun Fu

They also fail to sense the entire space of the input, which is critical for high-quality MR image SR. To address those problems, we propose squeeze and excitation reasoning attention networks (SERAN) for accurate MR image SR. We propose to squeeze attention from global spatial information of the input and obtain global descriptors.

Image Super-Resolution

Skeleton Aware Multi-modal Sign Language Recognition

3 code implementations16 Mar 2021 Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li, Yun Fu

Sign language is commonly used by deaf or speech impaired people to communicate but requires significant effort to master.

Ranked #2 on Sign Language Recognition on AUTSL (using extra training data)

Sign Language Recognition Skeleton Based Action Recognition

Learning from Weakly-labeled Web Videos via Exploring Sub-Concepts

no code implementations11 Jan 2021 Kunpeng Li, Zizhao Zhang, Guanhang Wu, Xuehan Xiong, Chen-Yu Lee, Zhichao Lu, Yun Fu, Tomas Pfister

To address this issue, we introduce a new method for pre-training video action recognition models using queried web videos.

Action Recognition Pseudo Label +1

Exploring Sub-Pseudo Labels for Learning from Weakly-Labeled Web Videos

no code implementations1 Jan 2021 Kunpeng Li, Zizhao Zhang, Guanhang Wu, Xuehan Xiong, Chen-Yu Lee, Yun Fu, Tomas Pfister

To address this issue, we introduce a new method for pre-training video action recognition models using queried web videos.

Action Recognition Pseudo Label +1

Recent Developments on Factor Models and its Applications in Econometric Learning

no code implementations21 Sep 2020 Jianqing Fan, Kunpeng Li, Yuan Liao

This paper makes a selective survey on the recent development of the factor model and its application on statistical learnings.

Matrix Completion Survey

Screencast Tutorial Video Understanding

1 code implementation CVPR 2020 Kunpeng Li, Chen Fang, Zhaowen Wang, Seokhwan Kim, Hailin Jin, Yun Fu

It is very popular for both novice and experienced users to learn new skills, compared to other tutorial media such as text, because of the visual guidance and the ease of understanding.

object-detection Object Detection +4

Adversarial Feature Hallucination Networks for Few-Shot Learning

1 code implementation CVPR 2020 Kai Li, Yulun Zhang, Kunpeng Li, Yun Fu

The recent flourish of deep learning in various tasks is largely accredited to the rich and accessible labeled data.

Data Augmentation Diversity +2

Attention Bridging Network for Knowledge Transfer

no code implementations ICCV 2019 Kunpeng Li, Yulun Zhang, Kai Li, Yuanyuan Li, Yun Fu

With weights sharing and domain adversary training, this knowledge can be successfully transferred by regularizing the network's response to the same category in the target domain.

Domain Generalization Transfer Learning +2

Visual Semantic Reasoning for Image-Text Matching

2 code implementations ICCV 2019 Kunpeng Li, Yulun Zhang, Kai Li, Yuanyuan Li, Yun Fu

It outperforms the current best method by 6. 8% relatively for image retrieval and 4. 8% relatively for caption retrieval on MS-COCO (Recall@1 using 1K test set).

Image Retrieval Image-text matching +3

Residual Non-local Attention Networks for Image Restoration

2 code implementations ICLR 2019 Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, Yun Fu

To address this issue, we design local and non-local attention blocks to extract features that capture the long-range dependencies between pixels and pay more attention to the challenging parts.

Demosaicking Image Denoising +1

Support Neighbor Loss for Person Re-Identification

1 code implementation18 Aug 2018 Kai Li, Zhengming Ding, Kunpeng Li, Yulun Zhang, Yun Fu

To ensure scalability and separability, a softmax-like function is formulated to push apart the positive and negative support sets.

Person Re-Identification

Tell Me Where to Look: Guided Attention Inference Network

2 code implementations CVPR 2018 Kunpeng Li, Ziyan Wu, Kuan-Chuan Peng, Jan Ernst, Yun Fu

Weakly supervised learning with only coarse labels can obtain visual explanations of deep neural network such as attention maps by back-propagating gradients.

Object Localization Semantic Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.