Search Results for author: Baoyuan Wang

Found 45 papers, 18 papers with code

Subobject-level Image Tokenization

1 code implementation22 Feb 2024 Delong Chen, Samuel Cahyawijaya, Jianfeng Liu, Baoyuan Wang, Pascale Fung

Transformer-based vision models typically tokenize images into fixed-size square patches as input units, which lacks the adaptability to image content and overlooks the inherent pixel grouping structure.

Attribute Language Modelling +1

From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting

no code implementations18 Dec 2023 Nuo Chen, Hongguang Li, Baoyuan Wang, Jia Li

IMP-TIP follows the ``From Good to Great" concept, collecting multiple potential solutions from both LLMs and their Tool-Augmented counterparts for the same math problem, and then selecting or re-generating the most accurate answer after cross-checking these solutions via tool-augmented interleaf prompting.

GSM8K Math +1

GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance

no code implementations12 Dec 2023 Haiming Zhang, Zhihao Yuan, Chaoda Zheng, Xu Yan, Baoyuan Wang, Guanbin Li, Song Wu, Shuguang Cui, Zhen Li

Our proposed GSmoothFace model mainly consists of the Audio to Expression Prediction (A2EP) module and the Target Adaptive Face Translation (TAFT) module.

Face Model Talking Face Generation

PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

no code implementations7 Dec 2023 Shuliang Ning, Duomin Wang, Yipeng Qin, Zirong Jin, Baoyuan Wang, Xiaoguang Han

Unlike prior arts constrained by specific input types, our method allows flexible specification of style (text or image) and texture (full garment, cropped sections, or texture patches) conditions.

Disentanglement Human Parsing +1

Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data

no code implementations30 Nov 2023 Yu Deng, Duomin Wang, Xiaohang Ren, Xingyu Chen, Baoyuan Wang

The key is to first learn a part-wise 4D generative model from monocular images via adversarial learning, to synthesize multi-view images of diverse identities and full motions as training data; then leverage a transformer-based animatable triplane reconstructor to learn 4D head reconstruction using the synthetic data.

3D Reconstruction

AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents

no code implementations29 Nov 2023 Duomin Wang, Bin Dai, Yu Deng, Baoyuan Wang

In this study, our goal is to create interactive avatar agents that can autonomously plan and animate nuanced facial movements realistically, from both visual and behavioral perspectives.

Neural Rendering

A Unified Framework for Multimodal, Multi-Part Human Motion Synthesis

no code implementations28 Nov 2023 Zixiang Zhou, Yu Wan, Baoyuan Wang

The field has made significant progress in synthesizing realistic human motion driven by various modalities.

Motion Synthesis

HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images

no code implementations27 Nov 2023 Xihe Yang, Xingyu Chen, Shaohui Wang, Daiheng Gao, Xiaoguang Han, Baoyuan Wang

As for human avatar reconstruction, contemporary techniques commonly necessitate the acquisition of costly data and struggle to achieve satisfactory results from a small number of casual images.

DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models

1 code implementation8 Oct 2023 Chengcheng Han, Xiaowei Du, Che Zhang, Yixin Lian, Xiang Li, Ming Gao, Baoyuan Wang

Chain-of-Thought (CoT) prompting has proven to be effective in enhancing the reasoning capabilities of Large Language Models (LLMs) with at least 100 billion parameters.

Arithmetic Reasoning

MDSC: Towards Evaluating the Style Consistency Between Music and Dance

1 code implementation4 Sep 2023 Zixiang Zhou, Weiyuan Li, Baoyuan Wang

We found that directly measuring the embedding distance between motion and music is not an optimal solution.

Controlling Character Motions without Observable Driving Source

no code implementations11 Aug 2023 Weiyuan Li, Bin Dai, Ziyi Zhou, Qi Yao, Baoyuan Wang

A high-level prior model can be easily injected on top to generate unlimited long and diverse sequences.

Reinforced Disentanglement for Face Swapping without Skip Connection

no code implementations ICCV 2023 Xiaohang Ren, Xingyu Chen, Pengfei Yao, Heung-Yeung Shum, Baoyuan Wang

The SOTA face swap models still suffer the problem of either target identity (i. e., shape) being leaked or the target non-identity attributes (i. e., background, hair) failing to be fully preserved in the final results.

Disentanglement Face Swapping

Visual Instruction Tuning with Polite Flamingo

2 code implementations3 Jul 2023 Delong Chen, Jianfeng Liu, Wenliang Dai, Baoyuan Wang

This side effect negatively impacts the model's ability to format responses appropriately -- for instance, its "politeness" -- due to the overly succinct and unformatted nature of raw annotations, resulting in reduced human preference.

Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification

1 code implementation26 May 2023 Ke Ji, Yixin Lian, Jingsheng Gao, Baoyuan Wang

Due to the complex label hierarchy and intensive labeling cost in practice, the hierarchical text classification (HTC) suffers a poor performance especially when low-resource or few-shot settings are considered.

Contrastive Learning Few-shot HTC +2

ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers

3 code implementations24 May 2023 Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang

Recently, plain vision Transformers (ViTs) have shown impressive performance on various computer vision tasks, thanks to their strong modeling capacity and large-scale pretraining.

Image Matting

FashionTex: Controllable Virtual Try-on with Text and Texture

1 code implementation8 May 2023 Anran Lin, Nanxuan Zhao, Shuliang Ning, Yuda Qiu, Baoyuan Wang, Xiaoguang Han

Virtual try-on attracts increasing research attention as a promising way for enhancing the user experience for online cloth shopping.

Virtual Try-on

An Effective Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds

1 code implementation21 Mar 2023 Chaoda Zheng, Xu Yan, Haiming Zhang, Baoyuan Wang, Shenghui Cheng, Shuguang Cui, Zhen Li

Due to the motion-centric nature, our method shows its impressive generalizability with limited training labels and provides good differentiability for end-to-end cycle training.

3D Single Object Tracking Autonomous Driving +3

Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation

no code implementations ICCV 2023 Xingyu Chen, Yu Deng, Baoyuan Wang

Improving the photorealism via CNN-based 2D super-resolution can break the strict 3D consistency, while keeping the 3D consistency by learning high-resolution 3D representations for direct rendering often compromises image quality.

Image Generation Representation Learning +1

Natural Response Generation for Chinese Reading Comprehension

1 code implementation17 Feb 2023 Nuo Chen, Hongguang Li, Yinan Bao, Baoyuan Wang, Jia Li

To this end, we construct a new dataset called Penguin to promote the research of MRC, providing a training and test bed for natural response generation to real scenarios.

Chinese Reading Comprehension Machine Reading Comprehension +1

UDE: A Unified Driving Engine for Human Motion Generation

1 code implementation CVPR 2023 Zixiang Zhou, Baoyuan Wang

Generating controllable and editable human motion sequences is a key challenge in 3D Avatar generation.


Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis

1 code implementation CVPR 2023 Duomin Wang, Yu Deng, Zixin Yin, Heung-Yeung Shum, Baoyuan Wang

We present a novel one-shot talking head synthesis method that achieves disentangled and fine-grained control over lip motion, eye gaze&blink, head pose, and emotional expression.

Contrastive Learning Disentanglement

Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image

no code implementations CVPR 2023 Yu Deng, Baoyuan Wang, Heung-Yeung Shum

We introduce a novel detail manifolds reconstructor to learn 3D-consistent fine details on the radiance manifolds from monocular images, and combine them with the coarse radiance manifolds for high-fidelity reconstruction.

Image Generation Novel View Synthesis

Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video

no code implementations CVPR 2023 Xingyu Chen, Baoyuan Wang, Heung-Yeung Shum

We present HandAvatar, a novel representation for hand animation and rendering, which can generate smoothly compositional geometry and self-occlusion-aware texture.


Local-Adaptive Face Recognition via Graph-based Meta-Clustering and Regularized Adaptation

no code implementations CVPR 2022 Wenbin Zhu, Chien-Yi Wang, Kuan-Lun Tseng, Shang-Hong Lai, Baoyuan Wang

Leveraging the environment-specific local data after the deployment of the initial global model, LaFR aims at getting optimal performance by training local-adapted models automatically and un-supervisely, as opposed to fixing their initial global model.

Clustering Face Recognition

Privacy-preserving Online AutoML for Domain-Specific Face Detection

no code implementations CVPR 2022 Chenqian Yan, Yuge Zhang, Quanlu Zhang, Yaming Yang, Xinyang Jiang, Yuqing Yang, Baoyuan Wang

Thanks to HyperFD, each local task (client) is able to effectively leverage the learning "experience" of previous tasks without uploading raw images to the platform; meanwhile, the meta-feature extractor is continuously learned to better trade off the bias and variance.

AutoML Face Detection +1

CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement

no code implementations CVPR 2021 Noranart Vesdapunt, Baoyuan Wang

Our confidence ranker is model-agnostic, so we can augment the data by choosing the pairs from multiple face detectors during the training, and generalize to a wide range of face detectors during the testing.

Face Detection

JNR: Joint-based Neural Rig Representation for Compact 3D Face Modeling

no code implementations ECCV 2020 Noranart Vesdapunt, Mitch Rundle, HsiangTao Wu, Baoyuan Wang

In this paper, we introduce a novel approach to learn a 3D face model using a joint-based face rig and a neural skinning network.

3D Face Modelling Face Model

Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting

no code implementations ECCV 2020 Bindita Chaudhuri, Noranart Vesdapunt, Linda Shapiro, Baoyuan Wang

Traditional methods for image-based 3D face reconstruction and facial motion retargeting fit a 3D morphable model (3DMM) to the face, which has limited modeling capacity and fail to generalize well to in-the-wild data.

3D Face Reconstruction Face Model +1

Animating Face using Disentangled Audio Representations

no code implementations2 Oct 2019 Gaurav Mittal, Baoyuan Wang

All previous methods for audio-driven talking head generation assume the input audio to be clean with a neutral tone.

Representation Learning Talking Head Generation

Joint Face Detection and Facial Motion Retargeting for Multiple Faces

no code implementations CVPR 2019 Bindita Chaudhuri, Noranart Vesdapunt, Baoyuan Wang

Facial motion retargeting is an important problem in both computer graphics and vision, which involves capturing the performance of a human face and transferring it to another 3D character.

3D Face Reconstruction Face Alignment +3

Real-time Burst Photo Selection Using a Light-Head Adversarial Network

no code implementations20 Mar 2018 Baoyuan Wang, Noranart Vesdapunt, Utkarsh Sinha, Lei Zhang

The system is designed to run in the viewfinder mode and capture a burst sequence of frames before and after the shutter is pressed.


Understanding and Predicting The Attractiveness of Human Action Shot

no code implementations2 Nov 2017 Bin Dai, Baoyuan Wang, Gang Hua

Selecting attractive photos from a human action shot sequence is quite challenging, because of the subjective nature of the "attractiveness", which is mainly a combined factor of human pose in action and the background.

Exposure: A White-Box Photo Post-Processing Framework

1 code implementation27 Sep 2017 Yuanming Hu, Hao He, Chenxi Xu, Baoyuan Wang, Stephen Lin

Retouching can significantly elevate the visual appeal of photos, but many casual photographers lack the expertise to do this well.

FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling

1 code implementation CVPR 2017 Yuanming Hu, Baoyuan Wang, Stephen Lin

However, the patch-based CNNs that exist for this problem are faced with the issue of estimation ambiguity, where a patch may contain insufficient information to establish a unique or even a limited possible range of illumination colors.

Color Constancy

Unsupervised Extraction of Video Highlights Via Robust Recurrent Auto-encoders

no code implementations ICCV 2015 Huan Yang, Baoyuan Wang, Stephen Lin, David Wipf, Minyi Guo, Baining Guo

With the growing popularity of short-form video sharing platforms such as \em{Instagram} and \em{Vine}, there has been an increasing need for techniques that automatically extract highlights from video.

Automatic Photo Adjustment Using Deep Neural Networks

1 code implementation24 Dec 2014 Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, Yizhou Yu

Many photographic styles rely on subtle adjustments that depend on the image content and even its semantics.

Photo Retouching

Cannot find the paper you are looking for? You can Submit a new open access paper.