Search Results for author: Yuan Dong

Found 25 papers, 7 papers with code

Is Foreground Prototype Sufficient? Few-Shot Medical Image Segmentation with Background-Fused Prototype

no code implementations4 Dec 2024 Song Tang, Chunxiao Zu, Wenxin Su, Yuan Dong, Mao Ye, Yan Gan, Xiatian Zhu

However, this paradigm is not applicable to medical images where the foreground and background share numerous visual features, necessitating a more detailed description for background.

Few-Shot Semantic Segmentation Image Segmentation +2

MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling

no code implementations26 Sep 2024 Weihao Yuan, Weichao Shen, Yisheng He, Yuan Dong, Xiaodong Gu, Zilong Dong, Liefeng Bo, QiXing Huang

Motion generation from discrete quantization offers many advantages over continuous regression, but at the cost of inevitable approximation errors.

Motion Generation Quantization

Atlas Gaussians Diffusion for 3D Generation

no code implementations23 Aug 2024 Haitao Yang, Yuan Dong, Hanwen Jiang, Dejia Xu, Georgios Pavlakos, QiXing Huang

Using the latent diffusion model has proven effective in developing novel 3D generation techniques.

3D Generation

An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes

no code implementations22 Mar 2024 Zhengyi Zhao, Chen Song, Xiaodong Gu, Yuan Dong, Qi Zuo, Weihao Yuan, Liefeng Bo, Zilong Dong, QiXing Huang

In particular, the third and fourth stages are iterated, with the cuts obtained in the fourth stage encouraging non-rigid alignment in the third stage to focus on regions close to the cuts.

VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model

no code implementations18 Mar 2024 Qi Zuo, Xiaodong Gu, Lingteng Qiu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Rui Peng, Siyu Zhu, Zilong Dong, Liefeng Bo, QiXing Huang

Images from video generative models are more suitable for multi-view generation because the underlying network architecture that generates them employs a temporal module to enforce frame consistency.

Denoising

URHand: Universal Relightable Hands

no code implementations CVPR 2024 Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito

To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities.

GPLD3D: Latent Diffusion of 3D Shape Generative Models by Enforcing Geometric and Physical Priors

no code implementations CVPR 2024 Yuan Dong, Qi Zuo, Xiaodong Gu, Weihao Yuan, Zhengyi Zhao, Zilong Dong, Liefeng Bo, QiXing Huang

The key to our approach is a new diffusion procedure that combines the discrete empirical data distribution and a continuous distribution induced by the quality checker.

Denoising

Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs

1 code implementation16 Dec 2023 Aodong Chen, Fei Xu, Li Han, Yuan Dong, Li Chen, Zhi Zhou, Fangming Liu

GPUs have become the \emph{defacto} hardware devices for accelerating Deep Neural Network (DNN) inference workloads.

Scheduling

Word4Per: Zero-shot Composed Person Retrieval

1 code implementation25 Nov 2023 Delong Liu, Haiwen Li, Zhicheng Zhao, Fei Su, Yuan Dong

Searching for specific person has great social benefits and security value, and it often involves a combination of visual and textual information.

 Ranked #1 on Zero-shot Composed Person Retrieval on ITCPR dataset (using extra training data)

Person Retrieval Retrieval +3

Ctrl-Room: Controllable Text-to-3D Room Meshes Generation with Layout Constraints

no code implementations5 Oct 2023 Chuan Fang, Yuan Dong, Kunming Luo, Xiaotao Hu, Rakesh Shrestha, Ping Tan

Next, the Appearance Generation Stage employs a fine-tuned ControlNet to produce a vivid panoramic image of the room guided by the 3D scene layout and text prompt.

Scene Generation Text to 3D

Text-guided Image Restoration and Semantic Enhancement for Text-to-Image Person Retrieval

1 code implementation18 Jul 2023 Delong Liu, Haiwen Li, Zhicheng Zhao, Yuan Dong, Nikolaos V. Boulgouris

To address this issue, we propose a novel TIPR framework to build fine-grained interactions and alignment between person images and the corresponding texts.

cross-modal alignment Data Augmentation +4

PanoContext-Former: Panoramic Total Scene Understanding with a Transformer

no code implementations CVPR 2024 Yuan Dong, Chuan Fang, Liefeng Bo, Zilong Dong, Ping Tan

Panoramic image enables deeper understanding and more holistic perception of $360^\circ$ surrounding environment, which can naturally encode enriched scene context information compared to standard perspective image.

3D Object Detection object-detection +1

PanoViT: Vision Transformer for Room Layout Estimation from a Single Panoramic Image

no code implementations23 Dec 2022 Weichao Shen, Yuan Dong, Zonghao Chen, Zhengyi Zhao, Yang Gao, Zhu Liu

In this paper, we propose PanoViT, a panorama vision transformer to estimate the room layout from a single panoramic image.

Position Room Layout Estimation

Caption Feature Space Regularization for Audio Captioning

1 code implementation18 Apr 2022 Yiming Zhang, Hong Yu, Ruoyi Du, Zhanyu Ma, Yuan Dong

To eliminate this negative effect, in this paper, we propose a two-stage framework for audio captioning: (i) in the first stage, via the contrastive learning, we construct a proxy feature space to reduce the distances between captions correlated to the same audio, and (ii) in the second stage, the proxy feature space is utilized as additional supervision to encourage the model to be optimized in the direction that benefits all the correlated captions.

Audio captioning Contrastive Learning +1

Modeling Clothing as a Separate Layer for an Animatable Human Avatar

no code implementations28 Jun 2021 Donglai Xiang, Fabian Prada, Timur Bagautdinov, Weipeng Xu, Yuan Dong, He Wen, Jessica Hodgins, Chenglei Wu

To address these difficulties, we propose a method to build an animatable clothed body avatar with an explicit representation of the clothing on the upper body from multi-view captured videos.

Inverse Rendering

TLRM: Task-level Relation Module for GNN-based Few-Shot Learning

no code implementations25 Jan 2021 Yurong Guo, Zhanyu Ma, Xiaoxu Li, Yuan Dong

We consider this method of measuring relation of samples only models the sample-to-sample relation, while neglects the specificity of different tasks.

Few-Shot Learning Relation +1

Inverse Structural Design of Graphene/Boron Nitride Hybrids by Regressional GAN

1 code implementation21 Aug 2019 Yuan Dong, Dawei Li, Chi Zhang, Chuhan Wu, Hong Wang, Ming Xin, Jianlin Cheng, Jian Lin

A significant novelty of the proposed RGAN is that it combines the supervised and regressional convolutional neural network (CNN) with the traditional unsupervised GAN, thus overcoming the common technical barrier in the traditional GANs, which cannot generate data associated with given continuous quantitative labels.

Computational Physics Materials Science Applied Physics

MSFD:Multi-Scale Receptive Field Face Detector

no code implementations11 Mar 2019 Qiushan Guo, Yuan Dong, Yu Guo, Hongliang Bai

We simultaneously propose an anchor assignment strategy which can cover faces with a wide range of scales to improve the recall rate of small faces and rotated faces.

Multi-hierarchical Independent Correlation Filters for Visual Tracking

1 code implementation26 Nov 2018 Shuai Bai, Zhiqun He, Ting-Bing Xu, Zheng Zhu, Yuan Dong, Hongliang Bai

For visual tracking, most of the traditional correlation filters (CF) based methods suffer from the bottleneck of feature redundancy and lack of motion information.

Motion Estimation Visual Object Tracking +1

Deep Learning Bandgaps of Topologically Doped Graphene

no code implementations28 Sep 2018 Yuan Dong, Chuhan Wu, Chi Zhang, Yingda Liu, Jianlin Cheng, Jian Lin

Moreover, given ubiquitous existence of topologies in materials, this work will stimulate widespread interests in applying deep learning algorithms to topological design of materials crossing atomic, nano-, meso-, and macro- scales.

Materials Science Computational Physics

BodyFusion: Real-Time Capture of Human Motion and Surface Geometry Using a Single Depth Camera

no code implementations ICCV 2017 Tao Yu, Kaiwen Guo, Feng Xu, Yuan Dong, Zhaoqi Su, Jianhui Zhao, Jianguo Li, Qionghai Dai, Yebin Liu

To reduce the ambiguities of the non-rigid deformation parameterization on the surface graph nodes, we take advantage of the internal articulated motion prior for human performance and contribute a skeleton-embedded surface fusion (SSF) method.

Surface Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.