Search Results for author: Di Xu

Found 29 papers, 4 papers with code

EVLM: An Efficient Vision-Language Model for Visual Understanding

no code implementations19 Jul 2024 Kaibing Chen, Dong Shen, Hanwen Zhong, Huasong Zhong, Kui Xia, Di Xu, Wei Yuan, Yifei Hu, Bin Wen, Tianke Zhang, Changyi Liu, Dewen Fan, Huihui Xiao, JiaHong Wu, Fan Yang, Size Li, Di Zhang

However, when dealing with long sequences of visual signals or inputs such as videos, the self-attention mechanism of language models can lead to significant computational overhead.

Image Captioning Language Modeling +2

Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture

no code implementations1 Jun 2024 Xuanchen Li, Yuhao Cheng, Xingyu Ren, Haozhe Jia, Di Xu, Wenhan Zhu, Yichao Yan

To simplify this process, we propose Topo4D, a novel framework for automatic geometry and texture generation, which optimizes densely aligned 4D heads and 8K texture maps directly from calibrated multi-view time-series images.

8k Face Reconstruction +2

Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation

no code implementations12 Mar 2024 Likun Li, Haoqi Zeng, Changpeng Yang, Haozhe Jia, Di Xu

The objective of personalization and stylization in text-to-image is to instruct a pre-trained diffusion model to analyze new concepts introduced by users and incorporate them into expected styles.

parameter-efficient fine-tuning Text-to-Image Generation

3D-Aware Face Editing via Warping-Guided Latent Direction Learning

no code implementations CVPR 2024 Yuhao Cheng, Zhuo Chen, Xingyu Ren, Wenhan Zhu, Zhengqin Xu, Di Xu, Changpeng Yang, Yichao Yan

To address the problem of distortion caused by tri-plane warping we train a warp-aware encoder to project the warped face onto a standardized latent space.

Attribute Facial Editing

DisControlFace: Adding Disentangled Control to Diffusion Autoencoder for One-shot Explicit Facial Image Editing

no code implementations11 Dec 2023 Haozhe Jia, Yan Li, Hengfei Cui, Di Xu, Yuwang Wang, Tao Yu

We identify the key challenge as the exploration of disentangled conditional control between high-level semantics and explicit parameters (e. g., 3DMM) in the generation process, and accordingly propose a novel diffusion-based editing framework, named DisControlFace.

HHAvatar: Gaussian Head Avatar with Dynamic Hairs

1 code implementation CVPR 2024 Zhanfeng Liao, Yuelang Xu, Zhe Li, Qijing Li, Boyao Zhou, Ruifeng Bai, Di Xu, Hongwen Zhang, Yebin Liu

To address the problem of dynamic hair modeling, we introduce a hybrid head model into our avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair.

2k

A Two-Step Framework for Multi-Material Decomposition of Dual Energy Computed Tomography from Projection Domain

no code implementations31 Oct 2023 Di Xu, Qihui Lyu, Dan Ruan, Ke Sheng

Deep learning (DL) methods have shown promise to improve the MMD performance, but typical approaches of conducing DL-MMD in the image domain fail to fully utilize projection information or under iterative setup are computationally inefficient in both training and prediction.

Benchmarking Domain Adaptation +1

PARF: Primitive-Aware Radiance Fusion for Indoor Scene Novel View Synthesis

no code implementations ICCV 2023 Haiyang Ying, Baowei Jiang, Jinzhi Zhang, Di Xu, Tao Yu, Qionghai Dai, Lu Fang

This paper proposes a method for fast scene radiance field reconstruction with strong novel view synthesis performance and convenient scene editing functionality.

Novel View Synthesis Semantic Parsing

Learning Dynamic MRI Reconstruction with Convolutional Network Assisted Reconstruction Swin Transformer

no code implementations19 Sep 2023 Di Xu, Hengjie Liu, Dan Ruan, Ke Sheng

Dynamic magnetic resonance imaging (DMRI) is an effective imaging tool for diagnosis tasks that require motion tracking of a certain anatomy.

Anatomy Computational Efficiency +3

Pink-Eggs Dataset V1: A Step Toward Invasive Species Management Using Deep Learning Embedded Solutions

no code implementations16 May 2023 Di Xu, Yang Zhao, Xiang Hao, Xin Meng

We introduce a novel dataset consisting of images depicting pink eggs that have been identified as Pomacea canaliculata eggs, accompanied by corresponding bounding box annotations.

Management

LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

1 code implementation8 May 2023 Peng Xia, Di Xu, Ming Hu, Lie Ju, ZongYuan Ge

Long-tailed multi-label visual recognition (LTML) task is a highly challenging task due to the label co-occurrence and imbalanced data distribution.

 Ranked #1 on Long-tail Learning on COCO-MLT (using extra training data)

Long-tail Learning

A Survey on Deep Neural Network Partition over Cloud, Edge and End Devices

no code implementations20 Apr 2023 Di Xu, Xiang He, Tonghua Su, Zhongjie Wang

This paper provides a comprehensive survey on the recent advances and challenges in DNN partition approaches over the cloud, edge, and end devices based on a detailed literature collection.

Edge-computing

Boosting Video-Text Retrieval with Explicit High-Level Semantics

no code implementations8 Aug 2022 Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding

Video-text retrieval (VTR) is an attractive yet challenging task for multi-modal understanding, which aims to search for relevant video (text) given a query (video).

Text Retrieval Video Captioning +2

SimViT: Exploring a Simple Vision Transformer with sliding windows

2 code implementations24 Dec 2021 Gang Li, Di Xu, Xing Cheng, Lingyu Si, Changwen Zheng

Although vision Transformers have achieved excellent performance as backbone models in many vision tasks, most of them intend to capture global relations of all tokens in an image or a window, which disrupts the inherent spatial and local correlations between patches in 2D structure.

IllumiNet: Transferring Illumination from Planar Surfaces to Virtual Objects in Augmented Reality

no code implementations12 Jul 2020 Di Xu, Zhen Li, Yanning Zhang, Qi Cao

This paper presents an illumination estimation method for virtual objects in real environment by learning.

Using generative adversarial networks to synthesize artificial financial datasets

no code implementations6 Feb 2020 Dmitry Efimov, Di Xu, Luyang Kong, Alexey Nefedov, Archana Anandakrishnan

Generative Adversarial Networks (GANs) became very popular for generation of realistically looking images.

Benchmarking

LSTM-Assisted Evolutionary Self-Expressive Subspace Clustering

no code implementations19 Oct 2019 Di Xu, Tianhang Long, Junbin Gao

Massive volumes of high-dimensional data that evolves over time is continuously collected by contemporary information processing systems, which brings up the problem of organizing this data into clusters, i. e. achieve the purpose of dimensional deduction, and meanwhile learning its temporal evolution patterns.

Clustering

Sparse Least Squares Low Rank Kernel Machines

no code implementations29 Jan 2019 Di Xu, Manjing Fang, Xia Hong, Junbin Gao

A general framework of least squares support vector machine with low rank kernels, referred to as LR-LSSVM, is introduced in this paper.

Computational Efficiency

Recovering Surface Details under General Unknown Illumination Using Shading and Coarse Multi-view Stereo

no code implementations CVPR 2014 Di Xu, Qi Duan, Jianming Zheng, Juyong Zhang, Jianfei Cai, Tat-Jen Cham

As a result, our approach is robust, stable and is able to efficiently recover high quality of surface details even starting with a coarse MVS.

Cannot find the paper you are looking for? You can Submit a new open access paper.