Search Results for author: Qingkun Su

Found 4 papers, 0 papers with code

MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration

no code implementations22 Mar 2024 Zhichao Wei, Qingkun Su, Long Qin, Weizhi Wang

CLS embeddings are used on the one hand to augment the text embeddings, and on the other hand together with patch embeddings to derive a small number of detail-rich subject embeddings, both of which are efficiently integrated into the diffusion model through the well-designed multimodal cross-attention mechanism.

Image Generation

Fine-grained Text-Video Retrieval with Frozen Image Encoders

no code implementations14 Jul 2023 Zuozhuo Dai, Fangtao Shao, Qingkun Su, Zilong Dong, Siyu Zhu

In the second stage, we propose a novel decoupled video text cross attention module to capture fine-grained multimodal information in spatial and temporal dimensions.

Retrieval Video Retrieval

MeshMVS: Multi-View Stereo Guided Mesh Reconstruction

no code implementations17 Oct 2020 Rakesh Shrestha, Zhiwen Fan, Qingkun Su, Zuozhuo Dai, Siyu Zhu, Ping Tan

Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects and guide the shape generation process.

3D Shape Generation

Sketch-R2CNN: An Attentive Network for Vector Sketch Recognition

no code implementations20 Nov 2018 Lei Li, Changqing Zou, Youyi Zheng, Qingkun Su, Hongbo Fu, Chiew-Lan Tai

To bridge the gap between these two spaces in neural networks, we propose a neural line rasterization module to convert the vector sketch along with the attention estimated by RNN into a bitmap image, which is subsequently consumed by CNN.

Sketch Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.