Search Results for author: Luming Tang

Found 11 papers, 6 papers with code

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model

no code implementations1 Aug 2024 Benlin Liu, Yuhao Dong, Yiqin Wang, Zixian Ma, Yansong Tang, Luming Tang, Yongming Rao, Wei-Chiu Ma, Ranjay Krishna

Multimodal language models (MLLMs) are increasingly being applied in real-world environments, necessitating their ability to interpret 3D spaces and comprehend temporal dynamics.

EgoSchema Language Modeling +3

RealFill: Reference-Driven Generation for Authentic Image Completion

no code implementations28 Sep 2023 Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein

Once personalized, RealFill is able to complete a target image with visually compelling contents that are faithful to the original scene.

Emergent Correspondence from Image Diffusion

2 code implementations NeurIPS 2023 Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, Bharath Hariharan

We propose a simple strategy to extract this implicit knowledge out of diffusion networks as image features, namely DIffusion FeaTures (DIFT), and use them to establish correspondences between real images.

Semantic correspondence

Magic3D: High-Resolution Text-to-3D Content Creation

1 code implementation CVPR 2023 Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results.

Text to 3D Vocal Bursts Intensity Prediction

Diagnosing and Remedying Shot Sensitivity with Cosine Few-Shot Learners

no code implementations7 Jul 2022 Davis Wertheimer, Luming Tang, Bharath Hariharan

Existing approaches generally assume that the shot number at test time is known in advance.

Novel Concepts

Visual Prompt Tuning

6 code implementations23 Mar 2022 Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, Ser-Nam Lim

The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning.

Image Classification Long-tail Learning +2

Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition

1 code implementation CVPR 2020 Luming Tang, Davis Wertheimer, Bharath Hariharan

Few-shot, fine-grained classification requires a model to learn subtle, fine-grained distinctions between different classes (e. g., birds) based on a few images alone.

Classification General Classification

Multi-Entity Dependence Learning with Rich Context via Conditional Variational Auto-encoder

no code implementations17 Sep 2017 Luming Tang, Yexiang Xue, Di Chen, Carla P. Gomes

Multi-Entity Dependence Learning (MEDL) explores conditional correlations among multiple entities.

Hierarchical Deep Recurrent Architecture for Video Understanding

1 code implementation11 Jul 2017 Luming Tang, Boyang Deng, Haiyu Zhao, Shuai Yi

The proposed framework contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part.

Classification General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.