no code implementations • 1 Aug 2024 • Benlin Liu, Yuhao Dong, Yiqin Wang, Zixian Ma, Yansong Tang, Luming Tang, Yongming Rao, Wei-Chiu Ma, Ranjay Krishna
Multimodal language models (MLLMs) are increasingly being applied in real-world environments, necessitating their ability to interpret 3D spaces and comprehend temporal dynamics.
no code implementations • 28 Sep 2023 • Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein
Once personalized, RealFill is able to complete a target image with visually compelling contents that are faithful to the original scene.
2 code implementations • NeurIPS 2023 • Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, Bharath Hariharan
We propose a simple strategy to extract this implicit knowledge out of diffusion networks as image features, namely DIffusion FeaTures (DIFT), and use them to establish correspondences between real images.
1 code implementation • CVPR 2023 • Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin
DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results.
Ranked #2 on Text to 3D on T$^3$Bench
no code implementations • 7 Jul 2022 • Davis Wertheimer, Luming Tang, Bharath Hariharan
Existing approaches generally assume that the shot number at test time is known in advance.
6 code implementations • 23 Mar 2022 • Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, Ser-Nam Lim
The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning.
Ranked #2 on Prompt Engineering on ImageNet-21k
1 code implementation • CVPR 2021 • Davis Wertheimer, Luming Tang, Bharath Hariharan
In this paper we reformulate few-shot classification as a reconstruction problem in latent space.
1 code implementation • CVPR 2020 • Luming Tang, Davis Wertheimer, Bharath Hariharan
Few-shot, fine-grained classification requires a model to learn subtle, fine-grained distinctions between different classes (e. g., birds) based on a few images alone.
no code implementations • ICCV 2017 • Zhongdao Wang, Luming Tang, Xihui Liu, Zhuliang Yao, Shuai Yi, Jing Shao, Junjie Yan, Shengjin Wang, Hongsheng Li, Xiaogang Wang
In our vehicle ReID framework, an orientation invariant feature embedding module and a spatial-temporal regularization module are proposed.
no code implementations • 17 Sep 2017 • Luming Tang, Yexiang Xue, Di Chen, Carla P. Gomes
Multi-Entity Dependence Learning (MEDL) explores conditional correlations among multiple entities.
1 code implementation • 11 Jul 2017 • Luming Tang, Boyang Deng, Haiyu Zhao, Shuai Yi
The proposed framework contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part.