1 code implementation • 2 Apr 2024 • Fangzhou Mu, Sicheng Mo, Yin Li
In this paper, we study the effect of cross-modal fusion on the scalability of video grounding models.
no code implementations • 26 Mar 2024 • Fangzhou Mu, Carter Sifferman, Sacha Jungerman, Yiquan Li, Mark Han, Michael Gleicher, Mohit Gupta, Yin Li
We present a method for reconstructing 3D shape of arbitrary Lambertian objects based on measurements by miniature, energy-efficient, low-cost single-photon cameras.
1 code implementation • 22 Feb 2024 • Zhuoyan Xu, Zhenmei Shi, Junyi Wei, Fangzhou Mu, Yin Li, YIngyu Liang
An emerging solution with recent success in vision and NLP involves finetuning a foundation model on a selection of relevant tasks, before its adaptation to a target task with limited labeled samples.
no code implementations • 12 Dec 2023 • Sicheng Mo, Fangzhou Mu, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, Bolei Zhou
Recent approaches such as ControlNet offer users fine-grained spatial control over text-to-image (T2I) diffusion models.
1 code implementation • 7 Dec 2023 • Tiantian Wang, Xinxin Zuo, Fangzhou Mu, Jian Wang, Ming-Hsuan Yang
To overcome these limitations, we leverage Neural Radiance Fields (NeRFs) to represent videos, conducting stylization in the rendered feature space.
1 code implementation • 5 Jul 2023 • Lin Sui, Fangzhou Mu, Yin Li
This report describes our submission to the Ego4D Moment Queries Challenge 2023.
no code implementations • 25 May 2023 • Zhengyang Lou, Huan Xu, Fangzhou Mu, Yanli Liu, XiaoYu Zhang, Liang Shang, Jiang Li, Bochen Guan, Yin Li, Yu Hen Hu
Using a modern game engine, our approach renders crisp clean images and their precise depth maps, based on which high-quality hazy images can be synthesized for training dehazing models.
1 code implementation • CVPR 2023 • Yuheng Li, Haotian Liu, Qingyang Wu, Fangzhou Mu, Jianwei Yang, Jianfeng Gao, Chunyuan Li, Yong Jae Lee
Large-scale text-to-image diffusion models have made amazing advances.
Ranked #4 on Conditional Text-to-Image Synthesis on COCO-MIG
no code implementations • ICCV 2023 • Felipe Gutierrez-Barragan, Fangzhou Mu, Andrei Ardelean, Atul Ingle, Claudio Bruschini, Edoardo Charbon, Yin Li, Mohit Gupta, Andreas Velten
Single-photon 3D cameras can record the time-of-arrival of billions of photons per second with picosecond accuracy.
2 code implementations • 16 Nov 2022 • Fangzhou Mu, Sicheng Mo, Gillian Wang, Yin Li
This report describes our submission to the Ego4D Moment Queries Challenge 2022.
Ranked #1 on Temporal Action Localization on Ego4D MQ test
1 code implementation • 16 Nov 2022 • Sicheng Mo, Fangzhou Mu, Yin Li
This report describes Badgers@UW-Madison, our submission to the Ego4D Natural Language Queries (NLQ) Challenge.
no code implementations • 3 May 2022 • Fangzhou Mu, Sicheng Mo, Jiayong Peng, Xiaochun Liu, Ji Hyun Nam, Siddeshwar Raghavan, Andreas Velten, Yin Li
Computational approach to imaging around the corner, or non-line-of-sight (NLOS) imaging, is becoming a reality thanks to major advances in imaging hardware and reconstruction algorithms.
no code implementations • CVPR 2022 • ran Xu, Fangzhou Mu, Jayoung Lee, Preeti Mukherjee, Somali Chaterji, Saurabh Bagchi, Yin Li
In this paper, we ask, and answer, the wide-ranging question across all MBODFs: How to expose the right set of execution branches and then how to schedule the optimal one at inference time?
no code implementations • CVPR 2022 • Fangzhou Mu, Jian Wang, Yicheng Wu, Yin Li
Our key intuition is that style transfer and view synthesis have to be jointly modeled for this task.
no code implementations • 16 Sep 2021 • Jiayong Peng, Fangzhou Mu, Ji Hyun Nam, Siddeshwar Raghavan, Yin Li, Andreas Velten, Zhiwei Xiong
Non-line-of-sight (NLOS) imaging is based on capturing the multi-bounce indirect reflections from the hidden objects.
no code implementations • ICLR 2020 • Fangzhou Mu, YIngyu Liang, Yin Li
We address the challenging problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks.