no code implementations • 17 Mar 2025 • Lijie Fan, Luming Tang, Siyang Qin, Tianhong Li, Xuan Yang, Siyuan Qiao, Andreas Steiner, Chen Sun, Yuanzhen Li, Tao Zhu, Michael Rubinstein, Michalis Raptis, Deqing Sun, Radu Soricut
We present UniFluid, a unified autoregressive framework for joint visual generation and understanding leveraging continuous visual tokens.
1 code implementation • 24 Feb 2025 • Tianhong Li, Qinyi Sun, Lijie Fan, Kaiming He
In this paper, we introduce a new level of modularization by abstracting generative models into atomic generative modules.
no code implementations • 13 Feb 2025 • Yuhui Zhang, Yuchang Su, Chenyu Wang, Tianhong Li, Zoe Wefers, Jeffrey Nirschl, James Burgess, Daisy Ding, Alejandro Lozano, Emma Lundberg, Serena Yeung-Levy
Building a virtual cell capable of accurately simulating cellular behaviors in silico has long been a dream in computational biology.
1 code implementation • 17 Oct 2024 • Lijie Fan, Tianhong Li, Siyang Qin, Yuanzhen Li, Chen Sun, Michael Rubinstein, Deqing Sun, Kaiming He, Yonglong Tian
Models based on continuous tokens achieve significantly better visual quality than those using discrete tokens.
Ranked #9 on
Text-to-Image Generation
on GenEval
1 code implementation • 17 Jun 2024 • Tianhong Li, Yonglong Tian, He Li, Mingyang Deng, Kaiming He
In this work, we propose to model the per-token probability distribution using a diffusion procedure, which allows us to apply autoregressive models in a continuous-valued space.
Ranked #14 on
Image Generation
on ImageNet 512x512
1 code implementation • 6 Dec 2023 • Tianhong Li, Dina Katabi, Kaiming He
This gap can be attributed to the lack of semantic information provided by labels.
Ranked #1 on
Unconditional Image Generation
on ImageNet 256x256
no code implementations • 5 Oct 2023 • Tianhong Li, Sangnie Bhardwaj, Yonglong Tian, Han Zhang, Jarred Barber, Dina Katabi, Guillaume Lajoie, Huiwen Chang, Dilip Krishnan
We demonstrate image generation and captioning performance on par with state-of-the-art text-to-image and image-to-text models with orders of magnitude fewer (only 3M) paired image-text data.
no code implementations • 23 May 2023 • Tianhong Li, Vibhaalakshmi Sivaraman, Pantea Karimi, Lijie Fan, Mohammad Alizadeh, Dina Katabi
Packet loss during video conferencing often results in poor quality and video freezing.
1 code implementation • CVPR 2023 • Tianhong Li, Huiwen Chang, Shlok Kumar Mishra, Han Zhang, Dina Katabi, Dilip Krishnan
In this work, we propose MAsked Generative Encoder (MAGE), the first framework to unify SOTA image generation and self-supervised representation learning.
Ranked #3 on
Unconditional Image Generation
on ImageNet 256x256
no code implementations • 6 Jul 2022 • Tianhong Li, Lijie Fan, Yuan Yuan, Dina Katabi
Thus, in this paper, we explore the feasibility of adapting RGB-based unsupervised representation learning to RF signals.
1 code implementation • CVPR 2022 • Tianhong Li, Peng Cao, Yuan Yuan, Lijie Fan, Yuzhe Yang, Rogerio Feris, Piotr Indyk, Dina Katabi
This forces all classes, including minority classes, to maintain a uniform distribution in the feature space, improves class boundaries, and provides better generalization even in the presence of long-tail data.
Ranked #24 on
Long-tail Learning
on CIFAR-10-LT (ρ=100)
no code implementations • 17 Dec 2020 • Tianhong Li, Lijie Fan, Yuan Yuan, Hao He, Yonglong Tian, Rogerio Feris, Piotr Indyk, Dina Katabi
However, contrastive learning is susceptible to feature suppression, i. e., it may discard important information relevant to the task of interest, and learn irrelevant features.
no code implementations • ECCV 2020 • Lijie Fan, Tianhong Li, Yuan Yuan, Dina Katabi
This paper aims to caption daily life --i. e., to create a textual description of people's activities and interactions with objects in their homes.
no code implementations • CVPR 2020 • Lijie Fan, Tianhong Li, Rongyao Fang, Rumen Hristov, Yuan Yuan, Dina Katabi
RF signals traverse clothes and reflect off the human body; thus they can be used to extract more persistent human-identifying features like body size and shape.
no code implementations • ICCV 2019 • Tianhong Li, Lijie Fan, Ming-Min Zhao, Yingcheng Liu, Dina Katabi
Understanding people's actions and interactions typically depends on seeing them.
Ranked #1 on
RF-based Pose Estimation
on RF-MMD
1 code implementation • CVPR 2020 • Tianhong Li, Jianguo Li, Zhuang Liu, Chang-Shui Zhang
Deep neural network compression techniques such as pruning and weight tensor decomposition usually require fine-tuning to recover the prediction accuracy when the compression ratio is high.
no code implementations • 27 Sep 2018 • Tianhong Li, Jianguo Li, Zhuang Liu, ChangShui Zhang
Taking the assumption that both "teacher" and "student" have the same feature map sizes at each corresponding block, we add a $1\times 1$ conv-layer at the end of each block in the student-net, and align the block-level outputs between "teacher" and "student" by estimating the parameters of the added layer with limited samples.
no code implementations • SIGCOMM '18 Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication 2018 • Ming-Min Zhao, Yonglong Tian, Hang Zhao, Mohammad Abu Alsheikh, Tianhong Li, Rumen Hristov, Zachary Kabelac, Dina Katabi, Antonio Torralba
It maintains this accuracy even in the presence of multiple people, and in new environments that it has not seen in the training set.
no code implementations • CVPR 2018 • Ming-Min Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, Dina Katabi
Yet, unlike vision-based pose estimation, the radio-based system can estimate 2D poses through walls despite never trained on such scenarios.
7 code implementations • ICLR 2018 • Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, Kilian Q. Weinberger
In this paper we investigate image classification with computational resource limits at test time.
General Classification
Handwritten Mathmatical Expression Recognition
+1
no code implementations • 18 Feb 2017 • Lunjia Hu, Ruihan Wu, Tianhong Li, Li-Wei Wang
The RTD of a concept class $\mathcal C \subseteq \{0, 1\}^n$, introduced by Zilles et al. (2011), is a combinatorial complexity measure characterized by the worst-case number of examples necessary to identify a concept in $\mathcal C$ according to the recursive teaching model.