Search Results for author: Yonglong Tian

Found 34 papers, 17 papers with code

Self-Correcting Self-Consuming Loops for Generative Model Training

1 code implementation11 Feb 2024 Nate Gillman, Michael Freeman, Daksh Aggarwal, Chia-Hong Hsu, Calvin Luo, Yonglong Tian, Chen Sun

As synthetic data becomes higher quality and proliferates on the internet, machine learning models are increasingly trained on a mix of human- and machine-generated data.

Motion Synthesis Representation Learning

Denoising Vision Transformers

no code implementations5 Jan 2024 Jiawei Yang, Katie Z Luo, Jiefeng Li, Kilian Q Weinberger, Yonglong Tian, Yue Wang

Our two-stage approach, termed Denoising Vision Transformers (DVT), does not require re-training existing pre-trained ViTs and is immediately applicable to any Transformer-based architecture.

Denoising

Learning Vision from Models Rivals Learning Vision from Data

1 code implementation28 Dec 2023 Yonglong Tian, Lijie Fan, KaiFeng Chen, Dina Katabi, Dilip Krishnan, Phillip Isola

We introduce SynCLR, a novel approach for learning visual representations exclusively from synthetic images and synthetic captions, without any real data.

Contrastive Learning Image Captioning +3

Scaling Laws of Synthetic Images for Model Training ... for Now

1 code implementation7 Dec 2023 Lijie Fan, KaiFeng Chen, Dilip Krishnan, Dina Katabi, Phillip Isola, Yonglong Tian

Our findings also suggest that scaling synthetic data can be particularly effective in scenarios such as: (1) when there is a limited supply of real images for a supervised problem (e. g., fewer than 0. 5 million images in ImageNet), (2) when the evaluation dataset diverges significantly from the training data, indicating the out-of-distribution scenario, or (3) when synthetic data is used in conjunction with real images, as demonstrated in the training of CLIP models.

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency

no code implementations5 Oct 2023 Tianhong Li, Sangnie Bhardwaj, Yonglong Tian, Han Zhang, Jarred Barber, Dina Katabi, Guillaume Lajoie, Huiwen Chang, Dilip Krishnan

We demonstrate image generation and captioning performance on par with state-of-the-art text-to-image and image-to-text models with orders of magnitude fewer (only 3M) paired image-text data.

Restart Sampling for Improving Generative Processes

1 code implementation NeurIPS 2023 Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, Tommi Jaakkola

Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet $64 \times 64$.

Attribute

Improving CLIP Training with Language Rewrites

1 code implementation NeurIPS 2023 Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian

During training, LaCLIP randomly selects either the original texts or the rewritten versions as text augmentations for each image.

In-Context Learning Sentence

Does Learning from Decentralized Non-IID Unlabeled Data Benefit from Self Supervision?

1 code implementation20 Oct 2022 Lirui Wang, Kaiqing Zhang, Yunzhu Li, Yonglong Tian, Russ Tedrake

Decentralized learning has been advocated and widely deployed to make efficient use of distributed datasets, with an extensive focus on supervised learning (SL) problems.

Contrastive Learning Representation Learning +1

Self-supervision through Random Segments with Autoregressive Coding (RandSAC)

no code implementations22 Mar 2022 Tianyu Hua, Yonglong Tian, Sucheng Ren, Michalis Raptis, Hang Zhao, Leonid Sigal

We illustrate that randomized serialization of the segments significantly improves the performance and results in distribution over spatially-long (across-segments) and -short (within-segment) predictions which are effective for feature learning.

Representation Learning Self-Supervised Learning

Co-advise: Cross Inductive Bias Distillation

no code implementations CVPR 2022 Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao

Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks.

Inductive Bias

Simple Distillation Baselines for Improving Small Self-supervised Models

1 code implementation21 Jun 2021 Jindong Gu, Wei Liu, Yonglong Tian

While large self-supervised models have rivalled the performance of their supervised counterparts, small models still struggle.

Generative Models as a Data Source for Multiview Representation Learning

1 code implementation ICLR 2022 Ali Jahanian, Xavier Puig, Yonglong Tian, Phillip Isola

We investigate this question in the setting of learning general-purpose visual representations from a black-box generative model rather than directly from data.

Representation Learning

Divide and Contrast: Self-supervised Learning from Uncurated Data

no code implementations ICCV 2021 Yonglong Tian, Olivier J. Henaff, Aaron van den Oord

Self-supervised learning holds promise in leveraging large amounts of unlabeled data, however much of its progress has thus far been limited to highly curated pre-training data such as ImageNet.

Clustering Contrastive Learning +2

Addressing Feature Suppression in Unsupervised Visual Representations

no code implementations17 Dec 2020 Tianhong Li, Lijie Fan, Yuan Yuan, Hao He, Yonglong Tian, Rogerio Feris, Piotr Indyk, Dina Katabi

However, contrastive learning is susceptible to feature suppression, i. e., it may discard important information relevant to the task of interest, and learn irrelevant features.

Attribute Contrastive Learning +1

What Makes for Good Views for Contrastive Learning?

1 code implementation NeurIPS 2020 Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, Phillip Isola

Contrastive learning between multiple views of the data has recently achieved state of the art performance in the field of self-supervised representation learning.

Contrastive Learning Data Augmentation +8

Supervised Contrastive Learning

23 code implementations NeurIPS 2020 Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan

Contrastive learning applied to self-supervised representation learning has seen a resurgence in recent years, leading to state of the art performance in the unsupervised training of deep image models.

Class Incremental Learning Contrastive Learning +4

Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need?

3 code implementations ECCV 2020 Yonglong Tian, Yue Wang, Dilip Krishnan, Joshua B. Tenenbaum, Phillip Isola

The focus of recent meta-learning research has been on the development of learning algorithms that can quickly adapt to test time tasks with limited data and low computational cost.

Few-Shot Image Classification Few-Shot Learning +2

Training-Free Uncertainty Estimation for Dense Regression: Sensitivity as a Surrogate

no code implementations28 Sep 2019 Lu Mi, Hao Wang, Yonglong Tian, Hao He, Nir Shavit

Uncertainty estimation is an essential step in the evaluation of the robustness for deep learning models in computer vision, especially when applied in risk-sensitive areas.

regression

Contrastive Multiview Coding

8 code implementations ECCV 2020 Yonglong Tian, Dilip Krishnan, Phillip Isola

We analyze key properties of the approach that make it work, finding that the contrastive loss outperforms a popular alternative based on cross-view prediction, and that the more views we learn from, the better the resulting representation captures underlying scene semantics.

Contrastive Learning Self-Supervised Action Recognition +1

ProbGAN: Towards Probabilistic GAN with Theoretical Guarantees

1 code implementation ICLR 2019 Hao He, Hao Wang, Guang-He Lee, Yonglong Tian

Probabilistic modelling is a principled framework to perform model aggregation, which has been a primary mechanism to combat mode collapse in the context of Generative Adversarial Networks (GAN).

Image Generation

Learning to Infer and Execute 3D Shape Programs

no code implementations ICLR 2019 Yonglong Tian, Andrew Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu

Human perception of 3D shapes goes beyond reconstructing them as a set of points or a composition of geometric primitives: we also effortlessly understand higher-level shape structure such as the repetition and reflective symmetry of object parts.

Representation Learning on Graphs with Jumping Knowledge Networks

4 code implementations ICML 2018 Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, Stefanie Jegelka

Furthermore, combining the JK framework with models like Graph Convolutional Networks, GraphSAGE and Graph Attention Networks consistently improves those models' performance.

Graph Attention Node Classification +2

Through-Wall Human Pose Estimation Using Radio Signals

no code implementations CVPR 2018 Ming-Min Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, Dina Katabi

Yet, unlike vision-based pose estimation, the radio-based system can estimate 2D poses through walls despite never trained on such scenarios.

RF-based Pose Estimation

Deep Learning Strong Parts for Pedestrian Detection

no code implementations ICCV 2015 Yonglong Tian, Ping Luo, Xiaogang Wang, Xiaoou Tang

Third, each part detector in DeepParts is a strong detector that can detect pedestrian by observing only a part of a proposal.

Occlusion Handling Pedestrian Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

no code implementations CVPR 2015 Yonglong Tian, Ping Luo, Xiaogang Wang, Xiaoou Tang

Rather than expensively annotating scene attributes, we transfer attributes information from existing scene segmentation datasets to the pedestrian dataset, by proposing a novel deep model to learn high-level features from multiple tasks and multiple data sources.

Pedestrian Detection Scene Segmentation

DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection

no code implementations11 Sep 2014 Wanli Ouyang, Ping Luo, Xingyu Zeng, Shi Qiu, Yonglong Tian, Hongsheng Li, Shuo Yang, Zhe Wang, Yuanjun Xiong, Chen Qian, Zhenyao Zhu, Ruohui Wang, Chen-Change Loy, Xiaogang Wang, Xiaoou Tang

In the proposed new deep architecture, a new deformation constrained pooling (def-pooling) layer models the deformation of object parts with geometric constraint and penalty.

Object object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.