Search Results for author: Yutong He

Found 27 papers, 10 papers with code

Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

no code implementations30 Nov 2024 Michail Dontas, Yutong He, Naoki Murata, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov

Blind inverse problems, where both the target data and forward operator are unknown, are crucial to many computer vision applications.

Image Restoration

Subspace Optimization for Large Language Models with Convergence Guarantees

1 code implementation15 Oct 2024 Yutong He, Pengrui Li, Yipeng Hu, Chuyan Chen, Kun Yuan

Subspace optimization algorithms, with GaLore (Zhao et al., 2024) as a representative method, have gained popularity for pre-training or fine-tuning large language models (LLMs) due to their memory efficiency.

PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training

1 code implementation26 Jun 2024 Xiao Liang, Zijian Zhao, Weichao Zeng, Yutong He, Fupeng He, Yiyi Wang, Chengying Gao

Learning musical structures and composition patterns is necessary for both music generation and understanding, but current methods do not make uniform use of learned features to generate and comprehend music simultaneously.

Music Generation

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation

no code implementations28 Mar 2024 Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Nathaniel Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter

Prompt engineering is effective for controlling the output of text-to-image (T2I) generative models, but it is also laborious due to the need for manually crafted prompts.

In-Context Learning Language Modeling +4

Weather Prediction with Diffusion Guided by Realistic Forecast Processes

no code implementations6 Feb 2024 Zhanxiang Hua, Yutong He, Chengqian Ma, Alexandra Anderson-Frey

Weather forecasting remains a crucial yet challenging domain, where recently developed models based on deep learning (DL) have approached the performance of traditional numerical weather prediction (NWP) models.

Weather Forecasting

Manifold Preserving Guided Diffusion

no code implementations28 Nov 2023 Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon

Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.

Conditional Image Generation

Towards reporting bias in visual-language datasets: bimodal augmentation by decoupling object-attribute association

no code implementations2 Oct 2023 Qiyu Wu, Mengjie Zhao, Yutong He, Lang Huang, Junya Ono, Hiromi Wakaki, Yuki Mitsufuji

In this paper, we focus on the wide existence of reporting bias in visual-language datasets, embodied as the object-attribute association, which can subsequentially degrade models trained on them.

Attribute Object

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

2 code implementations1 Oct 2023 Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Yutong He, Yuki Mitsufuji, Stefano Ermon

Consistency Models (CM) (Song et al., 2023) accelerate score-based diffusion model sampling at the cost of sample quality but lack a natural way to trade-off quality for speed.

 Ranked #1 on Image Generation on ImageNet 64x64 (NFE metric)

Denoising Image Generation

Sphere2Vec: A General-Purpose Location Representation Learning over a Spherical Surface for Large-Scale Geospatial Predictions

no code implementations30 Jun 2023 Gengchen Mai, Yao Xuan, Wenyun Zuo, Yutong He, Jiaming Song, Stefano Ermon, Krzysztof Janowicz, Ni Lao

So when applied to large-scale real-world GPS coordinate datasets, which require distance metric learning on the spherical surface, both types of models can fail due to the map projection distortion problem (2D) and the spherical-to-Euclidean distance approximation error (3D).

Image Classification Metric Learning +2

Localized Text-to-Image Generation for Free via Cross Attention Control

no code implementations26 Jun 2023 Yutong He, Ruslan Salakhutdinov, J. Zico Kolter

Despite the tremendous success in text-to-image generative models, localized text-to-image generation (that is, generating objects or features at specific locations in an image while maintaining a consistent overall generation) still requires either explicit training or substantial additional inference time.

Semantic Segmentation Text-to-Image Generation

Unbiased Compression Saves Communication in Distributed Optimization: When and How Much?

no code implementations NeurIPS 2023 Yutong He, Xinmeng Huang, Kun Yuan

Our results reveal that using independent unbiased compression can reduce the total communication cost by a factor of up to $\Theta(\sqrt{\min\{n, \kappa\}})$ when all local smoothness constants are constrained by a common upper bound, where $n$ is the number of workers and $\kappa$ is the condition number of the functions being minimized.

Distributed Optimization

Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression

no code implementations12 May 2023 Yutong He, Xinmeng Huang, Yiming Chen, Wotao Yin, Kun Yuan

In this paper, we investigate the performance limit of distributed stochastic optimization algorithms employing communication compression.

Stochastic Optimization

CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations

2 code implementations1 May 2023 Gengchen Mai, Ni Lao, Yutong He, Jiaming Song, Stefano Ermon

To directly leverage the abundant geospatial information associated with images in pre-training, fine-tuning, and inference stages, we present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.

Contrastive Learning Image Classification +1

Passive Non-line-of-sight Imaging for Moving Targets with an Event Camera

no code implementations27 Sep 2022 Conghe Wang, Yutong He, Xia Wang, Honghao Huang, Changda Yan, Xin Zhang, Hongwei Chen

Non-line-of-sight (NLOS) imaging is an emerging technique for detecting objects behind obstacles or around corners.

Tracking Urbanization in Developing Regions with Remote Sensing Spatial-Temporal Super-Resolution

no code implementations4 Apr 2022 Yutong He, William Zhang, Chenlin Meng, Marshall Burke, David B. Lobell, Stefano Ermon

Automated tracking of urban development in areas where construction information is not available became possible with recent advancements in machine learning and remote sensing.

Image Super-Resolution Object Tracking +2

Sphere2Vec: Self-Supervised Location Representation Learning on Spherical Surfaces

no code implementations29 Sep 2021 Gengchen Mai, Yao Xuan, Wenyun Zuo, Yutong He, Stefano Ermon, Jiaming Song, Krzysztof Janowicz, Ni Lao

Location encoding is valuable for a multitude of tasks where both the absolute positions and local contexts (image, text, and other types of metadata) of spatial objects are needed for accurate predictions.

Image Classification Representation Learning +1

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

1 code implementation ICLR 2022 Chenlin Meng, Yutong He, Yang song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon

The key challenge is balancing faithfulness to the user input (e. g., hand-drawn colored strokes) and realism of the synthesized image.

Denoising Image Generation

Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis

1 code implementation NeurIPS 2021 Yutong He, Dingjie Wang, Nicholas Lai, William Zhang, Chenlin Meng, Marshall Burke, David B. Lobell, Stefano Ermon

High-resolution satellite imagery has proven useful for a broad range of tasks, including measurement of global human population, local economic livelihoods, and biodiversity, among many others.

Object Counting Super-Resolution

Can we observe the QCD phase transition-generated gravitational waves through pulsar timing arrays?

no code implementations24 Feb 2021 Axel Brandenburg, Emma Clarke, Yutong He, Tina Kahniashvili

This is further made possible by our findings of shallower spectra proportional to the square root of the frequency for nonhelical hydromagnetic turbulence.

Cosmology and Nongalactic Astrophysics High Energy Physics - Phenomenology Fluid Dynamics

Relic gravitational waves from the chiral magnetic effect

no code implementations20 Jan 2021 Axel Brandenburg, Yutong He, Tina Kahniashvili, Matthias Rheinhardt, Jennifer Schober

Here we treat magnetic field generation through the chiral magnetic effect (CME) as a generic mechanism and explore its dependence on the speed of generation (the product of magnetic diffusivity and characteristic wavenumber) and the speed characterizing the maximum magnetic field strength expected from the CME.

Cosmology and Nongalactic Astrophysics High Energy Astrophysical Phenomena General Relativity and Quantum Cosmology

Motion-Based Handwriting Recognition and Word Reconstruction

1 code implementation15 Jan 2021 Junshen Kevin Chen, Wanze Xie, Yutong He

In this project, we leverage a trained single-letter classifier to predict the written word from a continuously written word sequence, by designing a word reconstruction pipeline consisting of a dynamic-programming algorithm and an auto-correction model.

Domain Adaptation Handwriting Recognition

Motion-Based Handwriting Recognition

1 code implementation15 Jan 2021 Junshen Kevin Chen, Wanze Xie, Yutong He

We attempt to overcome the restriction of requiring a writing surface for handwriting recognition.

Data Augmentation Handwriting Recognition

H-divergence: A Decision-Theoretic Discrepancy Measure for Two Sample Tests

no code implementations1 Jan 2021 Shengjia Zhao, Abhishek Sinha, Yutong He, Aidan Perreault, Jiaming Song, Stefano Ermon

Based on ideas from decision theory, we investigate a new class of discrepancies that are based on the optimal decision loss.

Vocal Bursts Valence Prediction

Fine-grained Image-to-Image Transformation towards Visual Recognition

no code implementations CVPR 2020 Wei Xiong, Yutong He, Yixuan Zhang, Wenhan Luo, Lin Ma, Jiebo Luo

In this paper, we aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image, which can thereby benefit the subsequent fine-grained image recognition and few-shot learning tasks.

Few-Shot Learning Fine-Grained Image Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.