no code implementations • 7 Jul 2024 • Haozhe Zhao, Xiaojian Ma, Liang Chen, Shuzheng Si, Rujie Wu, Kaikai An, Peiyu Yu, Minjia Zhang, Qing Li, Baobao Chang
This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing.
1 code implementation • 29 May 2024 • Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong
Generative models based on flow matching have attracted significant attention for their simplicity and superior performance in high-resolution image synthesis.
no code implementations • 27 May 2024 • Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu
To this end, we formulate an learnable energy-based latent space, and propose Noise-intensified Telescoping density-Ratio Estimation (NTRE) scheme for variational learning of an accurate latent space model without costly Markov Chain Monte Carlo.
no code implementations • 10 Apr 2024 • Yasi Zhang, Peiyu Yu, Ying Nian Wu
Text-to-image diffusion models have shown great success in generating high-quality text-guided images.
1 code implementation • NeurIPS 2023 • Peiyu Yu, Yaxuan Zhu, Sirui Xie, Xiaojian Ma, Ruiqi Gao, Song-Chun Zhu, Ying Nian Wu
To remedy this sampling issue, in this paper we introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
no code implementations • 5 Oct 2023 • Yilue Qian, Peiyu Yu, Ying Nian Wu, Yao Su, Wei Wang, Lifeng Fan
In this paper, we propose an interpretable and generalizable visual planning framework consisting of i) a novel Substitution-based Concept Learner (SCL) that abstracts visual inputs into disentangled concept representations, ii) symbol abstraction and reasoning that performs task planning via the self-learned symbols, and iii) a Visual Causal Transition model (ViCT) that grounds visual causal transitions to semantically similar real-world actions.
2 code implementations • 13 Jun 2022 • Peiyu Yu, Sirui Xie, Xiaojian Ma, Baoxiong Jia, Bo Pang, Ruiqi Gao, Yixin Zhu, Song-Chun Zhu, Ying Nian Wu
Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in generative modeling.
2 code implementations • NeurIPS 2021 • Peiyu Yu, Sirui Xie, Xiaojian Ma, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu
Foreground extraction can be viewed as a special case of generic image segmentation that focuses on identifying and disentangling objects from the background.
no code implementations • 22 Feb 2021 • Sirui Xie, Xiaojian Ma, Peiyu Yu, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu
Leveraging these concepts, they could understand the internal structure of this task, without seeing all of the problem instances.
no code implementations • 19 Dec 2019 • Peiyu Yu, Yongming Rao, Jiwen Lu, Jie zhou
Humans are able to perform fast and accurate object pose estimation even under severe occlusion by exploiting learned object model priors from everyday life.