1 code implementation • 15 Apr 2025 • Xingjian Leng, Jaskirat Singh, Yunzhong Hou, Zhenchang Xing, Saining Xie, Liang Zheng
We show that while diffusion loss is ineffective, end-to-end training can be unlocked through the representation-alignment (REPA) loss -- allowing both VAE and diffusion model to be jointly tuned during the training process.
Ranked #2 on
Image Generation
on ImageNet 256x256
1 code implementation • 14 Apr 2025 • Xingjian Leng, Jaskirat Singh, Yunzhong Hou, Zhenchang Xing, Saining Xie, Liang Zheng
We show that while diffusion loss is ineffective, end-to-end training can be unlocked through the representation-alignment (REPA) loss -- allowing both VAE and diffusion model to be jointly tuned during the training process.
no code implementations • 9 Apr 2025 • Naman jain, Jaskirat Singh, Manish Shetty, Liang Zheng, Koushik Sen, Ion Stoica
Improving open-source models on real-world SWE tasks (solving GITHUB issues) faces two key challenges: 1) scalable curation of execution environments to train these models, and, 2) optimal scaling of test-time compute.
no code implementations • 8 Apr 2025 • Jaskirat Singh, Junshen Kevin Chen, Jonas Kohler, Michael Cohen
In this work, we first analyze the reason for these limitations.
no code implementations • 8 Mar 2025 • Rishabh Gupta, Shivam Gupta, Jaskirat Singh, Sabre Kais
Short-term patterns in financial time series form the cornerstone of many algorithmic trading strategies, yet extracting these patterns reliably from noisy market data remains a formidable challenge.
no code implementations • 2 Dec 2024 • Jaskirat Singh, Lindsey Li, Weijia Shi, Ranjay Krishna, Yejin Choi, Pang Wei Koh, Michael F. Cohen, Stephen Gould, Liang Zheng, Luke Zettlemoyer
We introduce negative token merging (NegToMe), a simple but effective training-free approach which performs adversarial guidance through images by selectively pushing apart matching visual features between reference and generated images during the reverse diffusion process.
no code implementations • 1 Nov 2024 • Jaskirat Singh, Bram Adams, Ahmed E. Hassan
Among the non-hybrid operators, the Distilled operator is a better alternative in both mobile and edge tiers for lower latency performance at the cost of small to medium accuracy loss.
no code implementations • 4 Sep 2024 • Haiyu Wu, Jaskirat Singh, Sicong Tian, Liang Zheng, Kevin W. Bowyer
However, existing works 1) are typically limited in how many well-separated identities can be generated and 2) either neglect or use a separate editing model for attribute augmentation.
2 code implementations • 23 Jul 2024 • Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, Graham Neubig
OpenDevin), a platform for the development of powerful and flexible AI agents that interact with the world in similar ways to those of a human developer: by writing code, interacting with a command line, and browsing the web.
1 code implementation • 30 May 2024 • Yuchi Liu, Jaskirat Singh, Gaowen Liu, Ali Payani, Liang Zheng
Specifically, we include a hierarchy of LLMs, first constructing a prompt with precise instructions and accurate wording in a hierarchical manner, and then using this prompt to generate the final answer to the user query.
1 code implementation • 25 Mar 2024 • Jaskirat Singh, Emad Fallahzadeh, Bram Adams, Ahmed E. Hassan
In this paper, we conduct inference experiments involving 3 deployment operators (i. e., Partitioning, Quantization, Early Exit), 3 deployment tiers (i. e., Mobile, Edge, Cloud) and their combinations on four widely used Computer-Vision models to investigate the optimal strategies from the point of view of MLOps developers.
no code implementations • CVPR 2024 • Jaskirat Singh, Jianming Zhang, Qing Liu, Cameron Smith, Zhe Lin, Liang Zheng
To overcome these limitations, we introduce SmartMask, which allows any novice user to create detailed masks for precise object insertion.
1 code implementation • 12 Nov 2023 • Zhaoyuan Yang, Zhengyang Yu, Zhiwei Xu, Jaskirat Singh, Jing Zhang, Dylan Campbell, Peter Tu, Richard Hartley
We present a diffusion-based image morphing approach with perceptually-uniform sampling (IMPUS) that produces smooth, direct and realistic interpolations given an image pair.
no code implementations • NeurIPS 2023 • Jaskirat Singh, Liang Zheng
Furthermore, we also find that the assertion level alignment scores provide a useful feedback which can then be used in a simple iterative procedure to gradually increase the expression of different assertions in the final image outputs.
no code implementations • 6 Jul 2023 • Peter Tu, Zhaoyuan Yang, Richard Hartley, Zhiwei Xu, Jing Zhang, Yiwei Fu, Dylan Campbell, Jaskirat Singh, Tianyu Wang
This paper begins with a description of methods for estimating image probability density functions that reflects the observation that such data is usually constrained to lie in restricted regions of the high-dimensional image space-not every pattern of pixels is an image.
no code implementations • CVPR 2023 • Jaskirat Singh, Stephen Gould, Liang Zheng
The user scribbles control the color composition while the text prompt provides control over the overall image semantics.
no code implementations • 27 Sep 2022 • Kushagra Srivastava, Dhruv Patel, Aditya Kumar Jha, Mohhit Kumar Jha, Jaskirat Singh, Ravi Kiran Sarvadevabhatla, Pradeep Kumar Ramancharla, Harikumar Kandath, K. Madhava Krishna
Unmanned Aerial Vehicle (UAV) based remote sensing system incorporated with computer vision has demonstrated potential for assisting building construction and in disaster management like damage assessment during earthquakes.
1 code implementation • 17 Aug 2022 • Jaskirat Singh, Liang Zheng, Cameron Smith, Jose Echevarria
In particular, we propose a novel approach paint2pix, which learns to predict (and adapt) "what a user wants to draw" from rudimentary brushstroke inputs, by learning a mapping from the manifold of incomplete human paintings to their realistic renderings.
no code implementations • 16 Dec 2021 • Jaskirat Singh, Cameron Smith, Jose Echevarria, Liang Zheng
However, current research in this direction is often reliant on a progressive grid-based division strategy wherein the agent divides the overall image into successively finer grids, and then proceeds to paint each of them in parallel.
no code implementations • 14 Feb 2021 • Jaskirat Singh, Liang Zheng
However, we argue that the sample variance for a multi-scene environment is best minimized by treating each scene as a distinct MDP, and then learning a joint value function V(s, M) dependent on both state s and MDP M. We further demonstrate that the true joint value function for a multi-scene environment, follows a multi-modal distribution which is not captured by traditional CNN / LSTM based critic networks.
no code implementations • 25 Nov 2020 • Jaskirat Singh, Liang Zheng
Recently, Singh et al. [1] tried to address this by proposing a dynamic value estimation approach that models the true joint value function distribution as a Gaussian mixture model (GMM).
1 code implementation • CVPR 2021 • Jaskirat Singh, Liang Zheng
2) We also introduce invariance to the position and scale of the foreground object through a neural alignment model, which combines object localization and spatial transformer networks in an end to end manner, to zoom into a particular semantic instance.
Deep Reinforcement Learning
Model-based Reinforcement Learning
+4
no code implementations • 25 May 2020 • Jaskirat Singh, Liang Zheng
Training deep reinforcement learning agents on environments with multiple levels / scenes / conditions from the same task, has become essential for many applications aiming to achieve generalization and domain transfer from simulation to the real world.