Search Results for author: Anil Kag

Found 23 papers, 12 papers with code

H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models

no code implementations14 Apr 2025 Yushu Wu, Yanyu Li, Ivan Skorokhodov, Anil Kag, Willi Menapace, Sharath Girish, Aliaksandr Siarohin, Yanzhi Wang, Sergey Tulyakov

Our AE achieves an ultra-high compression ratio and real-time decoding speed on mobile while outperforming prior art in terms of reconstruction metrics by a large margin.

Denoising Text-to-Video Generation +1

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

no code implementations5 Feb 2025 Yunuo Chen, Junli Cao, Anil Kag, Vidit Goel, Sergei Korolev, Chenfanfu Jiang, Sergey Tulyakov, Jian Ren

Furthermore, our model improves the overall quality of video generation by promoting the 3D consistency of moving objects and reducing abrupt changes in shape and motion.

Video Generation

Scalable Ranked Preference Optimization for Text-to-Image Generation

no code implementations23 Oct 2024 Shyamgopal Karthik, Huseyin Coskun, Zeynep Akata, Sergey Tulyakov, Jian Ren, Anil Kag

In this work, we investigate a scalable approach for collecting large-scale and fully synthetic datasets for DPO training.

Text-to-Image Generation

Lightweight Predictive 3D Gaussian Splats

1 code implementation27 Jun 2024 Junli Cao, Vidit Goel, Chaoyang Wang, Anil Kag, Ju Hu, Sergei Korolev, Chenfanfu Jiang, Sergey Tulyakov, Jian Ren

Our key observation is that nearby points in the scene can share similar representations.

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

1 code implementation6 Jun 2024 Yang Sui, Yanyu Li, Anil Kag, Yerlan Idelbayev, Junli Cao, Ju Hu, Dhritiman Sagar, Bo Yuan, Sergey Tulyakov, Jian Ren

Diffusion-based image generation models have achieved great success in recent years by showing the capability of synthesizing high-quality content.

Image Generation model +1

SF-V: Single Forward Video Generation Model

1 code implementation6 Jun 2024 Zhixing Zhang, Yanyu Li, Yushu Wu, Yanwu Xu, Anil Kag, Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin, Junli Cao, Dimitris Metaxas, Sergey Tulyakov, Jian Ren

Diffusion-based video generation models have demonstrated remarkable success in obtaining high-fidelity videos through the iterative denoising process.

Denoising model +1

TextCraftor: Your Text Encoder Can be Image Quality Controller

no code implementations CVPR 2024 Yanyu Li, Xian Liu, Anil Kag, Ju Hu, Yerlan Idelbayev, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov, Jian Ren

Our findings reveal that, instead of replacing the CLIP text encoder used in Stable Diffusion with other large language models, we can enhance it through our proposed fine-tuning approach, TextCraftor, leading to substantial improvements in quantitative benchmarks and human assessments.

Image Generation

Scaffolding a Student to Instill Knowledge

1 code implementation International Conference on Learning Representations 2023 Anil Kag, Durmus Alp Emre Acar, Aditya Gangrade, Venkatesh Saligrama

We propose a novel knowledge distillation (KD) method to selectively instill teacher knowledge into a student model motivated by situations where the student's capacity is significantly smaller than that of the teachers.

Knowledge Distillation

Condensing CNNs With Partial Differential Equations

1 code implementation CVPR 2022 Anil Kag, Venkatesh Saligrama

Convolutional neural networks (CNNs) rely on the depth of the architecture to obtain complex features.

Hybrid Cloud-Edge Networks for Efficient Inference

1 code implementation29 Sep 2021 Anil Kag, Igor Fedorov, Aditya Gangrade, Paul Whatmough, Venkatesh Saligrama

The first network is a low-capacity network that can be deployed on an edge device, whereas the second is a high-capacity network deployed in the cloud.

Training Recurrent Neural Networks via Forward Propagation Through Time

1 code implementation International Conference on Machine Learning 2021 Anil Kag, Venkatesh Saligrama

BPTT updates RNN parameters on an instance by back-propagating the error in time over the entire sequence length, and as a result, leads to poor trainability due to the well-known gradient explosion/decay phenomena.

Time Adaptive Recurrent Neural Network

1 code implementation CVPR 2021 Anil Kag, Venkatesh Saligrama

We propose a learning method that, dynamically modifies the time-constants of the continuous-time counterpart of a vanilla RNN.

Selective Classification via One-Sided Prediction

1 code implementation15 Oct 2020 Aditya Gangrade, Anil Kag, Venkatesh Saligrama

We propose a novel method for selective classification (SC), a problem which allows a classifier to abstain from predicting some instances, thus trading off accuracy against coverage (the fraction of instances predicted).

Classification General Classification +2

RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?

1 code implementation ICLR 2020 Anil Kag, Ziming Zhang, Venkatesh Saligrama

Recurrent neural networks (RNNs) are particularly well-suited for modeling long-term dependencies in sequential data, but are notoriously hard to train because the error backpropagated in time either vanishes or explodes at an exponential rate.

RNNs Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?

no code implementations22 Aug 2019 Anil Kag, Ziming Zhang, Venkatesh Saligrama

Recurrent neural networks (RNNs) are particularly well-suited for modeling long-term dependencies in sequential data, but are notoriously hard to train because the error backpropagated in time either vanishes or explodes at an exponential rate.

Equilibrated Recurrent Neural Network: Neuronal Time-Delayed Self-Feedback Improves Accuracy and Stability

no code implementations2 Mar 2019 Ziming Zhang, Anil Kag, Alan Sullivan, Venkatesh Saligrama

We show that such self-feedback helps stabilize the hidden state transitions leading to fast convergence during training while efficiently learning discriminative latent features that result in state-of-the-art results on several benchmark datasets at test-time.

Learning Compact Networks via Adaptive Network Regularization

no code implementations NIPS Workshop CDNNRIA 2018 Sivaramakrishnan Sankarapandian, Anil Kag, Rachel Manzelli, Brian Kulis

We describe a training strategy that grows the number of units during training, and show on several benchmark datasets that our model yields architectures that are smaller than those obtained when tuning the number of hidden units on a standard fixed architecture.

Cannot find the paper you are looking for? You can Submit a new open access paper.