Transfer learning of large-scale Text-to-Image (T2I) models has recently shown impressive potential for Novel View Synthesis (NVS) of diverse objects from a single image.
However, the transformers are prohibited from directly learning the long-term dependency in videos due to the quadratic complexity of self-attention, and inherently suffering from slow inference time and error propagation due to the autoregressive process.
Ranked #24 on Video Generation on UCF-101
Despite recent advances in implicit neural representations (INRs), it remains challenging for a coordinate-based multi-layer perceptron (MLP) of INRs to learn a common representation across data instances and generalize it for unseen instances.
After code stacks in the sequence are randomly masked, Contextual RQ-Transformer is trained to infill the masked code stacks based on the unmasked contexts of the image.
Ranked #1 on Text-to-Image Generation on Conceptual Captions
However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off.
Ranked #2 on Text-to-Image Generation on Conceptual Captions
Large-batch training has been essential in leveraging large-scale datasets and models in deep learning.
We design and implement a ready-to-use library in PyTorch for performing micro-batch pipeline parallelism with checkpointing proposed by GPipe (Huang et al., 2019).
Conditional generative adversarial networks (cGANs) have gained a considerable attention in recent years due to its class-wise controllability and superior quality for complex generation tasks.
In this paper, a neural architecture search (NAS) framework is proposed for 3D medical image segmentation, to automatically optimize a neural architecture from a large design space.
Data augmentation is an essential technique for improving generalization ability of deep learning models.
Ranked #3 on Image Classification on SVHN
We study the problem of community detection in a random hypergraph model which we call the stochastic block model for $k$-uniform hypergraphs ($k$-SBM).