Trained on massive publicly available data, large language models (LLMs) have demonstrated tremendous success across various fields.
Global medium-range weather forecasting is critical to decision-making across many social and economic domains.
Advancing the frontier of subquadratic architectures for Language Models (LMs) is crucial in the rapidly evolving field of natural language processing.
We present Scalable Interpolant Transformers (SiT), a family of generative models built on the backbone of Diffusion Transformers (DiT).
TorchCP is a Python toolbox for conformal prediction research on deep learning models.
We demonstrate that continual pretraining of the full model on 1B-5B tokens of such data is an effective and affordable strategy for scaling the context length of language models to 128K.
In this work, we propose MagicPose, a diffusion-based model for 2D human pose and facial expression retargeting.
The results showcase the potential of exploiting the temporal relations in video data using generative models.
In the field of text-to-image generation (T2I), subject-driven content generation has achieved great progress with the ID in the images controllable.