We study continued training and supervised fine-tuning (SFT) of a language model (LM) to make effective use of long-context information.
Large language models (LLMs) exhibit impressive capabilities across a wide range of tasks, yet the choice of which model to use often involves a trade-off between performance and cost.
In a convergence of machine learning and biology, we reveal that diffusion models are evolutionary algorithms.
We present PhysGen, a novel image-to-video generation method that converts a single image and an input condition (e. g., force and torque applied to an object in the image) to produce a realistic, physically plausible, and temporally consistent video.
Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools.
While large language models have made significant strides in code generation, the pass rate of the generated code is bottlenecked on subtle errors, often requiring human intervention to pass tests, especially for complex problems.
Ranked #3 on Code Generation on HumanEval
Agent-based modeling has been around for decades, and applied widely across the social and natural sciences.
Personalized text-to-image generation methods can generate customized images based on the reference images, which have garnered wide research interest.
Retrieval-augmented generation (RAG) has emerged as a promising solution for mitigating hallucinations of large language models (LLMs) with retrieved external knowledge.
To create rich visualizations, data analysts often need to iterate back and forth among data processing and chart specification to achieve their goals.