Key to Fantasia3D is the disentangled modeling and learning of geometry and appearance.
Unlike the existing image-text similarity objective which only categorizes matched pairs as similar and unmatched pairs as dissimilar, equivariance also requires similarity to vary faithfully according to the semantic changes.
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.
Ranked #1 on Question Answering on PIQA
In such attacks, an adversary can prompt the LLM to produce malicious content or override the original instructions and the employed filtering schemes.
In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.
We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.
Finally, we use Diffusion Classifier to extract standard classifiers from class-conditional diffusion models trained on ImageNet.
Text-to-3D generation has shown rapid progress in recent days with the advent of score distillation, a methodology of using pretrained text-to-2D diffusion models to optimize neural radiance field (NeRF) in the zero-shot setting.
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.
Ranked #1 on Language Modelling on CLUE (CMRC2018)
To achieve full automation, we introduce a straightforward yet effective heuristic that enables the agent to pinpoint hallucination instances, avoid repetition in action sequences, and, in some environments, construct an internal memory map of the given environment.