Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

yuelangx/gaussian-head-avatar 5 Dec 2023

Creating high-fidelity 3D head avatars has always been a research hotspot, but there remains a great challenge under lightweight sparse view setups.

219
1.99 stars / hour

Efficient Large Language Models: A Survey

aiot-mlsys-lab/efficientllms 6 Dec 2023

Large Language Models (LLMs) have demonstrated remarkable capabilities in important tasks such as natural language understanding, language generation, and complex reasoning and have the potential to make a substantial impact on our society.

Natural Language Understanding Text Generation

149
1.73 stars / hour

Aligning and Prompting Everything All at Once for Universal Visual Perception

shenyunhang/ape 4 Dec 2023

However, predominant paradigms, driven by casting instance-level tasks as an object-word alignment, bring heavy cross-modality interaction, which is not effective in prompting object detection and visual grounding.

object-detection Object Detection +4

213
1.69 stars / hour

PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation

zhyever/PatchFusion 4 Dec 2023

Single image depth estimation is a foundational task in computer vision and generative modeling.

Depth Estimation

279
1.46 stars / hour

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

sunzey/alphaclip 6 Dec 2023

Alpha-CLIP not only preserves the visual recognition ability of CLIP but also enables precise control over the emphasis of image contents.

98
1.34 stars / hour

OneLLM: One Framework to Align All Modalities with Language

csuhan/onellm 6 Dec 2023

In detail, we first train an image projection module to connect a vision encoder with LLM.

Question Answering

153
1.34 stars / hour

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators

vvictoryuki/animatezero 6 Dec 2023

For appearance control, we borrow intermediate latents and their features from the text-to-image (T2I) generation for ensuring the generated first frame is equal to the given generated image.

Image Animation Video Generation

96
1.16 stars / hour

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

shi-labs/smooth-diffusion 7 Dec 2023

Specifically, we introduce Step-wise Variation Regularization to enforce the proportion between the variations of an arbitrary input latent and that of the output image is a constant at any diffusion training step.

66
1.10 stars / hour

Can LLMs Follow Simple Rules?

normster/llm_rules 6 Nov 2023

As Large Language Models (LLMs) are deployed with increasing real-world responsibilities, it is important to be able to specify and constrain the behavior of these systems in a reliable manner.

139
1.00 stars / hour

Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia

google-deepmind/concordia 6 Dec 2023

Agent-based modeling has been around for decades, and applied widely across the social and natural sciences.

Common Sense Reasoning

62
0.92 stars / hour