LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

KwaiVGI/LivePortrait 3 Jul 2024

Instead of following mainstream diffusion-based methods, we explore and extend the potential of the implicit-keypoint-based framework, which effectively balances computational efficiency and controllability.

Computational Efficiency Face Reenactment +3

GRUtopia: Dream General Robots in a City at Scale

openrobotlab/grutopia 15 Jul 2024

Recent works have been exploring the scaling laws in the field of Embodied AI.

Language Modelling Large Language Model

VISA: Reasoning Video Object Segmentation via Large Language Models

cilinyan/VISA 16 Jul 2024

In this paper, we introduce a new task, Reasoning Video Object Segmentation (ReasonVOS).

Decoder Object +6

Deep-TEMPEST: Using Deep Learning to Eavesdrop on HDMI from its Unintended Electromagnetic Emanations

emidan19/deep-tempest 12 Jul 2024

As a result, eavesdropping systems designed for the analog case obtain unclear and difficult-to-read images when applied to digital video.

E5-V: Universal Embeddings with Multimodal Large Language Models

kongds/e5-v 17 Jul 2024

We propose a single modality training approach for E5-V, where the model is trained exclusively on text pairs.

Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

openbmb/ioa 9 Jul 2024

The rapid advancement of large language models (LLMs) has paved the way for the development of highly capable autonomous agents.

RouteLLM: Learning to Route LLMs with Preference Data

lm-sys/routellm 26 Jun 2024

Large language models (LLMs) exhibit impressive capabilities across a wide range of tasks, yet the choice of which model to use often involves a trade-off between performance and cost.

Data Augmentation Transfer Learning

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers

goombalab/hydra 13 Jul 2024

We identify a key axis of matrix parameterizations termed sequence alignment, which increases the flexibility and performance of matrix mixers, providing insights into the strong performance of Transformers and recent SSMs such as Mamba.

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

microsoft/MInference 2 Jul 2024

With the pattern and sparse indices, we perform efficient sparse attention calculations via our optimized GPU kernels to significantly reduce the latency in the pre-filling stage of long-context LLMs.

Language Modelling Large Language Model

A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights

wentaol86/awesome-human-body-video-generation 11 Jul 2024

The goal of this survey is to offer the research community a clear and holistic view of the advancements in human video generation, highlighting the milestones achieved and the challenges that lie ahead.

Video Generation

