Layer-Condensed KV Cache for Efficient Inference of Large Language Models

whyNLP/LCKV 17 May 2024

In this paper, we propose a novel method that only computes and caches the KVs of a small number of layers, thus significantly saving memory consumption and improving inference throughput.

Language Modelling

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

hvision-nku/storydiffusion 2 May 2024

This module converts the generated sequence of images into videos with smooth transitions and consistent subjects that are significantly more stable than the modules based on latent spaces only, especially in the context of long video generation.

motion prediction Story Generation +1

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

xingy038/dreamer-xl 18 May 2024

In this work, we propose a novel Trajectory Score Matching (TSM) method that aims to solve the pseudo ground truth inconsistency problem caused by the accumulated error in Interval Score Matching (ISM) when using the Denoising Diffusion Implicit Models (DDIM) inversion process.

3D Generation Denoising +1

AgentScope: A Flexible yet Robust Multi-Agent Platform

modelscope/agentscope 21 Feb 2024

With the rapid advancement of Large Language Models (LLMs), significant progress has been made in multi-agent applications.

Multi-agent Integration

The Platonic Representation Hypothesis

minyoungg/platonic-rep 13 May 2024

We argue that representations in AI models, particularly deep networks, are converging.

LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion

robfiras/loco-mujoco 4 Nov 2023

Imitation Learning (IL) holds great promise for enabling agile locomotion in embodied agents.

Benchmarking Imitation Learning

AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding

x-lance/anitalker 6 May 2024

The paper introduces AniTalker, an innovative framework designed to generate lifelike talking faces from a single portrait.

Metric Learning Self-Supervised Learning

Transcriptomics-guided Slide Representation Learning in Computational Pathology

mahmoodlab/tangle 19 May 2024

Across three independent test datasets consisting of 1, 265 breast WSIs, 1, 946 lung WSIs, and 4, 584 liver WSIs, Tangle shows significantly better few-shot performance compared to supervised and SSL baselines.

Contrastive Learning Representation Learning +2

ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing

poloclub/clickdiffusion 5 Apr 2024

We demonstrate that by serializing both an image and a multi-modal instruction into a textual representation it is possible to leverage LLMs to perform precise transformations of the layout and appearance of an image.

Image Manipulation

WavCraft: Audio Editing and Generation with Large Language Models

jinhualiang/wavcraft 14 Mar 2024

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

