SnapKV: LLM Knows What You are Looking for Before Generation

fasterdecoding/snapkv 22 Apr 2024

Specifically, SnapKV achieves a consistent decoding speed with a 3. 6x increase in generation speed and an 8. 2x enhancement in memory efficiency compared to baseline when processing inputs of 16K tokens.

16k

101
0.44 stars / hour

PromptBench: A Unified Library for Evaluation of Large Language Models

microsoft/promptbench 13 Dec 2023

The evaluation of large language models (LLMs) is crucial to assess their performance and mitigate potential security risks.

Prompt Engineering

2,064
0.44 stars / hour

TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models

jishengpeng/TextrolSpeech 28 Aug 2023

The dataset comprises 236, 220 pairs of style prompt in natural text descriptions with five style factors and corresponding speech samples.

Language Modelling

47
0.43 stars / hour

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion

tencentarc/brushnet 11 Mar 2024

Image inpainting, the process of restoring corrupted images, has seen significant advancements with the advent of diffusion models (DMs).

Image Inpainting

942
0.43 stars / hour

Generative Agents: Interactive Simulacra of Human Behavior

a16z-infra/ai-town 7 Apr 2023

Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools.

Language Modelling Large Language Model

6,602
0.42 stars / hour

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

saidwivedi/TokenHMR 25 Apr 2024

We address the problem of regressing 3D human pose and shape from a single image, with a focus on 3D accuracy.

Human Mesh Recovery valid

55
0.41 stars / hour

EasySpider: A No-Code Visual System for Crawling the Web

NaiboWang/EasySpider ACM The Web Conference 2023

As such, web-crawling is an essential tool for both computational and non-computational scientists to conduct research.

Data Integration Marketing

22,189
0.41 stars / hour

UFO: A UI-Focused Agent for Windows OS Interaction

microsoft/UFO 8 Feb 2024

We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision.

Navigate

4,260
0.41 stars / hour

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

showlab/show-1 27 Sep 2023

In this paper, we are the first to propose a hybrid model, dubbed as Show-1, which marries pixel-based and latent-based VDMs for text-to-video generation.

Text-to-Video Generation Video Alignment +1

1,030
0.40 stars / hour

GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting

ku-cvlab/gaussiantalker 24 Apr 2024

A key insight is to encode the 3D Gaussian attributes into a shared implicit feature representation, where it is merged with audio features to manipulate each Gaussian attribute.

Attribute

65
0.39 stars / hour