WavCraft: Audio Editing and Generation with Natural Language Prompts

jinhualiang/wavcraft 14 Mar 2024

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

98
0.38 stars / hour

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

usyd-fsalab/fp6_llm 25 Jan 2024

However, existing systems do not provide Tensor Core support for FP6 quantization and struggle to achieve practical performance improvements during LLM inference.

Llama Quantization

72
0.38 stars / hour

RBF-PINN: Non-Fourier Positional Embedding in Physics-Informed Neural Networks

SimonZeng7108/RBF-PINN 13 Feb 2024

While many recent Physics-Informed Neural Networks (PINNs) variants have had considerable success in solving Partial Differential Equations, the empirical benefits of feature mapping drawn from the broader Neural Representations research have been largely overlooked.

37
0.38 stars / hour

SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing

modelscope/swift 18 Dec 2023

Image diffusion models have been utilized in various tasks, such as text-to-image generation and controllable image synthesis.

Text-to-Image Generation

1,271
0.37 stars / hour

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

unimodal4reasoning/chartvlm 19 Feb 2024

Recently, many versatile Multi-modal Large Language Models (MLLMs) have emerged continuously.

162
0.37 stars / hour

Painterly Image Harmonization by Learning from Painterly Objects

bcmi/ArtoPIH-Painterly-Image-Harmonization 15 Dec 2023

In particular, we learn a mapping from background style and object information to object style based on painterly objects in artistic paintings.

Image Harmonization Object

21
0.36 stars / hour

ID-Animator: Zero-Shot Identity-Preserving Human Video Generation

id-animator/id-animator 23 Apr 2024

Based on this pipeline, a random face reference training method is further devised to precisely capture the ID-relevant embeddings from reference images, thus improving the fidelity and generalization capacity of our model for ID-specific video generation.

Attribute Video Generation

107
0.36 stars / hour

Language Model Crossover: Variation through Few-Shot Prompting

carperai/openelm 23 Feb 2023

The promise of such language model crossover (which is simple to implement and can leverage many different open-source language models) is that it enables a simple mechanism to evolve semantically-rich text representations (with few domain-specific tweaks), and naturally benefits from current progress in language models.

In-Context Learning Language Modelling

577
0.35 stars / hour

Efficient Multimodal Learning from Data-centric Perspective

baai-dcai/bunny 18 Feb 2024

Multimodal Large Language Models (MLLMs) have demonstrated notable capabilities in general visual understanding and reasoning tasks.

605
0.35 stars / hour

CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

facebookresearch/purplellama 19 Apr 2024

We present BenchmarkName, a novel benchmark to quantify LLM security risks and capabilities.

GPT-4 Llama

1,892
0.34 stars / hour