Multi-Head RAG: Solving Multi-Aspect Problems with LLMs

spcl/mrag 7 Jun 2024

Retrieval Augmented Generation (RAG) enhances the abilities of Large Language Models (LLMs) by enabling the retrieval of documents into the LLM context to provide more accurate and relevant responses.

Benchmarking Decoder +1

Mathematical Supplement for the $\texttt{gsplat}$ Library

nerfstudio-project/gsplat 4 Dec 2023

This report provides the mathematical details of the gsplat library, a modular toolbox for efficient differentiable Gaussian splatting, as proposed by Kerbl et al.

Fast Timing-Conditioned Latent Audio Diffusion

stability-ai/stable-audio-tools 7 Feb 2024

Generating long-form 44. 1kHz stereo audio from text prompts can be computationally demanding.

Audio Generation

Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness

apple/axlearn 8 May 2023

In this paper, we discuss two effective approaches to improve the efficiency and robustness of CLIP training: (1) augmenting the training dataset while maintaining the same number of optimization steps, and (2) filtering out samples that contain text regions in the image.

Adversarial Text Retrieval

Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training

apple/axlearn 23 May 2024

In this work, we revisit the settings by adopting step time as a more accurate measure of model complexity, and by determining the total compute budget under the Chinchilla compute-optimal settings.


Scaling and evaluating sparse autoencoders

openai/sparse_autoencoder 6 Jun 2024

Using these techniques, we find clean scaling laws with respect to autoencoder size and sparsity.

Language Modelling

Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

BytedanceSpeech/seed-tts-eval 4 Jun 2024

Seed-TTS offers superior controllability over various speech attributes such as emotion and is capable of generating highly expressive and diverse speech for speakers in the wild.

In-Context Learning Language Modelling

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

woooodyy/agentgym 6 Jun 2024

Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community.

Language Modelling Large Language Model

Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

agent-husky/husky-v1 10 Jun 2024

Despite using 7B models, Husky matches or even exceeds frontier LMs such as GPT-4 on these tasks, showcasing the efficacy of our holistic approach in addressing complex reasoning problems.

Multi-hop Question Answering Question Answering

Recurrent Context Compression: Efficiently Expanding the Context Window of LLM

WUHU-G/RCC_Transformer 10 Jun 2024

To extend the context length of Transformer-based large language models (LLMs) and improve comprehension capabilities, we often face limitations due to computational resources and bounded memory storage capacity.

Long-Context Understanding Question Answering +2

