Retrieval-Augmented Generation for AI-Generated Content: A Survey

hymie122/rag-survey 29 Feb 2024

The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by advancements in model algorithms, the increasing scale of foundation models, and the availability of ample high-quality datasets.

Information Retrieval Large Language Model +2

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

graphpku/pissa 3 Apr 2024

However, LoRA approximates Delta W through the product of two matrices, A, initialized with Gaussian noise, and B, initialized with zeros, while PiSSA initializes A and B with principal singular values and vectors of the original matrix W. PiSSA can better approximate the outcomes of full-parameter fine-tuning at the beginning by changing the essential parts while freezing the "noisy" parts.


High-Fidelity Audio Compression with Improved RVQGAN

descriptinc/descript-audio-codec NeurIPS 2023

Language models have been successfully used to model natural signals, such as images, speech, and music.

Audio Compression Audio Generation +1

A Light CNN for Deep Face Representation with Noisy Labels

AlfredXiangWu/LightCNN 9 Nov 2015

This paper presents a Light CNN framework to learn a compact embedding on the large-scale face data with massive noisy labels.

Face Identification Face Recognition +2

SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions

IDKiro/sdxs 25 Mar 2024

Recent advancements in diffusion models have positioned them at the forefront of image generation.

Image-to-Image Translation Text-to-Image Generation

PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models

3DAgentWorld/Toolkit-for-Prompt-Compression 26 Mar 2024

Prompt compression is an innovative method for efficiently condensing input prompts while preserving essential information.

Code Completion Few-Shot Learning +2

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

google-deepmind/recurrentgemma 29 Feb 2024

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale.

Language Modelling

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models

InternLM/InternLM-XComposer 22 Feb 2024

We present DualFocus, a novel framework for integrating macro and micro perspectives within multi-modal large language models (MLLMs) to enhance vision-language task performance.


ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

InternLM/InternLM-XComposer 21 Nov 2023

In the realm of large multi-modal models (LMMs), efficient modality alignment is crucial yet often constrained by the scarcity of high-quality image-text data.

Descriptive visual instruction following +2

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

internlm/internlm-xcomposer 9 Apr 2024

The Large Vision-Language Model (LVLM) field has seen significant advancements, yet its progression has been hindered by challenges in comprehending fine-grained visual content due to limited resolution.

4k Language Modelling +1

