Scalable MatMul-free Language Modeling

ridgerchu/matmulfreellm 4 Jun 2024

Our experiments show that our proposed MatMul-free models achieve performance on-par with state-of-the-art Transformers that require far more memory during inference at a scale up to at least 2. 7B parameters.

Language Modelling

6.16 stars / hour

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

foundationvision/llamagen 10 Jun 2024

(3) A text-conditional image generation model with 775M parameters, from two-stage training on LAION-COCO and high aesthetics quality images, demonstrating competitive performance of visual quality and text alignment.

Conditional Image Generation

5.17 stars / hour

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

verazuo/jailbreak_llms 7 Aug 2023

We hope that our study can facilitate the research community and LLM vendors in promoting safer and regulated LLMs.

Community Detection

4.71 stars / hour

TextGrad: Automatic "Differentiation" via Text

zou-group/textgrad 11 Jun 2024

Without modifying the framework, TextGrad improves the zero-shot accuracy of GPT-4o in Google-Proof Question Answering from $51\%$ to $55\%$, yields $20\%$ relative performance gain in optimizing LeetCode-Hard coding problem solutions, improves prompts for reasoning, designs new druglike small molecules with desirable in silico binding, and designs radiation oncology treatment plans with high specificity.

 Ranked #1 on on GPQA

Question Answering Specificity

3.26 stars / hour

Matching Anything by Segmenting Anything

siyuanliii/masa CVPR 2024

The robust association of the same objects across video frames in complex scenes is crucial for many applications, especially Multiple Object Tracking (MOT).

Domain Generalization Multiple Object Tracking +2

2.17 stars / hour

X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular Design

ericlbuehler/ 11 Feb 2024

Starting with a set of pre-trained LoRA adapters, our gating strategy uses the hidden states to dynamically mix adapted layers, allowing the resulting X-LoRA model to draw upon different capabilities and create never-before-used deep layer-wise combinations to solve tasks.

graph construction Knowledge Graphs +2

2.06 stars / hour

Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness

apple/axlearn 8 May 2023

In this paper, we discuss two effective approaches to improve the efficiency and robustness of CLIP training: (1) augmenting the training dataset while maintaining the same number of optimization steps, and (2) filtering out samples that contain text regions in the image.

Adversarial Text Retrieval

1.77 stars / hour

Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training

apple/axlearn 23 May 2024

In this work, we revisit the settings by adopting step time as a more accurate measure of model complexity, and by determining the total compute budget under the Chinchilla compute-optimal settings.


1.76 stars / hour

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

line/libritts-p 12 Jun 2024

We employ a hybrid approach to construct prompt annotations: (1) manual annotations that capture human perceptions of speaker characteristics and (2) synthetic annotations on speaking style.

1.50 stars / hour

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

opengvlab/omnicorpus 12 Jun 2024

In this paper, we introduce OmniCorpus, a 10 billion-scale image-text interleaved dataset.

In-Context Learning

1.42 stars / hour