Zero-Shot Tokenizer Transfer

bminixhofer/zett 13 May 2024

Finally, we show that a ZeTT hypernetwork trained for a base (L)LM can also be applied to fine-tuned variants without extra training.

XLM-R

70
0.25 stars / hour

LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks

ipa-lab/hackingBuddyGPT 17 Oct 2023

We explore the intersection of LLMs and penetration testing to gain insight into their capabilities and challenges in the context of privilege escalation.

In-Context Learning

140
0.25 stars / hour

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

tothebeginning/pulid 24 Apr 2024

We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method for text-to-image generation.

Text-to-Image Generation

749
0.25 stars / hour

From NeRFs to Gaussian Splats, and Back

grasp-lyrl/nerftogsandback 15 May 2024

For robotics applications where there is a limited number of (typically ego-centric) views, parametric representations such as neural radiance fields (NeRFs) generalize better than non-parametric ones such as Gaussian splatting (GS) to views that are very different from those in the training data; GS however can render much faster than NeRFs.

SSIM

18
0.24 stars / hour

Improving Sample Quality of Diffusion Models Using Self-Attention Guidance

lllyasviel/fooocus ICCV 2023

Denoising diffusion models (DDMs) have attracted attention for their exceptional generation quality and diversity.

Denoising Image Generation

36,245
0.24 stars / hour

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

weixiongust/rlhf-reward-modeling 13 Apr 2023

Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently enhancing the model by fine-tuning on these filtered samples.

Ethics

164
0.22 stars / hour

depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers

thuml/depyf 14 Mar 2024

PyTorch \texttt{2. x} introduces a compiler designed to accelerate deep learning programs.

338
0.22 stars / hour

Libra: Building Decoupled Vision System on Large Language Models

yifanxu74/libra 16 May 2024

Specifically, we incorporate a routed visual expert with a cross-modal bridge module into a pretrained LLM to route the vision and language flows during attention computing to enable different attention patterns in inner-modal modeling and cross-modal interaction scenarios.

Language Modelling Large Language Model

11
0.22 stars / hour

Unified Training of Universal Time Series Forecasting Transformers

SalesforceAIResearch/uni2ts 4 Feb 2024

Deep learning for time series forecasting has traditionally operated within a one-model-per-dataset framework, limiting its potential to leverage the game-changing impact of large pre-trained models.

Time Series Time Series Forecasting

478
0.21 stars / hour

A Survey on Vision Mamba: Models, Applications and Challenges

ruixxxx/awesome-vision-mamba-models 29 Apr 2024

To help keep pace with the rapid advancements in computer vision, this paper aims to provide a comprehensive review of visual Mamba approaches.

134
0.21 stars / hour