KAN: Kolmogorov-Arnold Networks

kindxiaoming/pykan 30 Apr 2024

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs).

3,886
20.01 stars / hour

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

hvision-nku/storydiffusion 2 May 2024

This module converts the generated sequence of images into videos with smooth transitions and consistent subjects that are significantly more stable than the modules based on latent spaces only, especially in the context of long video generation.

motion prediction Story Generation +1

216
6.38 stars / hour

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

prometheus-eval/prometheus-eval 2 May 2024

Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs.

Language Modelling

98
3.38 stars / hour

X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular Design

ericlbuehler/mistral.rs 11 Feb 2024

Starting with a set of pre-trained LoRA adapters, our gating strategy uses the hidden states to dynamically mix adapted layers, allowing the resulting X-LoRA model to draw upon different capabilities and create never-before-used deep layer-wise combinations to solve tasks.

graph construction Knowledge Graphs +3

1,354
1.83 stars / hour

Improving Diffusion Models for Virtual Try-on

yisol/IDM-VTON 8 Mar 2024

Finally, we present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.

Virtual Try-on

1,578
1.59 stars / hour

Lightplane: Highly-Scalable Components for Neural 3D Fields

facebookresearch/lightplane 30 Apr 2024

Contemporary 3D research, particularly in reconstruction and generation, heavily relies on 2D images for inputs or supervision.

3D Reconstruction

120
1.54 stars / hour

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving

JackAILab/ConsistentID 25 Apr 2024

ConsistentID comprises two key components: a multimodal facial prompt generator that combines facial features, corresponding facial descriptions and the overall facial context to enhance precision in facial details, and an ID-preservation network optimized through the facial attention localization strategy, aimed at preserving ID consistency in facial regions.

379
1.41 stars / hour

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

opengvlab/internvl 25 Apr 2024

Compared to both open-source and proprietary models, InternVL 1. 5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks.

4k Language Modelling +3

1,449
1.30 stars / hour

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

tothebeginning/pulid 24 Apr 2024

We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method for text-to-image generation.

Text-to-Image Generation

355
1.17 stars / hour

RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing

2471023025/ralm_survey 30 Apr 2024

Large Language Models (LLMs) have catalyzed significant advancements in Natural Language Processing (NLP), yet they encounter challenges such as hallucination and the need for domain-specific knowledge.

Computational Efficiency Hallucination +2

65
0.91 stars / hour