Trending Research

Continual Learning of Large Language Models: A Comprehensive Survey

wang-ml-lab/llm-continual-learning-survey • • 25 Apr 2024

In this survey, we provide a comprehensive overview of the current research progress on LLMs within the context of CL.

Continual Learning

0.64 stars / hour

Paper
Code

Dynamic Generation of Personalities with Large Language Models

hiyouga/llama-factory • • 10 Apr 2024

We propose a new metric to assess personality generation capability based on this evaluation method.

Personality Generation

20,835

0.62 stars / hour

Paper
Code

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

hiyouga/llama-efficient-tuning • • 4 Aug 2023

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5

20,836

0.61 stars / hour

Paper
Code

CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models

qinghew/CharacterFactory • 24 Apr 2024

In this work, we propose CharacterFactory, a framework that allows sampling new characters with consistent identities in the latent space of GANs for diffusion models.

Consistent Character Generation Word Embeddings

0.59 stars / hour

Paper
Code

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

3dtopia/lgm • • 7 Feb 2024

2) 3D Backbone: We present an asymmetric U-Net as a high-throughput backbone operating on multi-view images, which can be produced from text or single-view image input by leveraging multi-view diffusion models.

1,272

0.58 stars / hour

Paper
Code

Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

yvanyin/metric3d • • Under review for Transaction 2024

For metric depth estimation, we show that the key to a zero-shot single-view model lies in resolving the metric ambiguity from various camera models and large-scale data training.

Ranked #1 on Surface Normals Estimation on NYU Depth v2 (using extra training data)

Depth Estimation Surface Normal Estimation +1

722

0.56 stars / hour

Paper
Code

Retrieval Head Mechanistically Explains Long-Context Factuality

nightdessert/retrieval_head • • 24 Apr 2024

Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context.

Continual Pretraining Hallucination +3

0.56 stars / hour

Paper
Code

Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels

baofff/U-ViT • • NeurIPS 2023

In an effort to further advance semi-supervised generative and classification tasks, we propose a simple yet effective training strategy called dual pseudo training (DPT), built upon strong semi-supervised learners and diffusion models.

Classification

741

0.53 stars / hour

Paper
Code

Learning Visuotactile Skills with Two Multifingered Hands

ToruOwO/hato • • 25 Apr 2024

Two significant challenges exist: the lack of an affordable and accessible teleoperation system suitable for a dual-arm setup with multifingered hands, and the scarcity of multifingered hand hardware equipped with touch sensing.

0.50 stars / hour

Paper
Code

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

zzxslp/som-llava • • 25 Apr 2024

Set-of-Mark (SoM) Prompting unleashes the visual grounding capability of GPT-4V, by enabling the model to associate visual objects with tags inserted on the image.

Ranked #47 on Visual Question Answering on MM-Vet

Visual Grounding Visual Question Answering +1

0.49 stars / hour

Paper
Code