GIVT: Generative Infinite-Vocabulary Transformers

google-research/big_vision 4 Dec 2023

We introduce generative infinite-vocabulary transformers (GIVT) which generate vector sequences with real-valued entries, instead of discrete tokens from a finite vocabulary.

Conditional Image Generation Decoder +2

1,675
0.27 stars / hour

LOHO: Latent Optimization of Hairstyles via Orthogonalization

dukebw/LOHO CVPR 2021

Therefore, we propose Latent Optimization of Hairstyles via Orthogonalization (LOHO), an optimization-based approach using GAN inversion to infill missing hair structure details in latent space during hairstyle transfer.

SSIM

226
0.27 stars / hour

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

tencentarc/instantmesh 10 Apr 2024

We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a single image, featuring state-of-the-art generation quality and significant training scalability.

Image to 3D

2,030
0.27 stars / hour

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

dreamscene4d/dreamscene4d 3 May 2024

We first decompose the video scene by using open-vocabulary mask trackers and an adapted image diffusion model to segment, track, and amodally complete the objects and background in the video.

Depth Estimation Depth Prediction +3

72
0.27 stars / hour

State-Free Inference of State-Space Models: The Transfer Function Approach

ruke1ire/RTF 10 May 2024

We approach designing a state-space model for deep learning applications through its dual representation, the transfer function, and uncover a highly efficient sequence parallel inference algorithm that is state-free: unlike other proposed algorithms, state-free inference does not incur any significant memory or computational cost with an increase in state size.

Language Modelling

24
0.26 stars / hour

FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering

hiyouga/llama-factory 29 Apr 2024

To investigate these aspects, we create and publish a novel TQA evaluation benchmark in English.

Question Answering

22,198
0.26 stars / hour

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

hiyouga/llama-efficient-tuning 4 Aug 2023

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5

22,205
0.26 stars / hour

SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing

modelscope/swift 18 Dec 2023

Image diffusion models have been utilized in various tasks, such as text-to-image generation and controllable image synthesis.

Decoder Text-to-Image Generation

1,453
0.26 stars / hour

Make Your LLM Fully Utilize the Context

hsiehjackson/ruler 25 Apr 2024

While many contemporary large language models (LLMs) can process lengthy input, they still struggle to fully utilize information within the long context, known as the lost-in-the-middle challenge.

4k Information Retrieval +1

181
0.26 stars / hour

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

prometheus-eval/prometheus-eval 2 May 2024

Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs.

Language Modelling

467
0.25 stars / hour