CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

salesforce/codetf 31 May 2023

In this paper, we present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.

309
4.11 stars / hour

Let's Verify Step by Step

openai/prm800k Preprint 2023

We conduct our own investigation, finding that process supervision significantly outperforms outcome supervision for training models to solve problems from the challenging MATH dataset.

 Ranked #1 on Math Word Problem Solving on MATH minival (using extra training data)

Active Learning Math Word Problem Solving +1

655
3.61 stars / hour

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

mit-han-lab/llm-awq 1 Jun 2023

Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).

Common Sense Reasoning Language Modelling +1

189
2.91 stars / hour

Gorilla: Large Language Model Connected with Massive APIs

ShishirPatil/gorilla 24 May 2023

Large Language Models (LLMs) have seen an impressive wave of advances recently, with models now excelling in a variety of tasks, such as mathematical reasoning and program synthesis.

Language Modelling Mathematical Reasoning +2

3,085
2.68 stars / hour

Large Language Models as Tool Makers

ctlllll/llm-toolmaker 26 May 2023

Our approach consists of two key phases: 1) tool making: an LLM acts as the tool maker that crafts tools for given tasks, where a tool is implemented as a Python utility function.

638
2.35 stars / hour

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

facebookresearch/hiera 1 Jun 2023

Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.

Video Recognition

141
2.05 stars / hour

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

threestudio-project/threestudio 25 May 2023

In this work, we propose to model the 3D parameter as a random variable instead of a constant as in SDS and present variational score distillation (VSD), a principled particle-based variational framework to explain and address the aforementioned issues in text-to-3D generation.

Text to 3D

1,278
2.02 stars / hour

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation

icoz69/styleavatar3d 30 May 2023

The recent advancements in image-text diffusion models have stimulated research interest in large-scale 3D generative models.

281
1.76 stars / hour

Humans in 4D: Reconstructing and Tracking Humans with Transformers

shubham-goel/4D-Humans 31 May 2023

To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.

Action Recognition Human Mesh Recovery +1

226
1.57 stars / hour

Generating Sequences With Recurrent Neural Networks

sjvasquez/handwriting-synthesis 4 Aug 2013

This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time.

Language Modelling Text Generation

3,422
1.49 stars / hour