CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

salesforce/codetf 31 May 2023

In this paper, we present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.

617
3.78 stars / hour

Let's Verify Step by Step

openai/prm800k Preprint 2023

We conduct our own investigation, finding that process supervision significantly outperforms outcome supervision for training models to solve problems from the challenging MATH dataset.

 Ranked #1 on Math Word Problem Solving on MATH minival (using extra training data)

Active Learning Math Word Problem Solving +1

710
2.96 stars / hour

Gorilla: Large Language Model Connected with Massive APIs

ShishirPatil/gorilla 24 May 2023

Large Language Models (LLMs) have seen an impressive wave of advances recently, with models now excelling in a variety of tasks, such as mathematical reasoning and program synthesis.

Language Modelling Mathematical Reasoning +2

3,236
2.54 stars / hour

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

mit-han-lab/llm-awq 1 Jun 2023

Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).

Common Sense Reasoning Language Modelling +1

267
2.27 stars / hour

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

facebookresearch/hiera 1 Jun 2023

Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.

Video Recognition

213
1.63 stars / hour

Large Language Models as Tool Makers

ctlllll/llm-toolmaker 26 May 2023

Our approach consists of two key phases: 1) tool making: an LLM acts as the tool maker that crafts tools for given tasks, where a tool is implemented as a Python utility function.

659
1.54 stars / hour

Humans in 4D: Reconstructing and Tracking Humans with Transformers

shubham-goel/4D-Humans 31 May 2023

To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.

Action Recognition Human Mesh Recovery +1

308
1.40 stars / hour

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation

icoz69/styleavatar3d 30 May 2023

The recent advancements in image-text diffusion models have stimulated research interest in large-scale 3D generative models.

329
1.37 stars / hour

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

kyegomez/tree-of-thoughts 17 May 2023

Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference.

Decision Making Language Modelling

2,650
1.29 stars / hour

EasySpider: A No-Code Visual System for Crawling the Web

NaiboWang/EasySpider ACM The Web Conference 2023

As such, web-crawling is an essential tool for both computational and non-computational scientists to conduct research.

Data Integration Marketing

9,612
1.24 stars / hour