Most implemented papers

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

guidance-ai/guidance 28 Jan 2022

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

The Matrix Calculus You Need For Deep Learning

parrt/bookish 5 Feb 2018

This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks.

Full Page Handwriting Recognition via Image to Sequence Extraction

kingyiusuen/image-to-latex 11 Mar 2021

We present a Neural Network based Handwritten Text Recognition (HTR) model architecture that can be trained to recognize full pages of handwritten or printed text without image segmentation.

PaLM: Scaling Language Modeling with Pathways

lucidrains/CoCa-pytorch Google Research 2022

To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.

Mistral 7B

mistralai/mistral-src 10 Oct 2023

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.

Measuring Mathematical Problem Solving With the MATH Dataset

hendrycks/math 5 Mar 2021

To facilitate future research and increase accuracy on MATH, we also contribute a large auxiliary pretraining dataset which helps teach models the fundamentals of mathematics.

How is ChatGPT's behavior changing over time?

lchen001/llmdrift 18 Jul 2023

We find that the performance and behavior of both GPT-3. 5 and GPT-4 can vary greatly over time.

Llemma: An Open Language Model For Mathematics

eleutherai/gpt-neox 16 Oct 2023

We present Llemma, a large language model for mathematics.

Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving

ischlag/TP-Transformer 15 Oct 2019

We incorporate Tensor-Product Representations within the Transformer in order to better support the explicit representation of relation structure.