Large Language Model
2003 papers with code • 3 benchmarks • 9 datasets
Benchmarks
These leaderboards are used to track progress in Large Language Model
Trend | Dataset | Best Model | Paper | Code | Compare |
---|
Libraries
Use these libraries to find Large Language Model models and implementationsMost implemented papers
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences.
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.
Generative Agents: Interactive Simulacra of Human Behavior
Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools.
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Our work, for the first time, uncovers that properly aligning the visual features with an advanced large language model can possess numerous advanced multi-modal abilities demonstrated by GPT-4, such as detailed image description generation and website creation from hand-drawn drafts.
Efficient Memory Management for Large Language Model Serving with PagedAttention
On top of it, we build vLLM, an LLM serving system that achieves (1) near-zero waste in KV cache memory and (2) flexible sharing of KV cache within and across requests to further reduce memory usage.
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
In this work, we unify visual representation into the language feature space to advance the foundational LLM towards a unified LVLM.
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
EoH represents the ideas of heuristics in natural language, termed thoughts.
Muse: Text-To-Image Generation via Masked Generative Transformers
Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.
Accelerating Large Language Model Decoding with Speculative Sampling
We present speculative sampling, an algorithm for accelerating transformer decoding by enabling the generation of multiple tokens from each transformer call.
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT.