Large Language Model
248 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Large Language Model
Libraries
Use these libraries to find Large Language Model models and implementationsMost implemented papers
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Recent pretrained language models extend from millions to billions of parameters.
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT.
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
To examine this phenomenon, we present MiniGPT-4, which aligns a frozen visual encoder with a frozen LLM, Vicuna, using just one projection layer.
Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation
We propose to use a top-down parser as a model-based pruning method, which also enables parallel encoding during inference.
Elixir: Train a Large Language Model on a Small GPU Cluster
To reduce GPU memory usage, memory partitioning, and memory offloading have been proposed.
Emergent Analogical Reasoning in Large Language Models
In human cognition, this capacity is closely tied to an ability to reason by analogy.
Muse: Text-To-Image Generation via Masked Generative Transformers
Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.
PaLM-E: An Embodied Multimodal Language Model
Large language models excel at a wide range of complex tasks.
OpenICL: An Open-Source Framework for In-context Learning
However, the implementation of ICL is sophisticated due to the diverse retrieval and inference methods involved, as well as the varying pre-processing requirements for different models, datasets, and tasks.