Databricks Dolly 15k is a dataset containing 15,000 high-quality human-generated prompt / response pairs specifically designed for instruction tuning large language models. It is authored by more than 5,000 Databricks employees during March and April of 2023. The training records are natural, expressive and designed to represent a wide range of the behaviors, from brainstorming and content generation to information extraction and summarization.
Source: Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLMPaper | Code | Results | Date | Stars |
---|