1 code implementation • 2 Jun 2022 • Jacob Portes, Davis Blalock, Cory Stephenson, Jonathan Frankle
Benchmarking the tradeoff between neural network accuracy and training time is computationally expensive.
no code implementations • 22 Nov 2023 • Aditi Jha, Sam Havens, Jeremey Dohmann, Alex Trott, Jacob Portes
We find that subsets of 1k-6k instruction finetuning samples are sufficient to achieve good performance on both (1) traditional NLP benchmarks and (2) model-based evaluation.
1 code implementation • NeurIPS 2023 • Jacob Portes, Alex Trott, Sam Havens, Daniel King, Abhinav Venigalla, Moin Nadeem, Nikhil Sardana, Daya Khudia, Jonathan Frankle
Here, we introduce MosaicBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining.