no code implementations • 30 Jan 2024 • Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati
We present H2O-Danube, a series of small 1. 8B language models consisting of H2O-Danube-1. 8B, trained on 1T tokens, and the incremental improved H2O-Danube2-1. 8B trained on an additional 2T tokens.