no code implementations • 9 Nov 2022 • Gil Sadeh, Zichen Wang, Jasleen Grewal, Huzefa Rangwala, Layne Price
In this paper, we propose a new peptide data augmentation scheme, where we train peptide language models on artificially constructed peptides that are small contiguous subsets of longer, wild-type proteins; we refer to the training peptides as "chopped proteins".