no code implementations • 27 Nov 2023 • Anusha Sabbineni, Nikhil Anand, Maria Minakova
While data selection methods have been studied extensively in active learning, data pruning, and data augmentation settings, there is little evidence for the efficacy of these methods in industry scale settings, particularly in low-resource languages.
no code implementations • 27 Nov 2023 • Nikhil Anand, Joshua Tan, Maria Minakova
Modern ML systems ingest data aggregated from diverse sources, such as synthetic, human-annotated, and live customer traffic.