no code implementations • EMNLP (NLLP) 2021 • Fernando Trias, Hongming Wang, Sylvain Jaume, Stratos Idreos
Older legal texts are often scanned and digitized via Optical Character Recognition (OCR), which results in numerous errors.
no code implementations • 16 Oct 2024 • Costin-Andrei Oncescu, Sanket Purandare, Stratos Idreos, Sham Kakade
While transformers have been at the core of most recent advancements in sequence generative models, their computational cost remains quadratic in sequence length.
1 code implementation • 9 Oct 2024 • Wanchao Liang, Tianyu Liu, Less Wright, Will Constable, Andrew Gu, Chien-chin Huang, Iris Zhang, Wei Feng, Howard Huang, Junjie Wang, Sanket Purandare, Gokul Nadathur, Stratos Idreos
By stacking training optimizations, we demonstrate accelerations of 65. 08% with 1D parallelism at the 128-GPU scale (Llama 3. 1 8B), an additional 12. 59% with 2D parallelism at the 256-GPU scale (Llama 3. 1 70B), and an additional 30% with 3D parallelism at the 512-GPU scale (Llama 3. 1 405B) on NVIDIA H100 GPUs over optimized baselines.
2 code implementations • 30 Jun 2022 • Eric R. Knorr, Baptiste Lemaire, Andrew Lim, Siqiang Luo, Huanchen Zhang, Stratos Idreos, Michael Mitzenmacher
We introduce Proteus, a novel self-designing approximate range filter, which configures itself based on sampled data in order to optimize its false positive rate (FPR) for a given space requirement.
no code implementations • ICLR 2021 • Abdul Wasay, Stratos Idreos
We identify a critical part of this design space that is not well-understood: That is how to decide between the alternatives of expanding a single network model or increasing the number of networks and using them together in an ensemble.
no code implementations • 11 Jul 2019 • Stratos Idreos, Niv Dayan, Wilson Qin, Mali Akmanalp, Sophie Hilgard, Andrew Ross, James Lennon, Varun Jain, Harshita Gupta, David Li, Zichen Zhu
The critical insight and potential long-term impact is that such unifying models 1) render what we consider up to now as fundamentally different data structures to be seen as views of the very same overall design space, and 2) allow seeing new data structure designs with performance properties that are not feasible by existing designs.
no code implementations • 12 Sep 2018 • Abdul Wasay, Brian Hentschel, Yuze Liao, Sanyuan Chen, Stratos Idreos
We propose MotherNets to enable higher accuracy and practical training cost for large and diverse neural network ensembles: A MotherNet captures the structural similarity across some or all members of a deep neural network ensemble which allows us to share data movement and computation costs across these networks.
1 code implementation • 1 Feb 2012 • Felix Halim, Stratos Idreos, Panagiotis Karras, Roland H. C. Yap
Stochastic cracking also uses each query as a hint on how to reorganize data, but not blindly so; it gains resilience and avoids performance bottlenecks by deliberately applying certain arbitrary choices in its decision-making.