no code implementations • EMNLP 2020 • Steven Rennie, Etienne Marcheret, Neil Mallinar, David Nahamoo, Vaibhava Goel
Nevertheless, additional pre-training closer to the end-task, such as training on synthetic QA pairs, has been shown to improve performance.
no code implementations • 5 Oct 2022 • A. Michael Carrell, Neil Mallinar, James Lucas, Preetum Nakkiran
We propose a systematic way to study the calibration error: by decomposing it into (1) calibration error on the train set, and (2) the calibration generalization gap.
no code implementations • 14 Jul 2022 • Neil Mallinar, James B. Simon, Amirhesam Abedsoltan, Parthe Pandit, Mikhail Belkin, Preetum Nakkiran
In this work we argue that while benign overfitting has been instructive and fruitful to study, many real interpolating methods like neural networks do not fit benignly: modest noise in the training set causes nonzero (but non-infinite) excess risk at test time, implying these models are neither benign nor catastrophic but rather fall in an intermediate regime.
no code implementations • 4 Feb 2020 • Neil Mallinar, Abhishek Shah, Tin Kam Ho, Rajendra Ugrani, Ayush Gupta
Real-world text classification tasks often require many labeled training examples that are expensive to obtain.
no code implementations • 29 Jul 2019 • Tom Sercu, Neil Mallinar
We introduce Multi-Frame Cross-Entropy training (MFCE) for convolutional neural network acoustic models.
no code implementations • 14 Dec 2018 • Neil Mallinar, Abhishek Shah, Rajendra Ugrani, Ayush Gupta, Manikandan Gurusankar, Tin Kam Ho, Q. Vera Liao, Yunfeng Zhang, Rachel K. E. Bellamy, Robert Yates, Chris Desmarais, Blake McGregor
We report on a user study that shows positive user feedback for this new approach to build conversational agents, and demonstrates the effectiveness of using data programming for auto-labeling.
3 code implementations • ICLR 2019 • Chun-Fu Chen, Quanfu Fan, Neil Mallinar, Tom Sercu, Rogerio Feris
The proposed approach demonstrates improvement of model efficiency and performance on both object recognition and speech recognition tasks, using popular architectures including ResNet and ResNeXt.
no code implementations • 16 Jan 2018 • Neil Mallinar, Corbin Rosset
We examine Deep Canonically Correlated LSTMs as a way to learn nonlinear transformations of variable length sequences and embed them into a correlated, fixed dimensional space.