no code implementations • 25 Dec 2023 • Tirth Patel, Fred Lu, Edward Raff, Charles Nicholas, Cynthia Matuszek, James Holt
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines, meaning a 0. 1\% change can cause an overwhelming number of false positives.
no code implementations • 25 Jul 2023 • Skyler Wu, Fred Lu, Edward Raff, James Holt
Convolutional layers have long served as the primary workhorse for image classification.
no code implementations • 27 Jun 2023 • Tyler LeBlond, Joseph Munoz, Fred Lu, Maya Fuchs, Elliott Zaresky-Williams, Edward Raff, Brian Testa
Differential privacy (DP) is the prevailing technique for protecting user data in machine learning models.
no code implementations • 24 Apr 2023 • Amol Khanna, Fred Lu, Edward Raff, Brian Testa
LASSO regularized logistic regression is particularly useful for its built-in feature selection, allowing coefficients to be removed from deployment and producing sparse solutions.
no code implementations • 18 Mar 2023 • Amol Khanna, Fred Lu, Edward Raff
Linear $L_1$-regularized models have remained one of the simplest and most effective tools in data analysis, especially in information retrieval problems where n-grams over text with TF-IDF or Okapi feature values are a strong and easy baseline.
no code implementations • 15 Jan 2023 • Fred Lu, Edward Raff, James Holt
Subsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets.
no code implementations • 16 Oct 2022 • Fred Lu, Joseph Munoz, Maya Fuchs, Tyler LeBlond, Elliott Zaresky-Williams, Edward Raff, Francis Ferraro, Brian Testa
We present a framework to statistically audit the privacy guarantee conferred by a differentially private machine learner in practice.
no code implementations • 9 Jun 2022 • Fred Lu, Edward Raff, Francis Ferraro
Many metric learning tasks, such as triplet learning, nearest neighbor retrieval, and visualization, are treated primarily as embedding tasks where the ultimate metric is some variant of the Euclidean distance (e. g., cosine or Mahalanobis), and the algorithm must learn to embed points into the pre-chosen space.
no code implementations • 18 Feb 2022 • Andre T. Nguyen, Fred Lu, Gary Lopez Munoz, Edward Raff, Charles Nicholas, James Holt
We explore the utility of information contained within a dropout based Bayesian neural network (BNN) for the task of detecting out of distribution (OOD) data.
no code implementations • 14 Feb 2022 • Fred Lu, Francis Ferraro, Edward Raff
Our method, which we term continuously generalized ordinal logistic, significantly outperforms the standard ordinal logistic model over a thorough set of ordinal regression benchmark datasets.
1 code implementation • 29 Sep 2021 • Peyman H. Kassani, Fred Lu, Yann Le Guen, Zihuai He
The merit of the proposed method includes: (1) flexible modelling of the non-linear effect of genetic variants to improve statistical power; (2) multiple knockoffs in the input layer to rigorously control false discovery rate; (3) hierarchical layers to substantially reduce the number of weight parameters and activations to improve computational efficiency; (4) de-randomized feature selection to stabilize identified signals.
1 code implementation • ICLR 2021 • Sharon Zhou, Eric Zelikman, Fred Lu, Andrew Y. Ng, Gunnar Carlsson, Stefano Ermon
Learning disentangled representations is regarded as a fundamental task for improving the generalization, robustness, and interpretability of generative models.