no code implementations • 1 Apr 2024 • Amol Khanna, Edward Raff, Nathan Inkawhich
Linear models are ubiquitous in data science, but are particularly prone to overfitting and data memorization in high dimensions.
no code implementations • 18 Jan 2024 • Anish Lakkapragada, Amol Khanna, Edward Raff, Nathan Inkawhich
As machine learning becomes increasingly prevalent in impactful decisions, recognizing when inference data is outside the model's expected input distribution is paramount for giving context to predictions.
Dimensionality Reduction Out of Distribution (OOD) Detection
no code implementations • 24 Apr 2023 • Amol Khanna, Fred Lu, Edward Raff, Brian Testa
LASSO regularized logistic regression is particularly useful for its built-in feature selection, allowing coefficients to be removed from deployment and producing sparse solutions.
no code implementations • 18 Mar 2023 • Amol Khanna, Fred Lu, Edward Raff
Linear $L_1$-regularized models have remained one of the simplest and most effective tools in data analysis, especially in information retrieval problems where n-grams over text with TF-IDF or Okapi feature values are a strong and easy baseline.