1 code implementation • 8 Sep 2023 • Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala
These insights are embedded and used to remove harmful and boost useful components in embeddings -- without any supervision.
1 code implementation • NeurIPS 2023 • Changho Shin, Sonia Cromp, Dyah Adila, Frederic Sala
Weak supervision enables efficient development of training sets by reducing the need for ground truth labels.
1 code implementation • 13 Mar 2023 • Zhenmei Shi, Yifei Ming, Ying Fan, Frederic Sala, YIngyu Liang
In this paper, we propose a simple and effective regularization method based on the nuclear norm of the learned features for domain generalization.
no code implementations • 20 Dec 2022 • Mayee F. Chen, Benjamin Nachman, Frederic Sala
An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal.
1 code implementation • 24 Nov 2022 • Harit Vishwakarma, Nicholas Roberts, Frederic Sala
Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from a variety of sources.
no code implementations • 22 Nov 2022 • Harit Vishwakarma, Heguang Lin, Frederic Sala, Ramya Korlakai Vinayak
Given the long shelf-life and diverse usage of the resulting datasets, understanding when the data obtained by such auto-labeling systems can be relied on is crucial.
1 code implementation • 7 Oct 2022 • Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White
The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications.
3 code implementations • 5 Oct 2022 • Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré
Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task.
Ranked #3 on
Question Answering
on Story Cloze
no code implementations • 30 Aug 2022 • Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala
While it has been used successfully in many domains, weak supervision's application scope is limited by the difficulty of constructing labeling functions for domains with complex or high-dimensional features.
1 code implementation • 24 Mar 2022 • Mayee F. Chen, Daniel Y. Fu, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, Christopher Ré
Despite the black-box nature of foundation models, we prove results characterizing how our approach improves performance and show that lift scales with the smoothness of label distributions in embedding space.
1 code implementation • 22 Mar 2022 • Benedikt Boecking, Nicholas Roberts, Willie Neiswanger, Stefano Ermon, Frederic Sala, Artur Dubrawski
The model outperforms baseline weak supervision label models on a number of multiclass image classification datasets, improves the quality of generated images, and further improves end-model performance through data augmentation with synthetic samples.
no code implementations • ICLR 2022 • Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Roberts, Frederic Sala
We apply this technique to important problems previously not tackled by WS frameworks including learning to rank, regression, and learning in hyperbolic space.
1 code implementation • 12 Oct 2021 • Renbo Tu, Nicholas Roberts, Mikhail Khodak, Junhong Shen, Frederic Sala, Ameet Talwalkar
This makes the performance of NAS approaches in more diverse areas poorly understood.
1 code implementation • 3 Mar 2021 • Mayee F. Chen, Benjamin Cohen-Wang, Stephen Mussmann, Frederic Sala, Christopher Ré
We apply our decomposition framework to three scenarios -- well-specified, misspecified, and corrected models -- to 1) choose between labeled and unlabeled data and 2) learn from their combination.
no code implementations • ICLR 2021 • Sarah Hooper, Michael Wornow, Ying Hang Seah, Peter Kellman, Hui Xue, Frederic Sala, Curtis Langlotz, Christopher Re
We propose a framework that fuses limited label learning and weak supervision for segmentation tasks, enabling users to train high-performing segmentation CNNs with very few hand-labeled training points.
1 code implementation • 26 Jun 2020 • Mayee F. Chen, Daniel Y. Fu, Frederic Sala, Sen Wu, Ravi Teja Mullapudi, Fait Poms, Kayvon Fatahalian, Christopher Ré
Our goal is to enable machine learning systems to be trained interactively.
3 code implementations • ACL 2020 • Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, Christopher Ré
However, existing hyperbolic embedding methods do not account for the rich logical patterns in KGs.
Ranked #5 on
Link Prediction
on YAGO3-10
no code implementations • 11 Apr 2020 • Zhaobin Kuang, Frederic Sala, Nimit Sohoni, Sen Wu, Aldo Córdova-Palomera, Jared Dunnmon, James Priest, Christopher Ré
To relax these assumptions, we propose Ivy, a new method to combine IV candidates that can handle correlated and invalid IV candidates in a robust manner.
1 code implementation • ICML 2020 • Daniel Y. Fu, Mayee F. Chen, Frederic Sala, Sarah M. Hooper, Kayvon Fatahalian, Christopher Ré
In this work, we show that, for a class of latent variable models highly applicable to weak supervision, we can find a closed-form solution to model parameters, obviating the need for iterative solutions like stochastic gradient descent (SGD).
no code implementations • NeurIPS 2019 • Frederic Sala, Paroma Varma, Jason Fries, Daniel Y. Fu, Shiori Sagawa, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James Priest, Christopher Ré
Multi-resolution sources exacerbate this challenge due to complex correlations and sample complexity that scales in the length of the sequence.
no code implementations • ICLR 2019 • Albert Gu, Frederic Sala, Beliz Gunel, Christopher Ré
The quality of the representations achieved by embeddings is determined by how well the geometry of the embedding space matches the structure of the data.
no code implementations • 14 Mar 2019 • Paroma Varma, Frederic Sala, Ann He, Alexander Ratner, Christopher Ré
Labeling training data is a key bottleneck in the modern machine learning pipeline.
1 code implementation • 5 Oct 2018 • Alexander Ratner, Braden Hancock, Jared Dunnmon, Frederic Sala, Shreyash Pandey, Christopher Ré
Snorkel MeTaL: A framework for training models with multi-task weak supervision
Ranked #1 on
Semantic Textual Similarity
on SentEval
2 code implementations • ICML 2018 • Christopher De Sa, Albert Gu, Christopher Ré, Frederic Sala
Given a tree, we give a combinatorial construction that embeds the tree in hyperbolic space with arbitrarily low distortion without using optimization.
no code implementations • 8 Mar 2017 • Frederic Sala, Shahroze Kabir, Guy Van Den Broeck, Lara Dolecek
After being trained, classifiers must often operate on data that has been corrupted by noise.