1 code implementation • 16 Oct 2022 • Jungkyu Park, Jakub Chłędowski, Stanisław Jastrzębski, Jan Witowski, Yanqi Xu, Linda Du, Sushma Gaddam, Eric Kim, Alana Lewin, Ujas Parikh, Anastasia Plaunova, Sardius Chen, Alexandra Millet, James Park, Kristine Pysarenko, Shalin Patel, Julia Goldberg, Melanie Wegener, Linda Moy, Laura Heacock, Beatriu Reig, Krzysztof J. Geras
On a dataset collected at NYU Langone Health, including 85, 526 patients with full-field 2D mammography (FFDM), synthetic 2D mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0. 831 (95% CI: 0. 769-0. 887) in classifying breasts with malignant findings using 3D mammography.
We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning.
Self-supervised learning holds promise to revolutionize molecule property prediction - a central task to drug discovery and many more industries - by enabling data efficient learning from scarce experimental data.
A natural idea to mitigate this problem is to bias the search process towards more easily synthesizable molecules using a proxy for synthetic accessibility.
By training neural networks on these adversarial examples, we can improve their generalisation in collider bias settings.
1 code implementation • 4 Aug 2020 • Farah E. Shamout, Yiqiu Shen, Nan Wu, Aakash Kaku, Jungkyu Park, Taro Makino, Stanisław Jastrzębski, Duo Wang, Ben Zhang, Siddhant Dogra, Meng Cao, Narges Razavian, David Kudlowitz, Lea Azour, William Moore, Yvonne W. Lui, Yindalon Aphinyanaphongs, Carlos Fernandez-Granda, Krzysztof J. Geras
In order to verify performance in a real clinical setting, we silently deployed a preliminary version of the deep neural network at New York University Langone Health during the first wave of the pandemic, which produced accurate predictions in real-time.
The central challenge in automated synthesis planning is to be able to generate and predict outcomes of a diverse set of chemical reactions.
Ranked #15 on Single-step retrosynthesis on USPTO-50k
Deep neural networks (DNNs) show promise in breast cancer screening, but their robustness to input perturbations must be better understood before they can be clinically implemented.
Designing a single neural network architecture that performs competitively across a range of molecule property prediction tasks remains largely an open challenge, and its solution may unlock a widespread use of deep learning in the drug discovery industry.
In breast cancer screening, radiologists make the diagnosis based on images that are taken from two angles.
Recent work has shown that using unlabeled data in semi-supervised learning is not always beneficial and can even hurt generalization, especially when there is a class mismatch between the unlabeled and labeled examples.
2 code implementations • 20 Mar 2019 • Nan Wu, Jason Phang, Jungkyu Park, Yiqiu Shen, Zhe Huang, Masha Zorin, Stanisław Jastrzębski, Thibault Févry, Joe Katsnelson, Eric Kim, Stacey Wolfson, Ujas Parikh, Sushma Gaddam, Leng Leng Young Lin, Kara Ho, Joshua D. Weinstein, Beatriu Reig, Yiming Gao, Hildegard Toth, Kristine Pysarenko, Alana Lewin, Jiyon Lee, Krystal Airola, Eralda Mema, Stephanie Chung, Esther Hwang, Naziya Samreen, S. Gene Kim, Laura Heacock, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras
We present a deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200, 000 exams (over 1, 000, 000 images).
However, the success of NAS depends on the definition of the search space.
We propose that in ResNets this can be resolved based on our results, by ensuring the same level of dynamical isometry at initialization.
When studying the SGD dynamics in relation to the sharpest directions in this initial phase, we find that the SGD step is large compared to the curvature and commonly fails to minimize the loss along the sharpest directions.
The crucial new ingredient is the introduction of a new (Cramer-Wold) metric in the space of densities, which replaces the Wasserstein metric used in SWAE.
Commonsense knowledge bases such as ConceptNet represent knowledge in the form of relational triples.
In particular we find that the ratio of learning rate to batch size is a key determinant of SGD dynamics and of the width of the final minima, and that higher values of the ratio lead to wider minima and often better generalization.
In general, a Resnet block tends to concentrate representation learning behavior in the first few layers while higher layers perform iterative refinement of features.
2 code implementations • • Devansh Arpit, Stanisław Jastrzębski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, Simon Lacoste-Julien
We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness.
Words in natural language follow a Zipfian distribution whereby some words are frequent but most are rare.
Ranked #48 on Question Answering on SQuAD1.1 dev
This paper shows how one can directly apply natural language processing (NLP) methods to classification problems in cheminformatics.