Search Results for author: Yair Schiff

Found 16 papers, 5 papers with code

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

1 code implementation • 5 Mar 2024 • Yair Schiff, Chia-Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov

Large-scale sequence modeling has sparked rapid advances that now extend into biology and genomics.

104

Paper
Code

DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic Systems

no code implementations • 6 Feb 2024 • Yair Schiff, Zhong Yi Wan, Jeffrey B. Parker, Stephan Hoyer, Volodymyr Kuleshov, Fei Sha, Leonardo Zepeda-Núñez

Learning dynamics from dissipative chaotic systems is notoriously difficult due to their inherent instability, as formalized by their positive Lyapunov exponents, which exponentially amplify errors in the learned dynamics.

Paper
Add Code

InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models

no code implementations • 14 Jun 2023 • Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov

While diffusion models excel at generating high-quality samples, their latent variables typically lack semantic meaning and are not suitable for representation learning.

Representation Learning

Paper
Add Code

Auditing and Generating Synthetic Data with Controllable Trust Trade-offs

no code implementations • 21 Apr 2023 • Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navratil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young

We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models.

Model Selection Privacy Preserving

Paper
Add Code

Cloud-Based Real-Time Molecular Screening Platform with MolFormer

no code implementations • 13 Aug 2022 • Brian Belgodere, Vijil Chenthamarakshan, Payel Das, Pierre Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young

With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed.

Drug Discovery Language Modelling +2

Paper
Add Code

Semi-Autoregressive Energy Flows: Exploring Likelihood-Free Training of Normalizing Flows

no code implementations • 14 Jun 2022 • Phillip Si, Zeyi Chen, Subham Sekhar Sahoo, Yair Schiff, Volodymyr Kuleshov

Training normalizing flow generative models can be challenging due to the need to calculate computationally expensive determinants of Jacobians.

Two-sample testing

Paper
Add Code

Learning with Stochastic Orders

1 code implementation • 27 May 2022 • Carles Domingo-Enrich, Yair Schiff, Youssef Mroueh

Learning high-dimensional distributions is often done with explicit likelihood modeling or implicit modeling via minimizing integral probability metrics (IPMs).

Image Generation

Paper
Code

Semi-Parametric Inducing Point Networks and Neural Processes

2 code implementations • 24 May 2022 • Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov

We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner.

Imputation Meta-Learning

Paper
Code

Predicting Deep Neural Network Generalization with Perturbation Response Curves

no code implementations • NeurIPS 2021 • Yair Schiff, Brian Quanz, Payel Das, Pin-Yu Chen

However, despite these successes, the recent Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition suggests that there is a need for more robust and efficient measures of network generalization.

Paper
Add Code

Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations

no code implementations • 8 Jun 2021 • Yair Schiff, Vijil Chenthamarakshan, Samuel Hoffman, Karthikeyan Natesan Ramamurthy, Payel Das

Deep generative models have emerged as a powerful tool for learning useful molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design.

Drug Discovery Topological Data Analysis +1

Paper
Add Code

Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks

no code implementations • 1 Jun 2021 • David Alvarez-Melis, Yair Schiff, Youssef Mroueh

Gradient flows are a powerful tool for optimizing functionals in general metric spaces, including the space of probabilities endowed with the Wasserstein metric.

Paper
Add Code

Gi and Pal Scores: Deep Neural Network Generalization Statistics

no code implementations • 8 Apr 2021 • Yair Schiff, Brian Quanz, Payel Das, Pin-Yu Chen

The field of Deep Learning is rich with empirical evidence of human-like performance on a variety of regression, classification, and control tasks.

regression

Paper
Add Code

Alleviating Noisy Data in Image Captioning with Cooperative Distillation

no code implementations • 21 Dec 2020 • Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff

Image captioning systems have made substantial progress, largely due to the availability of curated datasets like Microsoft COCO or Vizwiz that have accurate descriptions of their corresponding images.

Image Captioning

Paper
Add Code

Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

1 code implementation • 21 Dec 2020 • Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere

Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO.

Image Captioning Navigate

Paper
Code

Tabular Transformers for Modeling Multivariate Time Series

1 code implementation • 3 Nov 2020 • Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman

This results in two architectures for tabular time series: one for learning representations that is analogous to BERT and can be pre-trained end-to-end and used in downstream tasks, and one that is akin to GPT and can be used for generation of realistic synthetic tabular sequences.

Fraud Detection Synthetic Data Generation +2

295

Paper
Code

Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics

no code implementations • NeurIPS Workshop TDA_and_Beyond 2020 • Yair Schiff, Vijil Chenthamarakshan, Karthikeyan Natesan Ramamurthy, Payel Das

In this work, we propose a method for measuring how well the latent space of deep generative models is able to encode structural and chemical features of molecular datasets by correlating latent space metrics with metrics from the field of topological data analysis (TDA).

Topological Data Analysis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.