1 code implementation • 31 May 2023 • Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
In this paper, we conduct a systematic empirical study comparing the length generalization performance of decoder-only Transformers with five different position encoding approaches including Absolute Position Embedding (APE), T5's Relative PE, ALiBi, and Rotary, in addition to Transformers without positional encoding (NoPE).
no code implementations • 21 Apr 2023 • Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navratil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young
We present an auditing framework that offers a holistic assessment of synthetic datasets and AI models trained on them, centered around bias and discrimination prevention, fidelity to the real data, utility, robustness, and privacy preservation.
no code implementations • 13 Dec 2022 • Prasanna Sattigeri, Soumya Ghosh, Inkit Padhi, Pierre Dognin, Kush R. Varshney
The dropping of training points is done in principle, but in practice does not require the model to be refit.
no code implementations • 5 Oct 2022 • Igor Melnyk, Vijil Chenthamarakshan, Pin-Yu Chen, Payel Das, Amit Dhurandhar, Inkit Padhi, Devleena Das
We introduce Reprogramming for Protein Sequence Infilling, a framework in which pretrained natural language models are repurposed for protein sequence infilling via reprogramming, to infill protein sequence templates as a method of novel protein generation.
no code implementations • 13 Aug 2022 • Brian Belgodere, Vijil Chenthamarakshan, Payel Das, Pierre Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young
With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed.
1 code implementation • 8 Jul 2022 • Matteo Manica, Jannis Born, Joris Cadow, Dimitrios Christofidellis, Ashish Dave, Dean Clarke, Yves Gaetan Nana Teukam, Giorgio Giannone, Samuel C. Hoffman, Matthew Buchan, Vijil Chenthamarakshan, Timothy Donovan, Hsiang Han Hsu, Federico Zipoli, Oliver Schilter, Akihiro Kishimoto, Lisa Hamada, Inkit Padhi, Karl Wehden, Lauren McHugh, Alexy Khrabrov, Payel Das, Seiji Takeda, John R. Smith
With the growing availability of data within various scientific domains, generative models hold enormous potential to accelerate scientific discovery.
1 code implementation • EMNLP 2021 • Pierre L. Dognin, Inkit Padhi, Igor Melnyk, Payel Das
Automatic construction of relevant Knowledge Bases (KBs) from text, and generation of semantically meaningful text from KBs are both long-standing goals in Machine Learning.
1 code implementation • 17 Jun 2021 • Jerret Ross, Brian Belgodere, Vijil Chenthamarakshan, Inkit Padhi, Youssef Mroueh, Payel Das
Models based on machine learning can enable accurate and fast molecular property predictions, which is of interest in drug discovery and material design.
1 code implementation • 21 Dec 2020 • Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere
Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO.
no code implementations • 21 Dec 2020 • Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff
Image captioning systems have made substantial progress, largely due to the availability of curated datasets like Microsoft COCO or Vizwiz that have accurate descriptions of their corresponding images.
no code implementations • 8 Dec 2020 • Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha
Aligned with this, we propose a framework GYC, to generate a set of counterfactual text samples, which are crucial for testing these ML systems.
1 code implementation • 3 Nov 2020 • Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman
This results in two architectures for tabular time series: one for learning representations that is analogous to BERT and can be pre-trained end-to-end and used in downstream tasks, and one that is akin to GPT and can be used for generation of realistic synthetic tabular sequences.
no code implementations • EMNLP 2020 • Pierre L. Dognin, Igor Melnyk, Inkit Padhi, Cicero Nogueira dos santos, Payel Das
In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs).
2 code implementations • 22 May 2020 • Payel Das, Tom Sercu, Kahini Wadhawan, Inkit Padhi, Sebastian Gehrmann, Flaviu Cipcigan, Vijil Chenthamarakshan, Hendrik Strobelt, Cicero dos Santos, Pin-Yu Chen, Yi Yan Yang, Jeremy Tan, James Hedrick, Jason Crain, Aleksandra Mojsilovic
De novo therapeutic design is challenged by a vast chemical repertoire and multiple constraints, e. g., high broad-spectrum potency and low toxicity.
no code implementations • ACL 2020 • Inkit Padhi, Pierre Dognin, Ke Bai, Cicero Nogueira dos santos, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das
Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks.
no code implementations • NeurIPS 2020 • Vijil Chenthamarakshan, Payel Das, Samuel C. Hoffman, Hendrik Strobelt, Inkit Padhi, Kar Wai Lim, Benjamin Hoover, Matteo Manica, Jannis Born, Teodoro Laino, Aleksandra Mojsilovic
CogMol also includes insilico screening for assessing toxicity of parent molecules and their metabolites with a multi-task toxicity classifier, synthetic feasibility with a chemical retrosynthesis predictor, and target structure binding with docking simulations.
1 code implementation • NeurIPS 2019 • Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero dos Santos
In the kernel version we show that SIC can be cast as a convex optimization problem by introducing auxiliary variables that play an important role in feature selection as they are normalized feature importance scores.
no code implementations • ICLR 2019 • Cicero Nogueira dos Santos, Inkit Padhi, Pierre Dognin, Youssef Mroueh
We propose a non-adversarial feature matching-based approach to train generative models.
no code implementations • ICCV 2019 • Cicero Nogueira dos Santos, Youssef Mroueh, Inkit Padhi, Pierre Dognin
Perceptual features (PFs) have been used with great success in tasks such as transfer learning, style transfer, and super-resolution.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Tom Sercu, Sebastian Gehrmann, Hendrik Strobelt, Payel Das, Inkit Padhi, Cicero dos Santos, Kahini Wadhawan, Vijil Chenthamarakshan
We present the pipeline in an interactive visual tool to enable the exploration of the metrics, analysis of the learned latent space, and selection of the best model for a given task.
no code implementations • 17 Oct 2018 • Payel Das, Kahini Wadhawan, Oscar Chang, Tom Sercu, Cicero dos Santos, Matthew Riemer, Vijil Chenthamarakshan, Inkit Padhi, Aleksandra Mojsilovic
Our model learns a rich latent space of the biological peptide context by taking advantage of abundant, unlabeled peptide sequences.
no code implementations • ACL 2018 • Cicero Nogueira dos Santos, Igor Melnyk, Inkit Padhi
We introduce a new approach to tackle the problem of offensive language in online social media.
no code implementations • 26 Nov 2017 • Igor Melnyk, Cicero Nogueira dos santos, Kahini Wadhawan, Inkit Padhi, Abhishek Kumar
Text attribute transfer using non-parallel data requires methods that can perform disentanglement of content and linguistic attributes.