Findings (EMNLP) 2021

It has been shown that training multi-task models with auxiliary tasks can improve the target task quality through cross-task transfer.

EMNLP 2020

We propose a new methodology to assign and learn embeddings for numbers.

DeeLIO (ACL) 2022

In this work, we investigate whether there are more effective strategies for judiciously selecting in-context examples (relative to random sampling) that better leverage GPT-3’s in-context learning capabilities. Inspired by the recent success of leveraging a retrieval module to augment neural networks, we propose to retrieve examples that are semantically-similar to a test query sample to formulate its corresponding prompt.

EMNLP 2020

Legislator preferences are typically represented as measures of general ideology estimated from roll call votes on legislation, potentially masking important nuances in legislators{'} political attitudes.

27 May 2024

Contextual outcomes in the $m$th set of contextual data, $\textsf{C}_m$, are modeled in terms of latent function $f_m(x)\in\textsf{F}$, where $\textsf{F}$ is a functional class with $(C-1)$-dimensional vector output.

2 Dec 2023

Zero-shot learning (ZSL) is a promising approach to generalizing a model to categories unseen during training by leveraging class attributes, but challenges remain.

9 Mar 2023

Open world classification is a task in natural language processing with key practical relevance and impact.

23 Oct 2022

Weight pruning is among the most popular approaches for compressing deep convolutional neural networks.

17 Oct 2022

The model is fine-tuned by introducing a new regularization loss that separates the embeddings of IND and OOD data, which leads to significant gains on the OOD prediction task during testing.

20 Sep 2022

In recommendation systems, items are likely to be exposed to various users and we would like to learn about the familiarity of a new user with an existing item.

7 May 2022

Numbers are essential components of text, like any other word tokens, from which natural language processing (NLP) models are built and deployed.

17 Mar 2022

Commonsense question answering requires reasoning about everyday situations and causes and effects implicit in context.

25 Feb 2022

Understanding the effects of these system inputs on system outputs is crucial to have any meaningful model of a dynamical system.

24 Nov 2021

We introduce the challenging new task of explainable multiple abnormality classification in volumetric medical images, in which a model must indicate the regions used to predict each abnormality.

4 Nov 2021

Distributed learning has become an integral tool for scaling up machine learning and addressing the growing need for data privacy.

4 Nov 2021

In this work, we present a careful analysis of the thermodynamic variational objective (TVO), bridging the gap between existing variational objectives and shedding new insights to advance the field.

9 Jul 2021

We examine interval estimation of the effect of a treatment T on an outcome Y given the existence of an unobserved confounder U.

ICLR 2022

Though recent works have developed methods that can generate estimates (or imputations) of the missing entries in a dataset to facilitate downstream analysis, most depend on assumptions that may not align with real-world applications and could suffer from poor performance in subsequent tasks such as classification.

2 Jul 2021

InfoNCE-based contrastive representation learners, such as SimCLR, have been tremendously successful in recent years.

2 Jul 2021

Successful applications of InfoNCE and its variants have popularized the use of contrastive variational mutual information (MI) estimators in machine learning.

NAACL 2021

In many natural language processing applications, identifying predictive text can be as important as the predictions themselves.

27 Apr 2021

Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data.

2 Apr 2021

We consider machine-learning-based malignancy prediction and lesion identification from clinical dermatological images, which can be indistinctly acquired via smartphone or dermoscopy capture.

CVPR 2021

However, the growth in the number of additional parameters of many of these types of methods can be computationally expensive at larger scales, at times prohibitively so.

ICLR 2021

Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker.

ICLR 2021

Pretrained text encoders, such as BERT, have been applied increasingly in various natural language processing (NLP) tasks, and have recently demonstrated significant performance gains.

NeurIPS 2021

Our approach is based on learning a set of global and task-specific parameters.

23 Feb 2021

Zero-shot learning (ZSL) has been shown to be a promising approach to generalizing a model to categories unseen during training by leveraging class attributes, but challenges still remain.

10 Feb 2021

While different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19, the data itself is still scarce due to patient privacy concerns.

17 Jan 2021

Inspired by the recent success of leveraging a retrieval module to augment large-scale neural network models, we propose to retrieve examples that are semantically-similar to a test sample to formulate its corresponding prompt.

2 Jan 2021

Flexibility design problems are a class of problems that appear in strategic decision-making across industries, where the objective is to design a ($e. g.$, manufacturing) network that affords flexibility and adaptivity.

1 Jan 2021

There has been growing interest in representation learning for text data, based on theoretical arguments and empirical evidence.

CVPR 2021

The primary goal of knowledge distillation (KD) is to encapsulate the information of a model learned from a teacher network into a student network, with the latter being more compact than the former.

10 Dec 2020

Accordingly, given a set of graphs generated by an underlying graphon, we learn the corresponding step function as the Gromov-Wasserstein barycenter of the given graphs.

6 Dec 2020

Deep neural networks excel at comprehending complex visual signals, delivering on par or even superior performance to that of human experts.

NeurIPS 2020

Synchronization is a key step in data-parallel distributed machine learning (ML).

NeurIPS 2020

As a step towards more flexible, scalable and accurate ITE estimation, we present a novel generative Bayesian estimation framework that integrates representation learning, adversarial matching and causal estimation.

NeurIPS 2020

Further, our approach does not require storing data samples from the old tasks, which is done by many replay based methods.

NeurIPS 2021

Dealing with severe class imbalance poses a major challenge for real-world applications, especially when the accurate classification and generalization of minority classes is of primary interest.

17 Nov 2020

Explanation methods facilitate the development of models that learn meaningful concepts and avoid exploiting spurious correlations.

Findings of the Association for Computational Linguistics 2020

Pretrained Language Models (PLMs) have improved the performance of natural language understanding in recent years.

Findings of the Association for Computational Linguistics 2020

In sequence-to-sequence models, classical optimal transport (OT) can be applied to semantically match generated sentences with target sentences.

ICLR 2021

Large-scale language models have recently demonstrated impressive empirical performance.

23 Oct 2020

A key to causal inference with observational data is achieving balance in predictive features associated with each treatment type.

15 Oct 2020

Due to the SAP test's innate difficulty and its high test-retest variability, we propose the RetiNerveNet, a deep convolutional recursive neural network for obtaining estimates of the SAP visual field.

15 Oct 2020

Causal inference, or counterfactual prediction, is central to decision making in healthcare, policy and social sciences.

EMNLP 2020

An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences.

2 Oct 2020

The data sources described earlier make two "domains": a hand-collected data domain of images with threats, and a real-world domain of images assumed without threats.

14 Aug 2020

Cross-domain alignment between image objects and text sequences is key to many visual-language tasks, and it poses a fundamental challenge to both computer vision and natural language processing.

13 Aug 2020

Experiments on MNIST, FashionMNIST, and CIFAR-10 demonstrate WAFFLe's significant improvement to local test performance and fairness while simultaneously providing an extra layer of security.

13 Jul 2020

Maximum likelihood (ML) and adversarial learning are two popular approaches for training generative models, and from many perspectives these techniques are complementary.

ICML 2020

In GOT, cross-domain alignment is formulated as a graph matching problem, by representing entities into a dynamically-constructed graph.

ICML 2020

In this paper, we propose a novel Contrastive Log-ratio Upper Bound (CLUB) of mutual information.

22 Jun 2020

Small and imbalanced datasets commonly seen in healthcare represent a challenge when training classifiers based on deep learning models.

16 Jun 2020

An unbiased low-variance gradient estimator, termed GO gradient, was proposed recently for expectation-based objectives $\mathbb{E}_{q_{\boldsymbol{\gamma}}(\boldsymbol{y})} [f(\boldsymbol{y})]$, where the random variable (RV) $\boldsymbol{y}$ may be drawn from a stochastic computation graph with continuous (non-reparameterizable) internal nodes and continuous/discrete leaves.

14 Jun 2020

Balanced representation learning methods have been applied successfully to counterfactual inference from observational data.

NeurIPS 2020

As a fundamental issue in lifelong learning, catastrophic forgetting is directly caused by inaccessible historical data; accordingly, if the data (information) were memorized perfectly, no forgetting should be expected.

12 Jun 2020

Control variates are a well-established tool to reduce the variance of Monte Carlo estimators.

4 Jun 2020

Current neural-network-based classifiers are susceptible to adversarial examples.

4 Jun 2020

Traditional multi-view learning methods often rely on two assumptions: ($i$) the samples in different views are well-aligned, and ($ii$) their representations in latent space obey the same distribution.

ACL 2020

Learning disentangled representations of natural language is essential for many NLP tasks, e. g., conditional text generation, style transfer, personalized dialogue systems, etc.

8 May 2020

A modified Y-Net architecture based on the VGG11 encoder is used to simultaneously learn geometric orientation (similarity transform parameters) of the chest and segmentation of radiographic annotations.

ACL 2020

Auto-regressive text generation models usually focus on local fluency, and may cause inconsistent semantic meaning in long text generation.

4 May 2020

Text-based interactive recommendation provides richer user feedback and has demonstrated advantages over traditional interactive recommender systems.

ICLR 2020

We investigate new methods for training collaborative filtering models based on actor-critic reinforcement learning, to more directly maximize ranking-based objective functions.

NAACL 2021

In this paper, we investigate text generation in a hyperbolic latent space to learn continuous hierarchical representations.

NeurIPS 2020

We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers.

ICLR 2020

Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network.

21 Apr 2020

Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings, where data from previous tasks are unavailable.

6 Mar 2020

Recent research has proposed the lottery ticket hypothesis, suggesting that for a deep neural network, there exist trainable sub-networks performing equally or better than the original model with commensurate training steps.

29 Feb 2020

As a result, there is an unmet need in survival analysis for identifying subpopulations with distinct risk profiles, while jointly accounting for accurate individualized time-to-event predictions.

ICML 2020

Demonstrated by natural-image generation, we reveal that low-level filters (those close to observations) of both the generator and discriminator of pretrained GANs can be transferred to facilitate generation in a perceptually-distinct target domain with limited training data.

CVPR 2020

By training on a large amount of image-text-action triplets in a self-supervised learning manner, the pre-trained model provides generic representations of visual environments and language instructions.

12 Feb 2020

This model reached a classification performance of AUROC greater than 0. 90 for 18 abnormalities, with an average AUROC of 0. 773 for all 83 abnormalities, demonstrating the feasibility of learning from unfiltered whole volume CT data.

11 Feb 2020

These missing annotations can be problematic, as the standard cross-entropy loss employed to train object detection models treats classification as a positive-negative (PN) problem: unlabeled regions are implicitly assumed to be background.

ICML 2020

A new algorithmic framework is proposed for learning autoencoders of data distributions.

20 Jan 2020

Reinforcement learning (RL) has been widely studied for improving sequence-generation models.

13 Dec 2019

We show performance of our models on held-out evaluation sets, analyze several design parameters, and demonstrate the potential of such systems for automated detection of threats that can be found in airports.

NeurIPS 2019

Inference, estimation, sampling and likelihood evaluation are four primary goals of probabilistic modeling.

CVPR 2020

Neural networks are known to be vulnerable to carefully crafted adversarial examples, and these malicious samples often transfer, i. e., they remain adversarial even against other models.

20 Nov 2019

We propose a novel graph-driven generative model, that unifies multiple heterogeneous learning tasks into the same framework.

10 Nov 2019

Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks.

IJCNLP 2019

Generating high-quality paraphrases is a fundamental yet challenging natural language processing task.

ICLR 2020

To address this, we propose a learning framework that improves collaborative filtering with a synthetic feedback loop (CF-SFL) to simulate the user feedback.

20 Oct 2019

{Specifically, we build a conditional generative model to generate features from seen-class attributes, and establish an optimal transport between the distribution of the generated features and that of the real features.}

NeurIPS 2019

We investigate time-dependent data analysis from the perspective of recurrent kernel machines, from which models with hidden units and gated memory cells arise naturally.

5 Oct 2019

The Straight-Through (ST) estimator is a widely used technique for back-propagating gradients through discrete random variables.

5 Oct 2019

The relative importance of global versus local structure for the embeddings is learned automatically.

4 Oct 2019

Accordingly, the learned optimal transport reflects the correspondence between the event types of these two Hawkes processes.

NeurIPS 2019

This paper considers a novel variational formulation of network embeddings, with special focus on textual networks.

NeurIPS 2019

Model parallelism is required if a model is too large to fit in a single computing device.

11 Sep 2019

Recent unsupervised approaches to domain adaptation primarily focus on minimizing the gap between the source and the target domains through refining the feature generator, in order to learn a better alignment between the two domains.

24 Jun 2019

We propose a Leaked Motion Video Predictor (LMVP) to predict future frames by capturing the spatial and temporal dependencies from given inputs.

20 Jun 2019

Instead of learning a mixture model directly from a set of event sequences drawn from different Hawkes processes, the proposed method learns the target model iteratively, which generates "easy" sequences and uses them in an adversarial and self-paced manner.

ACL 2019

Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems.

13 Jun 2019

The proposed method achieves clinically-interpretable embeddings of ICD codes, and outperforms state-of-the-art embedding methods in procedure recommendation.

10 Jun 2019

In this paper we investigate new methods for training collaborative filtering models based on actor-critic reinforcement learning, to directly optimize the non-differentiable quality metrics of interest.

ACL 2019

We present a syntax-infused variational autoencoder (SIVAE), that integrates sentences with their syntactic trees to improve the grammar of generated sentences.

5 Jun 2019

We tackle an unsupervised domain adaptation problem for which the domain discrepancy between labeled source and unlabeled target domains is large, due to many factors of inter and intra-domain variation.

ACL 2019

Constituting highly informative network embeddings is an important tool for network analysis.

NAACL 2019

We propose a topic-guided variational auto-encoder (TGVAE) model for text generation.

21 May 2019

We present a survival function estimator for probabilistic predictions in time-to-event models, based on a neural network model for draws from the distribution of event times, without explicit assumptions on the form of the distribution.

NeurIPS 2019

Using this concept, we extend our method to multi-graph partitioning and matching by learning a Gromov-Wasserstein barycenter graph for multiple observed graphs; the barycenter graph plays the role of the disconnected graph, and since it is learned, so is the clustering.

15 May 2019

Adversarial examples are carefully perturbed in-puts for fooling machine learning models.

Proceedings of the 36th International Conference on Machine Learning 2019

Although we develop this framework for a particular type of SBM, namely the \emph{overlapping} stochastic blockmodel, the proposed framework can be adapted readily for other types of SBMs.

ICLR 2019

In this paper, we propose a powerful second-order attack method that reduces the accuracy of the defense model by Madry et al. (2017).

26 Apr 2019

The lower bound further allows us to extend the proposed algorithm to simultaneously predict multiple bag and instance-level labels from a single output of a neural network.

29 Mar 2019

We consider preoperative prediction of thyroid cancer based on ultra-high-resolution whole-slide cytopathology images.

NAACL 2019

Variational autoencoders (VAEs) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks.

17 Mar 2019

We propose a topic-guided variational autoencoder (TGVAE) model for text generation.

19 Feb 2019

Thompson sampling (TS) is a class of algorithms for sequential decision-making, which requires maintaining a posterior distribution over a model.

ACL 2019

Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables.

ICLR 2019

Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE).

17 Jan 2019

A novel Gromov-Wasserstein learning framework is proposed to jointly match (align) graphs and learn embedding vectors for the associated graph nodes.

ICLR 2019

Within many machine learning algorithms, a fundamental problem concerns efficient calculation of an unbiased gradient wrt parameters $\gammav$ for expectation-based objectives $\Ebb_{q_{\gammav} (\yv)} [f(\yv)]$.

3 Jan 2019

We investigate adversarial learning in the case when only an unnormalized form of the density can be accessed, rather than samples.

CVPR 2019

We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework.

2 Dec 2018

The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator.

ICLR 2019

We hypothesize that this is at least in part due to the evolution of the generator distribution and the catastrophic forgetting tendency of neural networks, which leads to the discriminator losing the ability to remember synthesized samples from previous instantiations of the generator.

2 Nov 2018

Sequence generation with reinforcement learning (RL) has received significant attention recently.

27 Sep 2018

Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation.

NeurIPS 2018

However, the discrete nature of text hinders the application of GAN to text-generation tasks.

NeurIPS 2018

When learning the topic model, we leverage a distilled underlying distance matrix to update the topic distributions and smoothly calculate the corresponding optimal transports.

NeurIPS 2019

The existence of adversarial data examples has drawn significant attention in the deep-learning community; such data are seemingly minimally perturbed relative to the original data, but lead to very different outputs from a deep-learning algorithm.

5 Sep 2018

Particle-optimization-based sampling (POS) is a recently developed effective sampling technique that interactively updates a set of particles.

5 Sep 2018

Health risks from cigarette smoking -- the leading cause of preventable death in the United States -- can be substantially reduced by quitting.

EMNLP 2018

Network embeddings, which learn low-dimensional representations for each vertex in a large-scale network, have received considerable attention in recent years.

ICML 2018

Policy optimization is a core component of reinforcement learning (RL), and most existing RL methods directly optimize parameters of a policy based on maximizing the expected total reward, or its surrogate.

4 Jul 2018

Particle-based variational inference methods (ParVIs) have gained attention in the Bayesian inference literature, for their capacity to yield flexible and accurate approximations.

ICML 2018

Distinct from most existing approaches, that only learn conditional distributions, the proposed model aims to learn a joint distribution of multiple random variables (domains).

ACL 2018

Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations.

NeurIPS 2018

Textual network embedding leverages rich text information associated with the network to learn low-dimensional vectorial representations of vertices.

ACL 2018

Semantic hashing has become a powerful paradigm for fast similarity search in many information retrieval systems.

ACL 2018

Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences.

ICML 2018

Modern health data science applications leverage abundant molecular and electronic health data, providing opportunities for machine learning to build statistical models to support clinical practice.

CVPR 2018

Low-rank signal modeling has been widely leveraged to capture non-local correlation in image processing applications.

13 Feb 2018

We consider the learning of multi-agent Hawkes processes, a model containing multiple Hawkes processes with shared endogenous impact functions and different exogenous intensities.

15 Jan 2018

Since diagnoses are typically correlated, a deep residual network is employed on top of the CNN encoder, to capture label (diagnosis) dependencies and incorporate information directly from the encoded sentence vector.

ICLR 2018

In this paper, we conduct an extensive comparative study between Simple Word Embeddings-based Models (SWEMs), with no compositional parameters, relative to employing word embeddings within RNN/CNN-based models.

30 Dec 2017

Learning probability distributions on the weights of neural networks (NNs) has recently proven beneficial in many applications.

28 Dec 2017

The TCNLM learns the global semantic coherence of a document via a neural topic model, and the probability of each learned latent topic is further used to build a Mixture-of-Experts (MoE) language model, where each expert (corresponding to one topic) is a recurrent neural network (RNN) that accounts for learning the local structure of a word sequence.

25 Dec 2017

Significant success has been realized recently on applying machine learning to real-world applications.

NeurIPS 2017

We consider the analysis of Electroencephalography (EEG) and Local Field Potential (LFP) datasets, which are “big” in terms of the size of recorded data but rarely have sufficient labels required to train complex models (e. g., conventional deep learning methods).

NeurIPS 2017

To facilitate understanding of network-level synchronization between brain regions, we introduce a novel model of multisite low-frequency neural recordings, such as local field potentials (LFPs) and electroencephalograms (EEGs).

NeurIPS 2017

We propose a scalable algorithm for model selection in sigmoid belief networks (SBNs), based on the factorized asymptotic Bayesian (FAB) framework.

15 Nov 2017

We present a deep generative model for learning to predict classes not seen at training time.

NeurIPS 2017

A new form of variational autoencoder (VAE) is developed, in which the joint distribution of data and codes is considered in two (symmetric) forms: ($i$) from observed data fed through the encoder to yield codes, and ($ii$) from latent codes drawn from a simple prior and propagated through the decoder to manifest data.

14 Oct 2017

The superposition of Hawkes processes is demonstrated to be beneficial for tightening the upper bound of excess risk under certain conditions, and we show the feasibility of the benefit in typical situations.

ICML 2018

A parametric point process model is developed, with modeling based on the assumption that sequential observations often share latent phenomena, while also possessing idiosyncratic effects.

EMNLP 2018

The role of meta network is to abstract the contextual information of a sentence or document into a set of input-aware filters.

21 Sep 2017

A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives.

NeurIPS 2017

The generators are designed to learn the two-way conditional distributions between the two domains, while the discriminators implicitly define a ternary discriminative function, which is trained to distinguish real data pairs and two kinds of fake data pairs.

NeurIPS 2017

We present a probabilistic framework for nonlinearities, based on doubly truncated Gaussian distributions.

NeurIPS 2017

For matrix inversion in the second sub-problem, we learn a convolutional neural network to approximate the matrix inversion, i. e., the inverse mapping is learned by feeding the input through the learned forward network.

6 Sep 2017

A new form of the variational autoencoder (VAE) is proposed, based on the symmetric Kullback-Leibler divergence.

NeurIPS 2017

We investigate the non-identifiability issues associated with bidirectional adversarial training for joint distribution matching.

4 Sep 2017

However, there has been little theoretical analysis of the impact of minibatch size to the algorithm's convergence rate.

ICML 2018

Distinct from normalizing flows and GANs, CTFs can be adopted to achieve the above two goals in one framework, with theoretical guarantees.

NeurIPS 2017

Learning latent representations from long text sequences is an important first step in many natural language processing applications.

ICML 2017

Moreover, inference cost scales in the number of edges which is attractive for massive but sparse networks.

ICML 2017

We propose a framework for generating realistic text via adversarial training.

ICML 2017

A framework is proposed to improve the sampling efficiency of stochastic gradient MCMC, based on Hamiltonian Monte Carlo.

NeurIPS 2017

A new method for learning variational autoencoders (VAEs) is developed, based on Stein variational gradient descent.

11 Jan 2017

During reconstruction and testing, we project the upper layer dictionary to the data level and only a single layer deconvolution is required.

8 Dec 2016

A multi-way factor analysis model is introduced for tensor-variate data of any order.

NeurIPS 2016

We then develop a supervised linear feature encoding method that is motivated by insights from linear value function approximation theory, as well as empirical successes from deep RL.