Search Results for author: Kyunghyun Cho

Found 231 papers, 111 papers with code

Log-Linear Reformulation of the Noisy Channel Model for Document-Level Neural Machine Translation

no code implementations EMNLP (spnlp) 2020 Sébastien Jean, Kyunghyun Cho

We seek to maximally use various data sources, such as parallel and monolingual data, to build an effective and efficient document-level translation system.

Language Modelling Machine Translation +1

Translating Hanja historical documents to understandable Korean and English

no code implementations20 May 2022 Juhee Son, Jiho Jin, Haneul Yoo, JinYeong Bak, Kyunghyun Cho, Alice Oh

We compare our method with two baselines: one is a recent model that simultaneously learns to restore and translate Hanja historical document and the other is the transformer that trained on newly translated corpora only.

Machine Translation Translation

Multi-segment preserving sampling for deep manifold sampler

no code implementations9 May 2022 Daniel Berenberg, Jae Hyeon Lee, Simon Kelow, Ji Won Park, Andrew Watkins, Vladimir Gligorijević, Richard Bonneau, Stephen Ra, Kyunghyun Cho

We introduce an alternative approach to this guided sampling procedure, multi-segment preserving sampling, that enables the direct inclusion of domain-specific knowledge by designating preserved and non-preserved segments along the input sequence, thereby restricting variation to only select regions.

Language Modelling

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

no code implementations10 Feb 2022 Nan Wu, Stanisław Jastrzębski, Kyunghyun Cho, Krzysztof J. Geras

We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning.

Causal Scene BERT: Improving object detection by searching for challenging groups of data

no code implementations8 Feb 2022 Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes.

Autonomous Vehicles Object Detection

Generative multitask learning mitigates target-causing confounding

no code implementations8 Feb 2022 Taro Makino, Krzysztof Geras, Kyunghyun Cho

We propose a simple and scalable approach to causal representation learning for multitask learning.

Representation Learning

LINDA: Unsupervised Learning to Interpolate in Natural Language Processing

no code implementations28 Dec 2021 Yekyung Kim, Seohyeong Jeong, Kyunghyun Cho

Despite the success of mixup in data augmentation, its applicability to natural language processing (NLP) tasks has been limited due to the discrete and variable-length nature of natural languages.

Data Augmentation Text Classification

Amortized Noisy Channel Neural Machine Translation

no code implementations16 Dec 2021 Richard Yuanzhe Pang, He He, Kyunghyun Cho

For all three approaches, the generated translations fail to achieve rewards comparable to BSR, but the translation quality approximated by BLEU is similar to the quality of BSR-produced translations.

Imitation Learning Knowledge Distillation +3

Characterizing and addressing the issue of oversmoothing in neural autoregressive sequence modeling

1 code implementation16 Dec 2021 Ilia Kulikov, Maksim Eremeev, Kyunghyun Cho

From these observations, we conclude that the high degree of oversmoothing is the main reason behind the degenerate case of overly probable short sequences in a neural autoregressive model.

Machine Translation Translation

Causal Effect Variational Autoencoder with Uniform Treatment

no code implementations16 Nov 2021 Daniel Jiwoong Im, Kyunghyun Cho, Narges Razavian

Causal effect variational autoencoder (CEVAE) are trained to predict the outcome given observational treatment data, while uniform treatment variational autoencoders (UTVAE) are trained with uniform treatment distribution using importance sampling.

Causal Inference

DEEP: DEnoising Entity Pre-training for Neural Machine Translation

no code implementations ACL 2022 Junjie Hu, Hiroaki Hayashi, Kyunghyun Cho, Graham Neubig

It has been shown that machine translation models usually generate poor translations for named entities that are infrequent in the training corpus.

Denoising Multi-Task Learning +2

AlphaD3M: Machine Learning Pipeline Synthesis

no code implementations3 Nov 2021 Iddo Drori, Yamuna Krishnamurthy, Remi Rampin, Raoni de Paula Lourenco, Jorge Piazentin Ono, Kyunghyun Cho, Claudio Silva, Juliana Freire

We introduce AlphaD3M, an automatic machine learning (AutoML) system based on meta reinforcement learning using sequence models with self play.

AutoML Meta Reinforcement Learning +1

Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement

no code implementations WMT (EMNLP) 2021 Hyojung Han, Seokchan Ahn, Yoonjung Choi, Insoo Chung, Sangha Kim, Kyunghyun Cho

Recent work in simultaneous machine translation is often trained with conventional full sentence translation corpora, leading to either excessive latency or necessity to anticipate as-yet-unarrived words, when dealing with a language pair whose word orders significantly differ.

Machine Translation Translation +1

Causal Scene BERT: Improving object detection by searching for challenging groups

no code implementations29 Sep 2021 Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

We verify that the prioritized groups found via intervention are challenging for the object detector and show that retraining with data collected from these groups helps inordinately compared to adding more IID data.

Autonomous Vehicles Object Detection

AAVAE: Augmentation-Augmented Variational Autoencoders

no code implementations29 Sep 2021 William Alejandro Falcon, Ananya Harsh Jha, Teddy Koker, Kyunghyun Cho

We empirically evaluate the proposed AAVAE on image classification, similar to how recent contrastive and non-contrastive learning algorithms have been evaluated.

Contrastive Learning Data Augmentation +2

Stereo Video Reconstruction Without Explicit Depth Maps for Endoscopic Surgery

no code implementations16 Sep 2021 Annika Brundyn, Jesse Swanson, Kyunghyun Cho, Doug Kondziolka, Eric Oermann

In the first reader study, a variant of the U-Net that takes as input multiple consecutive video frames and outputs the missing view performs best.

Video Reconstruction

An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models

1 code implementation6 Sep 2021 Tianxing He, Kyunghyun Cho, James Glass

Prompt-based knowledge probing for 1-hop relations has been used to measure how much world knowledge is stored in pretrained language models.

Pretrained Language Models

AASAE: Augmentation-Augmented Stochastic Autoencoders

1 code implementation26 Jul 2021 William Falcon, Ananya Harsh Jha, Teddy Koker, Kyunghyun Cho

We empirically evaluate the proposed AASAE on image classification, similar to how recent contrastive and non-contrastive learning algorithms have been evaluated.

Contrastive Learning Data Augmentation +2

Mode recovery in neural autoregressive sequence modeling

1 code implementation ACL (spnlp) 2021 Ilia Kulikov, Sean Welleck, Kyunghyun Cho

We propose to study these phenomena by investigating how the modes, or local maxima, of a distribution are maintained throughout the full learning chain of the ground-truth, empirical, learned and decoding-induced distributions, via the newly proposed mode recovery cost.

Comparing Test Sets with Item Response Theory

no code implementations ACL 2021 Clara Vania, Phu Mon Htut, William Huang, Dhara Mungra, Richard Yuanzhe Pang, Jason Phang, Haokun Liu, Kyunghyun Cho, Samuel R. Bowman

Recent years have seen numerous NLP datasets introduced to evaluate the performance of fine-tuned models on natural language understanding tasks.

Natural Language Understanding

True Few-Shot Learning with Language Models

1 code implementation NeurIPS 2021 Ethan Perez, Douwe Kiela, Kyunghyun Cho

Here, we evaluate the few-shot ability of LMs when such held-out examples are unavailable, a setting we call true few-shot learning.

Few-Shot Learning Model Selection +1

The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction

1 code implementation EMNLP 2021 Manling Li, Sha Li, Zhenhailong Wang, Lifu Huang, Kyunghyun Cho, Heng Ji, Jiawei Han, Clare Voss

We introduce a new concept of Temporal Complex Event Schema: a graph-based schema representation that encompasses events, arguments, temporal connections and argument relations.

NaturalProofs: Mathematical Theorem Proving in Natural Language

1 code implementation24 Mar 2021 Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho

Understanding and creating mathematics using natural mathematical language - the mixture of symbolic and natural language used by humans - is a challenging and important problem for driving progress in machine learning.

Automated Theorem Proving Domain Generalization +1

Online hyperparameter optimization by real-time recurrent learning

1 code implementation15 Feb 2021 Daniel Jiwoong Im, Cristina Savin, Kyunghyun Cho

Conventional hyperparameter optimization methods are computationally intensive and hard to generalize to scenarios that require dynamically adapting hyperparameters, such as life-long learning.

Hyperparameter Optimization online learning

Self-Supervised Equivariant Scene Synthesis from Video

no code implementations1 Feb 2021 Cinjon Resnick, Or Litany, Cosmas Heiß, Hugo Larochelle, Joan Bruna, Kyunghyun Cho

We propose a self-supervised framework to learn scene representations from video that are automatically delineated into background, characters, and their animations.

A Study on the Autoregressive and non-Autoregressive Multi-label Learning

no code implementations3 Dec 2020 Elham J. Barezi, Iacer Calixto, Kyunghyun Cho, Pascale Fung

These tasks are hard because the label space is usually (i) very large, e. g. thousands or millions of labels, (ii) very sparse, i. e. very few labels apply to each input document, and (iii) highly correlated, meaning that the existence of one label changes the likelihood of predicting all other labels.

Multi-Label Learning

Learned Equivariant Rendering without Transformation Supervision

no code implementations11 Nov 2020 Cinjon Resnick, Or Litany, Hugo Larochelle, Joan Bruna, Kyunghyun Cho

We propose a self-supervised framework to learn scene representations from video that are automatically delineated into objects and background.

Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning

1 code implementation COLING 2020 Jon Ander Campos, Kyunghyun Cho, Arantxa Otegi, Aitor Soroa, Gorka Azkune, Eneko Agirre

The interaction of conversational systems with users poses an exciting opportunity for improving them after deployment, but little evidence has been provided of its feasibility.

Conversational Question Answering Document Classification

Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search

1 code implementation ACL 2021 Gyuwan Kim, Kyunghyun Cho

We then conduct a multi-objective evolutionary search to find a length configuration that maximizes the accuracy and minimizes the efficiency metric under any given computational budget.

Classification Question Answering +1

Reducing false-positive biopsies with deep neural networks that utilize local and global information in screening mammograms

no code implementations19 Sep 2020 Nan Wu, Zhe Huang, Yiqiu Shen, Jungkyu Park, Jason Phang, Taro Makino, S. Gene Kim, Kyunghyun Cho, Laura Heacock, Linda Moy, Krzysztof J. Geras

Breast cancer is the most common cancer in women, and hundreds of thousands of unnecessary biopsies are done around the world at a tremendous cost.

Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule

no code implementations ICLR 2021 Shuhei Kurita, Kyunghyun Cho

Vision-and-language navigation (VLN) is a task in which an agent is embodied in a realistic 3D environment and follows an instruction to reach the goal node.

Language Modelling Vision and Language Navigation

Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation

1 code implementation EMNLP 2020 Jason Lee, Raphael Shu, Kyunghyun Cho

Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input.

14 Machine Translation +1

Evaluating representations by the complexity of learning low-loss predictors

1 code implementation15 Sep 2020 William F. Whitney, Min Jae Song, David Brandfonbrener, Jaan Altosaar, Kyunghyun Cho

We consider the problem of evaluating representations of data for use in solving a downstream task.

A Framework For Contrastive Self-Supervised Learning And Designing A New Approach

1 code implementation31 Aug 2020 William Falcon, Kyunghyun Cho

Contrastive self-supervised learning (CSL) is an approach to learn useful representations by solving a pretext task that selects and compares anchor, negative and positive (APN) features from an unlabeled dataset.

Data Augmentation Image Classification +1

AdapterHub: A Framework for Adapting Transformers

4 code implementations EMNLP 2020 Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych

We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.

Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset

1 code implementation EMNLP (sdp) 2020 Edwin Zhang, Nikhil Gupta, Raphael Tang, Xiao Han, Ronak Pradeep, Kuang Lu, Yue Zhang, Rodrigo Nogueira, Kyunghyun Cho, Hui Fang, Jimmy Lin

We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI.

Compositionality and Capacity in Emergent Languages

no code implementations WS 2020 Abhinav Gupta, Cinjon Resnick, Jakob Foerster, Andrew Dai, Kyunghyun Cho

Our hypothesis is that there should be a specific range of model capacity and channel bandwidth that induces compositional structure in the resulting language and consequently encourages systematic generalization.

Systematic Generalization

Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset

no code implementations ACL 2020 Edwin Zhang, Nikhil Gupta, Rodrigo Nogueira, Kyunghyun Cho, Jimmy Lin

The Neural Covidex is a search engine that exploits the latest neural ranking architectures to provide information access to the COVID-19 Open Research Dataset (CORD-19) curated by the Allen Institute for AI.

Decision Making

MLE-guided parameter search for task loss minimization in neural sequence modeling

1 code implementation4 Jun 2020 Sean Welleck, Kyunghyun Cho

Typical approaches to directly optimizing the task loss such as policy gradient and minimum risk training are based around sampling in the sequence space to obtain candidate update directions that are scored based on the loss of a single sequence.

Machine Translation

AdapterFusion: Non-Destructive Task Composition for Transfer Learning

2 code implementations EACL 2021 Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, Iryna Gurevych

We show that by separating the two stages, i. e., knowledge extraction and knowledge composition, the classifier can effectively exploit the representations learned from multiple tasks in a non-destructive manner.

Language Modelling Multi-Task Learning

Learning Non-Monotonic Automatic Post-Editing of Translations from Human Orderings

1 code implementation EAMT 2020 António Góis, Kyunghyun Cho, André Martins

Recent research in neural machine translation has explored flexible generation orders, as an alternative to left-to-right generation.

Automatic Post-Editing Translation

Learning to Learn Morphological Inflection for Resource-Poor Languages

no code implementations28 Apr 2020 Katharina Kann, Samuel R. Bowman, Kyunghyun Cho

We propose to cast the task of morphological inflection - mapping a lemma to an indicated inflected form - for resource-poor languages as a meta-learning problem.

Cross-Lingual Transfer Meta-Learning +1

Rapidly Bootstrapping a Question Answering Dataset for COVID-19

1 code implementation23 Apr 2020 Raphael Tang, Rodrigo Nogueira, Edwin Zhang, Nikhil Gupta, Phuong Cam, Kyunghyun Cho, Jimmy Lin

We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge.

Question Answering

Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned

1 code implementation10 Apr 2020 Edwin Zhang, Nikhil Gupta, Rodrigo Nogueira, Kyunghyun Cho, Jimmy Lin

We present the Neural Covidex, a search engine that exploits the latest neural ranking architectures to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI.

Decision Making

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

2 code implementations ACL 2020 Alex Wang, Kyunghyun Cho, Mike Lewis

QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source.

Abstractive Text Summarization

Understanding the robustness of deep neural network classifiers for breast cancer screening

no code implementations23 Mar 2020 Witold Oleszkiewicz, Taro Makino, Stanisław Jastrzębski, Tomasz Trzciński, Linda Moy, Kyunghyun Cho, Laura Heacock, Krzysztof J. Geras

Deep neural networks (DNNs) show promise in breast cancer screening, but their robustness to input perturbations must be better understood before they can be clinically implemented.

Unsupervised Question Decomposition for Question Answering

2 code implementations EMNLP 2020 Ethan Perez, Patrick Lewis, Wen-tau Yih, Kyunghyun Cho, Douwe Kiela

We aim to improve question answering (QA) by decomposing hard questions into simpler sub-questions that existing QA systems are capable of answering.

Question Answering

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

no code implementations ICLR 2020 Stanislaw Jastrzebski, Maciej Szymczak, Stanislav Fort, Devansh Arpit, Jacek Tabor, Kyunghyun Cho, Krzysztof Geras

We argue for the existence of the "break-even" point on this trajectory, beyond which the curvature of the loss surface and noise in the gradient are implicitly regularized by SGD.

On the Discrepancy between Density Estimation and Sequence Generation

1 code implementation EMNLP (spnlp) 2020 Jason Lee, Dustin Tran, Orhan Firat, Kyunghyun Cho

In this paper, by comparing several density estimators on five machine translation tasks, we find that the correlation between rankings of models based on log-likelihood and BLEU varies significantly depending on the range of the model families being compared.

Density Estimation Machine Translation +2

Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

1 code implementation EMNLP 2020 Sean Welleck, Ilia Kulikov, Jaedeok Kim, Richard Yuanzhe Pang, Kyunghyun Cho

Despite strong performance on a variety of tasks, neural sequence models trained with maximum likelihood have been shown to exhibit issues such as length bias and degenerate repetition.

Language Modelling

Navigation-Based Candidate Expansion and Pretrained Language Models for Citation Recommendation

no code implementations23 Jan 2020 Rodrigo Nogueira, Zhiying Jiang, Kyunghyun Cho, Jimmy Lin

Citation recommendation systems for the scientific literature, to help authors find papers that should be cited, have the potential to speed up discoveries and uncover new routes for scientific exploration.

Citation Recommendation Domain Adaptation +3

Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training

1 code implementation ACL 2020 Margaret Li, Stephen Roller, Ilia Kulikov, Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston

Generative dialogue models currently suffer from a number of problems which standard maximum likelihood training does not address.

Neural Unsupervised Parsing Beyond English

no code implementations WS 2019 Katharina Kann, Anhad Mohananey, Samuel R. Bowman, Kyunghyun Cho

Recently, neural network models which automatically infer syntactic structure from raw text have started to achieve promising results.

Finding Generalizable Evidence by Learning to Convince Q\&A Models

no code implementations IJCNLP 2019 Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed.

Question Answering

Multi-Stage Document Ranking with BERT

2 code implementations31 Oct 2019 Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, Jimmy Lin

The advent of deep neural networks pre-trained via language modeling tasks has spurred a number of successful applications in natural language processing.

Document Ranking Language Modelling

Capacity, Bandwidth, and Compositionality in Emergent Language Learning

1 code implementation24 Oct 2019 Cinjon Resnick, Abhinav Gupta, Jakob Foerster, Andrew M. Dai, Kyunghyun Cho

In this paper, we investigate the learning biases that affect the efficacy and compositionality of emergent languages.

Systematic Generalization

Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models

no code implementations16 Oct 2019 Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James Glass, Fuchun Peng

We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent.

Response Generation Text Generation +1

Generalized Inner Loop Meta-Learning

3 code implementations3 Oct 2019 Edward Grefenstette, Brandon Amos, Denis Yarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, Douwe Kiela, Kyunghyun Cho, Soumith Chintala

Many (but not all) approaches self-qualifying as "meta-learning" in deep learning and reinforcement learning fit a common pattern of approximating the solution to a nested optimization problem.

Meta-Learning reinforcement-learning

Inducing Constituency Trees through Neural Machine Translation

no code implementations22 Sep 2019 Phu Mon Htut, Kyunghyun Cho, Samuel R. Bowman

Latent tree learning(LTL) methods learn to parse sentences using only indirect supervision from a downstream task.

Language Modelling Machine Translation +1

Finding Generalizable Evidence by Learning to Convince Q&A Models

1 code implementation12 Sep 2019 Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed.

Question Answering

Countering Language Drift via Visual Grounding

no code implementations IJCNLP 2019 Jason Lee, Kyunghyun Cho, Douwe Kiela

Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans.

Language Modelling Translation +1

Neural Machine Translation with Byte-Level Subwords

1 code implementation7 Sep 2019 Changhan Wang, Kyunghyun Cho, Jiatao Gu

Representing text at the level of bytes and using the 256 byte set as vocabulary is a potential solution to this issue.

Machine Translation Translation

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

no code implementations IJCNLP 2019 Katharina Kann, Kyunghyun Cho, Samuel R. Bowman

Here, we aim to answer the following questions: Does using a development set for early stopping in the low-resource setting influence results as compared to a more realistic alternative, where the number of training epochs is tuned on development languages?

Dynamics-aware Embeddings

2 code implementations ICLR 2020 William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta

In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL).

Continuous Control reinforcement-learning +1

Latent-Variable Non-Autoregressive Neural Machine Translation with Deterministic Inference Using a Delta Posterior

1 code implementation20 Aug 2019 Raphael Shu, Jason Lee, Hideki Nakayama, Kyunghyun Cho

By decoding multiple initial latent variables in parallel and rescore using a teacher model, the proposed model further brings the gap down to 1. 0 BLEU point on WMT'14 En-De task with 6. 8x speedup.

14 Machine Translation +1

Neural Text Generation with Unlikelihood Training

2 code implementations ICLR 2020 Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, Jason Weston

Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core.

Text Generation

Improving localization-based approaches for breast cancer screening exam classification

no code implementations1 Aug 2019 Thibault Févry, Jason Phang, Nan Wu, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

We trained and evaluated a localization-based deep CNN for breast cancer screening exam classification on over 200, 000 exams (over 1, 000, 000 images).

Classification General Classification

Screening Mammogram Classification with Prior Exams

no code implementations30 Jul 2019 Jungkyu Park, Jason Phang, Yiqiu Shen, Nan Wu, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

Radiologists typically compare a patient's most recent breast cancer screening exam to their previous ones in making informed diagnoses.

Classification General Classification

Can Unconditional Language Models Recover Arbitrary Sentences?

no code implementations NeurIPS 2019 Nishant Subramani, Samuel R. Bowman, Kyunghyun Cho

We then investigate the conditions under which a language model can be made to generate a sentence through the identification of a point in such a space and find that it is possible to recover arbitrary sentences nearly perfectly with language models and representations of moderate size without modifying any model parameters.

Language Modelling Text Classification

A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks

no code implementations5 Jul 2019 Owen Marschall, Kyunghyun Cho, Cristina Savin

We present a framework for compactly summarizing many recent results in efficient and/or biologically plausible online training of recurrent neural networks (RNN).

online learning

Generating Diverse Translations with Sentence Codes

no code implementations ACL 2019 Raphael Shu, Hideki Nakayama, Kyunghyun Cho

In this work, we attempt to obtain diverse translations by using sentence codes to condition the sentence generation.

Machine Translation Translation

Deep Unsupervised Drum Transcription

2 code implementations9 Jun 2019 Keunwoo Choi, Kyunghyun Cho

We introduce DrummerNet, a drum transcription system that is trained in an unsupervised manner.

Sound Audio and Speech Processing

Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

no code implementations ACL 2019 Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li

Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings.

Machine Translation Translation

Multi-Turn Beam Search for Neural Dialogue Modeling

1 code implementation1 Jun 2019 Ilia Kulikov, Jason Lee, Kyunghyun Cho

We propose a novel approach for conversation-level inference by explicitly modeling the dialogue partner and running beam search across multiple conversation turns.

A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models

1 code implementation29 May 2019 Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho

We investigate this problem by proposing a generalized model of sequence generation that unifies decoding in directed and undirected models.

14 Machine Translation +4

Using local plasticity rules to train recurrent neural networks

no code implementations28 May 2019 Owen Marschall, Kyunghyun Cho, Cristina Savin

To learn useful dynamics on long time scales, neurons must use plasticity rules that account for long-term, circuit-wide effects of synaptic changes.

Sequential Graph Dependency Parser

no code implementations RANLP 2019 Sean Welleck, Kyunghyun Cho

We propose a method for non-projective dependency parsing by incrementally predicting a set of edges.

Dependency Parsing

Task-Driven Data Verification via Gradient Descent

no code implementations14 May 2019 Siavash Golkar, Kyunghyun Cho

We introduce a novel algorithm for the detection of possible sample corruption such as mislabeled samples in a training dataset given a small clean validation set.

Gradient-based learning for F-measure and other performance metrics

no code implementations ICLR 2019 Yu Gai, Zheng Zhang, Kyunghyun Cho

Many important classification performance metrics, e. g. $F$-measure, are non-differentiable and non-decomposable, and are thus unfriendly to gradient descent algorithm.

General Classification

Backplay: 'Man muss immer umkehren'

no code implementations ICLR 2019 Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho, Joan Bruna

Our contributions are that we analytically characterize the types of environments where Backplay can improve training speed, demonstrate the effectiveness of Backplay both in large grid worlds and a complex four player zero-sum game (Pommerman), and show that Backplay compares favorably to other competitive methods known to improve sample efficiency.

Advancing GraphSAGE with A Data-Driven Node Sampling

1 code implementation29 Apr 2019 Jihun Oh, Kyunghyun Cho, Joan Bruna

As an efficient and scalable graph neural network, GraphSAGE has enabled an inductive capability for inferring unseen nodes or graphs by aggregating subsampled local neighborhoods and by learning in a mini-batch gradient descent fashion.

General Classification Node Classification

Document Expansion by Query Prediction

4 code implementations17 Apr 2019 Rodrigo Nogueira, Wei Yang, Jimmy Lin, Kyunghyun Cho

One technique to improve the retrieval effectiveness of a search engine is to expand documents with terms that are related or representative of the documents' content. From the perspective of a question answering system, this might comprise questions the document can potentially answer.

Passage Re-Ranking Question Answering +1

Molecular geometry prediction using a deep generative graph neural network

1 code implementation31 Mar 2019 Elman Mansimov, Omar Mahmood, Seokho Kang, Kyunghyun Cho

Conventional conformation generation methods minimize hand-designed molecular force field energy functions that are often not well correlated with the true energy function of a molecule observed in nature.

Context-Aware Learning for Neural Machine Translation

no code implementations12 Mar 2019 Sébastien Jean, Kyunghyun Cho

By comparing performance using actual and random contexts, we show that a model trained with the proposed algorithm is more sensitive to the additional context.

Machine Translation Translation

Continual Learning via Neural Pruning

1 code implementation11 Mar 2019 Siavash Golkar, Michael Kagan, Kyunghyun Cho

We introduce Continual Learning via Neural Pruning (CLNP), a new method aimed at lifelong learning in fixed capacity models based on neuronal model sparsification.

Continual Learning

Augmentation for small object detection

4 code implementations19 Feb 2019 Mate Kisantal, Zbigniew Wojna, Jakub Murawski, Jacek Naruniec, Kyunghyun Cho

We evaluate different pasting augmentation strategies, and ultimately, we achieve 9. 7\% relative improvement on the instance segmentation and 7. 1\% on the object detection of small objects, compared to the current state of the art method on

Instance Segmentation Semantic Segmentation +1

Non-Monotonic Sequential Text Generation

1 code implementation WS 2019 Sean Welleck, Kianté Brantley, Hal Daumé III, Kyunghyun Cho

Standard sequential generation methods assume a pre-specified generation order, such as text generation methods which generate words from left to right.

Imitation Learning Text Generation

Insertion-based Decoding with automatically Inferred Generation Order

no code implementations TACL 2019 Jiatao Gu, Qi Liu, Kyunghyun Cho

Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal.

Code Generation Machine Translation +1

Emergent Linguistic Phenomena in Multi-Agent Communication Games

1 code implementation IJCNLP 2019 Laura Graesser, Kyunghyun Cho, Douwe Kiela

In this work, we propose a computational framework in which agents equipped with communication capabilities simultaneously play a series of referential games, where agents are trained using deep reinforcement learning.

reinforcement-learning

Passage Re-ranking with BERT

6 code implementations13 Jan 2019 Rodrigo Nogueira, Kyunghyun Cho

Recently, neural models pretrained on a language modeling task, such as ELMo (Peters et al., 2017), OpenAI GPT (Radford et al., 2018), and BERT (Devlin et al., 2018), have achieved impressive results on various natural language processing tasks such as question-answering and natural language inference.

Ranked #2 on Passage Re-Ranking on MS MARCO (using extra training data)

Passage Re-Ranking Passage Retrieval +1

Directional Analysis of Stochastic Gradient Descent via von Mises-Fisher Distributions in Deep learning

no code implementations ICLR 2019 Cheolhyoung Lee, Kyunghyun Cho, Wanmo Kang

We empirically verify our result using deep convolutional networks and observe a higher correlation between the gradient stochasticity and the proposed directional uniformity than that against the gradient norm stochasticity, suggesting that the directional statistics of minibatch gradients is a major factor behind SGD.

Learning with Reflective Likelihoods

no code implementations27 Sep 2018 Adji B. Dieng, Kyunghyun Cho, David M. Blei, Yann Lecun

Furthermore, the reflective likelihood objective prevents posterior collapse when used to train stochastic auto-encoders with amortized inference.

Countering Language Drift via Grounding

no code implementations27 Sep 2018 Jason Lee, Kyunghyun Cho, Douwe Kiela

While reinforcement learning (RL) shows a lot of promise for natural language processing—e. g.

Language Modelling Policy Gradient Methods +2

Jump to better conclusions: SCAN both left and right

1 code implementation WS 2018 Jasmijn Bastings, Marco Baroni, Jason Weston, Kyunghyun Cho, Douwe Kiela

Lake and Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the strong generalization abilities of recurrent sequence-to-sequence models.

Grammar Induction with Neural Language Models: An Unusual Replication

1 code implementation EMNLP (ACL) 2018 Phu Mon Htut, Kyunghyun Cho, Samuel R. Bowman

A substantial thread of recent work on latent tree learning has attempted to develop neural network models with parse-valued latent variables and train them on non-parsing tasks, in the hope of having them discover interpretable tree structure.

Constituency Parsing Language Modelling

Meta-Learning for Low-Resource Neural Machine Translation

no code implementations EMNLP 2018 Jiatao Gu, Yong Wang, Yun Chen, Kyunghyun Cho, Victor O. K. Li

We frame low-resource translation as a meta-learning problem, and we learn to adapt to low-resource languages based on multilingual high-resource language tasks.

Frame Low-Resource Neural Machine Translation +3

Backplay: "Man muss immer umkehren"

1 code implementation18 Jul 2018 Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho, Joan Bruna

Our contributions are that we analytically characterize the types of environments where Backplay can improve training speed, demonstrate the effectiveness of Backplay both in large grid worlds and a complex four player zero-sum game (Pommerman), and show that Backplay compares favorably to other competitive methods known to improve sample efficiency.

Learning Distributed Representations from Reviews for Collaborative Filtering

no code implementations18 Jun 2018 Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, Aaron Courville

However, interestingly, the greater modeling power offered by the recurrent neural network appears to undermine the model's ability to act as a regularizer of the product representations.

Collaborative Filtering Recommendation Systems

Classifier-agnostic saliency map extraction

1 code implementation ICLR 2019 Konrad Zolna, Krzysztof J. Geras, Kyunghyun Cho

To address this problem, we propose classifier-agnostic saliency map extraction, which finds all parts of the image that any classifier could use, not just one given in advance.

General Classification

Conditional molecular design with deep generative models

4 code implementations30 Apr 2018 Seokho Kang, Kyunghyun Cho

Although machine learning has been successfully used to propose novel molecules that satisfy desired properties, it is still challenging to explore a large chemical space efficiently.

Dynamic Meta-Embeddings for Improved Sentence Representations

3 code implementations EMNLP 2018 Douwe Kiela, Changhan Wang, Kyunghyun Cho

While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves.

Word Embeddings

A Stable and Effective Learning Strategy for Trainable Greedy Decoding

1 code implementation EMNLP 2018 Yun Chen, Victor O. K. Li, Kyunghyun Cho, Samuel R. Bowman

Beam search is a widely used approximate search strategy for neural network decoders, and it generally outperforms simple greedy decoding on tasks like machine translation.

Machine Translation Translation

Multi-lingual Common Semantic Space Construction via Cluster-consistent Word Embedding

no code implementations EMNLP 2018 Lifu Huang, Kyunghyun Cho, Boliang Zhang, Heng Ji, Kevin Knight

We construct a multilingual common semantic space based on distributional semantics, where words from multiple languages are projected into a shared space to enable knowledge and resource transfer across languages.

Word Alignment

Fine-Grained Attention Mechanism for Neural Machine Translation

no code implementations30 Mar 2018 Heeyoul Choi, Kyunghyun Cho, Yoshua Bengio

Neural machine translation (NMT) has been a new paradigm in machine translation, and the attention mechanism has become the dominant approach with the state-of-the-art records in many language pairs.

Machine Translation Translation

Retrieval-Augmented Convolutional Neural Networks for Improved Robustness against Adversarial Examples

no code implementations26 Feb 2018 Jake Zhao, Kyunghyun Cho

We propose a retrieval-augmented convolutional network and propose to train it with local mixup, a novel variant of the recently proposed mixup algorithm.

Boundary Seeking GANs

no code implementations ICLR 2018 R. Devon Hjelm, Athul Paul Jacob, Adam Trischler, Gerry Che, Kyunghyun Cho, Yoshua Bengio

We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator.

Scene Understanding Text Generation

Simple Nearest Neighbor Policy Method for Continuous Control Tasks

no code implementations ICLR 2018 Elman Mansimov, Kyunghyun Cho

As this policy does not require any optimization, it allows us to investigate the underlying difficulty of a task without being distracted by optimization difficulty of a learning algorithm.

Continuous Control

Loss Functions for Multiset Prediction

no code implementations ICLR 2018 Sean Welleck, Zixin Yao, Yu Gai, Jialin Mao, Zheng Zhang, Kyunghyun Cho

In this paper, we propose a novel multiset loss function by viewing this problem from the perspective of sequential decision making.

Decision Making reinforcement-learning

Unsupervised Neural Machine Translation

2 code implementations ICLR 2018 Mikel Artetxe, Gorka Labaka, Eneko Agirre, Kyunghyun Cho

In spite of the recent success of neural machine translation (NMT) in standard benchmarks, the lack of large parallel corpora poses a major practical problem for many language pairs.

Translation Unsupervised Machine Translation

Emergent Translation in Multi-Agent Communication

no code implementations ICLR 2018 Jason Lee, Kyunghyun Cho, Jason Weston, Douwe Kiela

While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans.

Machine Translation Translation

Graph Convolutional Networks for Classification with a Structured Label Space

no code implementations12 Oct 2017 Meihao Chen, Zhuoru Lin, Kyunghyun Cho

It is a usual practice to ignore any structural information underlying classes in multi-class classification.

Classification Document Classification +3

Attention-based Mixture Density Recurrent Networks for History-based Recommendation

no code implementations22 Sep 2017 Tian Wang, Kyunghyun Cho

The goal of personalized history-based recommendation is to automatically output a distribution over all the items given a sequence of previous purchases of a user.

A Tutorial on Deep Learning for Music Information Retrieval

2 code implementations13 Sep 2017 Keunwoo Choi, György Fazekas, Kyunghyun Cho, Mark Sandler

Following their success in Computer Vision and other areas, deep learning techniques have recently become widely adopted in Music Information Retrieval (MIR) research.

Information Retrieval Music Information Retrieval

A Comparison of Audio Signal Preprocessing Methods for Deep Neural Networks on Music Tagging

1 code implementation6 Sep 2017 Keunwoo Choi, György Fazekas, Kyunghyun Cho, Mark Sandler

In this paper, we empirically investigate the effect of audio preprocessing on music tagging with deep neural networks.

Music Tagging

Strawman: an Ensemble of Deep Bag-of-Ngrams for Sentiment Analysis

1 code implementation WS 2017 Kyunghyun Cho

This paper describes a builder entry, named "strawman", to the sentence-level sentiment analysis task of the "Build It, Break It" shared task of the First Workshop on Building Linguistically Generalizable NLP Systems.

Sentiment Analysis

Zero-Shot Transfer Learning for Event Extraction

1 code implementation ACL 2018 Lifu Huang, Heng Ji, Kyunghyun Cho, Clare R. Voss

Most previous event extraction studies have relied heavily on features derived from annotated event mentions, thus cannot be applied to new event types without annotation effort.

Event Extraction Transfer Learning

Emergent Communication in a Multi-Modal, Multi-Step Referential Game

1 code implementation ICLR 2018 Katrina Evtimova, Andrew Drozdov, Douwe Kiela, Kyunghyun Cho

Inspired by previous work on emergent communication in referential games, we propose a novel multi-modal, multi-step referential game, where the sender and receiver have access to distinct modalities of an object, and their information exchange is bidirectional and of arbitrary duration.

Search Engine Guided Non-Parametric Neural Machine Translation

no code implementations20 May 2017 Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li

In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training.

Machine Translation Translation

Segmentation of the Proximal Femur from MR Images using Deep Convolutional Neural Networks

no code implementations20 Apr 2017 Cem M. Deniz, Siyuan Xiang, Spencer Hallyburton, Arakua Welbeck, James S. Babb, Stephen Honig, Kyunghyun Cho, Gregory Chang

However, manual segmentation of MR images of bone is time-consuming, limiting the use of MRI measurements in the clinical practice.

Does Neural Machine Translation Benefit from Larger Context?

no code implementations17 Apr 2017 Sebastien Jean, Stanislas Lauly, Orhan Firat, Kyunghyun Cho

We propose a neural machine translation architecture that models the surrounding text in addition to the source sentence.

Machine Translation Translation

Task-Oriented Query Reformulation with Reinforcement Learning

2 code implementations EMNLP 2017 Rodrigo Nogueira, Kyunghyun Cho

In this work, we introduce a query reformulation system based on a neural network that rewrites a query to maximize the number of relevant documents returned.

reinforcement-learning

Transfer learning for music classification and regression tasks

3 code implementations27 Mar 2017 Keunwoo Choi, György Fazekas, Mark Sandler, Kyunghyun Cho

In this paper, we present a transfer learning approach for music classification and regression tasks.

Classification General Classification +3

Boundary-Seeking Generative Adversarial Networks

5 code implementations27 Feb 2017 R. Devon Hjelm, Athul Paul Jacob, Tong Che, Adam Trischler, Kyunghyun Cho, Yoshua Bengio

We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator.

Scene Understanding Text Generation

Trainable Greedy Decoding for Neural Machine Translation

1 code implementation EMNLP 2017 Jiatao Gu, Kyunghyun Cho, Victor O. K. Li

Instead of trying to build a new decoding algorithm for any specific decoding objective, we propose the idea of trainable decoding algorithm in which we train a decoding algorithm to find a translation that maximizes an arbitrary decoding objective.

Machine Translation Translation

QCD-Aware Recursive Neural Networks for Jet Physics

4 code implementations2 Feb 2017 Gilles Louppe, Kyunghyun Cho, Cyril Becot, Kyle Cranmer

Recent progress in applying machine learning for jet physics has been built upon an analogy between calorimeters and images.