Search Results for author: Kyunghyun Cho

Found 313 papers, 149 papers with code

Log-Linear Reformulation of the Noisy Channel Model for Document-Level Neural Machine Translation

no code implementations EMNLP (spnlp) 2020 Sébastien Jean, Kyunghyun Cho

We seek to maximally use various data sources, such as parallel and monolingual data, to build an effective and efficient document-level translation system.

Language Modeling Language Modelling +3

Black Box Causal Inference: Effect Estimation via Meta Prediction

no code implementations7 Mar 2025 Lucius E. J. Bynum, Aahlad Manas Puli, Diego Herrero-Quevedo, Nhi Nguyen, Carlos Fernandez-Granda, Kyunghyun Cho, Rajesh Ranganath

Causal inference and the estimation of causal effects plays a central role in decision-making across many areas, including healthcare and economics.

Causal Inference Decision Making +1

An Overview of Large Language Models for Statisticians

no code implementations25 Feb 2025 Wenlong Ji, Weizhe Yuan, Emily Getzen, Kyunghyun Cho, Michael I. Jordan, Song Mei, Jason E Weston, Weijie J. Su, Jing Xu, Linjun Zhang

Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decision-making.

Causal Inference Decision Making +3

NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

no code implementations18 Feb 2025 Weizhe Yuan, Jane Yu, Song Jiang, Karthik Padthe, Yang Li, Dong Wang, Ilia Kulikov, Kyunghyun Cho, Yuandong Tian, Jason E Weston, Xian Li

Scaling reasoning capabilities beyond traditional domains such as math and coding is hindered by the lack of diverse and high-quality questions.

Knowledge Distillation Math

Meta-Statistical Learning: Supervised Learning of Statistical Inference

no code implementations17 Feb 2025 Maxime Peyrard, Kyunghyun Cho

These tasks encompass statistical inference problems such as parameter estimation, hypothesis testing, or mutual information estimation.

Mutual Information Estimation parameter estimation

Cost-Efficient Continual Learning with Sufficient Exemplar Memory

no code implementations11 Feb 2025 Dongkyu Cho, Taesup Moon, Rumi Chunara, Kyunghyun Cho, Sungmin Cha

Continual learning (CL) research typically assumes highly constrained exemplar memory resources.

Continual Learning

The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models

no code implementations11 Feb 2025 Artem Kirsanov, Chi-Ning Chou, Kyunghyun Cho, SueYeon Chung

Decoder-only language models have the ability to dynamically switch between various computational tasks based on input prompts.

Decoder In-Context Learning

HERITAGE: An End-to-End Web Platform for Processing Korean Historical Documents in Hanja

1 code implementation21 Jan 2025 Seyoung Song, Haneul Yoo, Jiho Jin, Kyunghyun Cho, Alice Oh

First, anyone interested in these documents can get a general understanding from the model predictions and the interactive glossary, especially MT outputs in Korean and English.

document understanding Machine Translation +3

How To Think About End-To-End Encryption and AI: Training, Processing, Disclosure, and Consent

no code implementations28 Dec 2024 Mallory Knodel, Andrés Fábrega, Daniella Ferrari, Jacob Leiken, Betty Li Hou, Derek Yen, Sam de Alfaro, Kyunghyun Cho, Sunoo Park

End-to-end encryption (E2EE) has become the gold standard for securing communications, bringing strong confidentiality and privacy guarantees to billions of users worldwide.

Language Models as Causal Effect Generators

1 code implementation12 Nov 2024 Lucius E. J. Bynum, Kyunghyun Cho

In particular, we define a procedure for turning any language model and any directed acyclic graph (DAG) into a sequence-driven structural causal model (SD-SCM).

Causal Inference counterfactual +4

Concept Bottleneck Language Models For protein design

1 code implementation9 Nov 2024 Aya Abdelsalam Ismail, Tuomas Oikarinen, Amy Wang, Julius Adebayo, Samuel Stanton, Taylor Joren, Joseph Kleinhenz, Allen Goodman, Héctor Corrada Bravo, Kyunghyun Cho, Nathan C. Frey

We introduce Concept Bottleneck Protein Language Models (CB-pLM), a generative masked language model with a layer where each neuron corresponds to an interpretable concept.

Decision Making Drug Discovery +3

Aioli: A Unified Optimization Framework for Language Model Data Mixing

1 code implementation8 Nov 2024 Mayee F. Chen, Michael Y. Hu, Nicholas Lourie, Kyunghyun Cho, Christopher Ré

Finally, we leverage the insights from our framework to derive a new online method named Aioli, which directly estimates the mixing law parameters throughout training and uses them to dynamically adjust proportions.

Language Modeling Language Modelling +1

When Does Classical Chinese Help? Quantifying Cross-Lingual Transfer in Hanja and Kanbun

1 code implementation7 Nov 2024 Seyoung Song, Haneul Yoo, Jiho Jin, Kyunghyun Cho, Alice Oh

Historical and linguistic connections within the Sinosphere have led researchers to use Classical Chinese resources for cross-lingual transfer when processing historical documents from Korea and Japan.

Cross-Lingual Transfer Language Modeling +6

Semiparametric conformal prediction

no code implementations4 Nov 2024 Ji Won Park, Robert Tibshirani, Kyunghyun Cho

Many risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables, for which the prediction algorithm may report correlated errors.

Conformal Prediction Prediction

Generalists vs. Specialists: Evaluating LLMs on Highly-Constrained Biophysical Sequence Optimization Tasks

no code implementations29 Oct 2024 Angelica Chen, Samuel D. Stanton, Frances Ding, Robert G. Alberstein, Andrew M. Watkins, Richard Bonneau, Vladimir Gligorijević, Kyunghyun Cho, Nathan C. Frey

When combined with a novel preference learning loss, we find LLOME can not only learn to solve some Ehrlich functions, but can even outperform LaMBO-2 on moderately difficult Ehrlich variants.

Bilevel Optimization Model Optimization

Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing

no code implementations8 Oct 2024 Andreas Loukas, Karolis Martinkus, Ed Wagstaff, Kyunghyun Cho

Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge by posing assumptions about the relation between training and test distribution.

Domain Generalization

Using Deep Autoregressive Models as Causal Inference Engines

no code implementations27 Sep 2024 Daniel Jiwoong Im, Kevin Zhang, Nakul Verma, Kyunghyun Cho

We propose an autoregressive (AR) CI framework capable of handling complex confounders and sequential actions common in modern applications.

Causal Inference

On the design space between molecular mechanics and machine learning force fields

no code implementations3 Sep 2024 Yuanqing Wang, Kenichiro Takaba, Michael S. Chen, Marcus Wieder, Yuzhi Xu, Tong Zhu, John Z. H. Zhang, Arnav Nagle, Kuang Yu, Xinyan Wang, Daniel J. Cole, Joshua A. Rackers, Kyunghyun Cho, Joe G. Greener, Peter Eastman, Stefano Martiniani, Mark E. Tuckerman

A force field as accurate as quantum mechanics (QM) and as fast as molecular mechanics (MM), with which one can simulate a biomolecular system efficiently enough and meaningfully enough to get quantitative insights, is among the most ardent dreams of biophysicists -- a dream, nevertheless, not to be fulfilled any time soon.

Targeted Cause Discovery with Data-Driven Learning

1 code implementation29 Aug 2024 Jang-Hyun Kim, Claudia Skok Gibbs, Sangdoo Yun, Hyun Oh Song, Kyunghyun Cho

We propose a novel machine learning approach for inferring causal variables of a target variable from observations.

Causal Discovery

Non-convolutional Graph Neural Networks

1 code implementation31 Jul 2024 Yuanqing Wang, Kyunghyun Cho

Rethink convolution-based graph neural networks (GNN) -- they characteristically suffer from limited expressiveness, over-smoothing, and over-squashing, and require specialized sparse kernels for efficient computation.

Graph Learning

Closed-Form Test Functions for Biophysical Sequence Optimization Algorithms

2 code implementations28 Jun 2024 Samuel Stanton, Robert Alberstein, Nathan Frey, Andrew Watkins, Kyunghyun Cho

There is a growing body of work seeking to replicate the success of machine learning (ML) on domains like computer vision (CV) and natural language processing (NLP) to applications involving biophysical data.

Form

Following Length Constraints in Instructions

no code implementations25 Jun 2024 Weizhe Yuan, Ilia Kulikov, Ping Yu, Kyunghyun Cho, Sainbayar Sukhbaatar, Jason Weston, Jing Xu

Aligned instruction following models can better fulfill user requests than their unaligned counterparts.

Instruction Following

Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization

1 code implementation21 Jun 2024 Deokjae Lee, Hyun Oh Song, Kyunghyun Cho

Active learning is increasingly adopted for expensive multi-objective combinatorial optimization problems, but it involves a challenging subset selection problem, optimizing the batch acquisition score that quantifies the goodness of a batch for evaluation.

Active Learning Combinatorial Optimization

A Progressive Risk Formulation for Enhanced Deep Learning based Total Knee Replacement Prediction in Knee Osteoarthritis

no code implementations14 Jun 2024 Haresh Rengaraj Rajamohan, Richard Kijowski, Kyunghyun Cho, Cem M. Deniz

We developed deep learning models for predicting Total Knee Replacement (TKR) need within various time horizons in knee osteoarthritis patients, with a novel capability: the models can perform TKR prediction using a single scan, and furthermore when a previous scan is available, they leverage a progressive risk formulation to improve their predictions.

Preference Learning Algorithms Do Not Learn Preference Rankings

no code implementations29 May 2024 Angelica Chen, Sadhika Malladi, Lily H. Zhang, Xinyi Chen, Qiuyi Zhang, Rajesh Ranganath, Kyunghyun Cho

Preference learning algorithms (e. g., RLHF and DPO) are frequently used to steer LLMs to produce generations that are more preferred by humans, but our understanding of their inner workings is still limited.

Attribute

Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient

no code implementations28 May 2024 Nataša Tagasovska, Vladimir Gligorijević, Kyunghyun Cho, Andreas Loukas

Matching, combined with an encoder-decoder architecture, forms a domain-agnostic generative framework for property enhancement.

Decoder Protein Design

Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning

1 code implementation27 May 2024 Divyam Madaan, Taro Makino, Sumit Chopra, Kyunghyun Cho

Previous studies in this field have concentrated on capturing in isolation either the inter-modality dependencies (the relationships between different modalities and the label) or the intra-modality dependencies (the relationships within a single modality and the label).

A Brief Introduction to Causal Inference in Machine Learning

1 code implementation14 May 2024 Kyunghyun Cho

This is a lecture note produced for DS-GA 3001. 003 "Special Topics in DS - Causal Inference in Machine Learning" at the Center for Data Science, New York University in Spring, 2024.

Causal Inference Out-of-Distribution Generalization

MR-Transformer: Vision Transformer for Total Knee Replacement Prediction Using Magnetic Resonance Imaging

no code implementations5 May 2024 Chaojie Zhang, Shengjia Chen, Ozkan Cigdem, Haresh Rengaraj Rajamohan, Kyunghyun Cho, Richard Kijowski, Cem M. Deniz

A transformer-based deep learning model, MR-Transformer, was developed for total knee replacement (TKR) prediction using magnetic resonance imaging (MRI).

Deep Learning Prediction

Iterative Reasoning Preference Optimization

no code implementations30 Apr 2024 Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston

Iterative preference optimization methods have recently been shown to perform well for general instruction tuning tasks, but typically make little improvement on reasoning tasks (Yuan et al., 2024, Chen et al., 2024).

ARC GSM8K +1

Estimation of Time-to-Total Knee Replacement Surgery

no code implementations29 Apr 2024 Ozkan Cigdem, Shengjia Chen, Chaojie Zhang, Kyunghyun Cho, Richard Kijowski, Cem M. Deniz

A survival analysis model for predicting time-to-total knee replacement (TKR) was developed using features from medical images and clinical measurements.

Deep Learning Survival Analysis

Generalization Measures for Zero-Shot Cross-Lingual Transfer

no code implementations24 Apr 2024 Saksham Bassi, Duygu Ataman, Kyunghyun Cho

A model's capacity to generalize its knowledge to interpret unseen inputs with different characteristics is crucial to build robust and reliable machine learning systems.

Language Modeling Language Modelling +1

HyperCLOVA X Technical Report

no code implementations2 Apr 2024 Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han, Youngkyun Jin, Hyein Jun, Jaeseung Jung, Chanwoong Kim, jinhong Kim, Jinuk Kim, Dokyeong Lee, Dongwook Park, Jeong Min Sohn, Sujung Han, Jiae Heo, Sungju Hong, Mina Jeon, Hyunhoon Jung, Jungeun Jung, Wangkyo Jung, Chungjoon Kim, Hyeri Kim, Jonghyun Kim, Min Young Kim, Soeun Lee, Joonhee Park, Jieun Shin, Sojin Yang, Jungsoon Yoon, Hwaran Lee, Sanghwan Bae, Jeehwan Cha, Karl Gylleus, Donghoon Ham, Mihak Hong, Youngki Hong, Yunki Hong, Dahyun Jang, Hyojun Jeon, Yujin Jeon, Yeji Jeong, Myunggeun Ji, Yeguk Jin, Chansong Jo, Shinyoung Joo, Seunghwan Jung, Adrian Jungmyung Kim, Byoung Hoon Kim, Hyomin Kim, Jungwhan Kim, Minkyoung Kim, Minseung Kim, Sungdong Kim, Yonghee Kim, Youngjun Kim, Youngkwan Kim, Donghyeon Ko, Dughyun Lee, Ha Young Lee, Jaehong Lee, Jieun Lee, Jonghyun Lee, Jongjin Lee, Min Young Lee, Yehbin Lee, Taehong Min, Yuri Min, Kiyoon Moon, Hyangnam Oh, Jaesun Park, Kyuyon Park, Younghun Park, Hanbae Seo, Seunghyun Seo, Mihyun Sim, Gyubin Son, Matt Yeo, Kyung Hoon Yeom, Wonjoon Yoo, Myungin You, Doheon Ahn, Homin Ahn, Joohee Ahn, Seongmin Ahn, Chanwoo An, Hyeryun An, Junho An, Sang-Min An, Boram Byun, Eunbin Byun, Jongho Cha, Minji Chang, Seunggyu Chang, Haesong Cho, Youngdo Cho, Dalnim Choi, Daseul Choi, Hyoseok Choi, Minseong Choi, Sangho Choi, Seongjae Choi, Wooyong Choi, Sewhan Chun, Dong Young Go, Chiheon Ham, Danbi Han, Jaemin Han, Moonyoung Hong, Sung Bum Hong, Dong-Hyun Hwang, Seongchan Hwang, Jinbae Im, Hyuk Jin Jang, Jaehyung Jang, Jaeni Jang, Sihyeon Jang, Sungwon Jang, Joonha Jeon, Daun Jeong, JoonHyun Jeong, Kyeongseok Jeong, Mini Jeong, Sol Jin, Hanbyeol Jo, Hanju Jo, Minjung Jo, Chaeyoon Jung, Hyungsik Jung, Jaeuk Jung, Ju Hwan Jung, Kwangsun Jung, Seungjae Jung, Soonwon Ka, Donghan Kang, Soyoung Kang, Taeho Kil, Areum Kim, Beomyoung Kim, Byeongwook Kim, Daehee Kim, Dong-Gyun Kim, Donggook Kim, Donghyun Kim, Euna Kim, Eunchul Kim, Geewook Kim, Gyu Ri Kim, Hanbyul Kim, Heesu Kim, Isaac Kim, Jeonghoon Kim, JiHye Kim, Joonghoon Kim, Minjae Kim, Minsub Kim, Pil Hwan Kim, Sammy Kim, Seokhun Kim, Seonghyeon Kim, Soojin Kim, Soong Kim, Soyoon Kim, Sunyoung Kim, TaeHo Kim, Wonho Kim, Yoonsik Kim, You Jin Kim, Yuri Kim, Beomseok Kwon, Ohsung Kwon, Yoo-Hwan Kwon, Anna Lee, Byungwook Lee, Changho Lee, Daun Lee, Dongjae Lee, Ha-Ram Lee, Hodong Lee, Hwiyeong Lee, Hyunmi Lee, Injae Lee, Jaeung Lee, Jeongsang Lee, Jisoo Lee, JongSoo Lee, Joongjae Lee, Juhan Lee, Jung Hyun Lee, Junghoon Lee, Junwoo Lee, Se Yun Lee, Sujin Lee, Sungjae Lee, Sungwoo Lee, Wonjae Lee, Zoo Hyun Lee, Jong Kun Lim, Kun Lim, Taemin Lim, Nuri Na, Jeongyeon Nam, Kyeong-Min Nam, Yeonseog Noh, Biro Oh, Jung-Sik Oh, Solgil Oh, Yeontaek Oh, Boyoun Park, Cheonbok Park, Dongju Park, Hyeonjin Park, Hyun Tae Park, Hyunjung Park, JiHye Park, Jooseok Park, JungHwan Park, Jungsoo Park, Miru Park, Sang Hee Park, Seunghyun Park, Soyoung Park, Taerim Park, Wonkyeong Park, Hyunjoon Ryu, Jeonghun Ryu, Nahyeon Ryu, Soonshin Seo, Suk Min Seo, Yoonjeong Shim, Kyuyong Shin, Wonkwang Shin, Hyun Sim, Woongseob Sim, Hyejin Soh, Bokyong Son, Hyunjun Son, Seulah Son, Chi-Yun Song, Chiyoung Song, Ka Yeon Song, Minchul Song, Seungmin Song, Jisung Wang, Yonggoo Yeo, Myeong Yeon Yi, Moon Bin Yim, Taehwan Yoo, Youngjoon Yoo, Sungmin Yoon, Young Jin Yoon, Hangyeol Yu, Ui Seon Yu, Xingdong Zuo, Jeongin Bae, Joungeun Bae, Hyunsoo Cho, Seonghyun Cho, Yongjin Cho, Taekyoon Choi, Yera Choi, Jiwan Chung, Zhenghui Han, Byeongho Heo, Euisuk Hong, Taebaek Hwang, Seonyeol Im, Sumin Jegal, Sumin Jeon, Yelim Jeong, Yonghyun Jeong, Can Jiang, Juyong Jiang, Jiho Jin, Ara Jo, Younghyun Jo, Hoyoun Jung, Juyoung Jung, Seunghyeong Kang, Dae Hee Kim, Ginam Kim, Hangyeol Kim, Heeseung Kim, Hyojin Kim, Hyojun Kim, Hyun-Ah Kim, Jeehye Kim, Jin-Hwa Kim, Jiseon Kim, Jonghak Kim, Jung Yoon Kim, Rak Yeong Kim, Seongjin Kim, Seoyoon Kim, Sewon Kim, Sooyoung Kim, Sukyoung Kim, Taeyong Kim, Naeun Ko, Bonseung Koo, Heeyoung Kwak, Haena Kwon, Youngjin Kwon, Boram Lee, Bruce W. Lee, Dagyeong Lee, Erin Lee, Euijin Lee, Ha Gyeong Lee, Hyojin Lee, Hyunjeong Lee, Jeeyoon Lee, Jeonghyun Lee, Jongheok Lee, Joonhyung Lee, Junhyuk Lee, Mingu Lee, Nayeon Lee, Sangkyu Lee, Se Young Lee, Seulgi Lee, Seung Jin Lee, Suhyeon Lee, Yeonjae Lee, Yesol Lee, Youngbeom Lee, Yujin Lee, Shaodong Li, Tianyu Liu, Seong-Eun Moon, Taehong Moon, Max-Lasse Nihlenramstroem, Wonseok Oh, Yuri Oh, Hongbeen Park, Hyekyung Park, Jaeho Park, Nohil Park, Sangjin Park, Jiwon Ryu, Miru Ryu, Simo Ryu, Ahreum Seo, Hee Seo, Kangdeok Seo, Jamin Shin, Seungyoun Shin, Heetae Sin, Jiangping Wang, Lei Wang, Ning Xiang, Longxiang Xiao, Jing Xu, Seonyeong Yi, Haanju Yoo, Haneul Yoo, Hwanhee Yoo, Liang Yu, Youngjae Yu, Weijie Yuan, Bo Zeng, Qian Zhou, Kyunghyun Cho, Jung-Woo Ha, Joonsuk Park, Jihyun Hwang, Hyoung Jo Kwon, Soonyong Kwon, Jungyeon Lee, Seungho Lee, Seonghyeon Lim, Hyunkyung Noh, Seungho Choi, Sang-Woo Lee, Jung Hwa Lim, Nako Sung

We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding.

Instruction Following Machine Translation +1

Hyperparameters in Continual Learning: A Reality Check

no code implementations14 Mar 2024 Sungmin Cha, Kyunghyun Cho

Based on this, we propose a revised two-phase evaluation protocol consisting of a hyperparameter tuning phase and an evaluation phase.

class-incremental learning Class Incremental Learning +1

Self-Rewarding Language Models

3 code implementations18 Jan 2024 Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston

We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal.

Instruction Following Language Modeling +1

Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding

no code implementations9 Jan 2024 Yatong Bai, Utsav Garg, Apaar Shanker, Haoming Zhang, Samyak Parajuli, Erhan Bas, Isidora Filipovic, Amelia N. Chu, Eugenia D Fomitcheva, Elliot Branson, Aerin Kim, Somayeh Sojoudi, Kyunghyun Cho

Vision and vision-language applications of neural networks, such as image classification and captioning, rely on large-scale annotated datasets that require non-trivial data-collecting processes.

Image Captioning Image Classification +3

Show Your Work with Confidence: Confidence Bands for Tuning Curves

1 code implementation16 Nov 2023 Nicholas Lourie, Kyunghyun Cho, He He

We present the first method to construct valid confidence bands for tuning curves.

First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language Models

no code implementations8 Nov 2023 Naomi Saphra, Eve Fleisig, Kyunghyun Cho, Adam Lopez

Many NLP researchers are experiencing an existential crisis triggered by the astonishing success of ChatGPT and other systems based on large language models (LLMs).

Machine Translation

Multiple Physics Pretraining for Physical Surrogate Models

1 code implementation4 Oct 2023 Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Geraud Krawezik, Francois Lanusse, Mariel Pettee, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling of spatiotemporal systems with transformers.

AstroCLIP: A Cross-Modal Foundation Model for Galaxies

1 code implementation4 Oct 2023 Liam Parker, Francois Lanusse, Siavash Golkar, Leopoldo Sarra, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Geraud Krawezik, Michael McCabe, Ruben Ohana, Mariel Pettee, Bruno Regaldo-Saint Blancard, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

These embeddings can then be used - without any model fine-tuning - for a variety of downstream tasks including (1) accurate in-modality and cross-modality semantic similarity search, (2) photometric redshift estimation, (3) galaxy property estimation from both images and spectra, and (4) morphology classification.

Contrastive Learning model +5

Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs

1 code implementation13 Sep 2023 Angelica Chen, Ravid Shwartz-Ziv, Kyunghyun Cho, Matthew L. Leavitt, Naomi Saphra

Most interpretability research in NLP focuses on understanding the behavior and features of a fully trained model.

Blind Biological Sequence Denoising with Self-Supervised Set Learning

no code implementations4 Sep 2023 Nathan Ng, Ji Won Park, Jae Hyeon Lee, Ryan Lewis Kelly, Stephen Ra, Kyunghyun Cho

This set embedding represents the "average" of the subreads and can be decoded into a prediction of the clean sequence.

Denoising

Active and Passive Causal Inference Learning

no code implementations18 Aug 2023 Daniel Jiwoong Im, Kyunghyun Cho

This paper serves as a starting point for machine learning researchers, engineers and students who are interested in but not yet familiar with causal inference.

Causal Identification Causal Inference

Latent State Models of Training Dynamics

1 code implementation18 Aug 2023 Michael Y. Hu, Angelica Chen, Naomi Saphra, Kyunghyun Cho

We use the HMM representation to study phase transitions and identify latent "detour" states that slow down convergence.

Image Classification Language Modeling +2

Improving Joint Speech-Text Representations Without Alignment

no code implementations11 Aug 2023 Cal Peyser, Zhong Meng, Ke Hu, Rohit Prabhavalkar, Andrew Rosenberg, Tara N. Sainath, Michael Picheny, Kyunghyun Cho

The last year has seen astonishing progress in text-prompted image generation premised on the idea of a cross-modal representation space in which the text and image domains are represented jointly.

Speech Recognition

Leveraging Implicit Feedback from Deployment Data in Dialogue

no code implementations26 Jul 2023 Richard Yuanzhe Pang, Stephen Roller, Kyunghyun Cho, He He, Jason Weston

We study improving social conversational agents by learning from natural dialogue between users and a deployed model, without extra annotations.

Making the Most Out of the Limited Context Length: Predictive Power Varies with Clinical Note Type and Note Section

no code implementations13 Jul 2023 Hongyi Zheng, Yixin Zhu, Lavender Yao Jiang, Kyunghyun Cho, Eric Karl Oermann

Recent advances in large language models have led to renewed interest in natural language processing in healthcare using the free text of clinical notes.

Language Modeling Language Modelling

On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis

1 code implementation23 Jun 2023 Divyam Madaan, Daniel Sodickson, Kyunghyun Cho, Sumit Chopra

However, the image reconstruction process within the MRI pipeline, which requires the use of complex hardware and adjustment of a large number of scanner parameters, is highly susceptible to noise of various forms, resulting in arbitrary artifacts within the images.

Image Reconstruction Medical Image Analysis

Protein Discovery with Discrete Walk-Jump Sampling

1 code implementation8 Jun 2023 Nathan C. Frey, Daniel Berenberg, Karina Zadorozhny, Joseph Kleinhenz, Julien Lafrance-Vanasse, Isidro Hotzel, Yan Wu, Stephen Ra, Richard Bonneau, Kyunghyun Cho, Andreas Loukas, Vladimir Gligorijevic, Saeed Saremi

We resolve difficulties in training and sampling from a discrete generative model by learning a smoothed energy function, sampling from the smoothed data manifold with Langevin Markov chain Monte Carlo (MCMC), and projecting back to the true data manifold with one-step denoising.

Denoising

BOtied: Multi-objective Bayesian optimization with tied multivariate ranks

1 code implementation1 Jun 2023 Ji Won Park, Nataša Tagasovska, Michael Maser, Stephen Ra, Kyunghyun Cho

Motivated by this link, we propose the Pareto-compliant CDF indicator and the associated acquisition function, BOtied.

Bayesian Optimization

Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs

no code implementations23 May 2023 Angelica Chen, Jason Phang, Alicia Parrish, Vishakh Padmakumar, Chen Zhao, Samuel R. Bowman, Kyunghyun Cho

Large language models (LLMs) have achieved widespread success on a variety of in-context few-shot tasks, but this success is typically evaluated via correctness rather than consistency.

valid

Towards Understanding and Improving GFlowNet Training

1 code implementation11 May 2023 Max W. Shen, Emmanuel Bengio, Ehsan Hajiramezanali, Andreas Loukas, Kyunghyun Cho, Tommaso Biancalani

We investigate how to learn better flows, and propose (i) prioritized replay training of high-reward $x$, (ii) relative edge flow policy parametrization, and (iii) a novel guided trajectory balance objective, and show how it can solve a substructure credit assignment problem.

A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale

no code implementations19 Apr 2023 Cal Peyser, Michael Picheny, Kyunghyun Cho, Rohit Prabhavalkar, Ronny Huang, Tara Sainath

Unpaired text and audio injection have emerged as dominant methods for improving ASR performance in the absence of a large labeled corpus.

Decoder

Improving Code Generation by Training with Natural Language Feedback

1 code implementation28 Mar 2023 Angelica Chen, Jérémy Scheurer, Tomasz Korbak, Jon Ander Campos, Jun Shern Chan, Samuel R. Bowman, Kyunghyun Cho, Ethan Perez

The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development.

Code Generation Imitation Learning +1

Unsupervised Learning of Initialization in Deep Neural Networks via Maximum Mean Discrepancy

no code implementations8 Feb 2023 Cheolhyoung Lee, Kyunghyun Cho

We first notice that each parameter configuration in the parameter space corresponds to one particular downstream task of d-way classification.

Dual Learning for Large Vocabulary On-Device ASR

no code implementations11 Jan 2023 Cal Peyser, Ronny Huang, Tara Sainath, Rohit Prabhavalkar, Michael Picheny, Kyunghyun Cho

Dual learning is a paradigm for semi-supervised machine learning that seeks to leverage unsupervised data by solving two opposite tasks at once.

On the Blind Spots of Model-Based Evaluation Metrics for Text Generation

1 code implementation20 Dec 2022 Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov

In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data.

Text Generation

Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

no code implementations20 Dec 2022 Sang-Woo Lee, Sungdong Kim, Donghyeon Ko, Donghoon Ham, Youngki Hong, Shin Ah Oh, Hyunhoon Jung, Wangkyo Jung, Kyunghyun Cho, Donghyun Kwak, Hyungsuk Noh, WooMyoung Park

Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i. e., slots) to fulfill a specific task.

Language Modeling Language Modelling +3

Joint Embedding Predictive Architectures Focus on Slow Features

1 code implementation20 Nov 2022 Vlad Sobal, Jyothir S V, Siddhartha Jalagam, Nicolas Carion, Kyunghyun Cho, Yann Lecun

Many common methods for learning a world model for pixel-based environments use generative architectures trained with pixel-level reconstruction objectives.

Language Model Classifier Aligns Better with Physician Word Sensitivity than XGBoost on Readmission Prediction

1 code implementation13 Nov 2022 Grace Yang, Ming Cao, Lavender Y. Jiang, Xujin C. Liu, Alexander T. M. Cheung, Hannah Weiss, David Kurland, Kyunghyun Cho, Eric K. Oermann

We assess the sensitivity score on a set of representative words in the test set using two classifiers trained for hospital readmission classification with similar performance statistics.

Decision Making Language Modeling +2

HUE: Pretrained Model and Dataset for Understanding Hanja Documents of Ancient Korea

1 code implementation Findings (NAACL) 2022 Haneul Yoo, Jiho Jin, Juhee Son, JinYeong Bak, Kyunghyun Cho, Alice Oh

Historical records in Korea before the 20th century were primarily written in Hanja, an extinct language based on Chinese characters and not understood by modern Korean or Chinese speakers.

named-entity-recognition Named Entity Recognition +3

A Non-monotonic Self-terminating Language Model

1 code implementation3 Oct 2022 Eugene Choi, Kyunghyun Cho, Cheolhyoung Lee

We then propose a non-monotonic self-terminating language model, which significantly relaxes the constraint of monotonically increasing termination probability in the originally proposed self-terminating language model by Welleck et al. (2020), to address the issue of non-terminating sequences when using incomplete probable decoding algorithms.

Language Modeling Language Modelling +2

Towards Disentangled Speech Representations

no code implementations28 Aug 2022 Cal Peyser, Ronny Huang Andrew Rosenberg Tara N. Sainath, Michael Picheny, Kyunghyun Cho

In this paper, we construct a representation learning task based on joint modeling of ASR and TTS, and seek to learn a representation of audio that disentangles that part of the speech signal that is relevant to transcription from that part which is not.

Disentanglement

Predicting Out-of-Domain Generalization with Neighborhood Invariance

no code implementations5 Jul 2022 Nathan Ng, Neha Hulkund, Kyunghyun Cho, Marzyeh Ghassemi

Developing and deploying machine learning models safely depends on the ability to characterize and compare their abilities to generalize to new environments.

Data Augmentation Domain Generalization +3

Endowing Language Models with Multimodal Knowledge Graph Representations

1 code implementation27 Jun 2022 Ningyuan Huang, Yash R. Deshpande, Yibo Liu, Houda Alberts, Kyunghyun Cho, Clara Vania, Iacer Calixto

We use the recently released VisualSem KG as our external knowledge repository, which covers a subset of Wikipedia and WordNet entities, and compare a mix of tuple-based and graph-based algorithms to learn entity and relation representations that are grounded on the KG multimodal information.

Multilingual Named Entity Recognition named-entity-recognition +2

Linear Connectivity Reveals Generalization Strategies

1 code implementation24 May 2022 Jeevesh Juneja, Rachit Bansal, Kyunghyun Cho, João Sedoc, Naomi Saphra

It is widely accepted in the mode connectivity literature that when two neural networks are trained similarly on the same data, they are connected by a path through parameter space over which test set accuracy is maintained.

CoLA Diagnostic +2

Translating Hanja Historical Documents to Contemporary Korean and English

no code implementations20 May 2022 Juhee Son, Jiho Jin, Haneul Yoo, JinYeong Bak, Kyunghyun Cho, Alice Oh

Built on top of multilingual neural machine translation, H2KE learns to translate a historical document written in Hanja, from both a full dataset of outdated Korean translation and a small dataset of more recently translated contemporary Korean and English.

Machine Translation Translation

Multi-segment preserving sampling for deep manifold sampler

no code implementations9 May 2022 Daniel Berenberg, Jae Hyeon Lee, Simon Kelow, Ji Won Park, Andrew Watkins, Vladimir Gligorijević, Richard Bonneau, Stephen Ra, Kyunghyun Cho

We introduce an alternative approach to this guided sampling procedure, multi-segment preserving sampling, that enables the direct inclusion of domain-specific knowledge by designating preserved and non-preserved segments along the input sequence, thereby restricting variation to only select regions.

Language Modeling Language Modelling

Translation between Molecules and Natural Language

1 code implementation25 Apr 2022 Carl Edwards, Tuan Lai, Kevin Ros, Garrett Honke, Kyunghyun Cho, Heng Ji

We present $\textbf{MolT5}$ $-$ a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings.

Molecule Captioning Self-Supervised Learning +2

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

1 code implementation10 Feb 2022 Nan Wu, Stanisław Jastrzębski, Kyunghyun Cho, Krzysztof J. Geras

We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning.

Causal Scene BERT: Improving object detection by searching for challenging groups of data

no code implementations8 Feb 2022 Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes.

Autonomous Vehicles object-detection +1

Generative multitask learning mitigates target-causing confounding

2 code implementations8 Feb 2022 Taro Makino, Krzysztof J. Geras, Kyunghyun Cho

We propose generative multitask learning (GMTL), a simple and scalable approach to causal representation learning for multitask learning.

Out-of-Distribution Generalization Representation Learning

LINDA: Unsupervised Learning to Interpolate in Natural Language Processing

no code implementations28 Dec 2021 Yekyung Kim, Seohyeong Jeong, Kyunghyun Cho

Despite the success of mixup in data augmentation, its applicability to natural language processing (NLP) tasks has been limited due to the discrete and variable-length nature of natural languages.

Data Augmentation text-classification +1

Amortized Noisy Channel Neural Machine Translation

no code implementations16 Dec 2021 Richard Yuanzhe Pang, He He, Kyunghyun Cho

For all three approaches, the generated translations fail to achieve rewards comparable to BSR, but the translation quality approximated by BLEU and BLEURT is similar to the quality of BSR-produced translations.

Imitation Learning Knowledge Distillation +4

Characterizing and addressing the issue of oversmoothing in neural autoregressive sequence modeling

1 code implementation16 Dec 2021 Ilia Kulikov, Maksim Eremeev, Kyunghyun Cho

From these observations, we conclude that the high degree of oversmoothing is the main reason behind the degenerate case of overly probable short sequences in a neural autoregressive model.

Machine Translation Translation

Causal Effect Variational Autoencoder with Uniform Treatment

no code implementations16 Nov 2021 Daniel Jiwoong Im, Kyunghyun Cho, Narges Razavian

In this paper, we introduce uniform treatment variational autoencoders (UTVAE) that are trained with uniform treatment distribution using importance sampling and show that using uniform treatment over observational treatment distribution leads to better causal inference by mitigating the distribution shift that occurs from training to test time.

Causal Inference Domain Adaptation

DEEP: DEnoising Entity Pre-training for Neural Machine Translation

no code implementations ACL 2022 Junjie Hu, Hiroaki Hayashi, Kyunghyun Cho, Graham Neubig

It has been shown that machine translation models usually generate poor translations for named entities that are infrequent in the training corpus.

Denoising Multi-Task Learning +3

AlphaD3M: Machine Learning Pipeline Synthesis

no code implementations3 Nov 2021 Iddo Drori, Yamuna Krishnamurthy, Remi Rampin, Raoni de Paula Lourenco, Jorge Piazentin Ono, Kyunghyun Cho, Claudio Silva, Juliana Freire

We introduce AlphaD3M, an automatic machine learning (AutoML) system based on meta reinforcement learning using sequence models with self play.

AutoML BIG-bench Machine Learning +4

Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement

no code implementations WMT (EMNLP) 2021 Hyojung Han, Seokchan Ahn, Yoonjung Choi, Insoo Chung, Sangha Kim, Kyunghyun Cho

Recent work in simultaneous machine translation is often trained with conventional full sentence translation corpora, leading to either excessive latency or necessity to anticipate as-yet-unarrived words, when dealing with a language pair whose word orders significantly differ.

Machine Translation Sentence +2

AAVAE: Augmentation-Augmented Variational Autoencoders

no code implementations29 Sep 2021 William Alejandro Falcon, Ananya Harsh Jha, Teddy Koker, Kyunghyun Cho

We empirically evaluate the proposed AAVAE on image classification, similar to how recent contrastive and non-contrastive learning algorithms have been evaluated.

Contrastive Learning Data Augmentation +2

Causal Scene BERT: Improving object detection by searching for challenging groups

no code implementations29 Sep 2021 Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

We verify that the prioritized groups found via intervention are challenging for the object detector and show that retraining with data collected from these groups helps inordinately compared to adding more IID data.

Autonomous Vehicles object-detection +1

Stereo Video Reconstruction Without Explicit Depth Maps for Endoscopic Surgery

no code implementations16 Sep 2021 Annika Brundyn, Jesse Swanson, Kyunghyun Cho, Doug Kondziolka, Eric Oermann

In the first reader study, a variant of the U-Net that takes as input multiple consecutive video frames and outputs the missing view performs best.

Video Reconstruction

An Empirical Study on Few-shot Knowledge Probing for Pretrained Language Models

1 code implementation6 Sep 2021 Tianxing He, Kyunghyun Cho, James Glass

Prompt-based knowledge probing for 1-hop relations has been used to measure how much world knowledge is stored in pretrained language models.

Knowledge Probing Prompt Engineering +1

AASAE: Augmentation-Augmented Stochastic Autoencoders

1 code implementation26 Jul 2021 William Falcon, Ananya Harsh Jha, Teddy Koker, Kyunghyun Cho

We empirically evaluate the proposed AASAE on image classification, similar to how recent contrastive and non-contrastive learning algorithms have been evaluated.

Contrastive Learning Data Augmentation +2

Mode recovery in neural autoregressive sequence modeling

1 code implementation ACL (spnlp) 2021 Ilia Kulikov, Sean Welleck, Kyunghyun Cho

We propose to study these phenomena by investigating how the modes, or local maxima, of a distribution are maintained throughout the full learning chain of the ground-truth, empirical, learned and decoding-induced distributions, via the newly proposed mode recovery cost.

True Few-Shot Learning with Language Models

1 code implementation NeurIPS 2021 Ethan Perez, Douwe Kiela, Kyunghyun Cho

Here, we evaluate the few-shot ability of LMs when such held-out examples are unavailable, a setting we call true few-shot learning.

Few-Shot Learning Model Selection

The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction

1 code implementation EMNLP 2021 Manling Li, Sha Li, Zhenhailong Wang, Lifu Huang, Kyunghyun Cho, Heng Ji, Jiawei Han, Clare Voss

We introduce a new concept of Temporal Complex Event Schema: a graph-based schema representation that encompasses events, arguments, temporal connections and argument relations.

NaturalProofs: Mathematical Theorem Proving in Natural Language

1 code implementation24 Mar 2021 Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho

Understanding and creating mathematics using natural mathematical language - the mixture of symbolic and natural language used by humans - is a challenging and important problem for driving progress in machine learning.

Automated Theorem Proving Domain Generalization +3

Online hyperparameter optimization by real-time recurrent learning

1 code implementation15 Feb 2021 Daniel Jiwoong Im, Cristina Savin, Kyunghyun Cho

Conventional hyperparameter optimization methods are computationally intensive and hard to generalize to scenarios that require dynamically adapting hyperparameters, such as life-long learning.

Hyperparameter Optimization

Self-Supervised Equivariant Scene Synthesis from Video

no code implementations1 Feb 2021 Cinjon Resnick, Or Litany, Cosmas Heiß, Hugo Larochelle, Joan Bruna, Kyunghyun Cho

We propose a self-supervised framework to learn scene representations from video that are automatically delineated into background, characters, and their animations.

A Study on the Autoregressive and non-Autoregressive Multi-label Learning

no code implementations3 Dec 2020 Elham J. Barezi, Iacer Calixto, Kyunghyun Cho, Pascale Fung

These tasks are hard because the label space is usually (i) very large, e. g. thousands or millions of labels, (ii) very sparse, i. e. very few labels apply to each input document, and (iii) highly correlated, meaning that the existence of one label changes the likelihood of predicting all other labels.

Multi-Label Learning

Learned Equivariant Rendering without Transformation Supervision

no code implementations11 Nov 2020 Cinjon Resnick, Or Litany, Hugo Larochelle, Joan Bruna, Kyunghyun Cho

We propose a self-supervised framework to learn scene representations from video that are automatically delineated into objects and background.

Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning

1 code implementation COLING 2020 Jon Ander Campos, Kyunghyun Cho, Arantxa Otegi, Aitor Soroa, Gorka Azkune, Eneko Agirre

The interaction of conversational systems with users poses an exciting opportunity for improving them after deployment, but little evidence has been provided of its feasibility.

Conversational Question Answering Document Classification

Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search

1 code implementation ACL 2021 Gyuwan Kim, Kyunghyun Cho

We then conduct a multi-objective evolutionary search to find a length configuration that maximizes the accuracy and minimizes the efficiency metric under any given computational budget.

Question Answering text-classification +1

Reducing false-positive biopsies with deep neural networks that utilize local and global information in screening mammograms

no code implementations19 Sep 2020 Nan Wu, Zhe Huang, Yiqiu Shen, Jungkyu Park, Jason Phang, Taro Makino, S. Gene Kim, Kyunghyun Cho, Laura Heacock, Linda Moy, Krzysztof J. Geras

Breast cancer is the most common cancer in women, and hundreds of thousands of unnecessary biopsies are done around the world at a tremendous cost.

Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule

no code implementations ICLR 2021 Shuhei Kurita, Kyunghyun Cho

Vision-and-language navigation (VLN) is a task in which an agent is embodied in a realistic 3D environment and follows an instruction to reach the goal node.

Language Modeling Language Modelling +1

Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation

1 code implementation EMNLP 2020 Jason Lee, Raphael Shu, Kyunghyun Cho

Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input.

de-en Machine Translation +2

Evaluating representations by the complexity of learning low-loss predictors

1 code implementation15 Sep 2020 William F. Whitney, Min Jae Song, David Brandfonbrener, Jaan Altosaar, Kyunghyun Cho

We consider the problem of evaluating representations of data for use in solving a downstream task.

A Framework For Contrastive Self-Supervised Learning And Designing A New Approach

1 code implementation31 Aug 2020 William Falcon, Kyunghyun Cho

Contrastive self-supervised learning (CSL) is an approach to learn useful representations by solving a pretext task that selects and compares anchor, negative and positive (APN) features from an unlabeled dataset.

Data Augmentation Image Classification +1

AdapterHub: A Framework for Adapting Transformers

9 code implementations EMNLP 2020 Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych

We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.

XLM-R

Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset

1 code implementation EMNLP (sdp) 2020 Edwin Zhang, Nikhil Gupta, Raphael Tang, Xiao Han, Ronak Pradeep, Kuang Lu, Yue Zhang, Rodrigo Nogueira, Kyunghyun Cho, Hui Fang, Jimmy Lin

We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI.

Compositionality and Capacity in Emergent Languages

no code implementations WS 2020 Abhinav Gupta, Cinjon Resnick, Jakob Foerster, Andrew Dai, Kyunghyun Cho

Our hypothesis is that there should be a specific range of model capacity and channel bandwidth that induces compositional structure in the resulting language and consequently encourages systematic generalization.

Open-Ended Question Answering Systematic Generalization

Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset

no code implementations ACL 2020 Edwin Zhang, Nikhil Gupta, Rodrigo Nogueira, Kyunghyun Cho, Jimmy Lin

The Neural Covidex is a search engine that exploits the latest neural ranking architectures to provide information access to the COVID-19 Open Research Dataset (CORD-19) curated by the Allen Institute for AI.

Decision Making

MLE-guided parameter search for task loss minimization in neural sequence modeling

1 code implementation4 Jun 2020 Sean Welleck, Kyunghyun Cho

Typical approaches to directly optimizing the task loss such as policy gradient and minimum risk training are based around sampling in the sequence space to obtain candidate update directions that are scored based on the loss of a single sequence.

Machine Translation

AdapterFusion: Non-Destructive Task Composition for Transfer Learning

3 code implementations EACL 2021 Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, Iryna Gurevych

We show that by separating the two stages, i. e., knowledge extraction and knowledge composition, the classifier can effectively exploit the representations learned from multiple tasks in a non-destructive manner.

Language Modelling Multi-Task Learning

Learning Non-Monotonic Automatic Post-Editing of Translations from Human Orderings

1 code implementation EAMT 2020 António Góis, Kyunghyun Cho, André Martins

Recent research in neural machine translation has explored flexible generation orders, as an alternative to left-to-right generation.

Automatic Post-Editing Translation

Learning to Learn Morphological Inflection for Resource-Poor Languages

no code implementations28 Apr 2020 Katharina Kann, Samuel R. Bowman, Kyunghyun Cho

We propose to cast the task of morphological inflection - mapping a lemma to an indicated inflected form - for resource-poor languages as a meta-learning problem.

Cross-Lingual Transfer LEMMA +2

Rapidly Bootstrapping a Question Answering Dataset for COVID-19

1 code implementation23 Apr 2020 Raphael Tang, Rodrigo Nogueira, Edwin Zhang, Nikhil Gupta, Phuong Cam, Kyunghyun Cho, Jimmy Lin

We present CovidQA, the beginnings of a question answering dataset specifically designed for COVID-19, built by hand from knowledge gathered from Kaggle's COVID-19 Open Research Dataset Challenge.

Question Answering

Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned

1 code implementation10 Apr 2020 Edwin Zhang, Nikhil Gupta, Rodrigo Nogueira, Kyunghyun Cho, Jimmy Lin

We present the Neural Covidex, a search engine that exploits the latest neural ranking architectures to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI.

Decision Making

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

2 code implementations ACL 2020 Alex Wang, Kyunghyun Cho, Mike Lewis

QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source.

Abstractive Text Summarization

Understanding the robustness of deep neural network classifiers for breast cancer screening

no code implementations23 Mar 2020 Witold Oleszkiewicz, Taro Makino, Stanisław Jastrzębski, Tomasz Trzciński, Linda Moy, Kyunghyun Cho, Laura Heacock, Krzysztof J. Geras

Deep neural networks (DNNs) show promise in breast cancer screening, but their robustness to input perturbations must be better understood before they can be clinically implemented.

Unsupervised Question Decomposition for Question Answering

2 code implementations EMNLP 2020 Ethan Perez, Patrick Lewis, Wen-tau Yih, Kyunghyun Cho, Douwe Kiela

We aim to improve question answering (QA) by decomposing hard questions into simpler sub-questions that existing QA systems are capable of answering.

Question Answering

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

no code implementations ICLR 2020 Stanislaw Jastrzebski, Maciej Szymczak, Stanislav Fort, Devansh Arpit, Jacek Tabor, Kyunghyun Cho, Krzysztof Geras

We argue for the existence of the "break-even" point on this trajectory, beyond which the curvature of the loss surface and noise in the gradient are implicitly regularized by SGD.

On the Discrepancy between Density Estimation and Sequence Generation

1 code implementation EMNLP (spnlp) 2020 Jason Lee, Dustin Tran, Orhan Firat, Kyunghyun Cho

In this paper, by comparing several density estimators on five machine translation tasks, we find that the correlation between rankings of models based on log-likelihood and BLEU varies significantly depending on the range of the model families being compared.

Density Estimation Machine Translation +3

Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

1 code implementation EMNLP 2020 Sean Welleck, Ilia Kulikov, Jaedeok Kim, Richard Yuanzhe Pang, Kyunghyun Cho

Despite strong performance on a variety of tasks, neural sequence models trained with maximum likelihood have been shown to exhibit issues such as length bias and degenerate repetition.

Language Modeling Language Modelling

Navigation-Based Candidate Expansion and Pretrained Language Models for Citation Recommendation

no code implementations23 Jan 2020 Rodrigo Nogueira, Zhiying Jiang, Kyunghyun Cho, Jimmy Lin

Citation recommendation systems for the scientific literature, to help authors find papers that should be cited, have the potential to speed up discoveries and uncover new routes for scientific exploration.

Citation Recommendation Domain Adaptation +3

Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training

1 code implementation ACL 2020 Margaret Li, Stephen Roller, Ilia Kulikov, Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston

Generative dialogue models currently suffer from a number of problems which standard maximum likelihood training does not address.

Neural Unsupervised Parsing Beyond English

no code implementations WS 2019 Katharina Kann, Anhad Mohananey, Samuel R. Bowman, Kyunghyun Cho

Recently, neural network models which automatically infer syntactic structure from raw text have started to achieve promising results.

Finding Generalizable Evidence by Learning to Convince Q\&A Models

no code implementations IJCNLP 2019 Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed.

Question Answering

Multi-Stage Document Ranking with BERT

3 code implementations31 Oct 2019 Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, Jimmy Lin

The advent of deep neural networks pre-trained via language modeling tasks has spurred a number of successful applications in natural language processing.

Document Ranking Language Modeling +1

Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models

no code implementations16 Oct 2019 Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James Glass, Fuchun Peng

We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent.

Decoder Response Generation +2

Generalized Inner Loop Meta-Learning

3 code implementations3 Oct 2019 Edward Grefenstette, Brandon Amos, Denis Yarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, Douwe Kiela, Kyunghyun Cho, Soumith Chintala

Many (but not all) approaches self-qualifying as "meta-learning" in deep learning and reinforcement learning fit a common pattern of approximating the solution to a nested optimization problem.

Meta-Learning reinforcement-learning +2

Inducing Constituency Trees through Neural Machine Translation

no code implementations22 Sep 2019 Phu Mon Htut, Kyunghyun Cho, Samuel R. Bowman

Latent tree learning(LTL) methods learn to parse sentences using only indirect supervision from a downstream task.

Language Modeling Language Modelling +2

Finding Generalizable Evidence by Learning to Convince Q&A Models

1 code implementation12 Sep 2019 Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho

We propose a system that finds the strongest supporting evidence for a given answer to a question, using passage-based question-answering (QA) as a testbed.

Question Answering

Countering Language Drift via Visual Grounding

no code implementations IJCNLP 2019 Jason Lee, Kyunghyun Cho, Douwe Kiela

Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans.

Language Modeling Language Modelling +2

Neural Machine Translation with Byte-Level Subwords

1 code implementation7 Sep 2019 Changhan Wang, Kyunghyun Cho, Jiatao Gu

Representing text at the level of bytes and using the 256 byte set as vocabulary is a potential solution to this issue.

Machine Translation Translation

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

no code implementations IJCNLP 2019 Katharina Kann, Kyunghyun Cho, Samuel R. Bowman

Here, we aim to answer the following questions: Does using a development set for early stopping in the low-resource setting influence results as compared to a more realistic alternative, where the number of training epochs is tuned on development languages?

Dynamics-aware Embeddings

2 code implementations ICLR 2020 William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta

In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL).

continuous-control Continuous Control +4

Latent-Variable Non-Autoregressive Neural Machine Translation with Deterministic Inference Using a Delta Posterior

1 code implementation20 Aug 2019 Raphael Shu, Jason Lee, Hideki Nakayama, Kyunghyun Cho

By decoding multiple initial latent variables in parallel and rescore using a teacher model, the proposed model further brings the gap down to 1. 0 BLEU point on WMT'14 En-De task with 6. 8x speedup.

Machine Translation Translation

Neural Text Generation with Unlikelihood Training

6 code implementations ICLR 2020 Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, Jason Weston

Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core.

Blocking Text Generation

Improving localization-based approaches for breast cancer screening exam classification

no code implementations1 Aug 2019 Thibault Févry, Jason Phang, Nan Wu, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

We trained and evaluated a localization-based deep CNN for breast cancer screening exam classification on over 200, 000 exams (over 1, 000, 000 images).

Classification General Classification

Screening Mammogram Classification with Prior Exams

no code implementations30 Jul 2019 Jungkyu Park, Jason Phang, Yiqiu Shen, Nan Wu, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

Radiologists typically compare a patient's most recent breast cancer screening exam to their previous ones in making informed diagnoses.

Classification General Classification

Can Unconditional Language Models Recover Arbitrary Sentences?

no code implementations NeurIPS 2019 Nishant Subramani, Samuel R. Bowman, Kyunghyun Cho

We then investigate the conditions under which a language model can be made to generate a sentence through the identification of a point in such a space and find that it is possible to recover arbitrary sentences nearly perfectly with language models and representations of moderate size without modifying any model parameters.

Language Modeling Language Modelling +3

A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks

no code implementations5 Jul 2019 Owen Marschall, Kyunghyun Cho, Cristina Savin

We present a framework for compactly summarizing many recent results in efficient and/or biologically plausible online training of recurrent neural networks (RNN).

Clustering

Generating Diverse Translations with Sentence Codes

no code implementations ACL 2019 Raphael Shu, Hideki Nakayama, Kyunghyun Cho

In this work, we attempt to obtain diverse translations by using sentence codes to condition the sentence generation.

Diversity Machine Translation +2

Deep Unsupervised Drum Transcription

2 code implementations9 Jun 2019 Keunwoo Choi, Kyunghyun Cho

We introduce DrummerNet, a drum transcription system that is trained in an unsupervised manner.

Sound Audio and Speech Processing

Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

no code implementations ACL 2019 Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li

Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings.

Decoder Machine Translation +2

Multi-Turn Beam Search for Neural Dialogue Modeling

1 code implementation1 Jun 2019 Ilia Kulikov, Jason Lee, Kyunghyun Cho

We propose a novel approach for conversation-level inference by explicitly modeling the dialogue partner and running beam search across multiple conversation turns.

A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models

1 code implementation29 May 2019 Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho

We investigate this problem by proposing a generalized model of sequence generation that unifies decoding in directed and undirected models.

Machine Translation Natural Language Inference +3

Using local plasticity rules to train recurrent neural networks

no code implementations28 May 2019 Owen Marschall, Kyunghyun Cho, Cristina Savin

To learn useful dynamics on long time scales, neurons must use plasticity rules that account for long-term, circuit-wide effects of synaptic changes.

Sequential Graph Dependency Parser

no code implementations RANLP 2019 Sean Welleck, Kyunghyun Cho

We propose a method for non-projective dependency parsing by incrementally predicting a set of edges.

Dependency Parsing

Task-Driven Data Verification via Gradient Descent

no code implementations14 May 2019 Siavash Golkar, Kyunghyun Cho

We introduce a novel algorithm for the detection of possible sample corruption such as mislabeled samples in a training dataset given a small clean validation set.

Gradient-based learning for F-measure and other performance metrics

no code implementations ICLR 2019 Yu Gai, Zheng Zhang, Kyunghyun Cho

Many important classification performance metrics, e. g. $F$-measure, are non-differentiable and non-decomposable, and are thus unfriendly to gradient descent algorithm.

General Classification

Backplay: 'Man muss immer umkehren'

no code implementations ICLR 2019 Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho, Joan Bruna

Our contributions are that we analytically characterize the types of environments where Backplay can improve training speed, demonstrate the effectiveness of Backplay both in large grid worlds and a complex four player zero-sum game (Pommerman), and show that Backplay compares favorably to other competitive methods known to improve sample efficiency.

Reinforcement Learning Reinforcement Learning (RL)

Advancing GraphSAGE with A Data-Driven Node Sampling

1 code implementation29 Apr 2019 Jihun Oh, Kyunghyun Cho, Joan Bruna

As an efficient and scalable graph neural network, GraphSAGE has enabled an inductive capability for inferring unseen nodes or graphs by aggregating subsampled local neighborhoods and by learning in a mini-batch gradient descent fashion.

General Classification Graph Neural Network +2

Document Expansion by Query Prediction

5 code implementations17 Apr 2019 Rodrigo Nogueira, Wei Yang, Jimmy Lin, Kyunghyun Cho

One technique to improve the retrieval effectiveness of a search engine is to expand documents with terms that are related or representative of the documents' content. From the perspective of a question answering system, this might comprise questions the document can potentially answer.

Passage Re-Ranking Prediction +3

Molecular geometry prediction using a deep generative graph neural network

1 code implementation31 Mar 2019 Elman Mansimov, Omar Mahmood, Seokho Kang, Kyunghyun Cho

Conventional conformation generation methods minimize hand-designed molecular force field energy functions that are often not well correlated with the true energy function of a molecule observed in nature.

Graph Neural Network

Context-Aware Learning for Neural Machine Translation

no code implementations12 Mar 2019 Sébastien Jean, Kyunghyun Cho

By comparing performance using actual and random contexts, we show that a model trained with the proposed algorithm is more sensitive to the additional context.

Machine Translation Translation

Continual Learning via Neural Pruning

no code implementations11 Mar 2019 Siavash Golkar, Michael Kagan, Kyunghyun Cho

We introduce Continual Learning via Neural Pruning (CLNP), a new method aimed at lifelong learning in fixed capacity models based on neuronal model sparsification.

Continual Learning Diagnostic

Augmentation for small object detection

5 code implementations19 Feb 2019 Mate Kisantal, Zbigniew Wojna, Jakub Murawski, Jacek Naruniec, Kyunghyun Cho

We evaluate different pasting augmentation strategies, and ultimately, we achieve 9. 7\% relative improvement on the instance segmentation and 7. 1\% on the object detection of small objects, compared to the current state of the art method on

Instance Segmentation Object +3

Non-Monotonic Sequential Text Generation

1 code implementation WS 2019 Sean Welleck, Kianté Brantley, Hal Daumé III, Kyunghyun Cho

Standard sequential generation methods assume a pre-specified generation order, such as text generation methods which generate words from left to right.

Imitation Learning Position +1

Insertion-based Decoding with automatically Inferred Generation Order

no code implementations TACL 2019 Jiatao Gu, Qi Liu, Kyunghyun Cho

Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal.

Code Generation Machine Translation +1

Emergent Linguistic Phenomena in Multi-Agent Communication Games

1 code implementation IJCNLP 2019 Laura Graesser, Kyunghyun Cho, Douwe Kiela

In this work, we propose a computational framework in which agents equipped with communication capabilities simultaneously play a series of referential games, where agents are trained using deep reinforcement learning.

Deep Reinforcement Learning Reinforcement Learning (RL)

Passage Re-ranking with BERT

6 code implementations13 Jan 2019 Rodrigo Nogueira, Kyunghyun Cho

Recently, neural models pretrained on a language modeling task, such as ELMo (Peters et al., 2017), OpenAI GPT (Radford et al., 2018), and BERT (Devlin et al., 2018), have achieved impressive results on various natural language processing tasks such as question-answering and natural language inference.

Ranked #3 on Passage Re-Ranking on MS MARCO (using extra training data)

Language Modeling Passage Re-Ranking +3

Cannot find the paper you are looking for? You can Submit a new open access paper.