Search Results for author: Yoon Kim

Found 71 papers, 46 papers with code

Convolutional Neural Networks for Sentence Classification

118 code implementations • EMNLP 2014 • Yoon Kim

We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks.

Ranked #6 on Emotion Recognition in Conversation on CPED

Emotion Recognition in Conversation General Classification +3

11,412

Paper
Code

OpenNMT: Neural Machine Translation Toolkit

9 code implementations • WS 2018 • Guillaume Klein, Yoon Kim, Yuntian Deng, Vincent Nguyen, Jean Senellart, Alexander M. Rush

OpenNMT is an open-source toolkit for neural machine translation (NMT).

Machine Translation NMT +1

6,562

Paper
Code

OpenNMT: Open-Source Toolkit for Neural Machine Translation

4 code implementations • ACL 2017 • Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush

We describe an open-source toolkit for neural machine translation (NMT).

Machine Translation NMT +1

2,381

Paper
Code

Sequence-Level Knowledge Distillation

6 code implementations • EMNLP 2016 • Yoon Kim, Alexander M. Rush

We demonstrate that standard knowledge distillation applied to word-level prediction can be effective for NMT, and also introduce two novel sequence-level versions of knowledge distillation that further improve performance, and somewhat surprisingly, seem to eliminate the need for beam search (even when applied on the original teacher model).

Ranked #1 on Machine Translation on IWSLT2015 Thai-English

Knowledge Distillation Machine Translation +2

1,246

Paper
Code

Character-Aware Neural Language Models

14 code implementations • 26 Aug 2015 • Yoon Kim, Yacine Jernite, David Sontag, Alexander M. Rush

We describe a simple neural language model that relies only on character-level inputs.

Language Modelling

984

Paper
Code

Gated Linear Attention Transformers with Hardware-Efficient Training

2 code implementations • 11 Dec 2023 • Songlin Yang, Bailin Wang, Yikang Shen, Rameswar Panda, Yoon Kim

When used as a replacement for the standard attention layer in Transformers, the resulting gated linear attention (GLA) Transformer is found to perform competitively against the LLaMA-architecture Transformer (Touvron et al., 2023) as well recent linear-time-inference baselines such as RetNet(Sun et al., 2023a) and Mamba (Gu & Dao, 2023) on moderate-scale language modeling experiments.

2k Language Modelling

404

Paper
Code

Adversarially Regularized Autoencoders

6 code implementations • 13 Jun 2017 • Jake Zhao, Yoon Kim, Kelly Zhang, Alexander M. Rush, Yann Lecun

This adversarially regularized autoencoder (ARAE) allows us to generate natural textual outputs as well as perform manipulations in the latent space to induce change in the output space.

Representation Learning Style Transfer

401

Paper
Code

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

2 code implementations • 7 Sep 2023 • Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James Glass, Pengcheng He

Despite their impressive capabilities, large language models (LLMs) are prone to hallucinations, i. e., generating content that deviates from facts seen during pretraining.

335

Paper
Code

Latent Alignment and Variational Attention

1 code implementation • NeurIPS 2018 • Yuntian Deng, Yoon Kim, Justin Chiu, Demi Guo, Alexander M. Rush

This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference.

Ranked #28 on Machine Translation on IWSLT2014 German-English

Hard Attention Machine Translation +4

325

Paper
Code

Data Engineering for Scaling Language Models to 128K Context

2 code implementations • 15 Feb 2024 • Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, Hao Peng

We demonstrate that continual pretraining of the full model on 1B-5B tokens of such data is an effective and affordable strategy for scaling the context length of language models to 128K.

4k Continual Pretraining

309

Paper
Code

DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings

1 code implementation • NAACL 2022 • Yung-Sung Chuang, Rumen Dangovski, Hongyin Luo, Yang Zhang, Shiyu Chang, Marin Soljačić, Shang-Wen Li, Wen-tau Yih, Yoon Kim, James Glass

We propose DiffCSE, an unsupervised contrastive learning framework for learning sentence embeddings.

Ranked #13 on Semantic Textual Similarity on STS16

Contrastive Learning Language Modelling +3

286

Paper
Code

Inducing and Using Alignments for Transition-based AMR Parsing

1 code implementation • NAACL 2022 • Andrew Drozdov, Jiawei Zhou, Radu Florian, Andrew McCallum, Tahira Naseem, Yoon Kim, Ramon Fernandez Astudillo

These alignments are learned separately from parser training and require a complex pipeline of rule-based components, pre-processing, and post-processing to satisfy domain-specific constraints.

AMR Parsing

229

Paper
Code

Unsupervised Recurrent Neural Network Grammars

1 code implementation • NAACL 2019 • Yoon Kim, Alexander M. Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, Gábor Melis

On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese.

Ranked #8 on Constituency Grammar Induction on PTB Diagnostic ECG Database (Max F1 (WSJ) metric)

Constituency Grammar Induction Language Modelling +2

175

Paper
Code

Semi-Amortized Variational Autoencoders

1 code implementation • ICML 2018 • Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, Alexander M. Rush

Amortized variational inference (AVI) replaces instance-specific local inference with a global inference network.

Ranked #2 on Text Generation on Yahoo Questions

Text Generation Variational Inference

155

Paper
Code

Compound Probabilistic Context-Free Grammars for Grammar Induction

2 code implementations • ACL 2019 • Yoon Kim, Chris Dyer, Alexander M. Rush

We study a formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context-free grammar.

Ranked #10 on Constituency Grammar Induction on PTB Diagnostic ECG Database

Constituency Grammar Induction Sentence +1

126

Paper
Code

LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning

1 code implementation • 20 Nov 2023 • Han Guo, Philip Greengard, Eric P. Xing, Yoon Kim

Our approach uses an iterative algorithm to decompose each pretrained matrix into a high-precision low-rank component and a memory-efficient quantized component.

Language Modelling Model Compression +1

Paper
Code

Learning to Decode Collaboratively with Multiple Language Models

1 code implementation • 6 Mar 2024 • Shannon Zejiang Shen, Hunter Lang, Bailin Wang, Yoon Kim, David Sontag

We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level.

Instruction Following

Paper
Code

Parameter-Efficient Transfer Learning with Diff Pruning

2 code implementations • ACL 2021 • Demi Guo, Alexander M. Rush, Yoon Kim

This approach views finetuning as learning a task-specific diff vector that is applied on top of the pretrained parameter vector, which remains fixed and is shared across different tasks.

Transfer Learning

Paper
Code

Sequence-to-Sequence Learning with Latent Neural Grammars

1 code implementation • NeurIPS 2021 • Yoon Kim

While flexible and performant, these models often require large datasets for training and can fail spectacularly on benchmarks designed to test for compositional generalization.

Feature Engineering Machine Translation +2

Paper
Code

Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

1 code implementation • NeurIPS 2023 • Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi

We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs.

Model Editing World Knowledge

Paper
Code

Simple Hardware-Efficient PCFGs with Independent Left and Right Productions

1 code implementation • 23 Oct 2023 • Wei Liu, Songlin Yang, Yoon Kim, Kewei Tu

Scaling dense PCFGs to thousands of nonterminals via a low-rank parameterization of the rule probability tensor has been shown to be beneficial for unsupervised parsing.

Constituency Grammar Induction Language Modelling

Paper
Code

Grammar Prompting for Domain-Specific Language Generation with Large Language Models

1 code implementation • NeurIPS 2023 • Bailin Wang, Zi Wang, Xuezhi Wang, Yuan Cao, Rif A. Saurous, Yoon Kim

Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples.

In-Context Learning Semantic Parsing +1

Paper
Code

In-Context Language Learning: Architectures and Algorithms

1 code implementation • 23 Jan 2024 • Ekin Akyürek, Bailin Wang, Yoon Kim, Jacob Andreas

Finally, we show that hard-wiring these heads into neural models improves performance not just on ICLL, but natural language modeling -- improving the perplexity of 340M-parameter models by up to 1. 14 points (6. 7%) on the SlimPajama dataset.

In-Context Learning Language Modelling

Paper
Code

Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks

1 code implementation • 5 Jul 2023 • Zhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas, Yoon Kim

The impressive performance of recent language models across a wide range of tasks suggests that they possess a degree of abstract reasoning skills.

counterfactual Language Modelling

Paper
Code

Adapting Sequence Models for Sentence Correction

1 code implementation • EMNLP 2017 • Allen Schmaltz, Yoon Kim, Alexander M. Rush, Stuart M. Shieber

In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches.

Machine Translation Sentence +1

Paper
Code

VALHALLA: Visual Hallucination for Machine Translation

1 code implementation • CVPR 2022 • Yi Li, Rameswar Panda, Yoon Kim, Chun-Fu, Chen, Rogerio Feris, David Cox, Nuno Vasconcelos

In particular, given a source sentence an autoregressive hallucination transformer is used to predict a discrete visual representation from the input text, and the combined text and hallucinated representations are utilized to obtain the target translation.

Hallucination Multimodal Machine Translation +2

Paper
Code

Entailment as Robust Self-Learner

1 code implementation • 26 May 2023 • Jiaxin Ge, Hongyin Luo, Yoon Kim, James Glass

Experiments on binary and multi-class classification tasks show that SimPLE leads to more robust self-training results, indicating that the self-trained entailment models are more efficient and trustworthy than large language models on language understanding tasks.

Multi-class Classification Natural Language Understanding +1

Paper
Code

Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning

1 code implementation • 19 Sep 2023 • Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Xixin Wu, Yoon Kim, Helen Meng, James Glass

How can we perform computations over natural language representations to solve tasks that require symbolic and numeric reasoning?

Instruction Following Language Modelling +5

Paper
Code

Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations

1 code implementation • 13 Nov 2023 • Zilu Tang, Mayank Agarwal, Alex Shypula, Bailin Wang, Derry Wijaya, Jie Chen, Yoon Kim

This work explores the use of self-generated natural language explanations as an intermediate step for code-to-code translation with language models.

Code Translation Translation

Paper
Code

Sequence-Level Mixed Sample Data Augmentation

1 code implementation • EMNLP 2020 • Demi Guo, Yoon Kim, Alexander M. Rush

Despite their empirical success, neural networks still have difficulty capturing compositional aspects of natural language.

Data Augmentation Semantic Parsing +1

Paper
Code

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

1 code implementation • 12 Oct 2023 • Linlu Qiu, Liwei Jiang, Ximing Lu, Melanie Sclar, Valentina Pyatkin, Chandra Bhagavatula, Bailin Wang, Yoon Kim, Yejin Choi, Nouha Dziri, Xiang Ren

The ability to derive underlying principles from a handful of observations and then generalize to novel situations -- known as inductive reasoning -- is central to human intelligence.

Paper
Code

Co-training Improves Prompt-based Learning for Large Language Models

1 code implementation • 2 Feb 2022 • Hunter Lang, Monica Agrawal, Yoon Kim, David Sontag

We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data.

Zero-Shot Learning

Paper
Code

Hierarchical Phrase-based Sequence-to-Sequence Learning

1 code implementation • 15 Nov 2022 • Bailin Wang, Ivan Titov, Jacob Andreas, Yoon Kim

We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.

Inductive Bias Machine Translation +2

Paper
Code

Controlling the Focus of Pretrained Language Generation Models

1 code implementation • Findings (ACL) 2022 • Jiabao Ji, Yoon Kim, James Glass, Tianxing He

This work aims to develop a control mechanism by which a user can select spans of context as "highlights" for the model to focus on, and generate relevant output.

Abstractive Text Summarization Response Generation +1

Paper
Code

Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars

1 code implementation • 18 Dec 2022 • Songlin Yang, Roger P. Levy, Yoon Kim

We study grammar induction with mildly context-sensitive grammars for unsupervised discontinuous parsing.

Constituency Parsing Tensor Decomposition

Paper
Code

Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach

1 code implementation • 8 Feb 2023 • Han Guo, Philip Greengard, Hongyi Wang, Andrew Gelman, Yoon Kim, Eric P. Xing

A recent alternative formulation instead treats federated learning as a distributed inference problem, where the goal is to infer a global posterior from partitioned client data (Al-Shedivat et al., 2021).

Distributed Optimization Federated Learning +1

Paper
Code

Amortized Bethe Free Energy Minimization for Learning MRFs

1 code implementation • NeurIPS 2019 • Sam Wiseman, Yoon Kim

We propose to learn deep undirected graphical models (i. e., MRFs) with a non-ELBO objective for which we can calculate exact gradients.

Paper
Code

Emergence of Separable Manifolds in Deep Language Representations

1 code implementation • ICML 2020 • Jonathan Mamou, Hang Le, Miguel Del Rio, Cory Stephenson, Hanlin Tang, Yoon Kim, SueYeon Chung

In addition, we find that the emergence of linear separability in these manifolds is driven by a combined reduction of manifolds' radius, dimensionality and inter-manifold correlations.

Paper
Code

Temporal Analysis of Language through Neural Language Models

1 code implementation • WS 2014 • Yoon Kim, Yi-I Chiu, Kentaro Hanaki, Darshan Hegde, Slav Petrov

We provide a method for automatically detecting change in language across time through a chronologically trained neural language model.

Language Modelling

Paper
Code

Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models

1 code implementation • ICML 2020 • Rares-Darius Buhai, Yoni Halpern, Yoon Kim, Andrej Risteski, David Sontag

One of the most surprising and exciting discoveries in supervised learning was the benefit of overparameterization (i. e. training a very large model) to improving the optimization landscape of a problem, with minimal effect on statistical performance (i. e. generalization).

Variational Inference

Paper
Code

Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization

1 code implementation • 4 Apr 2024 • Aniruddha Nrusimha, Mayank Mishra, Naigang Wang, Dan Alistarh, Rameswar Panda, Yoon Kim

We show that regularizing both the inputs and outputs is crucial for preventing a model's "migrating" the difficulty in input quantization to the weights, which makes post-training quantization (PTQ) of weights more difficult.

Language Modelling Quantization

Paper
Code

Probing for Incremental Parse States in Autoregressive Language Models

1 code implementation • 17 Nov 2022 • Tiwalayo Eisape, Vineet Gangireddy, Roger P. Levy, Yoon Kim

This suggests implicit incremental syntactic inferences underlie next-word predictions in autoregressive neural language models.

Sentence

Paper
Code

Deriving Language Models from Masked Language Models

1 code implementation • 24 May 2023 • Lucas Torroba Hennigen, Yoon Kim

Masked language models (MLM) do not explicitly define a distribution over language, i. e., they are not language models per se.

Paper
Code

Improving Black-box Robustness with In-Context Rewriting

1 code implementation • 13 Feb 2024 • Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen

Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API.

News Classification

Paper
Code

Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling

1 code implementation • 26 Feb 2024 • Hang Jiang, Xiajie Zhang, Robert Mahari, Daniel Kessler, Eric Ma, Tal August, Irene Li, Alex 'Sandy' Pentland, Yoon Kim, Jad Kabbara, Deb Roy

Finally, we find that learning with stories shows a higher retention rate for non-native speakers in the follow-up assessment.

Multiple-choice

Paper
Code

Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models

1 code implementation • 17 Apr 2024 • Yue Zhou, Yada Zhu, Diego Antognini, Yoon Kim, Yang Zhang

This paper studies the relationship between the surface form of a mathematical problem and its solvability by large language models.

Language Modelling Mathematical Reasoning

Paper
Code

On the Flip Side: Identifying Counterexamples in Visual Question Answering

no code implementations • 3 Jun 2018 • Gabriel Grand, Aron Szanto, Yoon Kim, Alexander Rush

Visual question answering (VQA) models respond to open-ended natural language questions about images.

Question Answering Visual Question Answering

Paper
Add Code

OpenNMT: Open-source Toolkit for Neural Machine Translation

no code implementations • 12 Sep 2017 • Guillaume Klein, Yoon Kim, Yuntian Deng, Josep Crego, Jean Senellart, Alexander M. Rush

We introduce an open-source toolkit for neural machine translation (NMT) to support research into model architectures, feature representations, and source modalities, while maintaining competitive performance, modularity and reasonable training requirements.

Machine Translation NMT +1

Paper
Add Code

Structured Attention Networks

no code implementations • 3 Feb 2017 • Yoon Kim, Carl Denton, Luong Hoang, Alexander M. Rush

Attention networks have proven to be an effective approach for embedding categorical inference within a deep neural network.

Machine Translation Natural Language Inference +2

Paper
Add Code

Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction

no code implementations • WS 2016 • Allen Schmaltz, Yoon Kim, Alexander M. Rush, Stuart M. Shieber

We demonstrate that an attention-based encoder-decoder model can be used for sentence-level grammatical error identification for the Automated Evaluation of Scientific Writing (AESW) Shared Task 2016.

Sentence

Paper
Add Code

Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification

no code implementations • WS 2014 • Yoon Kim, Owen Zhang

We provide a simple but novel supervised weighting scheme for adjusting term frequency in tf-idf for sentiment analysis and text classification.

General Classification Sentiment Analysis +2

Paper
Add Code

Avoiding Latent Variable Collapse With Generative Skip Models

no code implementations • 12 Jul 2018 • Adji B. Dieng, Yoon Kim, Alexander M. Rush, David M. Blei

VAEs can capture complex distributions, but they can also suffer from an issue known as "latent variable collapse," especially if the likelihood model is powerful.

Paper
Add Code

A Tutorial on Deep Latent Variable Models of Natural Language

no code implementations • 17 Dec 2018 • Yoon Kim, Sam Wiseman, Alexander M. Rush

There has been much recent, exciting work on combining the complementary strengths of latent variable models and deep learning.

Variational Inference

Paper
Add Code

Representational correlates of hierarchical phrase structure in deep language models

no code implementations • 1 Jan 2021 • Matteo Alleman, Jonathan Mamou, Miguel A Del Rio, Hanlin Tang, Yoon Kim, SueYeon Chung

Importing from computational and cognitive neuroscience the notion of representational invariance, we perform a series of probes designed to test the sensitivity of Transformer representations to several kinds of structure in sentences.

Sentence

Paper
Add Code

Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models

no code implementations • ACL (RepL4NLP) 2021 • Matteo Alleman, Jonathan Mamou, Miguel A Del Rio, Hanlin Tang, Yoon Kim, SueYeon Chung

While vector-based language representations from pretrained language models have set a new standard for many NLP tasks, there is not yet a complete accounting of their inner workings.

Sentence

Paper
Add Code

Developmental Stage Classification of EmbryosUsing Two-Stream Neural Network with Linear-Chain Conditional Random Field

no code implementations • 13 Jul 2021 • Stanislav Lukyanenko, Won-Dong Jang, Donglai Wei, Robbert Struyven, Yoon Kim, Brian Leahy, Helen Yang, Alexander Rush, Dalit Ben-Yosef, Daniel Needleman, Hanspeter Pfister

In this work, we propose a two-stream model for developmental stage classification.

Classification

Paper
Add Code

Large Language Models are Few-Shot Clinical Information Extractors

no code implementations • 25 May 2022 • Monica Agrawal, Stefan Hegselmann, Hunter Lang, Yoon Kim, David Sontag

A long-running goal of the clinical NLP community is the extraction of important variables trapped in clinical notes.

Benchmarking coreference-resolution +4

Paper
Add Code

Learning to Grow Pretrained Models for Efficient Transformer Training

no code implementations • 2 Mar 2023 • Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Daniel Cox, Zhangyang Wang, Yoon Kim

Scaling transformers has led to significant breakthroughs in many domains, leading to a paradigm in which larger versions of existing models are trained and released on a periodic basis.

Paper
Add Code

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

no code implementations • 6 Mar 2023 • Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun, Yoon Kim

Prompt tuning, in which a base pretrained model is adapted to each task via conditioning on learned prompt vectors, has emerged as a promising approach for efficiently adapting large language models to multiple downstream tasks.

Transfer Learning

Paper
Add Code

SAIL: Search-Augmented Instruction Learning

no code implementations • 24 May 2023 • Hongyin Luo, Yung-Sung Chuang, Yuan Gong, Tianhua Zhang, Yoon Kim, Xixin Wu, Danny Fox, Helen Meng, James Glass

Large language models (LLMs) have been significantly improved by instruction fine-tuning, but still lack transparency and the ability to utilize up-to-date knowledge and information.

Denoising Fact Checking +3

Paper
Add Code

Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models

no code implementations • 15 Jun 2023 • Sarah J. Zhang, Samuel Florin, Ariel N. Lee, Eamon Niknafs, Andrei Marginean, Annie Wang, Keith Tyser, Zad Chin, Yann Hicke, Nikhil Singh, Madeleine Udell, Yoon Kim, Tonio Buonassisi, Armando Solar-Lezama, Iddo Drori

We curate a comprehensive dataset of 4, 550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree.

Electrical Engineering Few-Shot Learning +3

Paper
Add Code

Audio-Visual Neural Syntax Acquisition

no code implementations • 11 Oct 2023 • Cheng-I Jeff Lai, Freda Shi, Puyuan Peng, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David Cox, David Harwath, Yang Zhang, Karen Livescu, James Glass

We study phrase structure induction from visually-grounded speech.

Language Acquisition

Paper
Add Code

LangNav: Language as a Perceptual Representation for Navigation

no code implementations • 11 Oct 2023 • Bowen Pan, Rameswar Panda, SouYoung Jin, Rogerio Feris, Aude Oliva, Phillip Isola, Yoon Kim

We explore the use of language as a perceptual representation for vision-and-language navigation (VLN), with a focus on low-data settings.

Image Captioning Language Modelling +4

Paper
Add Code

Towards Verifiable Text Generation with Symbolic References

no code implementations • 15 Nov 2023 • Lucas Torroba Hennigen, Shannon Shen, Aniruddha Nrusimha, Bernhard Gapp, David Sontag, Yoon Kim

LLMs are vulnerable to hallucinations, and thus their outputs generally require laborious human verification for high-stakes applications.

Question Answering Text Generation

Paper
Add Code

CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities

no code implementations • 13 Jan 2024 • Yujun Mao, Yoon Kim, Yilun Zhou

And while self-generated verbalizations of intermediate reasoning steps (i. e., chain-of-thought prompting) have been shown to be helpful, whether LLMs can make use of helpful side information such as problem-specific hints has not been investigated before.

Math Mathematical Reasoning

Paper
Add Code

Structured Code Representations Enable Data-Efficient Adaptation of Code Language Models

no code implementations • 19 Jan 2024 • Mayank Agarwal, Yikang Shen, Bailin Wang, Yoon Kim, Jie Chen

In this work, we explore data-efficient adaptation of pre-trained code models by further pre-training and fine-tuning them with program structures.

Paper
Add Code

Diversity Measurement and Subset Selection for Instruction Tuning Datasets

no code implementations • 4 Feb 2024 • Peiqi Wang, Yikang Shen, Zhen Guo, Matthew Stallone, Yoon Kim, Polina Golland, Rameswar Panda

Our experiments demonstrate that the proposed diversity measure in the normalized weight gradient space is correlated with downstream instruction-following performance.

Instruction Following Point Processes

Paper
Add Code

Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment

no code implementations • 21 Feb 2024 • William Merrill, Zhaofeng Wu, Norihito Naka, Yoon Kim, Tal Linzen

Do LMs infer the semantics of text from co-occurrence patterns in their training data?

Sentence

Paper
Add Code

What Do Language Models Hear? Probing for Auditory Representations in Language Models

no code implementations • 26 Feb 2024 • Jerry Ngo, Yoon Kim

This probe is trained via a contrastive loss that pushes the language representations and sound representations of an object to be close to one another.

Object

Paper
Add Code

Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback

no code implementations • 17 Mar 2024 • Dong Won Lee, Hae Won Park, Yoon Kim, Cynthia Breazeal, Louis-Philippe Morency

We describe an approach for aligning an LLM-based dialogue agent based on global (i. e., dialogue-level) rewards, while also taking into account naturally-occurring multimodal signals.

Paper
Add Code

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

no code implementations • 18 Apr 2024 • Zhaofeng Wu, Ananth Balashankar, Yoon Kim, Jacob Eisenstein, Ahmad Beirami

In this work, we evaluate a simple approach for zero-shot cross-lingual alignment, where a reward model is trained on preference data in one source language and directly applied to other target languages.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.