Search Results for author: Yunsu Kim

Found 31 papers, 13 papers with code

Leveraging the Interplay Between Syntactic and Acoustic Cues for Optimizing Korean TTS Pause Formation

no code implementations • 3 Apr 2024 • Yejin Jeon, Yunsu Kim, Gary Geunbae Lee

Contemporary neural speech synthesis models have indeed demonstrated remarkable proficiency in synthetic speech generation as they have attained a level of quality comparable to that of human-produced speech.

Speech Synthesis

Paper
Add Code

Evalverse: Unified and Accessible Library for Large Language Model Evaluation

1 code implementation • 1 Apr 2024 • Jihoo Kim, Wonho Song, Dahyun Kim, Yunsu Kim, Yungi Kim, Chanjun Park

This paper introduces Evalverse, a novel library that streamlines the evaluation of Large Language Models (LLMs) by unifying disparate evaluation tools into a single, user-friendly framework.

Language Modelling Large Language Model

154

Paper
Code

Explainable Multi-hop Question Generation: An End-to-End Approach without Intermediate Question Labeling

1 code implementation • 31 Mar 2024 • Seonjeong Hwang, Yunsu Kim, Gary Geunbae Lee

We also prove that our model logically and incrementally increases the complexity of questions, and the generated multi-hop questions are also beneficial for training question answering models.

Question Answering Question Generation +2

Paper
Code

sDPO: Don't Use Your Data All at Once

no code implementations • 28 Mar 2024 • Dahyun Kim, Yungi Kim, Wonho Song, Hyeonwoo Kim, Yunsu Kim, Sanghoon Kim, Chanjun Park

As development of large language models (LLM) progresses, aligning them with human preferences has become increasingly important.

Paper
Add Code

Denoising Table-Text Retrieval for Open-Domain Question Answering

1 code implementation • 26 Mar 2024 • Deokhyung Kang, Baikjin Jung, Yunsu Kim, Gary Geunbae Lee

Previous studies in table-text open-domain question answering have two common challenges: firstly, their retrievers can be affected by false-positive labels in training datasets; secondly, they may struggle to provide appropriate evidence for questions that require reasoning across the table.

Denoising Open-Domain Question Answering +2

Paper
Code

Autoregressive Score Generation for Multi-trait Essay Scoring

1 code implementation • 13 Mar 2024 • Heejin Do, Yunsu Kim, Gary Geunbae Lee

Recently, encoder-only pre-trained models such as BERT have been successfully applied in automated essay scoring (AES) to predict a single overall score.

Automated Essay Scoring

Paper
Code

Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric

1 code implementation • 20 Jan 2024 • Golara Javadi, Kamer Ali Yuksel, Yunsu Kim, Thiago castro Ferreira, Mohamed Al-Badrashiny

The findings suggest that NoRefER is not merely a tool for error detection but also a comprehensive framework for enhancing ASR systems' transparency, efficiency, and effectiveness.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

2 code implementations • 23 Dec 2023 • Dahyun Kim, Chanjun Park, Sanghoon Kim, Wonsung Lee, Wonho Song, Yunsu Kim, Hyeonwoo Kim, Yungi Kim, Hyeonju Lee, Jihoo Kim, Changbae Ahn, Seonghoon Yang, Sukyung Lee, Hyunbyung Park, Gyoungjin Gim, Mikyoung Cha, Hwalsuk Lee, Sunghun Kim

We introduce SOLAR 10. 7B, a large language model (LLM) with 10. 7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks.

Instruction Following Language Modelling +1

1,128

Paper
Code

Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation

no code implementations • 6 Dec 2023 • Wonjun Lee, Gary Geunbae Lee, Yunsu Kim

This research contributes to the advancements of two-pass ASR systems in low-resource languages, offering the potential for improved cross-lingual transfer learning.

Cross-Lingual Transfer speech-recognition +2

Paper
Add Code

Exploring the Viability of Synthetic Audio Data for Audio-Based Dialogue State Tracking

1 code implementation • 4 Dec 2023 • Jihyun Lee, Yejin Jeon, Wonjun Lee, Yunsu Kim, Gary Geunbae Lee

We address this by investigating synthetic audio data for audio-based DST.

Dialogue State Tracking Task-Oriented Dialogue Systems

Paper
Code

Score-balanced Loss for Multi-aspect Pronunciation Assessment

1 code implementation • 26 May 2023 • Heejin Do, Yunsu Kim, Gary Geunbae Lee

With rapid technological growth, automatic pronunciation assessment has transitioned toward systems that evaluate pronunciation in various aspects, such as fluency and stress.

Paper
Code

Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring

1 code implementation • 26 May 2023 • Heejin Do, Yunsu Kim, Gary Geunbae Lee

Thus, predicting various trait scores of unseen-prompt essays (called cross-prompt essay trait scoring) is a remaining challenge of AES.

Automated Essay Scoring Relation

Paper
Code

Bring More Attention to Syntactic Symmetry for Automatic Postediting of High-Quality Machine Translations

no code implementations • 17 May 2023 • Baikjin Jung, Myungji Lee, Jong-Hyeok Lee, Yunsu Kim

Automatic postediting (APE) is an automated process to refine a given machine translation (MT).

Machine Translation

Paper
Add Code

DORIC : Domain Robust Fine-Tuning for Open Intent Clustering through Dependency Parsing

no code implementations • 17 Mar 2023 • Jihyun Lee, Seungyeon Seo, Yunsu Kim, Gary Geunbae Lee

We present our work on Track 2 in the Dialog System Technology Challenges 11 (DSTC11).

Clustering Dependency Parsing +1

Paper
Add Code

Self-Training with Purpose Preserving Augmentation Improves Few-shot Generative Dialogue State Tracking

no code implementations • 17 Nov 2022 • Jihyun Lee, Chaebin Lee, Yunsu Kim, Gary Geunbae Lee

In dialogue state tracking (DST), labeling the dataset involves considerable human labor.

Dialogue State Tracking

Paper
Add Code

Hierarchical Pronunciation Assessment with Multi-Aspect Attention

1 code implementation • 15 Nov 2022 • Heejin Do, Yunsu Kim, Gary Geunbae Lee

In this paper, we propose a Hierarchical Pronunciation Assessment with Multi-aspect Attention (HiPAMA) model, which hierarchically represents the granularity levels to directly capture their linguistic structures and introduces multi-aspect attention that reflects associations across aspects at the same level to create more connotative representations.

Ranked #2 on Utterance-level pronounciation scoring on speechocean762

Multi-Task Learning Phone-level pronunciation scoring +2

Paper
Code

Multi-Type Conversational Question-Answer Generation with Closed-ended and Unanswerable Questions

no code implementations • 24 Oct 2022 • Seonjeong Hwang, Yunsu Kim, Gary Geunbae Lee

Conversational question answering (CQA) facilitates an incremental and interactive understanding of a given context, but building a CQA system is difficult for many domains due to the problem of data scarcity.

Answer Generation Conversational Question Answering +1

Paper
Add Code

When and Why is Unsupervised Neural Machine Translation Useless?

no code implementations • EAMT 2020 • Yunsu Kim, Miguel Graça, Hermann Ney

This paper studies the practicality of the current state-of-the-art unsupervised methods in neural machine translation (NMT).

Machine Translation NMT +2

Paper
Add Code

When and Why is Document-level Context Useful in Neural Machine Translation?

1 code implementation • WS 2019 • Yunsu Kim, Duc Thanh Tran, Hermann Ney

Document-level context has received lots of attention for compensating neural machine translation (NMT) of isolated sentences.

Machine Translation NMT +1

Paper
Code

Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages

no code implementations • IJCNLP 2019 • Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, Hermann Ney

We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i. e., source-pivot and pivot-target, leading to a significant improvement in source-target translation.

Machine Translation NMT +3

Paper
Add Code

The RWTH Aachen University Machine Translation Systems for WMT 2019

no code implementations • WS 2019 • Jan Rosendahl, Christian Herold, Yunsu Kim, Miguel Gra{\c{c}}a, Weiyue Wang, Parnia Bahar, Yingbo Gao, Hermann Ney

For the De-En task, none of the tested methods gave a significant improvement over last years winning system and we end up with the same performance, resulting in 39. 6{\%} BLEU on newstest2019.

Attribute Language Modelling +3

Paper
Add Code

Generalizing Back-Translation in Neural Machine Translation

no code implementations • WS 2019 • Miguel Graça, Yunsu Kim, Julian Schamper, Shahram Khadivi, Hermann Ney

Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT).

Data Augmentation Machine Translation +3

Paper
Add Code

Learning Bilingual Sentence Embeddings via Autoencoding and Computing Similarities with a Multilayer Perceptron

no code implementations • WS 2019 • Yunsu Kim, Hendrik Rosendahl, Nick Rossenbach, Jan Rosendahl, Shahram Khadivi, Hermann Ney

We propose a novel model architecture and training algorithm to learn bilingual sentence embeddings from a combination of parallel and monolingual data.

Machine Translation Sentence +2

Paper
Add Code

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

1 code implementation • ACL 2019 • Yunsu Kim, Yingbo Gao, Hermann Ney

Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies.

Cross-Lingual Transfer Low-Resource Neural Machine Translation +3

Paper
Code

A Comparative Study on Vocabulary Reduction for Phrase Table Smoothing

no code implementations • WS 2016 • Yunsu Kim, Andreas Guta, Joern Wuebker, Hermann Ney

This work systematically analyzes the smoothing effect of vocabulary reduction for phrase translation models.

Translation

Paper
Add Code

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

no code implementations • EACL 2017 • Yunsu Kim, Julian Schamper, Hermann Ney

We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words.

Translation

Paper
Add Code

Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder

no code implementations • EMNLP 2018 • Yunsu Kim, Jiahui Geng, Hermann Ney

Unsupervised learning of cross-lingual word embedding offers elegant matching of words across languages, but has fundamental limitations in translating sentences.

Denoising Language Modelling +2

Paper
Add Code

The RWTH Aachen University Filtering System for the WMT 2018 Parallel Corpus Filtering Task

no code implementations • WS 2018 • Nick Rossenbach, Jan Rosendahl, Yunsu Kim, Miguel Gra{\c{c}}a, Aman Gokrani, Hermann Ney

We use several rule-based, heuristic methods to preselect sentence pairs.

Machine Translation Sentence +1

Paper
Add Code

The RWTH Aachen University Supervised Machine Translation Systems for WMT 2018

1 code implementation • WS 2018 • Julian Schamper, Jan Rosendahl, Parnia Bahar, Yunsu Kim, Arne Nix, Hermann Ney

In total we improve by 6. 8{\%} BLEU over our last year{'}s submission and by 4. 8{\%} BLEU over the winning system of the 2017 German→English task.

Machine Translation Translation

1,206

Paper
Code

The RWTH Aachen University English-German and German-English Unsupervised Neural Machine Translation Systems for WMT 2018

no code implementations • WS 2018 • Miguel Gra{\c{c}}a, Yunsu Kim, Julian Schamper, Jiahui Geng, Hermann Ney

This paper describes the unsupervised neural machine translation (NMT) systems of the RWTH Aachen University developed for the English â†” German news translation task of the \textit{EMNLP 2018 Third Conference on Machine Translation} (WMT 2018).

Machine Translation NMT +2

Paper
Add Code

Extended Translation Models in Phrase-based Decoding

no code implementations • WS 2015 • Andreas Guta, Joern Wuebker, Miguel Gra{\c{c}}a, Yunsu Kim, Hermann Ney

Language Modelling Machine Translation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.