no code implementations • 3 Apr 2024 • Yejin Jeon, Yunsu Kim, Gary Geunbae Lee
Contemporary neural speech synthesis models have indeed demonstrated remarkable proficiency in synthetic speech generation as they have attained a level of quality comparable to that of human-produced speech.
1 code implementation • 1 Apr 2024 • Jihoo Kim, Wonho Song, Dahyun Kim, Yunsu Kim, Yungi Kim, Chanjun Park
This paper introduces Evalverse, a novel library that streamlines the evaluation of Large Language Models (LLMs) by unifying disparate evaluation tools into a single, user-friendly framework.
1 code implementation • 31 Mar 2024 • Seonjeong Hwang, Yunsu Kim, Gary Geunbae Lee
We also prove that our model logically and incrementally increases the complexity of questions, and the generated multi-hop questions are also beneficial for training question answering models.
no code implementations • 28 Mar 2024 • Dahyun Kim, Yungi Kim, Wonho Song, Hyeonwoo Kim, Yunsu Kim, Sanghoon Kim, Chanjun Park
As development of large language models (LLM) progresses, aligning them with human preferences has become increasingly important.
1 code implementation • 26 Mar 2024 • Deokhyung Kang, Baikjin Jung, Yunsu Kim, Gary Geunbae Lee
Previous studies in table-text open-domain question answering have two common challenges: firstly, their retrievers can be affected by false-positive labels in training datasets; secondly, they may struggle to provide appropriate evidence for questions that require reasoning across the table.
1 code implementation • 13 Mar 2024 • Heejin Do, Yunsu Kim, Gary Geunbae Lee
Recently, encoder-only pre-trained models such as BERT have been successfully applied in automated essay scoring (AES) to predict a single overall score.
1 code implementation • 20 Jan 2024 • Golara Javadi, Kamer Ali Yuksel, Yunsu Kim, Thiago castro Ferreira, Mohamed Al-Badrashiny
The findings suggest that NoRefER is not merely a tool for error detection but also a comprehensive framework for enhancing ASR systems' transparency, efficiency, and effectiveness.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
2 code implementations • 23 Dec 2023 • Dahyun Kim, Chanjun Park, Sanghoon Kim, Wonsung Lee, Wonho Song, Yunsu Kim, Hyeonwoo Kim, Yungi Kim, Hyeonju Lee, Jihoo Kim, Changbae Ahn, Seonghoon Yang, Sukyung Lee, Hyunbyung Park, Gyoungjin Gim, Mikyoung Cha, Hwalsuk Lee, Sunghun Kim
We introduce SOLAR 10. 7B, a large language model (LLM) with 10. 7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks.
no code implementations • 6 Dec 2023 • Wonjun Lee, Gary Geunbae Lee, Yunsu Kim
This research contributes to the advancements of two-pass ASR systems in low-resource languages, offering the potential for improved cross-lingual transfer learning.
1 code implementation • 4 Dec 2023 • Jihyun Lee, Yejin Jeon, Wonjun Lee, Yunsu Kim, Gary Geunbae Lee
We address this by investigating synthetic audio data for audio-based DST.
1 code implementation • 26 May 2023 • Heejin Do, Yunsu Kim, Gary Geunbae Lee
With rapid technological growth, automatic pronunciation assessment has transitioned toward systems that evaluate pronunciation in various aspects, such as fluency and stress.
1 code implementation • 26 May 2023 • Heejin Do, Yunsu Kim, Gary Geunbae Lee
Thus, predicting various trait scores of unseen-prompt essays (called cross-prompt essay trait scoring) is a remaining challenge of AES.
no code implementations • 17 May 2023 • Baikjin Jung, Myungji Lee, Jong-Hyeok Lee, Yunsu Kim
Automatic postediting (APE) is an automated process to refine a given machine translation (MT).
no code implementations • 17 Mar 2023 • Jihyun Lee, Seungyeon Seo, Yunsu Kim, Gary Geunbae Lee
We present our work on Track 2 in the Dialog System Technology Challenges 11 (DSTC11).
no code implementations • 17 Nov 2022 • Jihyun Lee, Chaebin Lee, Yunsu Kim, Gary Geunbae Lee
In dialogue state tracking (DST), labeling the dataset involves considerable human labor.
1 code implementation • 15 Nov 2022 • Heejin Do, Yunsu Kim, Gary Geunbae Lee
In this paper, we propose a Hierarchical Pronunciation Assessment with Multi-aspect Attention (HiPAMA) model, which hierarchically represents the granularity levels to directly capture their linguistic structures and introduces multi-aspect attention that reflects associations across aspects at the same level to create more connotative representations.
no code implementations • 24 Oct 2022 • Seonjeong Hwang, Yunsu Kim, Gary Geunbae Lee
Conversational question answering (CQA) facilitates an incremental and interactive understanding of a given context, but building a CQA system is difficult for many domains due to the problem of data scarcity.
no code implementations • EAMT 2020 • Yunsu Kim, Miguel Graça, Hermann Ney
This paper studies the practicality of the current state-of-the-art unsupervised methods in neural machine translation (NMT).
1 code implementation • WS 2019 • Yunsu Kim, Duc Thanh Tran, Hermann Ney
Document-level context has received lots of attention for compensating neural machine translation (NMT) of isolated sentences.
no code implementations • IJCNLP 2019 • Yunsu Kim, Petre Petrov, Pavel Petrushkov, Shahram Khadivi, Hermann Ney
We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i. e., source-pivot and pivot-target, leading to a significant improvement in source-target translation.
no code implementations • WS 2019 • Jan Rosendahl, Christian Herold, Yunsu Kim, Miguel Gra{\c{c}}a, Weiyue Wang, Parnia Bahar, Yingbo Gao, Hermann Ney
For the De-En task, none of the tested methods gave a significant improvement over last years winning system and we end up with the same performance, resulting in 39. 6{\%} BLEU on newstest2019.
no code implementations • WS 2019 • Miguel Graça, Yunsu Kim, Julian Schamper, Shahram Khadivi, Hermann Ney
Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT).
no code implementations • WS 2019 • Yunsu Kim, Hendrik Rosendahl, Nick Rossenbach, Jan Rosendahl, Shahram Khadivi, Hermann Ney
We propose a novel model architecture and training algorithm to learn bilingual sentence embeddings from a combination of parallel and monolingual data.
1 code implementation • ACL 2019 • Yunsu Kim, Yingbo Gao, Hermann Ney
Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies.
Cross-Lingual Transfer Low-Resource Neural Machine Translation +3
no code implementations • WS 2016 • Yunsu Kim, Andreas Guta, Joern Wuebker, Hermann Ney
This work systematically analyzes the smoothing effect of vocabulary reduction for phrase translation models.
no code implementations • EACL 2017 • Yunsu Kim, Julian Schamper, Hermann Ney
We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words.
no code implementations • EMNLP 2018 • Yunsu Kim, Jiahui Geng, Hermann Ney
Unsupervised learning of cross-lingual word embedding offers elegant matching of words across languages, but has fundamental limitations in translating sentences.
no code implementations • WS 2018 • Nick Rossenbach, Jan Rosendahl, Yunsu Kim, Miguel Gra{\c{c}}a, Aman Gokrani, Hermann Ney
We use several rule-based, heuristic methods to preselect sentence pairs.
1 code implementation • WS 2018 • Julian Schamper, Jan Rosendahl, Parnia Bahar, Yunsu Kim, Arne Nix, Hermann Ney
In total we improve by 6. 8{\%} BLEU over our last year{'}s submission and by 4. 8{\%} BLEU over the winning system of the 2017 German→English task.
no code implementations • WS 2018 • Miguel Gra{\c{c}}a, Yunsu Kim, Julian Schamper, Jiahui Geng, Hermann Ney
This paper describes the unsupervised neural machine translation (NMT) systems of the RWTH Aachen University developed for the English ↔ German news translation task of the \textit{EMNLP 2018 Third Conference on Machine Translation} (WMT 2018).