no code implementations • 8 Sep 2023 • Kyoungyeon Cho, Seungkum Han, Wonseok Hwang
Here we provide NESTLE, a no code tool for large-scale statistical analysis of legal corpus.
1 code implementation • 3 Nov 2022 • Wonseok Hwang, Saehee Eom, Hanuhl Lee, Hai Jin Park, Minjoon Seo
Lawyers, for instance, search for appropriate precedents favorable to their clients, while the number of legal precedents is ever-growing.
1 code implementation • 10 Jun 2022 • Wonseok Hwang, Dongjun Lee, Kyoungyeon Cho, Hanuhl Lee, Minjoon Seo
Here we present the first large-scale benchmark of Korean legal AI datasets, LBOX OPEN, that consists of one legal corpus, two classification tasks, two legal judgement prediction (LJP) tasks, and one summarization task.
no code implementations • 23 Feb 2022 • Geewook Kim, Wonseok Hwang, Minjoon Seo, Seunghyun Park
Semi-structured query systems for document-oriented databases have many real applications.
Optical Character Recognition
Optical Character Recognition (OCR)
+1
3 code implementations • 30 Nov 2021 • Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park
Current Visual Document Understanding (VDU) methods outsource the task of reading text to off-the-shelf Optical Character Recognition (OCR) engines and focus on the understanding task with the OCR outputs.
Ranked #10 on
Document Image Classification
on RVL-CDIP
1 code implementation • 10 Aug 2021 • Teakgyu Hong, Donghyun Kim, Mingi Ji, Wonseok Hwang, Daehyun Nam, Sungrae Park
On the other hand, this paper tackles the problem by going back to the basic: effective combination of text and layout.
Ranked #3 on
Relation Extraction
on FUNSD
no code implementations • EMNLP 2021 • Wonseok Hwang, Hyunji Lee, Jinyeong Yim, Geewook Kim, Minjoon Seo
A real-world information extraction (IE) system for semi-structured document images often involves a long pipeline of multiple modules, whose complexity dramatically increases its development and maintenance cost.
no code implementations • 1 Jan 2021 • Teakgyu Hong, Donghyun Kim, Mingi Ji, Wonseok Hwang, Daehyun Nam, Sungrae Park
Although the recent advance in OCR enables the accurate extraction of text segments, it is still challenging to extract key information from documents due to the diversity of layouts.
no code implementations • 27 Nov 2020 • Juno Hwang, Wonseok Hwang, Junghyo Jo
The restricted Boltzmann machine (RBM) is a representative generative model based on the concept of statistical mechanics.
no code implementations • AKBC 2020 • Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Minjoon Seo
Deep learning approaches to semantic parsing require a large amount of labeled data, but annotating complex logical forms is costly.
1 code implementation • Findings (ACL) 2021 • Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Sohee Yang, Minjoon Seo
Information Extraction (IE) for semi-structured document images is often approached as a sequence tagging problem by classifying each recognized input token into one of the IOB (Inside, Outside, and Beginning) categories.
1 code implementation • NeurIPS Workshop Document_Intelligen 2019 • Wonseok Hwang, Seonghyeon Kim, Minjoon Seo, Jinyeong Yim, Seunghyun Park, Sungrae Park, Junyeop Lee, Bado Lee, Hwalsuk Lee
Parsing textual information embedded in images is important for various down- stream tasks.
Optical Character Recognition
Optical Character Recognition (OCR)
5 code implementations • 4 Feb 2019 • Wonseok Hwang, Jinyeong Yim, Seunghyun Park, Minjoon Seo
We present SQLova, the first Natural-language-to-SQL (NL2SQL) model to achieve human performance in WikiSQL dataset.