Search Results for author: Zejiang Shen

Found 8 papers, 6 papers with code

Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

no code implementations16 Mar 2022 Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, Doug Downey

Based on our findings, we present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.

Abstractive Text Summarization

VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups

1 code implementation1 Jun 2021 Zejiang Shen, Kyle Lo, Lucy Lu Wang, Bailey Kuehl, Daniel S. Weld, Doug Downey

Experiments are conducted on a newly curated evaluation suite, S2-VLUE, that unifies existing automatically-labeled datasets and includes a new dataset of manual annotations covering diverse papers from 19 scientific disciplines.

Language Modelling Text Classification +1

LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis

5 code implementations29 Mar 2021 Zejiang Shen, Ruochen Zhang, Melissa Dell, Benjamin Charles Germain Lee, Jacob Carlson, Weining Li

Recent advances in document image analysis (DIA) have been primarily driven by the application of neural networks.

PAWLS: PDF Annotation With Labels and Structure

1 code implementation ACL 2021 Mark Neumann, Zejiang Shen, Sam Skjonsberg

Adobe's Portable Document Format (PDF) is a popular way of distributing view-only documents with a rich visual markup.

OLALA: Object-Level Active Learning for Efficient Document Layout Annotation

1 code implementation5 Oct 2020 Zejiang Shen, Jian Zhao, Melissa Dell, YaoLiang Yu, Weining Li

Document images often have intricate layout structures, with numerous content regions (e. g. texts, figures, tables) densely arranged on each page.

Active Learning Object Detection

A Large Dataset of Historical Japanese Documents with Complex Layouts

3 code implementations18 Apr 2020 Zejiang Shen, Kaixuan Zhang, Melissa Dell

Deep learning-based approaches for automatic document layout analysis and content extraction have the potential to unlock rich information trapped in historical documents on a large scale.

Document Layout Analysis

Generating Object Stamps

1 code implementation1 Jan 2020 Youssef Alami Mejjati, Zejiang Shen, Michael Snower, Aaron Gokaslan, Oliver Wang, James Tompkin, Kwang In Kim

We present an algorithm to generate diverse foreground objects and composite them into background images using a GAN architecture.

Information Extraction from Text Regions with Complex Tabular Structure

no code implementations NeurIPS Workshop Document_Intelligen 2019 Kaixuan Zhang, Zejiang Shen, Jie zhou, Melissa Dell

Recent innovations have improved layout analysis of document images, significantly improving our ability to identify text and non-text regions.

Cannot find the paper you are looking for? You can Submit a new open access paper.