no code implementations • 24 Mar 2025 • Armineh Nourbakhsh, Siddharth Parekh, Pranav Shetty, Zhao Jin, Sameena Shah, Carolyn Rose
Document Visual Question Answering (VQA) models have evolved at an impressive rate over the past few years, coming close to or matching human performance on some benchmarks.
no code implementations • 20 Oct 2024 • Ran Zmigrod, Pranav Shetty, Mathieu Sibue, Zhiqiang Ma, Armineh Nourbakhsh, Xiaomo Liu, Manuela Veloso
In this work, we present K2Q, a diverse collection of five datasets converted from KIE to a prompt-response format using a plethora of bespoke templates.
1 code implementation • 2 May 2024 • Zhiyu Zoey Chen, Jing Ma, Xinlu Zhang, Nan Hao, An Yan, Armineh Nourbakhsh, Xianjun Yang, Julian McAuley, Linda Petzold, William Yang Wang
In the fast-evolving domain of artificial intelligence, large language models (LLMs) such as GPT-3 and GPT-4 are revolutionizing the landscapes of finance, healthcare, and law: domains characterized by their reliance on professional expertise, challenging data acquisition, high-stakes, and stringent regulatory compliance.
no code implementations • 5 Apr 2024 • Ran Zmigrod, Dongsheng Wang, Mathieu Sibue, Yulong Pei, Petr Babkin, Ivan Brugere, Xiaomo Liu, Nacho Navarro, Antony Papadimitriou, William Watson, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah
Several datasets exist for research on specific tasks of VRDU such as document classification (DC), key entity extraction (KEE), entity linking, visual question answering (VQA), inter alia.
no code implementations • 7 Feb 2024 • Ran Zmigrod, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah
Visually Rich Form Understanding (VRFU) poses a complex research problem due to the documents' highly structured nature and yet highly variable style and content.
no code implementations • 5 Jan 2024 • Dongsheng Wang, Zhiqiang Ma, Armineh Nourbakhsh, Kang Gu, Sameena Shah
Advances in Visually Rich Document Understanding (VrDU) have enabled information extraction and question answering over documents with complex layouts.
no code implementations • 31 Dec 2023 • Dongsheng Wang, Natraj Raman, Mathieu Sibue, Zhiqiang Ma, Petr Babkin, Simerjot Kaur, Yulong Pei, Armineh Nourbakhsh, Xiaomo Liu
Enterprise documents such as forms, invoices, receipts, reports, contracts, and other similar records, often carry rich semantics at the intersection of textual and spatial modalities.
no code implementations • 27 Oct 2021 • Simerjot Kaur, Ivan Brugere, Andrea Stefanucci, Armineh Nourbakhsh, Sameena Shah, Manuela Veloso
We compare the performance of our system with human generated recommendations and demonstrate the ability of our algorithm to perform extremely well on this task.
no code implementations • 19 Sep 2021 • Mahmoud Mahfouz, Armineh Nourbakhsh, Sameena Shah
Organizations around the world face an array of risks impacting their operations globally.
no code implementations • 23 Oct 2020 • Natraj Raman, Armineh Nourbakhsh, Sameena Shah, Manuela Veloso
Task specific fine-tuning of a pre-trained neural language model using a custom softmax output layer is the de facto approach of late when dealing with document classification problems.
no code implementations • 2 Oct 2020 • Vineeth Ravi, Selim Amrouni, Andrea Stefanucci, Armineh Nourbakhsh, Prashant Reddy, Manuela Veloso
Digital reports are often created based on tedious manual analysis as well as visualization of the underlying trends and characteristics of data.
no code implementations • 17 May 2020 • Zhiqiang Ma, Steven Pomerville, Mingyang Di, Armineh Nourbakhsh
In this paper we present SPot, an automated tool for detecting operating segments and their related performance indicators from earnings reports.
no code implementations • 24 Aug 2019 • Armineh Nourbakhsh, Grace Bang
In the finance sector, studies focused on anomaly detection are often associated with time-series and transactional data analytics.
no code implementations • 11 Nov 2017 • Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Sameena Shah, Robert Martin, John Duprey
It has a bottom-up approach to news detection, and does not rely on a predefined set of sources or subjects.
Social and Information Networks
no code implementations • 14 Aug 2017 • Quanzhi Li, Sameena Shah, Xiaomo Liu, Armineh Nourbakhsh
In addition to the data sets learned from just tweet data, we also built embedding sets from the general data and the combination of tweets with the general data.
no code implementations • SEMEVAL 2017 • Quanzhi Li, Armineh Nourbakhsh, Xiaomo Liu, Rui Fang, Sameena Shah
This paper describes the approach we used for SemEval-2017 Task 4: Sentiment Analysis in Twitter.
no code implementations • SEMEVAL 2017 • Quanzhi Li, Sameena Shah, Armineh Nourbakhsh, Rui Fang, Xiaomo Liu
This paper describes the approach we used for SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs.