no code implementations • EMNLP 2020 • Stefan Larson, Anthony Zheng, Anish Mahendran, Rishi Tekriwal, Adrian Cheung, Eric Guldan, Kevin Leach, Jonathan K. Kummerfeld
Diverse data is crucial for training robust models, but crowdsourced text often lacks diversity as workers tend to write simple variations from prompts.
1 code implementation • 8 Mar 2024 • Zhijian Li, Stefan Larson, Kevin Leach
Intent classifiers must be able to distinguish when a user's utterance does not belong to any supported intent to avoid producing incorrect and unrelated system responses.
no code implementations • 21 Jun 2023 • Stefan Larson, Gordon Lim, Kevin Leach
The RVL-CDIP benchmark is widely used for measuring performance on the task of document classification.
no code implementations • 16 Mar 2023 • Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini, Jonathan Boarman
Document denoising and binarization are fundamental problems in the document processing space, but current datasets are often too small and lack sufficient complexity to effectively train and benchmark modern data-driven machine learning models.
1 code implementation • 14 Oct 2022 • Stefan Larson, Gordon Lim, Yutong Ai, David Kuang, Kevin Leach
Our new out-of-distribution benchmark consists of two types of documents: those that are not part of any of the 16 in-domain RVL-CDIP categories (RVL-CDIP-O), and those that are one of the 16 in-domain categories yet are drawn from a distribution different from that of the original RVL-CDIP dataset (RVL-CDIP-N).
2 code implementations • 30 Aug 2022 • Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini, Jonathan Boarman
This paper introduces Augraphy, a Python library for constructing data augmentation pipelines which produce distortions commonly seen in real-world document image datasets.
no code implementations • 26 Jul 2022 • Stefan Larson, Kevin Leach
By extension, so too has interest in developing and improving intent classification and slot-filling models, which are two components that are commonly used in task-oriented dialog systems.
1 code implementation • SIGDIAL (ACL) 2022 • Stefan Larson, Kevin Leach
Similarly, developers of such ML-driven systems need to be able to add new training data to an already-existing dataset to support these new skills.
no code implementations • 5 Aug 2021 • Stefan Larson, Navtej Singh, Saarthak Maheshwari, Shanti Stewart, Uma Krishnaswamy
To be robust enough for widespread adoption, document analysis systems involving machine learning models must be able to respond correctly to inputs that fall outside of the data distribution that was used to generate the data on which the models were trained.
Document Classification Out-of-Distribution Generalization +1
1 code implementation • EACL 2021 • Jacob Solawetz, Stefan Larson
We introduce a new dataset by converting the QA-SRL 2. 0 dataset to a large-scale OIE dataset (LSOIE).
Natural Language Inference Natural Language Understanding +2
no code implementations • COLING 2020 • Stefan Larson, Adrian Cheung, Anish Mahendran, Kevin Leach, Jonathan K. Kummerfeld
Using three new noisy crowd-annotated datasets, we show that a wide range of inconsistencies occur and can impact system performance if not addressed.
no code implementations • LREC 2020 • Stefan Larson, Eric Guldan, Kevin Leach
Typical machine learning approaches to developing task-oriented dialog systems require the collection and management of large amounts of training data, especially for the tasks of intent classification and slot-filling.
5 code implementations • IJCNLP 2019 • Stefan Larson, Anish Mahendran, Joseph J. Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K. Kummerfeld, Kevin Leach, Michael A. Laurenzano, Lingjia Tang, Jason Mars
We find that while the classifiers perform well on in-scope intent classification, they struggle to identify out-of-scope queries.
no code implementations • NAACL 2019 • Stefan Larson, Anish Mahendran, Andrew Lee, Jonathan K. Kummerfeld, Parker Hill, Michael A. Laurenzano, Johann Hauswald, Lingjia Tang, Jason Mars
We also present a novel data collection pipeline built atop our detection technique to automatically and iteratively mine unique data samples while discarding erroneous samples.