Search Results for author: Vincent Perot

Found 11 papers, 2 papers with code

CodecLM: Aligning Language Models with Tailored Synthetic Data

no code implementations • 8 Apr 2024 • Zifeng Wang, Chun-Liang Li, Vincent Perot, Long T. Le, Jin Miao, Zizhao Zhang, Chen-Yu Lee, Tomas Pfister

To this end, we introduce CodecLM, a general framework for adaptively generating high-quality synthetic data for LLM alignment with different downstream instruction distributions and LLMs.

Instruction Following

Paper
Add Code

Noise-Aware Training of Layout-Aware Language Models

no code implementations • 30 Mar 2024 • Ritesh Sarkhel, Xiaoqi Ren, Lauro Beltrao Costa, Guolong Su, Vincent Perot, Yanan Xie, Emmanouil Koukoumidis, Arnab Nandi

Pre-training an extractor model on unlabeled instances of the target document type, followed by a fine-tuning step on human-labeled instances does not work in these scenarios, as it surpasses the maximum allowable training time allocated for the extractor.

Transfer Learning

Paper
Add Code

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

no code implementations • 9 Jan 2024 • Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, Tomas Pfister

We propose the Chain-of-Table framework, where tabular data is explicitly used in the reasoning chain as a proxy for intermediate thoughts.

Ranked #3 on Table-based Fact Verification on TabFact

Fact Verification In-Context Learning +3

Paper
Add Code

LMDX: Language Model-based Document Information Extraction and Localization

no code implementations • 19 Sep 2023 • Vincent Perot, Kai Kang, Florian Luisier, Guolong Su, Xiaoyu Sun, Ramya Sree Boppana, Zilong Wang, Jiaqi Mu, Hao Zhang, Nan Hua

Large Language Models (LLM) have revolutionized Natural Language Processing (NLP), improving state-of-the-art on many existing tasks and exhibiting emergent capabilities.

Language Modelling

Paper
Add Code

FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

no code implementations • 4 May 2023 • Chen-Yu Lee, Chun-Liang Li, Hao Zhang, Timothy Dozat, Vincent Perot, Guolong Su, Xiang Zhang, Kihyuk Sohn, Nikolai Glushnev, Renshen Wang, Joshua Ainslie, Shangbang Long, Siyang Qin, Yasuhisa Fujii, Nan Hua, Tomas Pfister

In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss.

Contrastive Learning document understanding +1

Paper
Add Code

QueryForm: A Simple Zero-shot Form Entity Query Framework

no code implementations • 14 Nov 2022 • Zifeng Wang, Zizhao Zhang, Jacob Devlin, Chen-Yu Lee, Guolong Su, Hao Zhang, Jennifer Dy, Vincent Perot, Tomas Pfister

Zero-shot transfer learning for document understanding is a crucial yet under-investigated scenario to help reduce the high cost involved in annotating document entities.

document understanding Transfer Learning

Paper
Add Code

DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning

2 code implementations • 10 Apr 2022 • Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

Continual learning aims to enable a single model to learn a sequence of tasks without catastrophic forgetting.

Continual Learning

371

Paper
Code

FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

no code implementations • ACL 2022 • Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister

Sequence modeling has demonstrated state-of-the-art performance on natural language and document understanding tasks.

document understanding

Paper
Add Code

Learning to Prompt for Continual Learning

4 code implementations • CVPR 2022 • Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge.

Class Incremental Learning Image Classification

1,657

Paper
Code

Text Classification with Few Examples using Controlled Generalization

no code implementations • NAACL 2019 • Abhijit Mahabal, Jason Baldridge, Burcu Karagol Ayan, Vincent Perot, Dan Roth

Training data for text classification is often limited in practice, especially for applications with many output classes or involving many related classification problems.

General Classification text-classification +2

Paper
Add Code

Counterfactual Fairness in Text Classification through Robustness

no code implementations • 27 Sep 2018 • Sahaj Garg, Vincent Perot, Nicole Limtiaco, Ankur Taly, Ed H. Chi, Alex Beutel

In this paper, we study counterfactual fairness in text classification, which asks the question: How would the prediction change if the sensitive attribute referenced in the example were different?

Attribute counterfactual +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.