Search Results for author: David Wan

Found 11 papers, 6 papers with code

Subtitles to Segmentation: Improving Low-Resource Speech-to-TextTranslation Pipelines

no code implementations • LREC 2020 • David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathy Mckeown

In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation.

Cross-Lingual Information Retrieval POS +6

Paper
Add Code

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

no code implementations • 19 Oct 2020 • David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathleen McKeown

In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation.

Cross-Lingual Information Retrieval POS +6

Paper
Add Code

Incorporating Terminology Constraints in Automatic Post-Editing

1 code implementation • WMT (EMNLP) 2020 • David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat, Kathleen McKeown

In this paper, we present both autoregressive and non-autoregressive models for lexically constrained APE, demonstrating that our approach enables preservation of 95% of the terminologies and also improves translation quality on English-German benchmarks.

Automatic Post-Editing Data Augmentation +1

Paper
Code

Segmenting Subtitles for Correcting ASR Segmentation Errors

no code implementations • EACL 2021 • David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuščáková, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown

Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation.

Information Retrieval Machine Translation +4

Paper
Add Code

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

1 code implementation • NAACL 2022 • David Wan, Mohit Bansal

We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning: (1) We augment the sentence selection strategy of PEGASUS's (Zhang et al., 2020) pre-training objective to create pseudo-summaries that are both important and factual; (2) We introduce three complementary components for fine-tuning.

Abstractive Text Summarization Contrastive Learning +1

Paper
Code

Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization

1 code implementation • 8 Sep 2022 • Shiyue Zhang, David Wan, Mohit Bansal

Though extractive summarization is less prone to the common unfaithfulness issues of abstractive summaries, does that mean extractive is equal to faithful?

Abstractive Text Summarization Extractive Summarization

Paper
Code

Evaluating and Improving Factuality in Multimodal Abstractive Summarization

1 code implementation • 4 Nov 2022 • David Wan, Mohit Bansal

Current metrics for evaluating factuality for abstractive document summarization have achieved high correlations with human judgment, but they do not account for the vision modality and thus are not adequate for vision-and-language summarization.

Abstractive Text Summarization Document Summarization

Paper
Code

Faithfulness-Aware Decoding Strategies for Abstractive Summarization

1 code implementation • 6 Mar 2023 • David Wan, Mengwen Liu, Kathleen McKeown, Markus Dreyer, Mohit Bansal

We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization.

Abstractive Text Summarization

Paper
Code

HistAlign: Improving Context Dependency in Language Generation by Aligning with History

1 code implementation • 8 May 2023 • David Wan, Shiyue Zhang, Mohit Bansal

Cache-LMs, which augment LMs with a memory of recent history, can increase context dependency and have shown remarkable performance in diverse language generation tasks.

Abstractive Text Summarization Text Generation

Paper
Code

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training

no code implementations • 4 Mar 2024 • David Wan, Jaemin Cho, Elias Stengel-Eskin, Mohit Bansal

Highlighting particularly relevant regions of an image can improve the performance of vision-language models (VLMs) on various vision-language (VL) tasks by guiding the model to attend more closely to these regions of interest.

Math Phrase Grounding +2

Paper
Add Code

Constrained Regeneration for Cross-Lingual Query-Focused Extractive Summarization

no code implementations • COLING 2022 • Elsbeth Turcan, David Wan, Faisal Ladhak, Petra Galuscakova, Sukanta Sen, Svetlana Tchistiakova, Weijia Xu, Marine Carpuat, Kenneth Heafield, Douglas Oard, Kathleen McKeown

Query-focused summaries of foreign-language, retrieved documents can help a user understand whether a document is actually relevant to the query term.

Extractive Summarization Machine Translation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.