Search Results for author: Samuel Barham

Found 4 papers, 0 papers with code

To Burst or Not to Burst: Generating and Quantifying Improbable Text

no code implementations27 Jan 2024 Kuleen Sasse, Samuel Barham, Efsun Sarioglu Kayi, Edward W. Staley

While large language models (LLMs) are extremely capable at text generation, their outputs are still distinguishable from human-authored text.

Text Generation

MegaWika: Millions of reports and their sources across 50 diverse languages

no code implementations13 Jul 2023 Samuel Barham, Orion Weller, Michelle Yuan, Kenton Murray, Mahsa Yarmohammadi, Zhengping Jiang, Siddharth Vashishtha, Alexander Martin, Anqi Liu, Aaron Steven White, Jordan Boyd-Graber, Benjamin Van Durme

To foster the development of new models for collaborative AI-assisted report generation, we introduce MegaWika, consisting of 13 million Wikipedia articles in 50 diverse languages, along with their 71 million referenced source materials.

Cross-Lingual Question Answering Retrieval +1

Synthetic Cross-language Information Retrieval Training Data

no code implementations29 Apr 2023 James Mayfield, Eugene Yang, Dawn Lawrie, Samuel Barham, Orion Weller, Marc Mason, Suraj Nair, Scott Miller

By repeating this process, collections of arbitrary size can be created in the style of MS MARCO but using naturally-occurring documents in any desired genre and domain of discourse.

Information Retrieval Language Modelling +4

Interpretable Adversarial Training for Text

no code implementations30 May 2019 Samuel Barham, Soheil Feizi

SPGD imposes a directional regularization constraint on input perturbations by projecting them onto the directions to nearby word embeddings with highest cosine similarities.

Sentence Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.