Search Results for author: Marc Marone

Found 11 papers, 4 papers with code

AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees

no code implementations • 12 Apr 2024 • William Fleshman, Aleem Khan, Marc Marone, Benjamin Van Durme

Large language models (LLMs) are increasingly capable of completing knowledge intensive tasks by recalling information from a static pretraining corpus.

Continual Learning

Paper
Add Code

Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data

no code implementations • 5 Apr 2024 • Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi

To address these limitations, we tackle the verifiability goal with a different philosophy: we trivialize the verification process by developing models that quote verbatim statements from trusted sources in pre-training data.

Philosophy

Paper
Add Code

Dated Data: Tracing Knowledge Cutoffs in Large Language Models

no code implementations • 19 Mar 2024 • Jeffrey Cheng, Marc Marone, Orion Weller, Dawn Lawrie, Daniel Khashabi, Benjamin Van Durme

Using this analysis, we find that effective cutoffs often differ from reported cutoffs.

Paper
Add Code

StarCoder 2 and The Stack v2: The Next Generation

no code implementations • 29 Feb 2024 • Anton Lozhkov, Raymond Li, Loubna Ben allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Our large model, StarCoder2- 15B, significantly outperforms other models of comparable size.

Ranked #24 on Code Generation on MBPP

Code Completion Code Generation +1

Paper
Add Code

"According to ...": Prompting Language Models Improves Quoting from Pre-Training Data

no code implementations • 22 May 2023 • Orion Weller, Marc Marone, Nathaniel Weir, Dawn Lawrie, Daniel Khashabi, Benjamin Van Durme

Large Language Models (LLMs) may hallucinate and generate fake information, despite pre-training on factual data.

Paper
Add Code

StarCoder: may the source be with you!

4 code implementations • 9 May 2023 • Raymond Li, Loubna Ben allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.

Ranked #42 on Code Generation on MBPP

8k Code Generation

7,088

Paper
Code

Data Portraits: Recording Foundation Model Training Data

no code implementations • NeurIPS 2023 • Marc Marone, Benjamin Van Durme

Foundation models are trained on increasingly immense and opaque datasets.

Language Modelling

Paper
Add Code

Pretrained Models for Multilingual Federated Learning

1 code implementation • NAACL 2022 • Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, Benjamin Van Durme

Since the advent of Federated Learning (FL), research has applied these methods to natural language processing (NLP) tasks.

Federated Learning Language Modelling +3

Paper
Code

Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction

2 code implementations • EMNLP 2021 • Mahsa Yarmohammadi, Shijie Wu, Marc Marone, Haoran Xu, Seth Ebner, Guanghui Qin, Yunmo Chen, Jialiang Guo, Craig Harman, Kenton Murray, Aaron Steven White, Mark Dredze, Benjamin Van Durme

Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English.

Dependency Parsing Event Extraction +4

Paper
Code

Selecting, Planning, and Rewriting: A Modular Approach for Data-to-Document Generation and Translation

no code implementations • WS 2019 • Lesly Miculicich, Marc Marone, Hany Hassan

In this paper, we report our system submissions to all 6 tracks of the WNGT 2019 shared task on Document-Level Generation and Translation.

Language Modelling Translation

Paper
Add Code

Character Eyes: Seeing Language through Character-Level Taggers

1 code implementation • WS 2019 • Yuval Pinter, Marc Marone, Jacob Eisenstein

Character-level models have been used extensively in recent years in NLP tasks as both supplements and replacements for closed-vocabulary token-level word representations.

POS

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.