Search Results for author: Moussa Kamal Eddine

Found 10 papers, 3 papers with code

Questioning the Validity of Summarization Datasets and Improving Their Factual Consistency

no code implementations31 Oct 2022 Yanzhu Guo, Chloé Clavel, Moussa Kamal Eddine, Michalis Vazirgiannis

Due to this lack of well-defined formulation, a large number of popular abstractive summarization datasets are constructed in a manner that neither guarantees validity nor meets one of the most essential criteria of summarization: factual consistency.

Abstractive Text Summarization valid

DATScore: Evaluating Translation with Data Augmented Translations

no code implementations12 Oct 2022 Moussa Kamal Eddine, Guokan Shang, Michalis Vazirgiannis

The rapid development of large pretrained language models has revolutionized not only the field of Natural Language Generation (NLG) but also its evaluation.

Data Augmentation Language Modelling +3

Word Sense Induction with Hierarchical Clustering and Mutual Information Maximization

no code implementations11 Oct 2022 Hadi Abdine, Moussa Kamal Eddine, Michalis Vazirgiannis, Davide Buscaldi

In this paper, we propose a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC).

Clustering Language Modelling +1

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

no code implementations22 Jun 2022 Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive Summarization

no code implementations21 Mar 2022 Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis

Like most natural language understanding and generation tasks, state-of-the-art models for summarization are transformer-based sequence-to-sequence architectures that are pretrained on large corpora.

Abstractive Text Summarization Natural Language Understanding

NLP Research and Resources at DaSciM, Ecole Polytechnique

no code implementations1 Dec 2021 Hadi Abdine, Yanzhu Guo, Moussa Kamal Eddine, Giannis Nikolentzos, Stamatis Outsios, Guokan Shang, Christos Xypolopoulos, Michalis Vazirgiannis

DaSciM (Data Science and Mining) part of LIX at Ecole Polytechnique, established in 2013 and since then producing research results in the area of large scale data analysis via methods of machine and deep learning.

Evaluation Of Word Embeddings From Large-Scale French Web Content

1 code implementation5 May 2021 Hadi Abdine, Christos Xypolopoulos, Moussa Kamal Eddine, Michalis Vazirgiannis

Adding that pretrained word vectors on huge text corpus achieved high performance in many different NLP tasks.

Word Embeddings

BARThez: a Skilled Pretrained French Sequence-to-Sequence Model

4 code implementations EMNLP 2021 Moussa Kamal Eddine, Antoine J. -P. Tixier, Michalis Vazirgiannis

We show BARThez to be very competitive with state-of-the-art BERT-based French language models such as CamemBERT and FlauBERT.

 Ranked #1 on Text Summarization on OrangeSum (using extra training data)

FLUE Natural Language Understanding +4

Cannot find the paper you are looking for? You can Submit a new open access paper.