no code implementations • ACL 2022 • Moussa Kamal Eddine, Guokan Shang, Antoine Tixier, Michalis Vazirgiannis
While traditional natural language generation metrics are fast, they are not very reliable.
no code implementations • 31 Oct 2022 • Yanzhu Guo, Chloé Clavel, Moussa Kamal Eddine, Michalis Vazirgiannis
Due to this lack of well-defined formulation, a large number of popular abstractive summarization datasets are constructed in a manner that neither guarantees validity nor meets one of the most essential criteria of summarization: factual consistency.
no code implementations • 12 Oct 2022 • Moussa Kamal Eddine, Guokan Shang, Michalis Vazirgiannis
The rapid development of large pretrained language models has revolutionized not only the field of Natural Language Generation (NLG) but also its evaluation.
no code implementations • 11 Oct 2022 • Hadi Abdine, Moussa Kamal Eddine, Michalis Vazirgiannis, Davide Buscaldi
In this paper, we propose a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC).
no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou
This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.
no code implementations • 21 Mar 2022 • Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis
Like most natural language understanding and generation tasks, state-of-the-art models for summarization are transformer-based sequence-to-sequence architectures that are pretrained on large corpora.
Abstractive Text Summarization Natural Language Understanding
no code implementations • 1 Dec 2021 • Hadi Abdine, Yanzhu Guo, Moussa Kamal Eddine, Giannis Nikolentzos, Stamatis Outsios, Guokan Shang, Christos Xypolopoulos, Michalis Vazirgiannis
DaSciM (Data Science and Mining) part of LIX at Ecole Polytechnique, established in 2013 and since then producing research results in the area of large scale data analysis via methods of machine and deep learning.
1 code implementation • 16 Oct 2021 • Moussa Kamal Eddine, Guokan Shang, Antoine J. -P. Tixier, Michalis Vazirgiannis
While traditional natural language generation metrics are fast, they are not very reliable.
1 code implementation • 5 May 2021 • Hadi Abdine, Christos Xypolopoulos, Moussa Kamal Eddine, Michalis Vazirgiannis
Adding that pretrained word vectors on huge text corpus achieved high performance in many different NLP tasks.
4 code implementations • EMNLP 2021 • Moussa Kamal Eddine, Antoine J. -P. Tixier, Michalis Vazirgiannis
We show BARThez to be very competitive with state-of-the-art BERT-based French language models such as CamemBERT and FlauBERT.
Ranked #1 on Text Summarization on OrangeSum (using extra training data)