Search Results for author: Marco Dinarelli

Found 31 papers, 5 papers with code

Vers la compréhension automatique de la parole bout-en-bout à moindre effort (Towards automatic end-to-end speech understanding with less effort)

no code implementations • JEP/TALN/RECITAL 2022 • Marco Naguib, François Portet, Marco Dinarelli

Les approches de compréhension automatique de la parole ont récemment bénéficié de l’apport de modèles préappris par autosupervision sur de gros corpus de parole.

Paper
Add Code

TArC: Tunisian Arabish Corpus, First complete release

no code implementations • LREC 2022 • Elisa Gugliotta, Marco Dinarelli

In this paper we present the final result of a project focused on Tunisian Arabic encoded in Arabizi, the Latin-based writing system for digital conversations.

Lemmatization POS +2

Paper
Add Code

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

no code implementations • 11 Sep 2023 • Titouan Parcollet, Ha Nguyen, Solene Evain, Marcely Zanon Boito, Adrien Pupier, Salima Mdhaffar, Hang Le, Sina Alisamir, Natalia Tomashenko, Marco Dinarelli, Shucong Zhang, Alexandre Allauzen, Maximin Coavoux, Yannick Esteve, Mickael Rouvier, Jerome Goulian, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing.

Self-Supervised Learning

Paper
Add Code

Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation

1 code implementation • 13 Feb 2023 • Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

Context-aware translation can be achieved by processing a concatenation of consecutive sentences with the standard Transformer architecture.

Machine Translation Position +2

Paper
Code

Focused Concatenation for Context-Aware Neural Machine Translation

1 code implementation • 24 Oct 2022 • Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

A straightforward approach to context-aware neural machine translation consists in feeding the standard encoder-decoder architecture with a window of consecutive sentences, formed by the current sentence and a number of sentences from its context concatenated to it.

Machine Translation Sentence +1

Paper
Code

TArC: Tunisian Arabish Corpus First complete release

no code implementations • 11 Jul 2022 • Elisa Gugliotta, Marco Dinarelli

In this paper we present the final result of a project on Tunisian Arabic encoded in Arabizi, the Latin-based writing system for digital conversations.

Lemmatization POS +2

Paper
Add Code

Toward Low-Cost End-to-End Spoken Language Understanding

no code implementations • 1 Jul 2022 • Marco Dinarelli, Marco Naguib, François Portet

Recent advances in spoken language understanding benefited from Self-Supervised models trained on large speech corpora.

Spoken Language Understanding

Paper
Add Code

Vers la compréhension automatique de la parole bout-en-bout à moindre effort

no code implementations • 1 Jul 2022 • Marco Naguib, François Portet, Marco Dinarelli

Recent advances in spoken language understanding benefited from Self-Supervised models trained on large speech corpora.

Spoken Language Understanding

Paper
Add Code

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

1 code implementation • 23 Apr 2021 • Solene Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Esteve, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

In this paper, we propose LeBenchmark: a reproducible framework for assessing SSL from speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Code

Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models

1 code implementation • ACL 2022 • Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

Multi-encoder models are a broad family of context-aware neural machine translation systems that aim to improve translation quality by encoding document-level contextual information alongside the current sentence.

Machine Translation Retrieval +2

Paper
Code

Multi-Task Sequence Prediction For Tunisian Arabizi Multi-Level Annotation

1 code implementation • COLING (WANLP) 2020 • Elisa Gugliotta, Marco Dinarelli, Olivier Kraif

We show also how we used the system in order to annotate a Tunisian Arabizi corpus, which has been afterwards manually corrected and used to further evaluate sequence models on Tunisian data.

POS POS Tagging +2

Paper
Code

TArC. Un corpus d'arabish tunisien

no code implementations • JEPTALNRECITAL 2020 • Elisa Gugliotta, Marco Dinarelli

TArC : Incrementally and Semi-Automatically Collecting a Tunisian arabish Corpus This article describes the collection process of the first morpho-syntactically annotated Tunisian arabish Corpus (TArC).

Paper
Add Code

TArC: Incrementally and Semi-Automatically Collecting a Tunisian Arabish Corpus

no code implementations • LREC 2020 • Elisa Gugliotta, Marco Dinarelli

This article describes the constitution process of the first morpho-syntactically annotated Tunisian Arabish Corpus (TArC).

Paper
Add Code

A Data Efficient End-To-End Spoken Language Understanding Architecture

no code implementations • 14 Feb 2020 • Marco Dinarelli, Nikita Kapoor, Bassam Jabaian, Laurent Besacier

For that, in many cases, models are combined with an external language model to enhance their performance.

Chunking Language Modelling +2

Paper
Add Code

Hybrid Neural Models For Sequence Modelling: The Best Of Three Worlds

no code implementations • 16 Sep 2019 • Marco Dinarelli, Loïc Grobol

We propose a neural architecture with the main characteristics of the most successful neural models of the last years: bidirectional RNNs, encoder-decoder, and the Transformer model.

Paper
Add Code

Mod\`eles neuronaux hybrides pour la mod\'elisation de s\'equences : le meilleur de trois mondes ()

no code implementations • JEPTALNRECITAL 2019 • Marco Dinarelli, Lo{\"\i}c Grobol

Nous proposons une architecture neuronale avec les caract{\'e}ristiques principales des mod{\`e}les neuronaux de ces derni{\`e}res ann{\'e}es : les r{\'e}seaux neuronaux r{\'e}currents bidirectionnels, les mod{\`e}les encodeur-d{\'e}codeur, et le mod{\`e}le Transformer.

Paper
Add Code

Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling

no code implementations • 9 Apr 2019 • Marco Dinarelli, Loïc Grobol

During the last couple of years, Recurrent Neural Networks (RNN) have reached state-of-the-art performances on most of the sequence modelling problems.

Paper
Add Code

ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations

no code implementations • LREC 2018 • Lo{\"\i}c Grobol, Isabelle Tellier, {\'E}ric de la Clergerie, Marco Dinarelli, L, Fr{\'e}d{\'e}ric ragin

Paper
Add Code

Effective Spoken Language Labeling with Deep Recurrent Neural Networks

no code implementations • 20 Jun 2017 • Marco Dinarelli, Yoann Dupont, Isabelle Tellier

Understanding spoken language is a highly complex problem, which can be decomposed into several simpler tasks.

Spoken Language Understanding

Paper
Add Code

Label-Dependencies Aware Recurrent Neural Networks

no code implementations • 6 Jun 2017 • Yoann Dupont, Marco Dinarelli, Isabelle Tellier

In this work we propose a solution far simpler but very effective: an evolution of the simple Jordan RNN, where labels are re-injected as input into the network, and converted into embeddings, in the same way as words.

Spoken Language Understanding

Paper
Add Code

Apports des analyses syntaxiques pour la d\'etection automatique de mentions dans un corpus de fran\ccais oral (Experiences in using deep and shallow parsing to detect entity mentions in oral French)

no code implementations • JEPTALNRECITAL 2017 • Lo{\"\i}c Grobol, Isabelle Tellier, {\'E}ric de la Clergerie, Marco Dinarelli, L, Fr{\'e}d{\'e}ric ragin

Cet article pr{\'e}sente trois exp{\'e}riences de d{\'e}tection de mentions dans un corpus de fran{\c{c}}ais oral : ANCOR.

Paper
Add Code

D\'etection des mots non-standards dans les tweets avec des r\'eseaux de neurones (Detecting non-standard words in tweets with neural networks)

no code implementations • JEPTALNRECITAL 2017 • Tian Tian, Isabelle Tellier, Marco Dinarelli, Pedro Cardoso

Dans cet article, nous proposons un mod{\`e}le pour d{\'e}tecter dans les textes g{\'e}n{\'e}r{\'e}s par des utilisateurs (en particulier les tweets), les mots non-standards {\`a} corriger.

SENTS

Paper
Add Code

R\'eseaux neuronaux profonds pour l'\'etiquetage de s\'equences (Deep Neural Networks for Sequence Labeling)

no code implementations • JEPTALNRECITAL 2017 • Yoann Dupont, Marco Dinarelli, Isabelle Tellier

R{\'e}cemment, une variante de r{\'e}seau neuronal particuli{\`e}rement adapt{\'e} {\`a} l{'}{\'e}tiquetage de s{\'e}quences textuelles a {\'e}t{\'e} propos{\'e}e, utilisant des repr{\'e}sentations distributionnelles des {\'e}tiquettes.

Paper
Add Code

\'Etude des r\'eseaux de neurones r\'ecurrents pour \'etiquetage de s\'equences (A study of Recurrent Neural Networks for Sequence Labelling)

no code implementations • JEPTALNRECITAL 2016 • Marco Dinarelli, Isabelle Tellier

Dans cet article nous {\'e}tudions plusieurs types de r{\'e}seaux neuronaux r{\'e}currents (RNN) pour l{'}{\'e}tiquetage de s{\'e}quences.

Paper
Add Code

Improving Recurrent Neural Networks For Sequence Labelling

no code implementations • 8 Jun 2016 • Marco Dinarelli, Isabelle Tellier

In this paper we study different types of Recurrent Neural Networks (RNN) for sequence labeling tasks.

POS POS Tagging +1

Paper
Add Code

Domain Adaptation for Named Entity Recognition Using CRFs

no code implementations • LREC 2016 • Tian Tian, Marco Dinarelli, Isabelle Tellier, Pedro Dias Cardoso

We explain the specificities of this corpus with examples and describe some baseline experiments.

Domain Adaptation named-entity-recognition +2

Paper
Add Code

Data Adaptation for Named Entity Recognition on Tweets with Features-Rich CRF

no code implementations • WS 2015 • Tian Tian, Marco Dinarelli, Isabelle Tellier

Domain Adaptation named-entity-recognition +2

Paper
Add Code

Evaluation of different strategies for domain adaptation in opinion mining

no code implementations • LREC 2014 • Garcia-Fern, Anne ez, Olivier Ferret, Marco Dinarelli

The work presented in this article takes place in the field of opinion mining and aims more particularly at finding the polarity of a text by relying on machine learning methods.

Domain Adaptation Opinion Mining +2

Paper
Add Code

LIMSI @ WMT13

no code implementations • WS 2013 • Alex Allauzen, er, Nicolas P{\'e}cheux, Quoc Khanh Do, Marco Dinarelli, Thomas Lavergne, Aur{\'e}lien Max, Hai-Son Le, Fran{\c{c}}ois Yvon

Language Modelling Machine Translation +1

Paper
Add Code

Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results

no code implementations • LREC 2012 • Marco Dinarelli, Sophie Rosset

We evaluate our procedure for preprocessing OCR-ized data in two ways: in terms of perplexity and OOV rate of a language model on development and evaluation data, and in terms of the performance of the named entity detection system on the preprocessed data.

Language Modelling named-entity-recognition +3

Paper
Add Code

Tree Representations in Probabilistic Models for Extended Named Entities Detection

no code implementations • EACL 2012 • Marco Dinarelli, Sophie Rosset

Named Entity Recognition (NER) Relation Extraction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.