Search Results for author: Marco Dinarelli

Found 33 papers, 6 papers with code

TArC: Tunisian Arabish Corpus, First complete release

no code implementations LREC 2022 Elisa Gugliotta, Marco Dinarelli

In this paper we present the final result of a project focused on Tunisian Arabic encoded in Arabizi, the Latin-based writing system for digital conversations.

Lemmatization POS +2

Vers la compréhension automatique de la parole bout-en-bout à moindre effort (Towards automatic end-to-end speech understanding with less effort)

no code implementations JEP/TALN/RECITAL 2022 Marco Naguib, François Portet, Marco Dinarelli

Les approches de compréhension automatique de la parole ont récemment bénéficié de l’apport de modèles préappris par autosupervision sur de gros corpus de parole.

Open Implementation and Study of BEST-RQ for Speech Processing

1 code implementation7 May 2024 Ryan Whetten, Titouan Parcollet, Marco Dinarelli, Yannick Estève

BERT-based Speech pre-Training with Random-projection Quantizer (BEST-RQ), is an SSL method that has shown great performance on Automatic Speech Recognition (ASR) while being simpler than other SSL methods, such as wav2vec 2. 0.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation

1 code implementation13 Feb 2023 Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

Context-aware translation can be achieved by processing a concatenation of consecutive sentences with the standard Transformer architecture.

Machine Translation Position +2

Focused Concatenation for Context-Aware Neural Machine Translation

1 code implementation24 Oct 2022 Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

A straightforward approach to context-aware neural machine translation consists in feeding the standard encoder-decoder architecture with a window of consecutive sentences, formed by the current sentence and a number of sentences from its context concatenated to it.

Decoder Machine Translation +2

TArC: Tunisian Arabish Corpus First complete release

no code implementations11 Jul 2022 Elisa Gugliotta, Marco Dinarelli

In this paper we present the final result of a project on Tunisian Arabic encoded in Arabizi, the Latin-based writing system for digital conversations.

Lemmatization POS +2

Vers la compréhension automatique de la parole bout-en-bout à moindre effort

no code implementations1 Jul 2022 Marco Naguib, François Portet, Marco Dinarelli

Recent advances in spoken language understanding benefited from Self-Supervised models trained on large speech corpora.

Spoken Language Understanding

Toward Low-Cost End-to-End Spoken Language Understanding

no code implementations1 Jul 2022 Marco Dinarelli, Marco Naguib, François Portet

Recent advances in spoken language understanding benefited from Self-Supervised models trained on large speech corpora.

Spoken Language Understanding

Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models

1 code implementation ACL 2022 Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

Multi-encoder models are a broad family of context-aware neural machine translation systems that aim to improve translation quality by encoding document-level contextual information alongside the current sentence.

Machine Translation Retrieval +2

Multi-Task Sequence Prediction For Tunisian Arabizi Multi-Level Annotation

1 code implementation COLING (WANLP) 2020 Elisa Gugliotta, Marco Dinarelli, Olivier Kraif

We show also how we used the system in order to annotate a Tunisian Arabizi corpus, which has been afterwards manually corrected and used to further evaluate sequence models on Tunisian data.

POS POS Tagging +2

TArC. Un corpus d'arabish tunisien

no code implementations JEPTALNRECITAL 2020 Elisa Gugliotta, Marco Dinarelli

TArC : Incrementally and Semi-Automatically Collecting a Tunisian arabish Corpus This article describes the collection process of the first morpho-syntactically annotated Tunisian arabish Corpus (TArC).

TArC: Incrementally and Semi-Automatically Collecting a Tunisian Arabish Corpus

no code implementations LREC 2020 Elisa Gugliotta, Marco Dinarelli

This article describes the constitution process of the first morpho-syntactically annotated Tunisian Arabish Corpus (TArC).

Hybrid Neural Models For Sequence Modelling: The Best Of Three Worlds

no code implementations16 Sep 2019 Marco Dinarelli, Loïc Grobol

We propose a neural architecture with the main characteristics of the most successful neural models of the last years: bidirectional RNNs, encoder-decoder, and the Transformer model.

Decoder

Mod\`eles neuronaux hybrides pour la mod\'elisation de s\'equences : le meilleur de trois mondes ()

no code implementations JEPTALNRECITAL 2019 Marco Dinarelli, Lo{\"\i}c Grobol

Nous proposons une architecture neuronale avec les caract{\'e}ristiques principales des mod{\`e}les neuronaux de ces derni{\`e}res ann{\'e}es : les r{\'e}seaux neuronaux r{\'e}currents bidirectionnels, les mod{\`e}les encodeur-d{\'e}codeur, et le mod{\`e}le Transformer.

Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling

no code implementations9 Apr 2019 Marco Dinarelli, Loïc Grobol

During the last couple of years, Recurrent Neural Networks (RNN) have reached state-of-the-art performances on most of the sequence modelling problems.

Effective Spoken Language Labeling with Deep Recurrent Neural Networks

no code implementations20 Jun 2017 Marco Dinarelli, Yoann Dupont, Isabelle Tellier

Understanding spoken language is a highly complex problem, which can be decomposed into several simpler tasks.

Spoken Language Understanding

Label-Dependencies Aware Recurrent Neural Networks

no code implementations6 Jun 2017 Yoann Dupont, Marco Dinarelli, Isabelle Tellier

In this work we propose a solution far simpler but very effective: an evolution of the simple Jordan RNN, where labels are re-injected as input into the network, and converted into embeddings, in the same way as words.

Spoken Language Understanding

R\'eseaux neuronaux profonds pour l'\'etiquetage de s\'equences (Deep Neural Networks for Sequence Labeling)

no code implementations JEPTALNRECITAL 2017 Yoann Dupont, Marco Dinarelli, Isabelle Tellier

R{\'e}cemment, une variante de r{\'e}seau neuronal particuli{\`e}rement adapt{\'e} {\`a} l{'}{\'e}tiquetage de s{\'e}quences textuelles a {\'e}t{\'e} propos{\'e}e, utilisant des repr{\'e}sentations distributionnelles des {\'e}tiquettes.

es-en

D\'etection des mots non-standards dans les tweets avec des r\'eseaux de neurones (Detecting non-standard words in tweets with neural networks)

no code implementations JEPTALNRECITAL 2017 Tian Tian, Isabelle Tellier, Marco Dinarelli, Pedro Cardoso

Dans cet article, nous proposons un mod{\`e}le pour d{\'e}tecter dans les textes g{\'e}n{\'e}r{\'e}s par des utilisateurs (en particulier les tweets), les mots non-standards {\`a} corriger.

SENTS

Improving Recurrent Neural Networks For Sequence Labelling

no code implementations8 Jun 2016 Marco Dinarelli, Isabelle Tellier

In this paper we study different types of Recurrent Neural Networks (RNN) for sequence labeling tasks.

POS POS Tagging +1

Evaluation of different strategies for domain adaptation in opinion mining

no code implementations LREC 2014 Garcia-Fern, Anne ez, Olivier Ferret, Marco Dinarelli

The work presented in this article takes place in the field of opinion mining and aims more particularly at finding the polarity of a text by relying on machine learning methods.

Domain Adaptation Opinion Mining +2

Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results

no code implementations LREC 2012 Marco Dinarelli, Sophie Rosset

We evaluate our procedure for preprocessing OCR-ized data in two ways: in terms of perplexity and OOV rate of a language model on development and evaluation data, and in terms of the performance of the named entity detection system on the preprocessed data.

Language Modelling named-entity-recognition +3

Cannot find the paper you are looking for? You can Submit a new open access paper.