Search Results for author: Patrick Paroubek

Found 37 papers, 5 papers with code

A Rough Set Formalization of Quantitative Evaluation with Ambiguity

no code implementations LREC 2012 Patrick Paroubek, Xavier Tannier

In this paper, we present the founding elements of a formal model of the evaluation paradigm in natural language processing.

Information Retrieval Machine Translation +2

Facing the Identification Problem in Language-Related Scientific Data Analysis.

no code implementations LREC 2014 Joseph Mariani, Christopher Cieri, Gil Francopoulo, Patrick Paroubek, Marine Delaborde

This paper describes the problems that must be addressed when studying large amounts of data over time which require entity normalization applied not to the usual genres of news or political speech, but to the genre of academic discourse about language resources, technologies and sciences.

Language Identification

Bidirectionnal converter between syntactic annotations : from French Treebank Dependencies to PASSAGE annotations, and back

no code implementations LREC 2014 Munshi Asadullah, Patrick Paroubek, Anne Vilnat

We shall illustrate the mapping of important syntactic phenomena using the corpus made of the examples of the FTB - DEP annotation guidelines, which we have hand-annotated with PASSAGE annotations and used to compute quantitative performance measures on the FTB - DEP guidelines. n this paper we will briefly introduce the two annotation formats.

Rediscovering 15 Years of Discoveries in Language Resources and Evaluation: The LREC Anthology Analysis

no code implementations LREC 2014 Joseph Mariani, Patrick Paroubek, Gil Francopoulo, Olivier Hamon

It follows similar exercises that have been conducted, such as the survey on the IEEE ICASSP conference series from 1976 to 1990, which served in the launching of the ESCA Eurospeech conference, a survey of the Association of Computational Linguistics (ACL) over 50 years of existence, which was presented at the ACL conference in 2012, or a survey over the 25 years (1987-2012) of the conferences contained in the ISCA Archive, presented at Interspeech 2013.

Speech Recognition

Utiliser les interjections pour d\'etecter les \'emotions

no code implementations JEPTALNRECITAL 2015 Amel Fraisse, Patrick Paroubek

Des travaux en analyse de sentiments ont montr{\'e} l{'}int{\'e}r{\^e}t des {\'e}motic{\^o}nes et r{\'e}cemment des mots-di{\`e}ses, qui s{'}av{\`e}rent {\^e}tre tr{\`e}s utiles pour la classification en polarit{\'e}.

A Study of Reuse and Plagiarism in LREC papers

no code implementations LREC 2016 Gil Francopoulo, Joseph Mariani, Patrick Paroubek

The aim of this experiment is to present an easy way to compare fragments of texts in order to detect (supposed) results of copy {\&} paste operations between articles in the domain of Natural Language Processing (NLP).

Predictive Modeling: Guessing the NLP Terms of Tomorrow

no code implementations LREC 2016 Gil Francopoulo, Joseph Mariani, Patrick Paroubek

Predictive modeling, often called {``}predictive analytics{''} in a commercial context, encompasses a variety of statistical techniques that analyze historical and present facts to make predictions about unknown events.

AppFM, une plate-forme de gestion de modules de TAL (AppFM, a tool for managing NLP modules)

no code implementations JEPTALNRECITAL 2016 Paul Bui-Quang, Brigitte Grau, Patrick Paroubek

AppFM 1 est un outil {\`a} mi-chemin entre un environnement de cr{\'e}ation de cha{\^\i}nes modulaires de TAL et un gestionnaire de services syst{\`e}mes.

Providing and Analyzing NLP Terms for our Community

no code implementations WS 2016 Gil Francopoulo, Joseph Mariani, Patrick Paroubek, Fr{\'e}d{\'e}ric Vernier

By its own nature, the Natural Language Processing (NLP) community is a priori the best equipped to study the evolution of its own publications, but works in this direction are rare and only recently have we seen a few attempts at charting the field.

Named Entity Recognition (NER) Optical Character Recognition (OCR)

NLP Analytics in Finance with DoRe: A French 250M Tokens Corpus of Corporate Annual Reports

no code implementations LREC 2020 Corentin Masson, Patrick Paroubek

Recent advances in neural computing and word embeddings for semantic processing open many new applications areas which had been left unaddressed so far because of inadequate language understanding capacity.

Stock Market Prediction Word Embeddings

Natural Language Processing for Cognitive Analysis of Emotions

no code implementations11 Oct 2022 Gustave Cortal, Alain Finkel, Patrick Paroubek, Lina Ye

Emotion analysis in texts suffers from two major limitations: annotated gold-standard corpora are mostly small and homogeneous, and emotion identification is often simplified as a sentence-level classification problem.

Emotion Recognition Management +1

Searching for Snippets of Open-Domain Dialogue in Task-Oriented Dialogue Datasets

no code implementations23 Nov 2023 Armand Stricker, Patrick Paroubek

Most existing dialogue corpora and models have been designed to fit into 2 predominant categories : task-oriented dialogues portray functional goals, such as making a restaurant reservation or booking a plane ticket, while chit-chat/open-domain dialogues focus on holding a socially engaging talk with a user.

Enhancing Task-Oriented Dialogues with Chitchat: a Comparative Study Based on Lexical Diversity and Divergence

1 code implementation23 Nov 2023 Armand Stricker, Patrick Paroubek

As a recent development, task-oriented dialogues (TODs) have been enriched with chitchat in an effort to make dialogues more diverse and engaging.

A Unified Approach to Emotion Detection and Task-Oriented Dialogue Modeling

1 code implementation24 Jan 2024 Armand Stricker, Patrick Paroubek

In current text-based task-oriented dialogue (TOD) systems, user emotion detection (ED) is often overlooked or is typically treated as a separate and independent task, requiring additional training.

Language Modelling

Chitchat as Interference: Adding User Backstories to Task-Oriented Dialogues

1 code implementation23 Feb 2024 Armand Stricker, Patrick Paroubek

During task-oriented dialogues (TODs), human users naturally introduce chitchat that is beyond the immediate scope of the task, interfering with the flow of the conversation.

MAPA Project: Ready-to-Go Open-Source Datasets and Deep Learning Technology to Remove Identifying Information from Text Documents

no code implementations LEGAL (LREC) 2022 Victoria Arranz, Khalid Choukri, Montse Cuadros, Aitor García Pablos, Lucie Gianola, Cyril Grouin, Manuel Herranz, Patrick Paroubek, Pierre Zweigenbaum

This paper presents the outcomes of the MAPA project, a set of annotated corpora for 24 languages of the European Union and an open-source customisable toolkit able to detect and substitute sensitive information in text documents from any domain, using state-of-the art, deep learning-based named entity recognition techniques.

De-identification named-entity-recognition +2

A Fine-Grained Annotated Corpus for Target-Based Opinion Analysis of Economic and Financial Narratives

no code implementations EMNLP (ECONLP) 2021 Jiahui Hu, Patrick Paroubek

In this paper, we present our pre-annotation models and evaluations of their performance, introduce our annotation scheme and report on the main characteristics of our corpus.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

A Unifying View On Task-oriented Dialogue Annotation

1 code implementation LREC 2022 Vojtěch Hudeček, Leon-paul Schaub, Daniel Stancl, Patrick Paroubek, Ondřej Dušek

In this paper, we present a new dataset, obtained by merging four publicly available annotated corpora for task-oriented dialogues in several domains (MultiWOZ 2. 2, CamRest676, DSTC2 and Schema-Guided Dialogue Dataset).

Dialogue Generation Dialogue State Tracking +1

Cannot find the paper you are looking for? You can Submit a new open access paper.