Search Results for author: Pierre Zweigenbaum

Found 69 papers, 8 papers with code

TL-Explorer: A Digital Humanities Tool for Mapping and Analyzing Translated Literature

no code implementations • COLING (LaTeCHCLfL, CLFL, LaTeCH) 2020 • Alex Zhai, Zheng Zhang, Amel Fraisse, Ronald Jenn, Shelley Fisher Fishkin, Pierre Zweigenbaum

TL-Explorer is a digital humanities tool for mapping and analyzing translated literature, encompassing the World Map and the Translation Dashboard.

Translation

Paper
Add Code

The Multilingual Anonymisation Toolkit for Public Administrations (MAPA) Project

no code implementations • EAMT 2020 • Ēriks Ajausks, Victoria Arranz, Laurent Bié, Aleix Cerdà-i-Cucó, Khalid Choukri, Montse Cuadros, Hans Degroote, Amando Estela, Thierry Etchegoyhen, Mercedes García-Martínez, Aitor García-Pablos, Manuel Herranz, Alejandro Kohan, Maite Melero, Mike Rosner, Roberts Rozis, Patrick Paroubek, Artūrs Vasiļevskis, Pierre Zweigenbaum

We describe the MAPA project, funded under the Connecting Europe Facility programme, whose goal is the development of an open-source de-identification toolkit for all official European Union languages.

De-identification

Paper
Add Code

Re-train or Train from Scratch? Comparing Pre-training Strategies of BERT in the Medical Domain

no code implementations • LREC 2022 • Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Pierre Zweigenbaum

BERT models used in specialized domains all seem to be the result of a simple strategy: initializing with the original BERT and then resuming pre-training on a specialized corpus.

Paper
Add Code

Cross-lingual Approaches for the Detection of Adverse Drug Reactions in German from a Patient’s Perspective

1 code implementation • LREC 2022 • Lisa Raithel, Philippe Thomas, Roland Roller, Oliver Sapina, Sebastian Möller, Pierre Zweigenbaum

In this work, we present the first corpus for German Adverse Drug Reaction (ADR) detection in patient-generated content.

Binary Classification Few-Shot Learning

Paper
Code

Building Comparable Corpora for Assessing Multi-Word Term Alignment

no code implementations • LREC 2022 • Omar Adjali, Emmanuel Morin, Pierre Zweigenbaum

To that aim, we exploit parallel corpora to perform automatic bilingual MWT extraction and comparable corpus construction.

Machine Translation

Paper
Add Code

Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing

no code implementations • EMNLP (Eval4NLP) 2021 • Lucie Gianola, Hicham El Boukkouri, Cyril Grouin, Thomas Lavergne, Patrick Paroubek, Pierre Zweigenbaum

Paper
Add Code

MAPA Project: Ready-to-Go Open-Source Datasets and Deep Learning Technology to Remove Identifying Information from Text Documents

no code implementations • LEGAL (LREC) 2022 • Victoria Arranz, Khalid Choukri, Montse Cuadros, Aitor García Pablos, Lucie Gianola, Cyril Grouin, Manuel Herranz, Patrick Paroubek, Pierre Zweigenbaum

This paper presents the outcomes of the MAPA project, a set of annotated corpora for 24 languages of the European Union and an open-source customisable toolkit able to detect and substitute sensitive information in text documents from any domain, using state-of-the art, deep learning-based named entity recognition techniques.

De-identification named-entity-recognition +2

Paper
Add Code

SEME at SemEval-2024 Task 2: Comparing Masked and Generative Language Models on Natural Language Inference for Clinical Trials

no code implementations • 5 Apr 2024 • Mathilde Aguiar, Pierre Zweigenbaum, Nona Naderi

This paper describes our submission to Task 2 of SemEval-2024: Safe Biomedical Natural Language Inference for Clinical Trials.

Natural Language Inference Task 2

Paper
Add Code

A Dataset for Pharmacovigilance in German, French, and Japanese: Annotating Adverse Drug Reactions across Languages

2 code implementations • 27 Mar 2024 • Lisa Raithel, Hui-Syuan Yeh, Shuntaro Yada, Cyril Grouin, Thomas Lavergne, Aurélie Névéol, Patrick Paroubek, Philippe Thomas, Tomohiro Nishiyama, Sebastian Möller, Eiji Aramaki, Yuji Matsumoto, Roland Roller, Pierre Zweigenbaum

User-generated data sources have gained significance in uncovering Adverse Drug Reactions (ADRs), with an increasing number of discussions occurring in the digital world.

Attribute

Paper
Code

Cross-lingual Approaches for the Detection of Adverse Drug Reactions in German from a Patient's Perspective

1 code implementation • 3 Aug 2022 • Lisa Raithel, Philippe Thomas, Roland Roller, Oliver Sapina, Sebastian Möller, Pierre Zweigenbaum

In this work, we present the first corpus for German Adverse Drug Reaction (ADR) detection in patient-generated content.

Binary Classification Few-Shot Learning

Paper
Code

Decorate the Examples: A Simple Method of Prompt Design for Biomedical Relation Extraction

no code implementations • LREC 2022 • Hui-Syuan Yeh, Thomas Lavergne, Pierre Zweigenbaum

In this paper, we investigate prompting for biomedical relation extraction, with experiments on the ChemProt dataset.

Cloze Test Relation +1

Paper
Add Code

Global alignment for relation extraction in Microbiology

no code implementations • 25 Nov 2021 • Anfu Tang, Claire Nédellec, Pierre Zweigenbaum, Louise Deléger, Robert Bossy

We investigate a method to extract relations from texts based on global alignment and syntactic information.

Relation Relation Extraction

Paper
Add Code

Does constituency analysis enhance domain-specific pre-trained BERT models for relation extraction?

1 code implementation • 25 Nov 2021 • Anfu Tang, Louise Deléger, Robert Bossy, Pierre Zweigenbaum, Claire Nédellec

Recently many studies have been conducted on the topic of relation extraction.

DrugProt Relation

Paper
Code

C-Norm: a neural approach to few-shot entity normalization

1 code implementation • BMC Bioinformatics 2020 • Arnaud Ferré, Louise Deléger, Robert Bossy, Pierre Zweigenbaum, Claire Nédellec

Entity normalization is an important information extraction task which has gained renewed attention in the last decade, particularly in the biomedical and life science domains.

Ranked #1 on Medical Concept Normalization on BB-norm-phenotype

Few-Shot Learning Medical Concept Normalization

Paper
Code

CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters

2 code implementations • COLING 2020 • Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Hiroshi Noji, Pierre Zweigenbaum, Junichi Tsujii

Due to the compelling improvements brought by BERT, many recent representation models adopted the Transformer architecture as their main building block, consequently inheriting the wordpiece tokenization system despite it not being intrinsically linked to the notion of Transformers.

Ranked #1 on Semantic Similarity on ClinicalSTS

Clinical Concept Extraction Drug–drug Interaction Extraction +3

194

Paper
Code

Handling Entity Normalization with no Annotated Corpus: Weakly Supervised Methods Based on Distributional Representation and Ontological Information

no code implementations • LREC 2020 • Arnaud Ferr{\'e}, Robert Bossy, Mouhamadou Ba, Louise Del{\'e}ger, Thomas Lavergne, Pierre Zweigenbaum, Claire N{\'e}dellec

We propose a new approach to address the scarcity of training data that extends the CONTES method by corpus selection, pre-processing and weak supervision strategies, which can yield high-performance results without any manually annotated examples.

BIG-bench Machine Learning Entity Linking

Paper
Add Code

Overview of the Fourth BUCC Shared Task: Bilingual Dictionary Induction from Comparable Corpora

no code implementations • LREC 2020 • Reinhard Rapp, Pierre Zweigenbaum, Serge Sharoff

The shared task of the 13th Workshop on Building and Using Comparable Corpora was devoted to the induction of bilingual dictionaries from comparable rather than parallel corpora.

Paper
Add Code

Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words In Mind

no code implementations • 18 Sep 2019 • Zheng Zhang, Ruiqing Yin, Jun Zhu, Pierre Zweigenbaum

Recent work in cross-lingual contextual word embedding learning cannot handle multi-sense words well.

Bilingual Lexicon Induction Word Embeddings

Paper
Add Code

Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition

1 code implementation • ACL 2019 • Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Pierre Zweigenbaum

Using pre-trained word embeddings in conjunction with Deep Learning models has become the {``}de facto{''} approach in Natural Language Processing (NLP).

Ranked #4 on Clinical Concept Extraction on 2010 i2b2/VA

Clinical Concept Extraction Word Embeddings

Paper
Code

GNEG: Graph-Based Negative Sampling for word2vec

no code implementations • ACL 2018 • Zheng Zhang, Pierre Zweigenbaum

Negative sampling is an important component in word2vec for distributed word representation learning.

Language Modelling Representation Learning +1

Paper
Add Code

Efficient Generation and Processing of Word Co-occurrence Networks Using corpus2graph

1 code implementation • WS 2018 • Zheng Zhang, Pierre Zweigenbaum, Ruiqing Yin

Corpus2graph is an open-source NLP-application-oriented tool that generates a word co-occurrence network from a large corpus.

Keyword Extraction

Paper
Code

Combining rule-based and embedding-based approaches to normalize textual entities with an ontology

no code implementations • LREC 2018 • Arnaud Ferr{\'e}, Louise Del{\'e}ger, Pierre Zweigenbaum, Claire N{\'e}dellec

Entity Linking Word Embeddings

Paper
Add Code

Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat

no code implementations • LREC 2018 • Christopher Norman, Mariska Leeflang, Pierre Zweigenbaum, Aur{\'e}lie N{\'e}v{\'e}ol

Decision Making

Paper
Add Code

A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora

no code implementations • LREC 2018 • Pierre Zweigenbaum, Serge Sharoff, Reinhard Rapp

Machine Translation Semantic Textual Similarity +1

Paper
Add Code

Three Dimensions of Reproducibility in Natural Language Processing

no code implementations • LREC 2018 • K. Bretonnel Cohen, Jingbo Xia, Pierre Zweigenbaum, Tiffany Callahan, Orin Hargraves, Foster Goss, Nancy Ide, Aur{\'e}lie N{\'e}v{\'e}ol, Cyril Grouin, Lawrence E. Hunter

Paper
Add Code

D\'etection des couples de termes translitt\'er\'es \`a partir d'un corpus parall\`ele anglais-arabe ()

no code implementations • JEPTALNRECITAL 2018 • Wafa Neifar, Thierry Hamon, Pierre Zweigenbaum, Mariem Ellouze, Lamia-Hadrich Belguith

Paper
Add Code

Automatic classification of doctor-patient questions for a virtual patient record query task

no code implementations • WS 2017 • Leonardo Campillos Llanos, Sophie Rosset, Pierre Zweigenbaum

We present the work-in-progress of automating the classification of doctor-patient questions in the context of a simulated consultation with a virtual patient.

BIG-bench Machine Learning Dialogue Management +4

Paper
Add Code

Representation of complex terms in a vector space structured by an ontology for a normalization task

no code implementations • WS 2017 • Arnaud Ferr{\'e}, Pierre Zweigenbaum, Claire N{\'e}dellec

The method generates continuous vector representations of complex terms in a semantic space structured by the ontology.

Paper
Add Code

zNLP: Identifying Parallel Sentences in Chinese-English Comparable Corpora

no code implementations • WS 2017 • Zheng Zhang, Pierre Zweigenbaum

This paper describes the zNLP system for the BUCC 2017 shared task.

Machine Translation Sentence

Paper
Add Code

Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora

no code implementations • WS 2017 • Pierre Zweigenbaum, Serge Sharoff, Reinhard Rapp

We examined manually a small sample of the false negative sentence pairs for the most precise French-English runs and estimated the number of parallel sentence pairs not yet in the provided gold standard.

Machine Translation Sentence

Paper
Add Code

Traitement automatique de la langue biom\'edicale au LIMSI (Biomedical language processing at LIMSI)

no code implementations • JEPTALNRECITAL 2017 • Christopher Norman, Cyril Grouin, Thomas Lavergne, Aur{\'e}lie N{\'e}v{\'e}ol, Pierre Zweigenbaum

Nous proposons des d{\'e}monstrations de trois outils d{\'e}velopp{\'e}s par le LIMSI en traitement automatique des langues appliqu{\'e} au domaine biom{\'e}dical : la d{\'e}tection de concepts m{\'e}dicaux dans des textes courts, la cat{\'e}gorisation d{'}articles scientifiques pour l{'}assistance {\`a} l{'}{\'e}criture de revues syst{\'e}matiques, et l{'}anonymisation de textes cliniques.

Paper
Add Code

Tri Automatique de la Litt\'erature pour les Revues Syst\'ematiques (Automatically Ranking the Literature in Support of Systematic Reviews)

no code implementations • JEPTALNRECITAL 2017 • Christopher Norman, Mariska Leeflang, Pierre Zweigenbaum, Aur{\'e}lie N{\'e}v{\'e}ol

Nous appliquons un mod{\`e}le de regression logistique sur deux corpus issus de revues syst{\'e}matiques conduites dans le domaine du traitement automatique de la langue et de l{'}efficacit{\'e} des m{\'e}dicaments.

Classification General Classification

Paper
Add Code

D\'etection de concepts et granularit\'e de l'annotation (Concept detection and annotation granularity )

no code implementations • JEPTALNRECITAL 2017 • Pierre Zweigenbaum, Thomas Lavergne

Nous faisons l{'}hypoth{\`e}se qu{'}une annotation {\`a} un niveau de granularit{\'e} plus fin, typiquement au niveau de l{'}{\'e}nonc{\'e}, devrait am{\'e}liorer la performance d{'}un d{\'e}tecteur automatique entra{\^\i}n{\'e} sur ces donn{\'e}es.

Paper
Add Code

Supervised classification of end-of-lines in clinical text with no manual annotation

no code implementations • WS 2016 • Pierre Zweigenbaum, Cyril Grouin, Thomas Lavergne

In some plain text documents, end-of-line marks may or may not mark the boundary of a text unit (e. g., of a paragraph).

General Classification Language Modelling +1

Paper
Add Code

Detection of Text Reuse in French Medical Corpora

no code implementations • WS 2016 • Eva D{'}hondt, Cyril Grouin, Aur{\'e}lie N{\'e}v{\'e}ol, Efstathios Stamatatos, Pierre Zweigenbaum

Electronic Health Records (EHRs) are increasingly available in modern health care institutions either through the direct creation of electronic documents in hospitals{'} health information systems, or through the digitization of historical paper records.

De-identification Optical Character Recognition (OCR)

Paper
Add Code

A Dataset for ICD-10 Coding of Death Certificates: Creation and Usage

no code implementations • WS 2016 • Thomas Lavergne, Aur{\'e}lie N{\'e}v{\'e}ol, Aude Robert, Cyril Grouin, Gr{\'e}goire Rey, Pierre Zweigenbaum

Very few datasets have been released for the evaluation of diagnosis coding with the International Classification of Diseases, and only one so far in a language other than English.

Named Entity Recognition (NER)

Paper
Add Code

Hybrid methods for ICD-10 coding of death certificates

no code implementations • WS 2016 • Pierre Zweigenbaum, Thomas Lavergne

Paper
Add Code

Overview of the Regulatory Network of Plant Seed Development (SeeDev) Task at the BioNLP Shared Task 2016.

no code implementations • WS 2016 • Estelle Chaix, Bertr Dubreucq, , Abdelhak Fatihi, Dialekti Valsamou, Robert Bossy, Mouhamadou Ba, Louise Del{\'e}ger, Pierre Zweigenbaum, Philippe Bessi{\`e}res, Loic Lepiniec, Claire N{\'e}dellec

Entity Extraction using GAN

Paper
Add Code

Impact de l'agglutination dans l'extraction de termes en arabe standard moderne (Adaptation of a term extractor to the Modern Standard Arabic language)

no code implementations • JEPTALNRECITAL 2016 • Wafa Neifar, Thierry Hamon, Pierre Zweigenbaum, Mariem Ellouze, lamia hadrich belguith

L{'}adaptation a d{'}abord consist{\'e} {\`a} d{\'e}crire le processus d{'}extraction des termes de mani{\`e}re similaire {\`a} celui d{\'e}fini pour l{'}anglais et le fran{\c{c}}ais en prenant en compte certains particularit{\'e}s morpho-syntaxiques de la langue arabe.

Paper
Add Code

Une cat\'egorisation de fins de lignes non-supervis\'ee (End-of-line classification with no supervision)

no code implementations • JEPTALNRECITAL 2016 • Pierre Zweigenbaum, Cyril Grouin, Thomas Lavergne

Nous proposons une m{\'e}thode enti{\`e}rement non-supervis{\'e}e pour d{\'e}terminer si une fin de ligne doit {\^e}tre vue comme un simple espace ou comme une v{\'e}ritable fronti{\`e}re d{'}unit{\'e} textuelle, et la testons sur un corpus de comptes rendus m{\'e}dicaux.

Paper
Add Code

Managing Linguistic and Terminological Variation in a Medical Dialogue System

no code implementations • LREC 2016 • Leonardo Campillos Llanos, Dhouha Bouamor, Pierre Zweigenbaum, Sophie Rosset

We introduce a dialogue task between a virtual patient and a doctor where the dialogue system, playing the patient part in a simulated consultation, must reconcile a specialized level, to understand what the doctor says, and a lay level, to output realistic patient-language utterances.

Sentence Spoken Language Understanding

Paper
Add Code

Identification of Drug-Related Medical Conditions in Social Media

no code implementations • LREC 2016 • Fran{\c{c}}ois Morlane-Hond{\`e}re, Cyril Grouin, Pierre Zweigenbaum

When trained on the output of the first classifier, the second classifier{'}s performances are the following: p=0. 683;r=0. 956;f1=0. 797.

Paper
Add Code

Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality

no code implementations • LREC 2016 • Dhouha Bouamor, Leonardo Campillos Llanos, Anne-Laure Ligozat, Sophie Rosset, Pierre Zweigenbaum

While measuring the readability of texts has been a long-standing research topic, assessing the technicality of terms has only been addressed more recently and mostly for the English language.

Language Modelling Learning-To-Rank

Paper
Add Code

Description of the PatientGenesys Dialogue System

no code implementations • WS 2015 • Leonardo Campillos Llanos, Dhouha Bouamor, {\'E}ric Bilinski, Anne-Laure Ligozat, Pierre Zweigenbaum, Sophie Rosset

Paper
Add Code

BUCC Shared Task: Cross-Language Document Similarity

no code implementations • WS 2015 • Serge Sharoff, Pierre Zweigenbaum, Reinhard Rapp

Paper
Add Code

\'Etude des verbes introducteurs de noms de m\'edicaments dans les forums de sant\'e

no code implementations • JEPTALNRECITAL 2015 • Fran{\c{c}}ois Morlane-Hond{\`e}re, Cyril Grouin, Pierre Zweigenbaum

Nous estimons que l{'}analyse de ces variantes pourrait permettre de mod{\'e}liser les erreurs faites par les usagers des forums lorsqu{'}ils {\'e}crivent les noms de m{\'e}dicaments, et am{\'e}liorer en cons{\'e}quence les syst{\`e}mes de recherche d{'}information.

Paper
Add Code

Un patient virtuel dialogant

no code implementations • JEPTALNRECITAL 2015 • Leonardo Campillos, Dhouha Bouamor, {\'E}ric Bilinski, Anne-Laure Ligozat, Pierre Zweigenbaum, Sophie Rosset

Le d{\'e}monstrateur que nous d{\'e}crivons ici est un prototype de syst{\`e}me de dialogue dont l{'}objectif est de simuler un patient.

Paper
Add Code

M\'edicaments qui soignent, m\'edicaments qui rendent malades : \'etude des relations causales pour identifier les effets secondaires

no code implementations • JEPTALNRECITAL 2015 • Fran{\c{c}}ois Morlane-Hond{\`e}re, Cyril Grouin, V{\'e}ronique Moriceau, Pierre Zweigenbaum

Dans cet article, nous nous int{\'e}ressons {\`a} la mani{\`e}re dont sont exprim{\'e}s les liens qui existent entre un traitement m{\'e}dical et un effet secondaire.

Paper
Add Code

Identification de facteurs de risque pour des patients diab\'etiques \`a partir de comptes-rendus cliniques par des approches hybrides

no code implementations • JEPTALNRECITAL 2015 • Cyril Grouin, V{\'e}ronique Moriceau, Sophie Rosset, Pierre Zweigenbaum

Dans cet article, nous pr{\'e}sentons les m{\'e}thodes que nous avons d{\'e}velopp{\'e}es pour analyser des comptes- rendus hospitaliers r{\'e}dig{\'e}s en anglais.

Paper
Add Code

Automatic Analysis of Scientific and Literary Texts. Presentation and Results of the DEFT2014 Text Mining Challenge (Analyse automatique de textes litt\'eraires et scientifiques : pr\'esentation et r\'esultats du d\'efi fouille de texte DEFT2014) [in French]

no code implementations • JEPTALNRECITAL 2014 • Thierry Hamon, Quentin Plepl{\'e}, Patrick Paroubek, Pierre Zweigenbaum, Cyril Grouin

Opinion Mining

Paper
Add Code

Use of unsupervised word classes for entity recognition: Application to the detection of disorders in clinical reports

no code implementations • LREC 2014 • Maria Evangelia Chatzimina, Cyril Grouin, Pierre Zweigenbaum

We design and test two syntax-based methods to produce word classes: one applies the Brown clustering algorithm to syntactic dependencies, the other collects latent categories created by a PCFG-LA parser.

Chunking Clustering +2

Paper
Add Code

Annotation of specialized corpora using a comprehensive entity and relation scheme

no code implementations • LREC 2014 • Louise Del{\'e}ger, Anne-Laure Ligozat, Cyril Grouin, Pierre Zweigenbaum, Aur{\'e}lie N{\'e}v{\'e}ol

We present the annotation scheme as well as the results of a pilot annotation study covering 35 clinical documents in a variety of subfields and genres.

Relation

Paper
Add Code

Language Resources for French in the Biomedical Domain

no code implementations • LREC 2014 • Aur{\'e}lie N{\'e}v{\'e}ol, Julien Grosjean, St{\'e}fan Darmoni, Pierre Zweigenbaum

The biomedical domain offers a wealth of linguistic resources for Natural Language Processing, including terminologies and corpora.

Paper
Add Code

The Quaero French Medical Corpus: A Ressource for Medical Entity Recognition and Normalization

no code implementations • LREC 2014 • Cyril Grouin, Jeremy Leixa, Aurélie Névéol, Sophie Rosset, Xavier Tannier, Pierre Zweigenbaum

Overall, a total of 26, 409 entity annotations were mapped to 5, 797 unique UMLS concepts.

Paper
Add Code

Building Specialized Bilingual Lexicons Using Large Scale Background Knowledge

no code implementations • EMNLP 2013 • Dhouha Bouamor, Adrian Popescu, Nasredine Semmar, Pierre Zweigenbaum

Information Retrieval Machine Translation

Paper
Add Code

Building Specialized Bilingual Lexicons Using Word Sense Disambiguation

no code implementations • IJCNLP 2013 • Dhouha Bouamor, Nasredine Semmar, Pierre Zweigenbaum

Semantic Textual Similarity Word Sense Disambiguation

Paper
Add Code

Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora

no code implementations • ACL 2013 • Dhouha Bouamor, Nasredine Semmar, Pierre Zweigenbaum

Semantic Textual Similarity Word Sense Disambiguation

Paper
Add Code

Overview of BioNLP Shared Task 2013

no code implementations • WS 2013 • Claire N{\'e}dellec, Robert Bossy, Jin-Dong Kim, Jung-jae Kim, Tomoko Ohta, Sampo Pyysalo, Pierre Zweigenbaum

Paper
Add Code

Automatic Named Entity Pre-annotation for Out-of-domain Human Annotation

no code implementations • WS 2013 • Sophie Rosset, Cyril Grouin, Thomas Lavergne, Mohamed Ben Jannet, J{\'e}r{\'e}my Leixa, Olivier Galibert, Pierre Zweigenbaum

Paper
Add Code

Using WordNet and Semantic Similarity for Bilingual Terminology Mining from Comparable Corpora

no code implementations • WS 2013 • Dhouha Bouamor, Nasredine Semmar, Pierre Zweigenbaum

Information Retrieval Machine Translation +3

Paper
Add Code

(Utilisation de la similarit\'e s\'emantique pour l'extraction de lexiques bilingues \`a partir de corpus comparables) [in French]

no code implementations • JEPTALNRECITAL 2013 • Dhouha Bouamor, Nasredine Semmar, Pierre Zweigenbaum

Paper
Add Code

Extraction of temporal relations between clinical events in clinical documents (Extraction des relations temporelles entre \'ev\'enements m\'edicaux dans des comptes rendus hospitaliers) [in French]

no code implementations • JEPTALNRECITAL 2013 • Pierre Zweigenbaum, Xavier Tannier

Paper
Add Code

Automatic Construction of a MultiWord Expressions Bilingual Lexicon: A Statistical Machine Translation Evaluation Perspective

no code implementations • WS 2012 • Dhouha Bouamor, Nasredine Semmar, Pierre Zweigenbaum

Machine Translation Translation

Paper
Add Code

Manual Corpus Annotation: Giving Meaning to the Evaluation Metrics

no code implementations • COLING 2012 • Yann Mathet, Antoine Widl{\"o}cher, Kar{\"e}n Fort, Claire Fran{\c{c}}ois, Olivier Galibert, Cyril Grouin, Juliette Kahn, Sophie Rosset, Pierre Zweigenbaum

Paper
Add Code

Structured Named Entities in two distinct press corpora: Contemporary Broadcast News and Old Newspapers

no code implementations • WS 2012 • Sophie Rosset, Cyril Grouin, Kar{\"e}n Fort, Olivier Galibert, Juliette Kahn, Pierre Zweigenbaum

Named Entity Recognition (NER)

Paper
Add Code

Extraction d'information automatique en domaine m\'edical par projection inter-langue : vers un passage \`a l'\'echelle (Automatic Information Extraction in the Medical Domain by Cross-Lingual Projection) [in French]

no code implementations • JEPTALNRECITAL 2012 • Asma Ben Abacha, Pierre Zweigenbaum, Aur{\'e}lien Max

Paper
Add Code

Indexation libre et contr\^ol\'ee d'articles scientifiques. Pr\'esentation et r\'esultats du d\'efi fouille de textes DEFT2012 (Controlled and free indexing of scientific papers. Presentation and results of the DEFT2012 text-mining challenge) [in French]

no code implementations • JEPTALNRECITAL 2012 • Patrick Paroubek, Pierre Zweigenbaum, Dominic Forest, Cyril Grouin

Lemmatization

Paper
Add Code

Extended Named Entities Annotation on OCRed Documents: From Corpus Constitution to Evaluation Campaign

no code implementations • LREC 2012 • Olivier Galibert, Sophie Rosset, Cyril Grouin, Pierre Zweigenbaum, Ludovic Quintard

Within the framework of the Quaero project, we proposed a new definition of named entities, based upon an extension of the coverage of named entities as well as the structure of those named entities.

Named Entity Recognition (NER) Optical Character Recognition (OCR)

Paper
Add Code

Identifying bilingual Multi-Word Expressions for Statistical Machine Translation

no code implementations • LREC 2012 • Dhouha Bouamor, Nasredine Semmar, Pierre Zweigenbaum

MultiWord Expressions (MWEs) repesent a key issue for numerous applications in Natural Language Processing (NLP) especially for Machine Translation (MT).

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.