no code implementations • COLING (PEOPLES) 2020 • Nikola Ljubešić, Ilia Markov, Darja Fišer, Walter Daelemans
We further showcase the usage of the lexicons by calculating the difference in emotion distributions in texts containing and not containing socially unacceptable discourse, comparing them across four languages (English, Croatian, Dutch, Slovene) and two topics (migrants and LGBT).
no code implementations • EACL (WASSA) 2021 • Ilia Markov, Nikola Ljubešić, Darja Fišer, Walter Daelemans
In this paper, we describe experiments designed to evaluate the impact of stylometric and emotion-based features on hate speech detection: the task of classifying textual content into hate or non-hate speech classes.
no code implementations • NAACL (BioNLP) 2021 • Madhumita Sushil, Simon Suster, Walter Daelemans
For evaluation of explanations, we create a synthetic sepsis-identification dataset, as well as apply our technique on additional clinical and sentiment analysis datasets.
no code implementations • NAACL (BioNLP) 2021 • Madhumita Sushil, Simon Suster, Walter Daelemans
We explore whether state-of-the-art BERT models encode sufficient domain knowledge to correctly perform domain-specific inference.
1 code implementation • NAACL (BioNLP) 2021 • Pieter Fivez, Simon Suster, Walter Daelemans
Recent research on robust representations of biomedical names has focused on modeling large amounts of fine-grained conceptual distinctions using complex neural encoders.
1 code implementation • EACL (Louhi) 2021 • Pieter Fivez, Simon Suster, Walter Daelemans
It has not yet been empirically confirmed that training biomedical name encoders on fine-grained distinctions automatically leads to bottom-up encoding of such higher-level semantics.
no code implementations • NAACL (NLP4IF) 2021 • Ilia Markov, Walter Daelemans
Hate speech detection is an actively growing field of research with a variety of recently proposed approaches that allowed to push the state-of-the-art results.
no code implementations • NAACL (NLP4IF) 2021 • Jens Lemmens, Ilia Markov, Walter Daelemans
We study the usefulness of hateful metaphorsas features for the identification of the type and target of hate speech in Dutch Facebook comments.
no code implementations • COLING (LaTeCHCLfL, CLFL, LaTeCH) 2020 • Nikolay Banar, Walter Daelemans, Mike Kestemont
We investigate the use of Iconclass in the context of neural machine translation for NL<->EN artwork titles.
no code implementations • COLING 2022 • Jeska Buhmann, Maxime De Bruyn, Ehsan Lotfi, Walter Daelemans
In addition, we show that large groups of semantically similar questions are important for obtaining well-performing intent classification models.
no code implementations • EMNLP 2021 • Simon Suster, Pieter Fivez, Pietro Totis, Angelika Kimmig, Jesse Davis, Luc De Raedt, Walter Daelemans
While solving math word problems automatically has received considerable attention in the NLP community, few works have addressed probability word problems specifically.
no code implementations • TRAC (COLING) 2022 • Ilia Markov, Walter Daelemans
Online hate speech detection is an inherently challenging task that has recently received much attention from the natural language processing community.
no code implementations • 10 Dec 2024 • Ehsan Lotfi, Nikolay Banar, Nerses Yuzbashyan, Walter Daelemans
Using bBSARD, we conduct extensive benchmarking of retrieval models available for Dutch and French.
no code implementations • 14 Jun 2024 • Ine Gevers, Walter Daelemans
Using continuous pre-training, we control what entity knowledge is available to the model.
1 code implementation • 14 Jan 2024 • Ehsan Lotfi, Maxime De Bruyn, Jeska Buhmann, Walter Daelemans
The new wave of Large Language Models (LLM) has offered an efficient tool to curate sizeable conversational datasets.
1 code implementation • COLING 2022 • Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans
Automatic evaluation of open-domain dialogs remains an unsolved problem.
no code implementations • COLING 2022 • Jens Lemmens, Jens Van Nooten, Tim Kreutz, Walter Daelemans
We present CoNTACT: a Dutch language model adapted to the domain of COVID-19 tweets.
1 code implementation • LREC 2022 • Chris Emmery, Ákos Kádár, Grzegorz Chrupała, Walter Daelemans
The perturbed data, models, and code are available for reproduction at https://github. com/cmry/augtox
no code implementations • EMNLP (NLP4ConvAI) 2021 • Ehsan Lotfi, Maxime De Bruyn, Jeska Buhmann, Walter Daelemans
In this work we study the unsupervised selection abilities of pre-trained generative models (e. g. BART) and show that by adding a score-and-aggregate module between encoder and decoder, they are capable of learning to pick the proper knowledge through minimising the language modelling loss (i. e. without having access to knowledge labels).
1 code implementation • EMNLP (MRQA) 2021 • Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans
In this paper, we present the first multilingual FAQ dataset publicly available.
1 code implementation • 2 Aug 2021 • Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans
While powerful and efficient retrieval-based models exist for English, it is rarely the case for other languages for which the same amount of training data is not available.
1 code implementation • EACL 2021 • Pieter Fivez, Simon Suster, Walter Daelemans
Effective representation of biomedical names for downstream NLP tasks requires the encoding of both lexical as well as domain-specific semantic information.
no code implementations • COLING 2020 • Ehsan Lotfi, Ilia Markov, Walter Daelemans
Native language identification (NLI) {--} identifying the native language (L1) of a person based on his/her writing in the second language (L2) {--} is useful for a variety of purposes, including marketing, security, and educational applications.
no code implementations • WS 2020 • Jens Lemmens, Ben Burtenshaw, Ehsan Lotfi, Ilia Markov, Walter Daelemans
We present an ensemble approach for the detection of sarcasm in Reddit and Twitter responses in the context of The Second Workshop on Figurative Language Processing held in conjunction with ACL 2020.
no code implementations • 22 May 2020 • Nikolay Banar, Walter Daelemans, Mike Kestemont
To stimulate further research in this area and close the gap with subword-level NMT, we make all our code and models publicly available.
2 code implementations • 14 May 2020 • Madhumita Sushil, Simon Šuster, Walter Daelemans
For evaluation of explanations, we create a synthetic sepsis-identification dataset, as well as apply our technique on additional clinical and sentiment analysis datasets.
no code implementations • LREC 2020 • Tim Kreutz, Walter Daelemans
In these cases, key phrases that limit finding the competitive language are selected, and overall recall on the target language also decreases.
no code implementations • LREC 2020 • St{\'e}phan Tulkens, S, Dominiek ra, Walter Daelemans
We consider the orthographic neighborhood effect: the effect that words with more orthographic similarity to other words are read faster.
no code implementations • LREC 2020 • Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon
Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality.
1 code implementation • 25 Oct 2019 • Chris Emmery, Ben Verhoeven, Guy De Pauw, Gilles Jacobs, Cynthia Van Hee, Els Lefever, Bart Desmet, Véronique Hoste, Walter Daelemans
The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data.
no code implementations • 16 Oct 2019 • Simon Šuster, Madhumita Sushil, Walter Daelemans
Memory networks have been a popular choice among neural architectures for machine reading comprehension and question answering.
no code implementations • 15 Jun 2019 • Janneke van de Loo, Guy De Pauw, Walter Daelemans
This paper describes continuing work on semantic frame slot filling for a command and control task using a weakly-supervised approach.
1 code implementation • 30 Jan 2019 • Janneke van de Loo, Jort F. Gemmeke, Guy De Pauw, Bart Ons, Walter Daelemans, Hugo Van hamme
We present a framework for the induction of semantic frames from utterances in the context of an adaptive command-and-control interface.
1 code implementation • CONLL 2018 • St{\'e}phan Tulkens, S, Dominiek ra, Walter Daelemans
We conclude that the neighborhood effect is unlikely to have a perceptual basis, but is more likely to be the result of items co-activating after recognition.
no code implementations • WS 2018 • Lisa Hilte, Walter Daelemans, V, Reinhild ekerckhove
We aim to predict Flemish adolescents{'} educational track based on their Dutch social media writing.
1 code implementation • WS 2018 • Simon {\v{S}}uster, Madhumita Sushil, Walter Daelemans
Recently, segment convolutional neural networks have been proposed for end-to-end relation extraction in the clinical domain, achieving results comparable to or outperforming the approaches with heavy manual feature engineering.
no code implementations • 11 Sep 2018 • Tom De Smedt, Sylvia Jaki, Eduan Kotzé, Leïla Saoud, Maja Gwóźdź, Guy De Pauw, Walter Daelemans
In this report, we present a study of eight corpora of online hate speech, by demonstrating the NLP techniques that we used to collect and analyze the jihadist, extremist, racist, and sexist content.
1 code implementation • WS 2018 • Madhumita Sushil, Simon Šuster, Walter Daelemans
We find that the output rule-sets can explain the predictions of a neural network trained for 4-class text classification from the 20 newsgroups dataset to a macro-averaged F-score of 0. 80.
no code implementations • COLING 2018 • Tim Kreutz, Walter Daelemans
Lexicon based methods for sentiment analysis rely on high quality polarity lexicons.
no code implementations • COLING 2018 • Tim Kreutz, Walter Daelemans
This paper describes CLiPS{'}s submissions for the Discriminating between Dutch and Flemish in Subtitles (DFS) shared task at VarDial 2018.
no code implementations • 3 Jul 2018 • Madhumita Sushil, Simon Šuster, Kim Luyckx, Walter Daelemans
We compare the model performance of the feature set constructed from a bag of words to that obtained from medical concepts.
1 code implementation • NAACL 2018 • Simon Šuster, Walter Daelemans
We present a new dataset for machine comprehension in the medical domain.
Ranked #1 on Question Answering on CliCR
no code implementations • 17 Jan 2018 • Cynthia Van Hee, Gilles Jacobs, Chris Emmery, Bart Desmet, Els Lefever, Ben Verhoeven, Guy De Pauw, Walter Daelemans, Véronique Hoste
While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online.
no code implementations • 14 Nov 2017 • Madhumita Sushil, Simon Šuster, Kim Luyckx, Walter Daelemans
To understand and interpret the representations, we explore the best encoded features within the patient representations obtained from the autoencoder model.
1 code implementation • 19 Oct 2017 • Pieter Fivez, Simon Šuster, Walter Daelemans
We present an unsupervised context-sensitive spelling correction method for clinical free-text that uses word and character n-gram embeddings.
no code implementations • WS 2017 • Chris Emmery, Grzegorz Chrupa{\l}a, Walter Daelemans
The majority of research on extracting missing user attributes from social media profiles use costly hand-annotated labels for supervised learning.
no code implementations • RANLP 2017 • Lea Canales, Walter Daelemans, Ester Boldrini, Patricio Mart{\'\i}nez-Barco
Our objective in this paper is to show the pre-annotation process, as well as to evaluate the usability of subjective and polarity information in this process.
no code implementations • WS 2017 • Enrique Manjavacas, Jeroen De Gussem, Walter Daelemans, Mike Kestemont
Recent applications of neural language models have led to an increased interest in the automatic generation of natural language.
no code implementations • WS 2017 • Pieter Fivez, Simon {\v{S}}uster, Walter Daelemans
We present an unsupervised context-sensitive spelling correction method for clinical free-text that uses word and character n-gram embeddings.
1 code implementation • WS 2017 • Simon Šuster, Stéphan Tulkens, Walter Daelemans
Clinical NLP has an immense potential in contributing to how clinical practice will be revolutionized by the advent of large scale processing of clinical records.
1 code implementation • 31 Aug 2016 • Stéphan Tulkens, Lisa Hilte, Elise Lodewyckx, Ben Verhoeven, Walter Daelemans
The best-performing model used the manually cleaned dictionary and obtained an F-score of 0. 46 for the racist class on a test set consisting of unseen Dutch comments, retrieved from the same sites used for the training set.
2 code implementations • WS 2016 • Stéphan Tulkens, Simon Šuster, Walter Daelemans
In this paper, we report a knowledge-based method for Word Sense Disambiguation in the domains of biomedical and clinical text.
1 code implementation • LREC 2016 • Stéphan Tulkens, Chris Emmery, Walter Daelemans
With this research, we provide the embeddings themselves, the relation evaluation task benchmark for use in further research, and demonstrate how the benchmarked embeddings prove a useful unsupervised linguistic resource, effectively used in a downstream task.
no code implementations • LREC 2016 • Ben Verhoeven, Walter Daelemans, Barbara Plank
Personality profiling is the task of detecting personality traits of authors based on writing style.
no code implementations • 13 Jan 2016 • Vincent Van Asch, Walter Daelemans
The goal of this paper is to investigate the connection between the performance gain that can be obtained by selftraining and the similarity between the corpora used in this approach.
no code implementations • 11 Jan 2016 • Claudia Peersman, Walter Daelemans, Reinhild Vandekerckhove, Bram Vandekerckhove, Leona Van Vaerenbergh
We present a corpus-based analysis of the effects of age, gender and region of origin on the production of both "netspeak" or "chatspeak" features and regional speech features in Flemish Dutch posts that were collected from a Belgian online social network platform.
no code implementations • LREC 2014 • Ben Verhoeven, Walter Daelemans
We performed a supervised machine learning experiment using the SVM algorithm in a 10-fold cross-validation setup.
no code implementations • LREC 2014 • Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, N{\'u}ria Bel, Audron{\.e} Bielevi{\v{c}}ien{\.e}, Lars Borin, Ant{\'o}nio Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garab{\'\i}k, Marko Grobelnik, Carmen Garc{\'\i}a-Mateo, Josef van Genabith, Jan Haji{\v{c}}, Inma Hern{\'a}ez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lind{\'e}n, Bernardo Magnini, Joseph Mariani, John McNaught, Maite Melero, Monica Monachini, Asunci{\'o}n Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Stelios Piperidis, Adam Przepi{\'o}rkowski, Eir{\'\i}kur R{\"o}gnvaldsson, Michael Rosner, Bolette Pedersen, Inguna Skadi{\c{n}}a, Koenraad De Smedt, Marko Tadi{\'c}, Paul Thompson, Dan Tufi{\c{s}}, Tam{\'a}s V{\'a}radi, Andrejs Vasi{\c{l}}jevs, Kadri Vider, Jolanta Zabarskaite
This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics.
no code implementations • LREC 2012 • Mike Kestemont, Claudia Peersman, Benny De Decker, Guy De Pauw, Kim Luyckx, Roser Morante, Frederik Vaassen, Janneke van de Loo, Walter Daelemans
Although in recent years numerous forms of Internet communication ― such as e-mail, blogs, chat rooms and social network environments ― have emerged, balanced corpora of Internet speech with trustworthy meta-information (e. g. age and gender) or linguistic annotations are still limited.
no code implementations • LREC 2012 • Roser Morante, Walter Daelemans
In this paper we present ConanDoyle-neg, a corpus of stories by Conan Doyle annotated with negation information.
no code implementations • LREC 2012 • Tom De Smedt, Walter Daelemans
The lexicon is a dictionary of 1, 100 adjectives that occur frequently in online product reviews, manually annotated with polarity strength, subjectivity and intensity, for each word sense.
no code implementations • 22 Dec 1998 • Walter Daelemans, Antal Van den Bosch, Jakub Zavrel
We provide explanations for both results in terms of the properties of the natural language processing tasks and the learning algorithms.
1 code implementation • 11 Jul 1996 • Walter Daelemans, Jakub Zavrel, Peter Berck, Steven Gillis
In this paper we show that a large-scale application of the memory-based approach is feasible: we obtain a tagging accuracy that is on a par with that of known statistical approaches, and with attractive space and time complexity properties when using {\em IGTree}, a tree-based formalism for indexing and searching huge case bases.}