no code implementations • WS (NoDaLiDa) 2019 • Marina Santini, Benjamin Danielsson, Arne Jönsson
We explore the effectiveness of four feature representations – bag-of-words, word embeddings, principal components and autoencoders – for the binary categorization of the easy-to-read variety vs standard language.
no code implementations • READI (LREC) 2022 • Evelina Rennes, Marina Santini, Arne Jonsson
The toolkit allows user to selectively decide the types of simplification that meet the specific needs of the target audience they belong to.
no code implementations • TERM (LREC) 2022 • Oskar Jerdhaf, Marina Santini, Peter Lundberg, Tomas Bjerner, Yosef Al-Abasse, Arne Jonsson, Thomas Vakili
In the experiments briefly presented in this abstract, we compare the performance of a generalist Swedish pre-trained language model with a domain-specific Swedish pre-trained model on the downstream task of focussed terminology extraction of implant terms, which are terms that indicate the presence of implants in the body of patients.
no code implementations • LEGAL (LREC) 2022 • Olle Bridal, Thomas Vakili, Marina Santini
Privacy preservation of sensitive information is one of the main concerns in clinical text mining.
no code implementations • LREC 2022 • Benjamin Danielsson, Marina Santini, Peter Lundberg, Yosef Al-Abasse, Arne Jonsson, Emma Eneling, Magnus Stridsman
In this paper, we compare the performance of two BERT-based text classifiers whose task is to classify patients (more precisely, their medical histories) as having or not having implant(s) in their body.
no code implementations • LREC 2020 • Marina Santini, Arne Jonsson, Evelina Rennes
In this paper, we propose visualizing results of a corpus-based study on text complexity using radar charts.