Search Results for author: Chris Emmery

Found 15 papers, 7 papers with code

Native Language Identification with Big Bird Embeddings

1 code implementation13 Sep 2023 Sergey Kramp, Giovanni Cassani, Chris Emmery

Native Language Identification (NLI) intends to classify an author's native language based on their writing in another language.

Computational Efficiency Feature Engineering +1

Tailoring Domain Adaptation for Machine Translation Quality Estimation

1 code implementation18 Apr 2023 Javad PourMostafa Roshan Sharami, Dimitar Shterionov, Frédéric Blain, Eva Vanmassenhove, Mirella De Sisto, Chris Emmery, Pieter Spronck

While quality estimation (QE) can play an important role in the translation process, its effectiveness relies on the availability and quality of training data.

Data Augmentation Domain Adaptation +3

User-Centered Security in Natural Language Processing

no code implementations10 Jan 2023 Chris Emmery

This dissertation proposes a framework of user-centered security in Natural Language Processing (NLP), and demonstrates how it can improve the accessibility of related research.

Privacy Preserving

Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model

no code implementations14 Jul 2022 Chris van der Lee, Thiago castro Ferreira, Chris Emmery, Travis Wiltshire, Emiel Krahmer

In terms of output quality, extending the training set of a data-to-text system with a language model using the pseudo-labeling approach did increase text quality scores, but the data augmentation approach yielded similar scores to the system without training set extension.

Data Augmentation Data-to-Text Generation +2

Cyberbullying Classifiers are Sensitive to Model-Agnostic Perturbations

1 code implementation LREC 2022 Chris Emmery, Ákos Kádár, Grzegorz Chrupała, Walter Daelemans

The perturbed data, models, and code are available for reproduction at https://github. com/cmry/augtox

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling

1 code implementation EACL 2021 Chris Emmery, Ákos Kádár, Grzegorz Chrupała

Written language contains stylistic cues that can be exploited to automatically infer a variety of potentially sensitive author information.

Privacy Preserving

Style Obfuscation by Invariance

1 code implementation COLING 2018 Chris Emmery, Enrique Manjavacas, Grzegorz Chrupała

The task of obfuscating writing style using sequence models has previously been investigated under the framework of obfuscation-by-transfer, where the input text is explicitly rewritten in another style.

Style Transfer

Automatic Detection of Cyberbullying in Social Media Text

no code implementations17 Jan 2018 Cynthia Van Hee, Gilles Jacobs, Chris Emmery, Bart Desmet, Els Lefever, Ben Verhoeven, Guy De Pauw, Walter Daelemans, Véronique Hoste

While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online.

Binary Classification

Simple Queries as Distant Labels for Predicting Gender on Twitter

no code implementations WS 2017 Chris Emmery, Grzegorz Chrupa{\l}a, Walter Daelemans

The majority of research on extracting missing user attributes from social media profiles use costly hand-annotated labels for supervised learning.

Gender Classification General Classification

Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource

1 code implementation LREC 2016 Stéphan Tulkens, Chris Emmery, Walter Daelemans

With this research, we provide the embeddings themselves, the relation evaluation task benchmark for use in further research, and demonstrate how the benchmarked embeddings prove a useful unsupervised linguistic resource, effectively used in a downstream task.

Dialect Identification Relation +1

The Development of Dutch and Afrikaans Language Resources for Compound Boundary Analysis.

no code implementations LREC 2014 Menno van Zaanen, Gerhard van Huyssteen, Suzanne Aussems, Chris Emmery, Roald Eiselen

Whereas in languages such as English the components that make up a compound are separated by a space, in languages such as Finnish, German, Afrikaans and Dutch these components are concatenated into one word.

Boundary Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.