We present an extended version of a tool developed for calculating linguistic distances and asymmetries in auditory perception of closely related languages.
We tested the widely used Penn Discourse Tree Bank full parser (Lin et al., 2010) and the state-of-the-art neural network NeuralEDUSeg (Wang et al., 2018) and XLNet (Yang et al., 2019) models on the two-stage discourse segmentation and discourse relation recognition.
We focus on the syntactic variation and measure syntactic distances between nine Slavic languages (Belarusian, Bulgarian, Croatian, Czech, Polish, Slovak, Slovene, Russian, and Ukrainian) using symmetric measures of insertion, deletion and movement of syntactic units in the parallel sentences of the fable “The North Wind and the Sun”.
In recent years, voice-controlled personal assistants have revolutionized the interaction with smart devices and mobile applications.
no code implementations • • Mickaël Rigault, Claudia Cevenini, Khalid Choukri, Martin Kocour, Karel Veselý, Igor Szoke, Petr Motlicek, Juan Pablo Zuluaga-Gomez, Alexander Blatt, Dietrich Klakow, Allan Tart, Pavel Kolčárek, Jan Černocký
In this paper the authors detail the various legal and ethical issues faced during the ATCO2 project.
We hypothesise that the ISO 24617-2 dialogue act annotation framework adequately supports sales negotiation assessment in the domain of call centre conversations.
In this paper, label propagation-based semi-supervised learning is explored for the task of hate speech classification.
We apply hyperbolic embeddings to trace the dynamics of change of conceptual-semantic relationships in a large diachronic scientific corpus (200 years).
The paper presents a novel discourse-based approach to argument quality assessment defined as a graph classification task, where the depth of reasoning (argumentation) is evident from the number and type of detected discourse units and relations between them.
In this paper, we take a closer analytical look at AWEs learned from English speech and study how the choice of the learning objective and the architecture shapes their representational profile.
no code implementations • 22 Oct 2022 • David Ifeoluwa Adelani, Graham Neubig, Sebastian Ruder, Shruti Rijhwani, Michael Beukman, Chester Palen-Michel, Constantine Lignos, Jesujoba O. Alabi, Shamsuddeen H. Muhammad, Peter Nabende, Cheikh M. Bamba Dione, Andiswa Bukula, Rooweither Mabuya, Bonaventure F. P. Dossou, Blessing Sibanda, Happy Buzaaba, Jonathan Mukiibi, Godson Kalipe, Derguene Mbaye, Amelia Taylor, Fatoumata Kabore, Chris Chinenye Emezue, Anuoluwapo Aremu, Perez Ogayo, Catherine Gitau, Edwin Munkoh-Buabeng, Victoire M. Koagne, Allahsera Auguste Tapo, Tebogo Macucwa, Vukosi Marivate, Elvis Mboning, Tajuddeen Gwadabe, Tosin Adewumi, Orevaoghene Ahia, Joyce Nakatumba-Nabende, Neo L. Mokono, Ignatius Ezeani, Chiamaka Chukwuneke, Mofetoluwa Adeyemi, Gilles Q. Hacheme, Idris Abdulmumin, Odunayo Ogundepo, Oreen Yousuf, Tatiana Moteu Ngoli, Dietrich Klakow
African languages are spoken by over a billion people, but are underrepresented in NLP research and development.
In noisy environments, speech can be hard to understand for humans.
Models of acoustic word embeddings (AWEs) learn to map variable-length spoken word segments onto fixed-dimensionality vector representations such that different acoustic exemplars of the same word are projected nearby in the embedding space.
We present an LSTM-based autoregressive language model which uses prefix embeddings (from a pretrained masked language model) via fusion (e. g. concatenation) to obtain a richer context representation for language modelling.
Transferring knowledge from one domain to another is of practical importance for many tasks in natural language processing, especially when the amount of available data in the target domain is limited.
However, text classification in low-resource languages is still challenging due to the lack of annotated data.
Analyzing ethnic or religious bias is important for improving fairness, accountability, and transparency of natural language processing models.
The detection and normalization of temporal expressions is an important task and a preprocessing step for many applications.
Recent research on style transfer takes inspiration from unsupervised neural machine translation (UNMT), learning from large amounts of non-parallel data by exploiting cycle consistency loss, back-translation, and denoising autoencoders.
However, labels from weak supervision can be rather noisy and the high capacity of DNNs makes them easy to overfit the noisy labels.
1 code implementation • • David Ifeoluwa Adelani, Jesujoba Oluwadara Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, Dietrich Klakow, Peter Nabende, Ernie Chang, Tajuddeen Gwadabe, Freshia Sackey, Bonaventure F. P. Dossou, Chris Chinenye Emezue, Colin Leong, Michael Beukman, Shamsuddeen Hassan Muhammad, Guyo Dub Jarso, Oreen Yousuf, Andre Niyongabo Rubungo, Gilles Hacheme, Eric Peter Wairagala, Muhammad Umair Nasir, Benjamin Ayoade Ajibade, Tunde Oluwaseyi Ajayi, Yvonne Wambui Gitau, Jade Abbott, Mohamed Ahmed, Millicent Ochieng, Anuoluwapo Aremu, Perez Ogayo, Jonathan Mukiibi, Fatoumata Ouoba Kabore, Godson Koffi Kalipe, Derguene Mbaye, Allahsera Auguste Tapo, Victoire Memdjokam Koagne, Edwin Munkoh-Buabeng, Valencia Wagner, Idris Abdulmumin, Ayodele Awokoya, Happy Buzaaba, Blessing Sibanda, Andiswa Bukula, Sam Manthalu
We focus on two questions: 1) How can pre-trained models be used for languages not included in the initial pre-training?
Even though hate speech (HS) online has been an important object of research in the last decade, most HS-related corpora over-simplify the phenomenon of hate by attempting to label user comments as "hate" or "neutral".
Learning semantically meaningful sentence embeddings is an open problem in natural language processing.
Incorrect labels in training data occur when human annotators make mistakes or when the data is generated via weak or distant supervision.
The introduced data augmentation adds additional performance on high WER transcripts and allows the adaptation of the model to unseen airspaces.
Multilingual pre-trained language models (PLMs) have demonstrated impressive performance on several downstream tasks for both high-resourced and low-resourced languages.
Finally, we show that it is possible to combine PCA with using 1bit per dimension.
The field of natural language processing (NLP) has recently seen a large change towards using pre-trained language models for solving almost any task.
Even though most interfaces in the real world are discrete, no efficient way exists to train neural networks to make use of them, yet.
Characterizing these errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors, but also gives a way to act and improve the classifier.
We further discuss the implications of our work on modeling speech processing and language similarity with neural networks.
Documents as short as a single sentence may inadvertently reveal sensitive information about their authors, including e. g. their gender or ethnicity.
For most language combinations, parallel data is either scarce or simply unavailable.
We evaluate the intelligibility of synonyms in context and find that choosing a lexical unit that is less risky to be misheard than its synonym introduced an average gain in comprehension of 37% at SNR -5 dB and 21% at SNR 0 dB for babble noise.
Welcome to WeaSuL 2021, the First Workshop on Weakly Supervised Learning, co-located with ICLR 2021.
Our experiments show that (1) the distance in the embedding space in the best cases only moderately correlates with phonological distance, and (2) improving the performance on the word discrimination task does not necessarily yield models that better reflect word phonological similarity.
We observe that, on both similar and distant target tasks and across all languages, the subspace-based representations transfer more effectively than standard BERT representations in the zero-shot setting, with improvements between F1 +10. 9 and F1 +42. 9 over the baselines across all tested monolingual and cross-lingual scenarios.
For this, we study the effects of model transfer on sequence labeling across various domains and tasks and show that our methods based on model similarity and support vector machines are able to predict promising sources, resulting in performance increases of up to 24 F1 points.
We present a deep neural model of spoken word recognition which is trained to retrieve the meaning of a word (in the form of a word embedding) given its spoken form, a task which resembles that faced by a human listener.
Theories and models of spoken word recognition aim to explain the process of accessing lexical knowledge given an acoustic realization of a word form.
Distant supervision allows obtaining labeled training corpora for low-resource settings where only limited hand-annotated data exists.
This is done using a transfer learning approach, where the parameters learned by an emoji-based source task are transferred to a sentiment target task.
Distant and weak supervision allow to obtain large amounts of labeled training data quickly and cheaply, but these automatic annotations tend to contain a high amount of errors.
We present our submission and results for SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) where we participated in offensive tweet classification tasks in English, Arabic, Greek, Turkish and Danish.
Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on.
Building on these recent developments, and with the aim of improving the quality of generated captions, the contribution of our work in this paper is two-fold: First, we propose a generic multimodal model fusion framework for caption generation as well as emendation where we utilize different fusion strategies to integrate a pretrained Auxiliary Language Model (AuxLM) within the traditional encoder-decoder visual captioning frameworks.
Combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information.
Deep neural networks and huge language models are becoming omnipresent in natural language applications.
In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness and/or non-linguists' perception of language similarity.
Multilingual transformer models like mBERT and XLM-RoBERTa have obtained great improvements for many NLP tasks on a variety of languages.
Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method.
A lot of real-world phenomena are complex and cannot be captured by single task annotations.
Machine Learning approaches to Natural Language Processing tasks benefit from a comprehensive collection of real-life user data.
State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language.
Particularly, these image features are subdivided into global and local features, where global features are extracted from the global representation of the image, while local features are extracted from the objects detected locally in an image.
Generating longer textual sequences when conditioned on the visual information is an interesting problem to explore.
Differentially private stochastic gradient descent (DPSGD) is a variation of stochastic gradient descent based on the Differential Privacy (DP) paradigm, which can mitigate privacy threats that arise from the presence of sensitive information in training data.
Although multitask learning has achieved improved performance in some problems, there are also tasks that lose performance when trained together.
Fine-tuning pre-trained transformer-based language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks.
The neural attention model has achieved great success in data-to-text generation tasks.
In air traffic control, assistant systems support air traffic controllers in their work.
Techniques such as distant and weak supervision can be used to create labeled data in a (semi-) automatic way.
We propose the Two-sidEd Attentive conditional Generative Adversarial Network (TEA-cGAN) to generate semantically manipulated images while preserving other contents such as background intact.
In low-resource settings, the performance of supervised labeling models can be improved with automatically annotated or distantly supervised data, which is cheap to create but often noisy.
As a result, the content to be described in the text cannot be explicitly controlled.
Languages may be differently distant from each other and their mutual intelligibility may be asymmetric.
Interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years.
This work explores cross-lingual transfer learning (TL) for named entity recognition, focusing on bootstrapping Japanese from English.
However, they typically cannot serve as a drop-in replacement for conventional single-sense embeddings, because the correct sense vector needs to be selected for each word.
In our experiments on Chunking and NER, this approach performs more robustly than the baselines.
Developing conventional natural language generation systems requires extensive attention from human experts in order to craft complex sets of sentence planning rules.
Recently, Kannan et al.  proposed several logit regularization methods to improve the adversarial robustness of classifiers.
Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in building end-to-end trainable dialogue systems.
no code implementations • • Volha Petukhova, Andrei Malchanau, Youssef Oualil, Dietrich Klakow, Saturnino Luz, Fasih Haider, Nick Campbell, Dimitris Koryzis, Dimitris Spiliotopoulos, Pierre Albert, Nicklas Linz, Alex, Jan ersson
The performance of Neural Network (NN)-based language models is steadily improving due to the emergence of new architectures, which are able to learn different natural language characteristics.
The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora.
Training large vocabulary Neural Network Language Models (NNLMs) is a difficult task due to the explicit requirement of the output layer normalization, which typically involves the evaluation of the full softmax function over the complete vocabulary.
Word embeddings are high-dimensional vector representations of words and are thus difficult to interpret.
Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network.
We augmented pre-trained word embeddings with these novel embeddings and evaluated on a rare word similarity task, obtaining up to 3 times improvement in correlation over the original set of embeddings.
In an intercomprehension scenario, typically a native speaker of language L1 is confronted with output from an unknown, but related language L2.
This paper describes a method to automatically create dialogue resources annotated with dialogue act information by reusing existing dialogue corpora.
no code implementations • • Volha Petukhova, Martin Gropp, Dietrich Klakow, Gregor Eigner, Mario Topf, Stefan Srb, Petr Motlicek, Blaise Potard, John Dines, Olivier Deroo, Ronny Egeler, Uwe Meinz, Steffen Liersch, Anna Schmidt
We first start with human-human Wizard of Oz experiments to collect human-human data in order to model natural human dialogue behaviour, for better understanding of phenomena of human interactions and predicting interlocutors actions, and then replace the human Wizard by an increasingly advanced dialogue system, using evaluation data for system improvement.
In the TAC KBP 2013 English Slotfilling evaluation, the submitted main run of the LSV RelationFactory system achieved the top-ranked F1-score of 37. 3%.
We present a gold standard for semantic relation extraction in the food domain for German.
In this paper we explore a task-driven approach to interfacing NLP components, where language processing is guided by the end-task that each application requires.