Even though most interfaces in the real world are discrete, no efficient way exists to train neural networks to make use of them, yet.
Through two real-world case studies we confirm that Premise gives clear and actionable insight into the systematic errors made by modern NLP classifiers.
We further discuss the implications of our work on modeling speech processing and language similarity with neural networks.
Documents as short as a single sentence may inadvertently reveal sensitive information about their authors, including e. g. their gender or ethnicity.
For most language combinations, parallel data is either scarce or simply unavailable.
We evaluate the intelligibility of synonyms in context and find that choosing a lexical unit that is less risky to be misheard than its synonym introduced an average gain in comprehension of 37% at SNR -5 dB and 21% at SNR 0 dB for babble noise.
Welcome to WeaSuL 2021, the First Workshop on Weakly Supervised Learning, co-located with ICLR 2021.
Our experiments show that (1) the distance in the embedding space in the best cases only moderately correlates with phonological distance, and (2) improving the performance on the word discrimination task does not necessarily yield models that better reflect word phonological similarity.
We observe that, on both similar and distant target tasks and across all languages, the subspace-based representations transfer more effectively than standard BERT representations in the zero-shot setting, with improvements between F1 +10. 9 and F1 +42. 9 over the baselines across all tested monolingual and cross-lingual scenarios.
For this, we study the effects of model transfer on sequence labeling across various domains and tasks and show that our methods based on model similarity and support vector machines are able to predict promising sources, resulting in performance increases of up to 24 F1 points.
Theories and models of spoken word recognition aim to explain the process of accessing lexical knowledge given an acoustic realization of a word form.
We present a deep neural model of spoken word recognition which is trained to retrieve the meaning of a word (in the form of a word embedding) given its spoken form, a task which resembles that faced by a human listener.
Distant supervision allows obtaining labeled training corpora for low-resource settings where only limited hand-annotated data exists.
This is done using a transfer learning approach, where the parameters learned by an emoji-based source task are transferred to a sentiment target task.
Distant and weak supervision allow to obtain large amounts of labeled training data quickly and cheaply, but these automatic annotations tend to contain a high amount of errors.
We present our submission and results for SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) where we participated in offensive tweet classification tasks in English, Arabic, Greek, Turkish and Danish.
Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on.
Building on these recent developments, and with the aim of improving the quality of generated captions, the contribution of our work in this paper is two-fold: First, we propose a generic multimodal model fusion framework for caption generation as well as emendation where we utilize different fusion strategies to integrate a pretrained Auxiliary Language Model (AuxLM) within the traditional encoder-decoder visual captioning frameworks.
Deep neural networks and huge language models are becoming omnipresent in natural language applications.
Second, the different embedding types can form clusters in the common embedding space, preventing the computation of a meaningful average of different embeddings and thus, reducing performance.
In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness and/or non-linguists' perception of language similarity.
Multilingual transformer models like mBERT and XLM-RoBERTa have obtained great improvements for many NLP tasks on a variety of languages.
Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method.
Machine Learning approaches to Natural Language Processing tasks benefit from a comprehensive collection of real-life user data.
State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language.
Particularly, these image features are subdivided into global and local features, where global features are extracted from the global representation of the image, while local features are extracted from the objects detected locally in an image.
Generating longer textual sequences when conditioned on the visual information is an interesting problem to explore.
Although multitask learning has achieved improved performance in some problems, there are also tasks that lose performance when trained together.
Fine-tuning pre-trained transformer-based language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks.
The neural attention model has achieved great success in data-to-text generation tasks.
In air traffic control, assistant systems support air traffic controllers in their work.
Techniques such as distant and weak supervision can be used to create labeled data in a (semi-) automatic way.
We propose the Two-sidEd Attentive conditional Generative Adversarial Network (TEA-cGAN) to generate semantically manipulated images while preserving other contents such as background intact.
In low-resource settings, the performance of supervised labeling models can be improved with automatically annotated or distantly supervised data, which is cheap to create but often noisy.
Languages may be differently distant from each other and their mutual intelligibility may be asymmetric.
The interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years.
This work explores cross-lingual transfer learning (TL) for named entity recognition, focusing on bootstrapping Japanese from English.
However, they typically cannot serve as a drop-in replacement for conventional single-sense embeddings, because the correct sense vector needs to be selected for each word.
In our experiments on Chunking and NER, this approach performs more robustly than the baselines.
Developing conventional natural language generation systems requires extensive attention from human experts in order to craft complex sets of sentence planning rules.
Recently, Kannan et al.  proposed several logit regularization methods to improve the adversarial robustness of classifiers.
Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in building end-to-end trainable dialogue systems.
no code implementations • • Volha Petukhova, Andrei Malchanau, Youssef Oualil, Dietrich Klakow, Saturnino Luz, Fasih Haider, Nick Campbell, Dimitris Koryzis, Dimitris Spiliotopoulos, Pierre Albert, Nicklas Linz, Alex, Jan ersson
The performance of Neural Network (NN)-based language models is steadily improving due to the emergence of new architectures, which are able to learn different natural language characteristics.
The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora.
Training large vocabulary Neural Network Language Models (NNLMs) is a difficult task due to the explicit requirement of the output layer normalization, which typically involves the evaluation of the full softmax function over the complete vocabulary.
Word embeddings are high-dimensional vector representations of words and are thus difficult to interpret.
Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network.
We augmented pre-trained word embeddings with these novel embeddings and evaluated on a rare word similarity task, obtaining up to 3 times improvement in correlation over the original set of embeddings.
This paper describes a method to automatically create dialogue resources annotated with dialogue act information by reusing existing dialogue corpora.
In an intercomprehension scenario, typically a native speaker of language L1 is confronted with output from an unknown, but related language L2.
no code implementations • • Volha Petukhova, Martin Gropp, Dietrich Klakow, Gregor Eigner, Mario Topf, Stefan Srb, Petr Motlicek, Blaise Potard, John Dines, Olivier Deroo, Ronny Egeler, Uwe Meinz, Steffen Liersch, Anna Schmidt
We first start with human-human Wizard of Oz experiments to collect human-human data in order to model natural human dialogue behaviour, for better understanding of phenomena of human interactions and predicting interlocutors actions, and then replace the human Wizard by an increasingly advanced dialogue system, using evaluation data for system improvement.
In the TAC KBP 2013 English Slotfilling evaluation, the submitted main run of the LSV RelationFactory system achieved the top-ranked F1-score of 37. 3%.
In this paper we explore a task-driven approach to interfacing NLP components, where language processing is guided by the end-task that each application requires.
We present a gold standard for semantic relation extraction in the food domain for German.