Search Results for author: Anders Søgaard

Found 100 papers, 45 papers with code

Sociolectal Analysis of Pretrained Language Models

no code implementations EMNLP 2021 Sheng Zhang, Xin Zhang, Weiming Zhang, Anders Søgaard

Using data from English cloze tests, in which subjects also self-reported their gender, age, education, and race, we examine performance differences of pretrained language models across demographic groups, defined by these (protected) attributes.

Pretrained Language Models

Common Sense Bias in Semantic Role Labeling

no code implementations WNUT (ACL) 2021 Heather Lent, Anders Søgaard

Large-scale language models such as ELMo and BERT have pushed the horizon of what is possible in semantic role labeling (SRL), solving the out-of-vocabulary problem and enabling end-to-end systems, but they have also introduced significant biases.

Common Sense Reasoning Semantic Role Labeling +1

Resources and Evaluations for Danish Entity Resolution

no code implementations CRAC (ACL) 2021 Maria Barrett, Hieu Lam, Martin Wu, Ophélie Lacroix, Barbara Plank, Anders Søgaard

Automatic coreference resolution is understudied in Danish even though most of the Danish Dependency Treebank (Buch-Kromann, 2003) is annotated with coreference relations.

Coreference Resolution Entity Disambiguation +2

Multilingual Negation Scope Resolution for Clinical Text

no code implementations EACL (Louhi) 2021 Mareike Hartmann, Anders Søgaard

Negation scope resolution is key to high-quality information extraction from clinical texts, but so far, efforts to make encoders used for information extraction negation-aware have been limited to English.

Multi-Task Learning Negation Scope Resolution

Locke’s Holiday: Belief Bias in Machine Reading

no code implementations EMNLP 2021 Anders Søgaard

I highlight a simple failure mode of state-of-the-art machine reading systems: when contexts do not align with commonly shared beliefs.

Pretrained Language Models Reading Comprehension

Word Order Does Matter and Shuffled Language Models Know It

no code implementations ACL 2022 Mostafa Abdou, Vinit Ravishankar, Artur Kulmizev, Anders Søgaard

Recent studies have shown that language models pretrained and/or fine-tuned on randomly permuted sentences exhibit competitive performance on GLUE, putting into question the importance of word order information.

How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task

no code implementations ACL (WAT) 2021 Rahul Aralikatte, Héctor Ricardo Murrieta Bello, Miryam de Lhoneux, Daniel Hershcovich, Marcel Bollmann, Anders Søgaard

This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization.

Benchmark Translation

A Multilingual Benchmark for Probing Negation-Awareness with Minimal Pairs

1 code implementation CoNLL (EMNLP) 2021 Mareike Hartmann, Miryam de Lhoneux, Daniel Hershcovich, Yova Kementchedjhieva, Lukas Nielsen, Chen Qiu, Anders Søgaard

Negation is one of the most fundamental concepts in human cognition and language, and several natural language inference (NLI) probes have been designed to investigate pretrained language models’ ability to detect and reason with negation.

Benchmark Natural Language Inference +1

Square One Bias in NLP: Towards a Multi-Dimensional Exploration of the Research Manifold

1 code implementation Findings (ACL) 2022 Sebastian Ruder, Ivan Vulić, Anders Søgaard

Most work targeting multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on.

Fairness

Ancestor-to-Creole Transfer is Not a Walk in the Park

no code implementations insights (ACL) 2022 Heather Lent, Emanuele Bugliarello, Anders Søgaard

We aim to learn language models for Creole languages for which large volumes of data are not readily available, and therefore explore the potential transfer from ancestor languages (the 'Ancestry Transfer Hypothesis').

What a Creole Wants, What a Creole Needs

no code implementations1 Jun 2022 Heather Lent, Kelechi Ogueji, Miryam de Lhoneux, Orevaoghene Ahia, Anders Søgaard

We demonstrate, through conversations with Creole experts and surveys of Creole-speaking communities, how the things needed from language technology can change dramatically from one language to another, even when the languages are considered to be very similar to each other, as with Creoles.

Natural Language Processing

Evaluating Deep Taylor Decomposition for Reliability Assessment in the Wild

1 code implementation3 May 2022 Stephanie Brandl, Daniel Hershcovich, Anders Søgaard

We argue that we need to evaluate model interpretability methods 'in the wild', i. e., in situations where professionals make critical decisions, and models can potentially assist them.

Decision Making

Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

1 code implementation ACL 2022 Stephanie Brandl, Oliver Eberle, Jonas Pilot, Anders Søgaard

We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention.

Relation Extraction Sentiment Analysis

Generalized Quantifiers as a Source of Error in Multilingual NLU Benchmarks

1 code implementation22 Apr 2022 Ruixiang Cui, Daniel Hershcovich, Anders Søgaard

Logical approaches to representing language have developed and evaluated computational models of quantifier words since the 19th century, but today's NLU models still struggle to capture their semantics.

How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns

1 code implementation11 Apr 2022 Stephanie Brandl, Ruixiang Cui, Anders Søgaard

Gender-neutral pronouns have recently been introduced in many languages to a) include non-binary people and b) as a generic singular.

Factual Consistency of Multilingual Pretrained Language Models

1 code implementation Findings (ACL) 2022 Constanza Fierro, Anders Søgaard

However, for that, we need to know how reliable this knowledge is, and recent work has shown that monolingual English language models lack consistency when predicting factual knowledge, that is, they fill-in-the-blank differently for paraphrases describing the same fact.

Pretrained Language Models

Word Order Does Matter (And Shuffled Language Models Know It)

no code implementations21 Mar 2022 Vinit Ravishankar, Mostafa Abdou, Artur Kulmizev, Anders Søgaard

Recent studies have shown that language models pretrained and/or fine-tuned on randomly permuted sentences exhibit competitive performance on GLUE, putting into question the importance of word order information.

Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning

1 code implementation ACL 2022 Miryam de Lhoneux, Sheng Zhang, Anders Søgaard

Large multilingual pretrained language models such as mBERT and XLM-RoBERTa have been found to be surprisingly effective for cross-lingual transfer of syntactic parsing models (Wu and Dredze 2019), but only between related languages.

Cross-Lingual Transfer Dependency Parsing +2

Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting

1 code implementation Findings (ACL) 2022 Ilias Chalkidis, Anders Søgaard

In document classification for, e. g., legal and biomedical text, we often deal with hundreds of classes, including very infrequent ones, as well as temporal concept drift caused by the influence of real world events, e. g., policy changes, conflicts, or pandemics.

Document Classification Multi-Label Classification

FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing

1 code implementation ACL 2022 Ilias Chalkidis, Tommaso Pasini, Sheng Zhang, Letizia Tomada, Sebastian Felix Schwemer, Anders Søgaard

We present a benchmark suite of four datasets for evaluating the fairness of pre-trained language models and the techniques used to fine-tune them for downstream tasks.

Benchmark Fairness

The Impact of Differential Privacy on Group Disparity Mitigation

1 code implementation5 Mar 2022 Victor Petrén Bach Hansen, Atula Tejaswi Neerkaje, Ramit Sawhney, Lucie Flek, Anders Søgaard

The performance cost of differential privacy has, for some applications, been shown to be higher for minority groups; fairness, conversely, has been shown to disproportionally compromise the privacy of members of such groups.

Computer Vision Fairness

Exploring the Unfairness of DP-SGD Across Settings

no code implementations24 Feb 2022 Frederik Noe, Rasmus Herskind, Anders Søgaard

We establish a negative, logarithmic correlation between privacy and fairness in the case of linear classification and robust deep learning.

Classification Dimensionality Reduction +1

Do We Still Need Automatic Speech Recognition for Spoken Language Understanding?

no code implementations29 Nov 2021 Lasse Borgholt, Jakob Drachmann Havtorn, Mostafa Abdou, Joakim Edin, Lars Maaløe, Anders Søgaard, Christian Igel

We compare learned speech features from wav2vec 2. 0, state-of-the-art ASR transcripts, and the ground truth text as input for a novel speech-based named entity recognition task, a cardiac arrest detection task on real-world emergency calls and two existing SLU benchmarks.

Ranked #5 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)

Automatic Speech Recognition Machine Translation +6

Revisiting Methods for Finding Influential Examples

no code implementations8 Nov 2021 Karthikeyan K, Anders Søgaard

Several instance-based explainability methods for finding influential training examples for test-time decisions have been proposed recently, including Influence Functions, TraceIn, Representer Point Selection, Grad-Dot, and Grad-Cos.

Dynamic Forecasting of Conversation Derailment

no code implementations EMNLP 2021 Yova Kementchedjhieva, Anders Søgaard

This approach shows mixed results: in a high-quality data setting, a longer average forecast horizon can be achieved at the cost of a small drop in F1; in a low-quality data setting, however, dynamic training propagates the noise and is highly detrimental to performance.

Do Language Models Know the Way to Rome?

no code implementations EMNLP (BlackboxNLP) 2021 Bastien Liétard, Mostafa Abdou, Anders Søgaard

The global geometry of language models is important for a range of applications, but language model probes tend to evaluate rather local relations, for which ground truths are easily obtained.

Language Modelling

On Language Models for Creoles

1 code implementation CoNLL (EMNLP) 2021 Heather Lent, Emanuele Bugliarello, Miryam de Lhoneux, Chen Qiu, Anders Søgaard

Creole languages such as Nigerian Pidgin English and Haitian Creole are under-resourced and largely ignored in the NLP literature.

Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color

no code implementations CoNLL (EMNLP) 2021 Mostafa Abdou, Artur Kulmizev, Daniel Hershcovich, Stella Frank, Ellie Pavlick, Anders Søgaard

Pretrained language models have been shown to encode relational information, such as the relations between entities or concepts in knowledge-bases -- (Paris, Capital, France).

Pretrained Language Models

The Impact of Positional Encodings on Multilingual Compression

no code implementations EMNLP 2021 Vinit Ravishankar, Anders Søgaard

In order to preserve word-order information in a non-autoregressive setting, transformer architectures tend to include positional knowledge, by (for instance) adding positional encodings to token embeddings.

Inductive Bias

On the Interaction of Belief Bias and Explanations

no code implementations Findings (ACL) 2021 Ana Valeria Gonzalez, Anna Rogers, Anders Søgaard

A myriad of explainability methods have been proposed in recent years, but there is little consensus on how to evaluate them.

Minimax and Neyman-Pearson Meta-Learning for Outlier Languages

1 code implementation2 Jun 2021 Edoardo Maria Ponti, Rahul Aralikatte, Disha Shrivastava, Siva Reddy, Anders Søgaard

In fact, under a decision-theoretic framework, MAML can be interpreted as minimising the expected risk across training languages (with a uniform prior), which is known as Bayes criterion.

Meta-Learning Part-Of-Speech Tagging +1

John praised Mary because he? Implicit Causality Bias and Its Interaction with Explicit Cues in LMs

no code implementations2 Jun 2021 Yova Kementchedjhieva, Mark Anderson, Anders Søgaard

We hypothesize that the temporary challenge humans face in integrating the two contradicting signals, one from the lexical semantics of the verb, one from the sentence-level semantics, would be reflected in higher error rates for models on tasks dependent on causal links.

Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

no code implementations1 Jun 2021 Mark Anderson, Anders Søgaard, Carlos Gómez Rodríguez

S{\o}gaard (2020) obtained results suggesting the fraction of trees occurring in the test data isomorphic to trees in the training set accounts for a non-trivial variation in parser performance.

Do End-to-End Speech Recognition Models Care About Context?

no code implementations17 Feb 2021 Lasse Borgholt, Jakob Drachmann Havtorn, Željko Agić, Anders Søgaard, Lars Maaløe, Christian Igel

We test this hypothesis by measuring temporal context sensitivity and evaluate how the models perform when we constrain the amount of contextual information in the audio input.

speech-recognition Speech Recognition

Does injecting linguistic structure into language models lead to better alignment with brain recordings?

no code implementations29 Jan 2021 Mostafa Abdou, Ana Valeria Gonzalez, Mariya Toneva, Daniel Hershcovich, Anders Søgaard

We evaluate across two fMRI datasets whether language models align better with brain recordings, if their attention is biased by annotations from syntactic or semantic formalisms.

Natural Language Processing

Attention Can Reflect Syntactic Structure (If You Let It)

no code implementations EACL 2021 Vinit Ravishankar, Artur Kulmizev, Mostafa Abdou, Anders Søgaard, Joakim Nivre

Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism.

Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses

no code implementations EMNLP 2020 Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei, Anders Søgaard

Evaluation of grammatical error correction (GEC) systems has primarily focused on essays written by non-native learners of English, which however is only part of the full spectrum of GEC applications.

Benchmark Grammatical Error Correction +1

Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards

1 code implementation12 Oct 2020 Rahul Aralikatte, Mostafa Abdou, Heather Lent, Daniel Hershcovich, Anders Søgaard

Coreference resolution and semantic role labeling are NLP tasks that capture different aspects of semantics, indicating respectively, which expressions refer to the same entity, and what semantic roles expressions serve in the sentence.

Coreference Resolution Natural Language Understanding +1

Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias

1 code implementation EMNLP 2020 Ana Valeria Gonzalez, Maria Barrett, Rasmus Hvingelby, Kellie Webster, Anders Søgaard

The one-sided focus on English in previous studies of gender bias in NLP misses out on opportunities in other languages: English challenge datasets such as GAP and WinoGender highlight model preferences that are "hallucinatory", e. g., disambiguating gender-ambiguous occurrences of 'doctor' as male doctors.

Translation

Worst-Case-Aware Curriculum Learning for Zero and Few Shot Transfer

1 code implementation23 Sep 2020 Sheng Zhang, Xin Zhang, Weiming Zhang, Anders Søgaard

Multi-task transfer learning based on pre-trained language encoders achieves state-of-the-art performance across a range of tasks.

Transfer Learning

The Sensitivity of Language Models and Humans to Winograd Schema Perturbations

2 code implementations ACL 2020 Mostafa Abdou, Vinit Ravishankar, Maria Barrett, Yonatan Belinkov, Desmond Elliott, Anders Søgaard

Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability.

Common Sense Reasoning Pretrained Language Models

We Need to Talk About Random Splits

1 code implementation EACL 2021 Anders Søgaard, Sebastian Ebert, Jasmijn Bastings, Katja Filippova

We argue that random splits, like standard splits, lead to overly optimistic performance estimates.

Domain Adaptation

Weakly Supervised POS Taggers Perform Poorly on Truly Low-Resource Languages

no code implementations28 Apr 2020 Katharina Kann, Ophélie Lacroix, Anders Søgaard

Part-of-speech (POS) taggers for low-resource languages which are exclusively based on various forms of weak supervision - e. g., cross-lingual transfer, type-level supervision, or a combination thereof - have been reported to perform almost as well as supervised ones.

Cross-Lingual Transfer POS

Are All Good Word Vector Spaces Isomorphic?

1 code implementation EMNLP 2020 Ivan Vulić, Sebastian Ruder, Anders Søgaard

Existing algorithms for aligning cross-lingual word vector spaces assume that vector spaces are approximately isomorphic.

Comparing Unsupervised Word Translation Methods Step by Step

no code implementations NeurIPS 2019 Mareike Hartmann, Yova Kementchedjhieva, Anders Søgaard

Cross-lingual word vector space alignment is the task of mapping the vocabularies of two languages into a shared semantic space, which can be used for dictionary induction, unsupervised machine translation, and transfer learning.

Transfer Learning Translation +2

Retrieval-based Goal-Oriented Dialogue Generation

no code implementations30 Sep 2019 Ana Valeria Gonzalez, Isabelle Augenstein, Anders Søgaard

Most research on dialogue has focused either on dialogue generation for openended chit chat or on state tracking for goal-directed dialogue.

Dialogue Generation

Domain Transfer in Dialogue Systems without Turn-Level Supervision

1 code implementation16 Sep 2019 Joachim Bingel, Victor Petrén Bach Hansen, Ana Valeria Gonzalez, Paweł Budzianowski, Isabelle Augenstein, Anders Søgaard

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent throughout the conversation.

Dialogue State Tracking reinforcement-learning +1

Lost in Evaluation: Misleading Benchmarks for Bilingual Dictionary Induction

2 code implementations IJCNLP 2019 Yova Kementchedjhieva, Mareike Hartmann, Anders Søgaard

We study the composition and quality of the test sets for five diverse languages from this dataset, with concerning findings: (1) a quarter of the data consists of proper nouns, which can be hardly indicative of BDI performance, and (2) there are pervasive gaps in the gold-standard targets.

Cross-Lingual Word Embeddings Word Embeddings

Rewarding Coreference Resolvers for Being Consistent with World Knowledge

1 code implementation IJCNLP 2019 Rahul Aralikatte, Heather Lent, Ana Valeria Gonzalez, Daniel Hershcovich, Chen Qiu, Anders Sandholm, Michael Ringaard, Anders Søgaard

Unresolved coreference is a bottleneck for relation extraction, and high-quality coreference resolvers may produce an output that makes it a lot easier to extract knowledge triples.

reinforcement-learning Relation Extraction

Higher-order Comparisons of Sentence Encoder Representations

no code implementations IJCNLP 2019 Mostafa Abdou, Artur Kulmizev, Felix Hill, Daniel M. Low, Anders Søgaard

Representational Similarity Analysis (RSA) is a technique developed by neuroscientists for comparing activity patterns of different measurement modalities (e. g., fMRI, electrophysiology, behavior).

Ellipsis Resolution as Question Answering: An Evaluation

1 code implementation EACL 2021 Rahul Aralikatte, Matthew Lamm, Daniel Hardt, Anders Søgaard

Most, if not all forms of ellipsis (e. g., so does Mary) are similar to reading comprehension questions (what does Mary do), in that in order to resolve them, we need to identify an appropriate text span in the preceding discourse.

Coreference Resolution Machine Reading Comprehension +2

X-WikiRE: A Large, Multilingual Resource for Relation Extraction as Machine Comprehension

1 code implementation WS 2019 Mostafa Abdou, Cezar Sas, Rahul Aralikatte, Isabelle Augenstein, Anders Søgaard

Although the vast majority of knowledge bases KBs are heavily biased towards English, Wikipedias do cover very different topics in different languages.

Reading Comprehension Relation Extraction

Issue Framing in Online Discussion Fora

no code implementations NAACL 2019 Mareike Hartmann, Tallulah Jansen, Isabelle Augenstein, Anders Søgaard

In online discussion fora, speakers often make arguments for or against something, say birth control, by highlighting certain aspects of the topic.

Better, Faster, Stronger Sequence Tagging Constituent Parsers

1 code implementation NAACL 2019 David Vilares, Mostafa Abdou, Anders Søgaard

Combining these techniques, we clearly surpass the performance of sequence tagging constituent parsers on the English and Chinese Penn Treebanks, and reduce their parsing time even further.

Multi-Task Learning

Jointly Learning to Label Sentences and Tokens

2 code implementations14 Nov 2018 Marek Rei, Anders Søgaard

Learning to construct text representations in end-to-end systems can be difficult, as natural languages are highly compositional and task-specific annotated datasets are often limited in size.

Grammatical Error Detection Sentence Classification

Why is unsupervised alignment of English embeddings from different algorithms so hard?

no code implementations EMNLP 2018 Mareike Hartmann, Yova Kementchedjhieva, Anders Søgaard

This paper presents a challenge to the community: Generative adversarial networks (GANs) can perfectly align independent English word embeddings induced using the same algorithm, based on distributional information alone; but fails to do so, for two different embeddings algorithms.

Word Embeddings

Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction

1 code implementation CONLL 2018 Yova Kementchedjhieva, Sebastian Ruder, Ryan Cotterell, Anders Søgaard

Most recent approaches to bilingual dictionary induction find a linear alignment between the word vector spaces of two languages.

Nightmare at test time: How punctuation prevents parsers from generalizing

no code implementations WS 2018 Anders Søgaard, Miryam de Lhoneux, Isabelle Augenstein

Punctuation is a strong indicator of syntactic structure, and parsers trained on text with punctuation often rely heavily on this signal.

A strong baseline for question relevancy ranking

no code implementations EMNLP 2018 Ana V. González-Garduño, Isabelle Augenstein, Anders Søgaard

The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks -- a task that amounts to question relevancy ranking -- involve complex pipelines and manual feature engineering.

Community Question Answering Feature Engineering

Parameter sharing between dependency parsers for related languages

1 code implementation EMNLP 2018 Miryam de Lhoneux, Johannes Bjerva, Isabelle Augenstein, Anders Søgaard

We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies.

On the Limitations of Unsupervised Bilingual Dictionary Induction

no code implementations ACL 2018 Anders Søgaard, Sebastian Ruder, Ivan Vulić

Unsupervised machine translation---i. e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora---seems impossible, but nevertheless, Lample et al. (2018) recently proposed a fully unsupervised machine translation (MT) model.

Graph Similarity Translation +1

Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens

no code implementations NAACL 2018 Marek Rei, Anders Søgaard

Can attention- or gradient-based visualization techniques be used to infer token-level labels for binary sequence tagging problems, using networks trained only on sentence-level labels?

Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces

1 code implementation NAACL 2018 Isabelle Augenstein, Sebastian Ruder, Anders Søgaard

We combine multi-task learning and semi-supervised learning by inducing a joint embedding space between disparate label spaces and learning transfer functions between label embeddings, enabling us to jointly leverage unlabelled data and auxiliary, annotated datasets.

General Classification Multi-Task Learning +1

Is writing style predictive of scientific fraud?

no code implementations13 Jul 2017 Chloé Braud, Anders Søgaard

The problem of detecting scientific fraud using machine learning was recently introduced, with initial, positive results from a model taking into account various general indicators.

A Survey Of Cross-lingual Word Embedding Models

no code implementations15 Jun 2017 Sebastian Ruder, Ivan Vulić, Anders Søgaard

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages.

Cross-Lingual Transfer Cross-Lingual Word Embeddings +2

Latent Multi-task Architecture Learning

2 code implementations23 May 2017 Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, Anders Søgaard

In practice, however, MTL involves searching an enormous space of possible parameter sharing architectures to find (a) the layers or subspaces that benefit from sharing, (b) the appropriate amount of sharing, and (c) the appropriate relative weights of the different task losses.

Multi-Task Learning

Multi-Task Learning of Keyphrase Boundary Classification

no code implementations ACL 2017 Isabelle Augenstein, Anders Søgaard

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to predefined types.

Classification General Classification +1

Identifying beneficial task relations for multi-task learning in deep neural networks

1 code implementation EACL 2017 Joachim Bingel, Anders Søgaard

Multi-task learning (MTL) in deep neural networks for NLP has recently received increasing interest due to some compelling benefits, including its potential to efficiently regularize models and to reduce the need for labeled data.

Multi-Task Learning

Cross-lingual RST Discourse Parsing

1 code implementation EACL 2017 Chloé Braud, Maximin Coavoux, Anders Søgaard

Discourse parsing is an integral part of understanding information flow and argumentative structure in documents.

Discourse Parsing

Cross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages

1 code implementation6 Jan 2017 Michael Sejr Schlichtkrull, Anders Søgaard

In cross-lingual dependency annotation projection, information is often lost during transfer because of early decoding.

Dependency Parsing

Spikes as regularizers

no code implementations18 Nov 2016 Anders Søgaard

We present a confidence-based single-layer feed-forward learning algorithm SPIRAL (Spike Regularized Adaptive Learning) relying on an encoding of activation spikes.

A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments

no code implementations EACL 2017 Omer Levy, Anders Søgaard, Yoav Goldberg

While cross-lingual word embeddings have been studied extensively in recent years, the qualitative differences between the different algorithms remain vague.

Cross-Lingual Word Embeddings Word Embeddings

Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss

3 code implementations ACL 2016 Barbara Plank, Anders Søgaard, Yoav Goldberg

Bidirectional long short-term memory (bi-LSTM) networks have recently proven successful for various NLP sequence modeling tasks, but little is known about their reliance to input representations, target languages, data set size, and label noise.

Part-Of-Speech Tagging POS

Improving sentence compression by learning to predict gaze

no code implementations NAACL 2016 Sigrid Klerke, Yoav Goldberg, Anders Søgaard

We show how eye-tracking corpora can be used to improve sentence compression models, presenting a novel multi-task learning algorithm based on multi-layer LSTMs.

Multi-Task Learning Sentence Compression

Empirical Gaussian priors for cross-lingual transfer learning

no code implementations9 Jan 2016 Anders Søgaard

We instead propose to use the $k$ source language models to estimate the parameters of a Gaussian prior for learning new POS taggers.

Cross-Lingual Transfer online learning +3

Cannot find the paper you are looking for? You can Submit a new open access paper.