Search Results for author: Johannes Bjerva

Found 47 papers, 11 papers with code

Semantic Tagging with Deep Residual Networks

1 code implementation COLING 2016 Johannes Bjerva, Barbara Plank, Johan Bos

We propose a novel semantic tagging task, sem-tagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets).

Part-Of-Speech Tagging POS +2

Byte-based Language Identification with Deep Convolutional Networks

no code implementations WS 2016 Johannes Bjerva

The system, named ResIdent, is trained only on the data released with the task (closed training).

Language Identification

Morphological Complexity Influences Verb-Object Order in Swedish Sign Language

no code implementations WS 2016 Johannes Bjerva, Carl B{\"o}rstell

Computational linguistic approaches to sign languages could benefit from investigating how complexity influences structure.

The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations

1 code implementation EACL 2017 Lasha Abzianidze, Johannes Bjerva, Kilian Evang, Hessel Haagsma, Rik van Noord, Pierre Ludmann, Duc-Duy Nguyen, Johan Bos

The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch).

Articulation rate in Swedish child-directed speech increases as a function of the age of the child even when surprisal is controlled for

no code implementations10 Jun 2017 Johan Sjons, Thomas Hörberg, Robert Östling, Johannes Bjerva

In earlier work, we have shown that articulation rate in Swedish child-directed speech (CDS) increases as a function of the age of the child, even when utterance length and differences in articulation rate between subjects are controlled for.

The Power of Character N-grams in Native Language Identification

no code implementations WS 2017 Artur Kulmizev, Bo Blankers, Johannes Bjerva, Malvina Nissim, Gertjan van Noord, Barbara Plank, Martijn Wieling

In this paper, we explore the performance of a linear SVM trained on language independent character features for the NLI Shared Task 2017.

Native Language Identification Text Classification

One Model to Rule them all: Multitask and Multilingual Modelling for Lexical Analysis

no code implementations3 Nov 2017 Johannes Bjerva

For instance, if you are a skilled violinist, you will likely have an easier time learning to play cello.

Language Acquisition Lexical Analysis

Tracking Typological Traits of Uralic Languages in Distributed Language Representations

no code implementations WS 2018 Johannes Bjerva, Isabelle Augenstein

Although linguistic typology has a long history, computational approaches have only recently gained popularity.

From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings

no code implementations NAACL 2018 Johannes Bjerva, Isabelle Augenstein

A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the World Atlas of Language Structure (WALS).

Morphological Inflection Part-Of-Speech Tagging

Cross-lingual complex word identification with multitask learning

no code implementations WS 2018 Joachim Bingel, Johannes Bjerva

We approach the 2018 Shared Task on Complex Word Identification by leveraging a cross-lingual multitask learning approach.

Complex Word Identification Lexical Simplification

Parameter sharing between dependency parsers for related languages

1 code implementation EMNLP 2018 Miryam de Lhoneux, Johannes Bjerva, Isabelle Augenstein, Anders Søgaard

We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies.

Copenhagen at CoNLL--SIGMORPHON 2018: Multilingual Inflection in Context with Explicit Morphosyntactic Decoding

no code implementations CONLL 2018 Yova Kementchedjhieva, Johannes Bjerva, Isabelle Augenstein

This paper documents the Team Copenhagen system which placed first in the CoNLL--SIGMORPHON 2018 shared task on universal morphological reinflection, Task 2 with an overall accuracy of 49. 87.

LEMMA Morphological Inflection +2

Multitask and Multilingual Modelling for Lexical Analysis

no code implementations7 Sep 2018 Johannes Bjerva

In Natural Language Processing (NLP), one traditionally considers a single task (e. g. part-of-speech tagging) for a single language (e. g. English) at a time.

Lexical Analysis Part-Of-Speech Tagging

What do Language Representations Really Represent?

no code implementations CL 2019 Johannes Bjerva, Robert Östling, Maria Han Veiga, Jörg Tiedemann, Isabelle Augenstein

If the corpus is multilingual, the same model can be used to learn distributed representations of languages, such that similar languages end up with similar representations.

Language Modelling Translation

A Probabilistic Generative Model of Linguistic Typology

1 code implementation NAACL 2019 Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

In the principles-and-parameters framework, the structural features of languages depend on parameters that may be toggled on or off, with a single parameter often dictating the status of multiple features.

Uncovering Probabilistic Implications in Typological Knowledge Bases

no code implementations ACL 2019 Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages with object-verb word ordering tend to have post-positions.

Knowledge Base Population

Transductive Auxiliary Task Self-Training for Neural Multi-Task Models

no code implementations WS 2019 Johannes Bjerva, Katharina Kann, Isabelle Augenstein

Multi-task learning and self-training are two common ways to improve a machine learning model's performance in settings with limited training data.

Multi-Task Learning

Back to the Future -- Sequential Alignment of Text Representations

1 code implementation8 Sep 2019 Johannes Bjerva, Wouter Kouw, Isabelle Augenstein

In particular, language evolution causes data drift between time-steps in sequential decision-making tasks.

Decision Making Rumour Detection

Zero-Shot Cross-Lingual Transfer with Meta Learning

1 code implementation EMNLP 2020 Farhad Nooralahzadeh, Giannis Bekoulis, Johannes Bjerva, Isabelle Augenstein

We show that this challenging setup can be approached using meta-learning, where, in addition to training a source language model, another model learns to select which training instances are the most beneficial to the first.

Few-Shot NLI Language Modelling +5

SubjQA: A Dataset for Subjectivity and Review Comprehension

1 code implementation EMNLP 2020 Johannes Bjerva, Nikita Bhutani, Behzad Golshan, Wang-Chiew Tan, Isabelle Augenstein

We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance.

Question Answering Sentiment Analysis +1

Unsupervised Evaluation for Question Answering with Transformers

no code implementations EMNLP (BlackboxNLP) 2020 Lukas Muttenthaler, Isabelle Augenstein, Johannes Bjerva

We observe a consistent pattern in the answer representations, which we show can be used to automatically evaluate whether or not a predicted answer span is correct.

Question Answering

Does Typological Blinding Impede Cross-Lingual Sharing?

no code implementations EACL 2021 Johannes Bjerva, Isabelle Augenstein

Our hypothesis is that a model trained in a cross-lingual setting will pick up on typological cues from the input data, thus overshadowing the utility of explicitly using such features.

Colexifications for Bootstrapping Cross-lingual Datasets: The Case of Phonology, Concreteness, and Affectiveness

no code implementations5 Jun 2023 Yiyi Chen, Johannes Bjerva

Colexification refers to the linguistic phenomenon where a single lexical form is used to convey multiple meanings.

A Framework for Responsible Development of Automated Student Feedback with Generative AI

no code implementations29 Aug 2023 Euan D Lindsay, Aditya Johri, Johannes Bjerva

These questions are important from an ethical perspective; but they are also important from an operational perspective.

Language Modelling

The Past, Present, and Future of Typological Databases in NLP

no code implementations20 Oct 2023 Emi Baylor, Esther Ploeger, Johannes Bjerva

We propose that such a view of typology has significant potential in the future, including in language modeling in low-resource scenarios.

Language Modelling

Patterns of Closeness and Abstractness in Colexifications: The Case of Indigenous Languages in the Americas

no code implementations18 Dec 2023 Yiyi Chen, Johannes Bjerva

Colexification refers to linguistic phenomena where multiple concepts (meanings) are expressed by the same lexical form, such as polysemy or homophony.

Patterns of Persistence and Diffusibility across the World's Languages

no code implementations3 Jan 2024 Yiyi Chen, Johannes Bjerva

Language similarities can be caused by genetic relatedness, areal contact, universality, or chance.

Multilingual NLP

Text Embedding Inversion Security for Multilingual Language Models

no code implementations22 Jan 2024 Yiyi Chen, Heather Lent, Johannes Bjerva

However, storing sensitive information as embeddings can be vulnerable to security breaches, as research shows that text can be reconstructed from embeddings, even without knowledge of the underlying model.

Multilingual Gradient Word-Order Typology from Universal Dependencies

no code implementations2 Feb 2024 Emi Baylor, Esther Ploeger, Johannes Bjerva

While information from the field of linguistic typology has the potential to improve performance on NLP tasks, reliable typological data is a prerequisite.

Sociolinguistically Informed Interpretability: A Case Study on Hinglish Emotion Classification

no code implementations5 Feb 2024 Kushal Tatariya, Heather Lent, Johannes Bjerva, Miryam de Lhoneux

Emotion classification is a challenging task in NLP due to the inherent idiosyncratic and subjective nature of linguistic expression, especially with code-mixed data.

Emotion Classification

What is 'Typological Diversity' in NLP?

1 code implementation6 Feb 2024 Esther Ploeger, Wessel Poelman, Miryam de Lhoneux, Johannes Bjerva

We recommend future work to include an operationalization of 'typological diversity' that empirically justifies the diversity of language samples.

Multilingual NLP

Cannot find the paper you are looking for? You can Submit a new open access paper.