Search Results for author: Anne Lauscher

Found 28 papers, 16 papers with code

From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers

no code implementations EMNLP 2020 Anne Lauscher, Vinit Ravishankar, Ivan Vuli{\'c}, Goran Glava{\v{s}}

Massively multilingual transformers (MMTs) pretrained via language modeling (e. g., mBERT, XLM-R) have become a default paradigm for zero-shot language transfer in NLP, offering unmatched transfer performance.

Cross-Lingual Word Embeddings Dependency Parsing +4

DS-TOD: Efficient Domain Specialization for Task-Oriented Dialog

1 code implementation Findings (ACL) 2022 Chia-Chien Hung, Anne Lauscher, Simone Ponzetto, Goran Glavaš

Recent work has shown that self-supervised dialog-specific pretraining on large conversational datasets yields substantial gains over traditional language modeling (LM) pretraining in downstream task-oriented dialog (TOD).

Language Modelling Masked Language Modeling +1

Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog

no code implementations20 May 2022 Chia-Chien Hung, Anne Lauscher, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš

We then introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.

Cross-Lingual Transfer Pretrained Language Models

Fair and Argumentative Language Modeling for Computational Argumentation

1 code implementation ACL 2022 Carolin Holtermann, Anne Lauscher, Simone Paolo Ponzetto

We employ our resource to assess the effect of argumentative fine-tuning and debiasing on the intrinsic bias found in transformer-based language models using a lightweight adapter-based approach that is more sustainable and parameter-efficient than full fine-tuning.

Language Modelling

Welcome to the Modern World of Pronouns: Identity-Inclusive Natural Language Processing beyond Gender

no code implementations24 Feb 2022 Anne Lauscher, Archie Crowley, Dirk Hovy

Based on our observations and ethical considerations, we define a series of desiderata for modeling pronouns in language technology.

DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

1 code implementation15 Oct 2021 Chia-Chien Hung, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš

Recent work has shown that self-supervised dialog-specific pretraining on large conversational datasets yields substantial gains over traditional language modeling (LM) pretraining in downstream task-oriented dialog (TOD).

Language Modelling Masked Language Modeling +1

Sustainable Modular Debiasing of Language Models

no code implementations Findings (EMNLP) 2021 Anne Lauscher, Tobias Lüken, Goran Glavaš

Unfair stereotypical biases (e. g., gender, racial, or religious biases) encoded in modern pretrained language models (PLMs) have negative ethical implications for widespread adoption of state-of-the-art language technology.

Fairness Language Modelling +1

Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

1 code implementation13 Aug 2021 Tobias Walter, Celina Kirschner, Steffen Eger, Goran Glavaš, Anne Lauscher, Simone Paolo Ponzetto

We analyze bias in historical corpora as encoded in diachronic distributional semantic models by focusing on two specific forms of bias, namely a political (i. e., anti-communism) and racist (i. e., antisemitism) one.

Diachronic Word Embeddings Word Embeddings

Scientia Potentia Est -- On the Role of Knowledge in Computational Argumentation

no code implementations1 Jul 2021 Anne Lauscher, Henning Wachsmuth, Iryna Gurevych, Goran Glavaš

In this survey paper, we fill this gap by (1) proposing a pyramid of types of knowledge required in CA tasks, (2) analysing the state of the art with respect to the reliance and exploitation of these types of knowledge, for each of the for main research areas in CA, and (3) outlining and discussing directions for future research efforts in CA.

Common Sense Reasoning Natural Language Understanding

MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting

1 code implementation1 Jul 2021 Anne Lauscher, Brandon Ko, Bailey Kuehl, Sophie Johnson, David Jurgens, Arman Cohan, Kyle Lo

In our work, we address this research gap by proposing a novel framework for CCA as a document-level context extraction and labeling task.

Text Classification

Self-Supervised Learning for Visual Summary Identification in Scientific Publications

no code implementations21 Dec 2020 Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš, Shigeo Morishima

Providing visual summaries of scientific publications can increase information access for readers and thereby help deal with the exponential growth in the number of scientific publications.

Self-Supervised Learning

AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings

no code implementations COLING (WANLP) 2020 Anne Lauscher, Rafik Takieddin, Simone Paolo Ponzetto, Goran Glavaš

Our analysis yields several interesting findings, e. g., that implicit gender bias in embeddings trained on Arabic news corpora steadily increases over time (between 2007 and 2017).

Word Embeddings

Creating a Domain-diverse Corpus for Theory-based Argument Quality Assessment

1 code implementation COLING (ArgMining) 2020 Lily Ng, Anne Lauscher, Joel Tetreault, Courtney Napoles

Computational models of argument quality (AQ) have focused primarily on assessing the overall quality or just one specific characteristic of an argument, such as its convincingness or its clarity.

Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing

1 code implementation COLING 2020 Anne Lauscher, Lily Ng, Courtney Napoles, Joel Tetreault

Though preceding work in computational argument quality (AQ) mostly focuses on assessing overall AQ, researchers agree that writers would benefit from feedback targeting individual dimensions of argumentation theory.

The OpenCitations Data Model

1 code implementation25 May 2020 Marilena Daquino, Silvio Peroni, David Shotton, Giovanni Colavizza, Behnam Ghavimi, Anne Lauscher, Philipp Mayr, Matteo Romanello, Philipp Zumstein

A variety of schemas and ontologies are currently used for the machine-readable description of bibliographic entities and citations.

Digital Libraries

Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

1 code implementation EMNLP (DeeLIO) 2020 Anne Lauscher, Olga Majewska, Leonardo F. R. Ribeiro, Iryna Gurevych, Nikolai Rozanov, Goran Glavaš

Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models.

Common Sense Reasoning

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

no code implementations1 May 2020 Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš

Massively multilingual transformers pretrained with language modeling objectives (e. g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance.

Cross-Lingual Word Embeddings Dependency Parsing +5

A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces

3 code implementations13 Sep 2019 Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić

Moreover, we successfully transfer debiasing models, by means of cross-lingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications.

Word Embeddings

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

1 code implementation COLING 2020 Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš

In this work, we complement such distributional knowledge with external lexical knowledge, that is, we integrate the discrete knowledge on word-level semantic similarity into pretraining.

Language Modelling Lexical Simplification +6

MinScIE: Citation-centered Open Information Extraction

1 code implementation Joint Conference on Digital Libraries (JCDL) 2019 Anne Lauscher, Yide Song, Kiril Gashteovski

Acknowledging the importance of citations in scientific literature, in this work we present MinScIE, an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.

Open Information Extraction

Are We Consistently Biased? Multidimensional Analysis of Biases in Distributional Word Vectors

1 code implementation SEMEVAL 2019 Anne Lauscher, Goran Glavaš

In this work, we present a systematic study of biases encoded in distributional word vector spaces: we analyze how consistent the bias effects are across languages, corpora, and embedding models.

Cross-Lingual Transfer Word Embeddings

An Argument-Annotated Corpus of Scientific Publications

no code implementations WS 2018 Anne Lauscher, Goran Glava{\v{s}}, Simone Paolo Ponzetto

We analyze the annotated argumentative structures and investigate the relations between argumentation and other rhetorical aspects of scientific writing, such as discourse roles and citation contexts.

Argument Mining

Cannot find the paper you are looking for? You can Submit a new open access paper.