no code implementations • LREC (MWE) 2022 • Sabine Schulte im Walde
A variety of distributional and multi-modal computational approaches has been suggested for modelling the degrees of compositionality across types of multiword expressions and languages.
1 code implementation • LREC 2022 • Annerose Eichel, Gabriella Lapesa, Sabine Schulte im Walde
Agenda-setting is a widely explored phenomenon in political science: powerful stakeholders (governments or their financial supporters) have control over the media and set their agenda: political and economical powers determine which news should be salient.
no code implementations • EACL (VarDial) 2021 • Diego Frassinelli, Gabriella Lapesa, Reem Alatrash, Dominik Schlechtweg, Sabine Schulte im Walde
Kiezdeutsch is a variety of German predominantly spoken by teenagers from multi-ethnic urban neighborhoods in casual conversations with their peers.
no code implementations • LREC 2022 • Gioia Baldissin, Dominik Schlechtweg, Sabine Schulte im Walde
We provide a novel dataset – DiaWUG – with judgements on diatopic lexical semantic variation for six Spanish variants in Europe and Latin America.
1 code implementation • 15 Oct 2024 • Tarun Tater, Sabine Schulte im Walde, Diego Frassinelli
The visual representation of a concept varies significantly depending on its meaning and the context where it occurs; this poses multiple challenges both for vision and multimodal models.
1 code implementation • 5 Apr 2024 • Annerose Eichel, Tana Deeg, André Blessing, Milena Belosevic, Sabine Arndt-Lappe, Sabine Schulte im Walde
We present a comprehensive computational study of the under-investigated phenomenon of personal name compounds (PNCs) in German such as Willkommens-Merkel ('Welcome-Merkel').
1 code implementation • 5 Apr 2024 • Annerose Eichel, Sabine Schulte im Walde
We present a novel dataset for physical and abstract plausibility of events in English.
no code implementations • 27 Jan 2024 • Filip Miletić, Sabine Schulte im Walde
Our findings overall question the ability of transformer models to robustly capture fine-grained semantics.
no code implementations • 21 Nov 2023 • Dominik Schlechtweg, Shafqat Mumtaz Virk, Pauline Sander, Emma Sköldberg, Lukas Theuer Linke, Tuo Zhang, Nina Tahmasebi, Jonas Kuhn, Sabine Schulte im Walde
We present the DURel tool that implements the annotation of semantic proximity between uses of words into an online, open source interface.
no code implementations • 8 Nov 2023 • Urban Knupleš, Diego Frassinelli, Sabine Schulte im Walde
Humans tend to strongly agree on ratings on a scale for extreme cases (e. g., a CAT is judged as very concrete), but judgements on mid-scale words exhibit more disagreement.
1 code implementation • 28 Apr 2023 • Annerose Eichel, Helena Schlipf, Sabine Schulte im Walde
We propose a novel approach to learn domain-specific plausible materials for components in the vehicle repair domain by probing Pretrained Language Models (PLMs) in a cloze task style setting to overcome the lack of annotated datasets.
no code implementations • *SEM (NAACL) 2022 • Prisca Piccirilli, Sabine Schulte im Walde
Given a specific discourse, which discourse properties trigger the use of metaphorical language, rather than using literal alternatives?
no code implementations • LREC 2022 • Prisca Piccirilli, Sabine Schulte im Walde
First, is a metaphorically-perceived discourse more abstract and more emotional in comparison to a literally-perceived discourse?
no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Anna H{\"a}tty, Julia Bettinger, Michael Dorna, Jonas Kuhn, Sabine Schulte im Walde
Predicting the difficulty of domain-specific vocabulary is an important task towards a better understanding of a domain, and to enhance the communication between lay people and experts.
1 code implementation • Joint Conference on Lexical and Computational Semantics 2021 • Dominik Schlechtweg, Enrique Castaneda, Jonas Kuhn, Sabine Schulte im Walde
We suggest to model human-annotated Word Usage Graphs capturing fine-grained semantic proximity distinctions between word uses with a Bayesian formulation of the Weighted Stochastic Block Model, a generative model for random graphs popular in biology, physics and social sciences.
1 code implementation • ACL 2021 • Sinan Kurtyigit, Maike Park, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde
While there is a large amount of research in the field of Lexical Semantic Change Detection, only few approaches go beyond a standard benchmark evaluation of existing models.
1 code implementation • Findings (ACL) 2021 • Thomas Bott, Dominik Schlechtweg, Sabine Schulte im Walde
This paper presents a comparison of unsupervised methods of hypernymy prediction (i. e., to predict which word in a pair of words such as fish-cod is the hypernym and which the hyponym).
no code implementations • EACL 2021 • Severin Laicher, Sinan Kurtyigit, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde
Type- and token-based embedding architectures are still competing in lexical semantic change detection.
1 code implementation • 14 Nov 2020 • Severin Laicher, Gioia Baldissin, Enrique Castañeda, Dominik Schlechtweg, Sabine Schulte im Walde
We present the results of our participation in the DIACR-Ita shared task on lexical semantic change detection for Italian.
no code implementations • 6 Nov 2020 • Jens Kaiser, Dominik Schlechtweg, Sabine Schulte im Walde
We present the results of our participation in the DIACR-Ita shared task on lexical semantic change detection for Italian.
no code implementations • SEMEVAL 2020 • Jens Kaiser, Dominik Schlechtweg, Sean Papay, Sabine Schulte im Walde
We present the results of our system for SemEval-2020 Task 1 that exploits a commonly used lexical semantic change detection model based on Skip-Gram with Negative Sampling.
no code implementations • ACL 2020 • Anna H{\"a}tty, Dominik Schlechtweg, Michael Dorna, Sabine Schulte im Walde
While automatic term extraction is a well-researched area, computational approaches to distinguish between degrees of technicality are still understudied.
no code implementations • LREC 2020 • Pegah Alipoor, Sabine Schulte im Walde
Predicting the degree of compositionality of noun compounds such as {``}snowball{''} and {``}butterfly{''} is a crucial ingredient for lexicography and Natural Language Processing applications, to know whether the compound should be treated as a whole, or through its constituents, and what it means.
no code implementations • LREC 2020 • Julia Bettinger, Anna H{\"a}tty, Michael Dorna, Sabine Schulte im Walde
We present a dataset with difficulty ratings for 1, 030 German closed noun compounds extracted from domain-specific texts for do-it-ourself (DIY), cooking and automotive.
no code implementations • LREC 2020 • Anurag Nigam, Anna H{\"a}tty, Sabine Schulte im Walde
We perform a comparative study for automatic term extraction from domain-specific language using a PageRank model with different edge-weighting methods.
no code implementations • LREC 2020 • Reem Alatrash, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde
Modelling language change is an increasingly important area of interest within the fields of sociolinguistics and historical linguistics.
no code implementations • 9 Jan 2020 • Dominik Schlechtweg, Sabine Schulte im Walde
We present a novel procedure to simulate lexical semantic change from synchronic sense-annotated data, and demonstrate its usefulness for assessing lexical semantic change detection models.
no code implementations • IJCNLP 2019 • Marco Del Tredici, Diego Marcheggiani, Sabine Schulte im Walde, Raquel Fernández
Information about individuals can help to better understand what they say, particularly in social media where texts are short.
1 code implementation • ACL 2019 • Dominik Schlechtweg, Anna Hätty, Marco del Tredici, Sabine Schulte im Walde
We perform an interdisciplinary large-scale evaluation for detecting lexical semantic divergences in a diachronic and in a synchronic task: semantic sense changes across time, and semantic sense changes across domains.
1 code implementation • WS 2019 • Dominik Schlechtweg, Cennet Oguz, Sabine Schulte im Walde
We simulate first- and second-order context overlap and show that Skip-Gram with Negative Sampling is similar to Singular Value Decomposition in capturing second-order co-occurrence information, while Pointwise Mutual Information is agnostic to it.
no code implementations • SEMEVAL 2019 • Anna H{\"a}tty, Dominik Schlechtweg, Sabine Schulte im Walde
We introduce SURel, a novel dataset with human-annotated meaning shifts between general-language and domain-specific contexts.
no code implementations • WS 2019 • Diego Frassinelli, Sabine Schulte im Walde
In recent years, both cognitive and computational research has provided empirical analyses of contextual co-occurrence of concrete and abstract words, partially resulting in inconsistent pictures.
1 code implementation • COLING 2018 • Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde
Our analysis shows that our model performs comparably to state-of-the-art approaches on domains that are similar, while performing significantly better on highly divergent domains.
no code implementations • COLING 2018 • Anna H{\"a}tty, Sabine Schulte im Walde
Automatic term identification and investigating the understandability of terms in a specialized domain are often treated as two separate lines of research.
1 code implementation • 12 Jun 2018 • Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde
Our analysis shows that our model performs comparably to state-of-the-art approaches on domains that are similar, while performing significantly better on highly divergent domains.
no code implementations • SEMEVAL 2018 • Daniela Naumann, Diego Frassinelli, Sabine Schulte im Walde
Across disciplines, researchers are eager to gain insight into empirical features of abstract vs. concrete concepts.
no code implementations • NAACL 2018 • Maximilian K{\"o}per, Sabine Schulte im Walde
We present a computational model to detect and distinguish analogies in meaning shifts between German base and complex verbs.
no code implementations • NAACL 2018 • Eleri Aedmaa, Maximilian K{\"o}per, Sabine Schulte im Walde
This paper presents two novel datasets and a random-forest classifier to automatically predict literal vs. non-literal language usage for a highly frequent type of multi-word expression in a low-resource language, i. e., Estonian.
no code implementations • WS 2018 • Ina Roesiger, Maximilian K{\"o}per, Kim Anh Nguyen, Sabine Schulte im Walde
Cases of coreference and bridging resolution often require knowledge about semantic relations between anaphors and antecedents.
no code implementations • NAACL 2018 • Anna H{\"a}tty, Sabine Schulte im Walde
This paper introduces a new dataset of term annotation.
no code implementations • SEMEVAL 2018 • Sabine Schulte im Walde, Maximilian K{\"o}per, Sylvia Springorum
This paper presents a collection to assess meaning components in German complex verbs, which frequently undergo meaning shifts.
1 code implementation • ACL 2018 • Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde
Sentiment analysis in low-resource languages suffers from a lack of annotated corpora to estimate high-performing models.
Cross-Lingual Sentiment Classification Machine Translation +5
no code implementations • NAACL 2018 • Dominik Schlechtweg, Sabine Schulte im Walde, Stefanie Eckmann
We propose a framework that extends synchronic polysemy annotation to diachronic changes in lexical meaning, to counteract the lack of resources for evaluating computational models of lexical semantic change.
no code implementations • NAACL 2018 • Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity.
no code implementations • 14 Apr 2018 • Dominik Schlechtweg, Sabine Schulte im Walde
We test the hypothesis that the degree of grammaticalization of German prepositions correlates with their corpus-based contextual dispersion measured by word entropy.
no code implementations • WS 2017 • Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde
We show that Bi-LSTMs perform well across datasets and that both LSTMs and Bi-LSTMs are particularly good at fine-grained sentiment tasks (i. e., with more than two classes).
no code implementations • EMNLP 2017 • Kim Anh Nguyen, Maximilian Köper, Sabine Schulte im Walde, Ngoc Thang Vu
We present a novel neural model HyperVec to learn hierarchical embeddings for hypernymy detection and directionality.
1 code implementation • CONLL 2017 • Dominik Schlechtweg, Stefanie Eckmann, Enrico Santus, Sabine Schulte im Walde, Daniel Hole
This paper explores the information-theoretic measure entropy to detect metaphoric change, transferring ideas from hypernym detection to research on language change.
no code implementations • WS 2017 • Stefan Bott, Sabine Schulte im Walde
Ambiguity represents an obstacle for distributional semantic models(DSMs), which typically subsume the contexts of all word senses within one vector.
no code implementations • WS 2017 • Maximilian K{\"o}per, Sabine Schulte im Walde
Abstract words refer to things that can not be seen, heard, felt, smelled, or tasted as opposed to concrete words.
no code implementations • WS 2017 • Maximilian K{\"o}per, Sabine Schulte im Walde
This paper compares a neural network DSM relying on textual co-occurrences with a multi-modal model integrating visual information.
no code implementations • EACL 2017 • Anna H{\"a}tty, Michael Dorna, Sabine Schulte im Walde
Feature design and selection is a crucial aspect when treating terminology extraction as a machine learning classification problem.
no code implementations • EACL 2017 • Maximilian K{\"o}per, Sabine Schulte im Walde
Up to date, the majority of computational models still determines the semantic relatedness between words (or larger linguistic units) on the type level.
no code implementations • EACL 2017 • Marion Weller-Di Marco, Alex Fraser, er, Sabine Schulte im Walde
Many errors in phrase-based SMT can be attributed to problems on three linguistic levels: morphological complexity in the target language, structural differences and lexical choice.
1 code implementation • EACL 2017 • Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
Distinguishing between antonyms and synonyms is a key task to achieve high performance in NLP systems.
no code implementations • WS 2016 • Stefan Bott, Nana Khvtisavrishvili, Max Kisselew, Sabine Schulte im Walde
German particle verbs represent a frequent type of multi-word-expression that forms a highly productive paradigm in the lexicon.
1 code implementation • COLING 2016 • Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
Word embeddings have been demonstrated to benefit NLP tasks impressively.
no code implementations • ACL 2016 • Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu
We propose a novel vector representation that integrates lexical contrast into distributional vectors and strengthens the most salient features for determining degrees of word similarity.
no code implementations • LREC 2016 • Maximilian K{\"o}per, Melanie Zai{\ss}, Qi Han, Steffen Koch, Sabine Schulte im Walde
Vector space models and distributional information are widely used in NLP.
no code implementations • LREC 2016 • Maximilian K{\"o}per, Sabine Schulte im Walde
This paper presents a collection of 350, 000 German lemmatised words, rated on four psycholinguistic affective attributes.
no code implementations • LREC 2016 • Sabine Schulte im Walde, Anna H{\"a}tty, Stefan Bott, Nana Khvtisavrishvili
This paper presents a novel gold standard of German noun-noun compounds (Ghost-NN) including 868 compounds annotated with corpus frequencies of the compounds and their constituents, productivity and ambiguity of the constituents, semantic relations between the constituents, and compositionality ratings of compound-constituent pairs.
no code implementations • LREC 2014 • Jason Utt, Sylvia Springorum, Maximilian K{\"o}per, Sabine Schulte im Walde
This paper discusses an extension of the V-measure (Rosenberg and Hirschberg, 2007), an entropy-based cluster evaluation metric.
no code implementations • LREC 2014 • Stefan Bott, Sabine Schulte im Walde
In the work presented here we assess the degree of compositionality of German Particle Verbs with a Distributional Semantics Model which only relies on word window information and has no access to syntactic information as such.
no code implementations • LREC 2014 • Moritz Wittmann, Marion Weller, Sabine Schulte im Walde
In our evaluation against a gold standard, we compare different pre-processing strategies (lemmatized vs. inflected forms) and introduce language model scores of synonym candidates in the context of the input particle verb as well as distributional similarity as additional re-ranking criteria.
no code implementations • LREC 2014 • Maximilian K{\"o}per, Sabine Schulte im Walde
This paper addresses vector space models of prepositions, a notoriously ambiguous word class.
no code implementations • LREC 2012 • Sabine Schulte im Walde, Susanne Borgwaldt, Ronny Jauch
This paper introduces association norms of German noun compounds as a lexical semantic resource for cognitive and computational linguistics research on compositionality.
no code implementations • LREC 2012 • Sylvia Springorum, Sabine Schulte im Walde, Antje Ro{\ss}deutscher
A focus of the study was on the mutual profit of theoretical and empirical perspectives with respect to salient semantic properties of the an particle verbs: (a) how can we transform the theoretical insights into empirical, corpus-based features, (b) to what extent can we replicate the theoretical classification by a machine learning approach, and (c) can the computational analysis in turn deepen our insights to the semantic properties of the particle verbs?