Figurative Language in Noun Compound Models across Target Properties, Domains and Time

no code implementations LREC (MWE) 2022 Sabine Schulte im Walde

A variety of distributional and multi-modal computational approaches has been suggested for modelling the degrees of compositionality across types of multiword expressions and languages.

Investigating Independence vs. Control: Agenda-Setting in Russian News Coverage on Social Media

1 code implementation LREC 2022 Annerose Eichel, Gabriella Lapesa, Sabine Schulte im Walde

Agenda-setting is a widely explored phenomenon in political science: powerful stakeholders (governments or their financial supporters) have control over the media and set their agenda: political and economical powers determine which news should be salient.

DiaWUG: A Dataset for Diatopic Lexical Semantic Variation in Spanish

no code implementations LREC 2022 Gioia Baldissin, Dominik Schlechtweg, Sabine Schulte im Walde

We provide a novel dataset – DiaWUG – with judgements on diatopic lexical semantic variation for six Spanish variants in Europe and Latin America.

Unveiling the Mystery of Visual Attributes of Concrete and Abstract Concepts: Variability, Nearest Neighbors, and Challenging Categories

1 code implementation15 Oct 2024 Tarun Tater, Sabine Schulte im Walde, Diego Frassinelli

The visual representation of a concept varies significantly depending on its meaning and the context where it occurs; this poses multiple challenges both for vision and multimodal models.


Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds

1 code implementation5 Apr 2024 Annerose Eichel, Tana Deeg, André Blessing, Milena Belosevic, Sabine Arndt-Lappe, Sabine Schulte im Walde

We present a comprehensive computational study of the under-investigated phenomenon of personal name compounds (PNCs) in German such as Willkommens-Merkel ('Welcome-Merkel').

A Dataset for Physical and Abstract Plausibility and Sources of Human Disagreement

1 code implementation5 Apr 2024 Annerose Eichel, Sabine Schulte im Walde

We present a novel dataset for physical and abstract plausibility of events in English.

Semantics of Multiword Expressions in Transformer-Based Models: A Survey

no code implementations27 Jan 2024 Filip Miletić, Sabine Schulte im Walde

Our findings overall question the ability of transformer models to robustly capture fine-grained semantics.

Investigating the Nature of Disagreements on Mid-Scale Ratings: A Case Study on the Abstractness-Concreteness Continuum

no code implementations8 Nov 2023 Urban Knupleš, Diego Frassinelli, Sabine Schulte im Walde

Humans tend to strongly agree on ratings on a scale for extreme cases (e. g., a CAT is judged as very concrete), but judgements on mid-scale words exhibit more disagreement.


Made of Steel? Learning Plausible Materials for Components in the Vehicle Repair Domain

1 code implementation28 Apr 2023 Annerose Eichel, Helena Schlipf, Sabine Schulte im Walde

We propose a novel approach to learn domain-specific plausible materials for components in the vehicle repair domain by probing Pretrained Language Models (PLMs) in a cloze task style setting to overcome the lack of annotated datasets.

Domain Adaptation

Features of Perceived Metaphoricity on the Discourse Level: Abstractness and Emotionality

no code implementations LREC 2022 Prisca Piccirilli, Sabine Schulte im Walde

First, is a metaphorically-perceived discourse more abstract and more emotional in comparison to a literally-perceived discourse?


Modeling Sense Structure in Word Usage Graphs with the Weighted Stochastic Block Model

1 code implementation Joint Conference on Lexical and Computational Semantics 2021 Dominik Schlechtweg, Enrique Castaneda, Jonas Kuhn, Sabine Schulte im Walde

We suggest to model human-annotated Word Usage Graphs capturing fine-grained semantic proximity distinctions between word uses with a Bayesian formulation of the Weighted Stochastic Block Model, a generative model for random graphs popular in biology, physics and social sciences.

Stochastic Block Model

Lexical Semantic Change Discovery

1 code implementation ACL 2021 Sinan Kurtyigit, Maike Park, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde

While there is a large amount of research in the field of Lexical Semantic Change Detection, only few approaches go beyond a standard benchmark evaluation of existing models.

Change Detection

More than just Frequency? Demasking Unsupervised Hypernymy Prediction Methods

1 code implementation Findings (ACL) 2021 Thomas Bott, Dominik Schlechtweg, Sabine Schulte im Walde

This paper presents a comparison of unsupervised methods of hypernymy prediction (i. e., to predict which word in a pair of words such as fish-cod is the hypernym and which the hyponym).

OP-IMS @ DIACR-Ita: Back to the Roots: SGNS+OP+CD still rocks Semantic Change Detection

no code implementations6 Nov 2020 Jens Kaiser, Dominik Schlechtweg, Sabine Schulte im Walde

We present the results of our participation in the DIACR-Ita shared task on lexical semantic change detection for Italian.

Change Detection

IMS at SemEval-2020 Task 1: How low can you go? Dimensionality in Lexical Semantic Change Detection

no code implementations SEMEVAL 2020 Jens Kaiser, Dominik Schlechtweg, Sean Papay, Sabine Schulte im Walde

We present the results of our system for SemEval-2020 Task 1 that exploits a commonly used lexical semantic change detection model based on Skip-Gram with Negative Sampling.

Change Detection

Predicting Degrees of Technicality in Automatic Terminology Extraction

no code implementations ACL 2020 Anna H{\"a}tty, Dominik Schlechtweg, Michael Dorna, Sabine Schulte im Walde

While automatic term extraction is a well-researched area, computational approaches to distinguish between degrees of technicality are still understudied.

Term Extraction Word Embeddings

Variants of Vector Space Reductions for Predicting the Compositionality of English Noun Compounds

no code implementations LREC 2020 Pegah Alipoor, Sabine Schulte im Walde

Predicting the degree of compositionality of noun compounds such as {``}snowball{''} and {``}butterfly{''} is a crucial ingredient for lexicography and Natural Language Processing applications, to know whether the compound should be treated as a whole, or through its constituents, and what it means.

A Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds in the Domains DIY, Cooking and Automotive

no code implementations LREC 2020 Julia Bettinger, Anna H{\"a}tty, Michael Dorna, Sabine Schulte im Walde

We present a dataset with difficulty ratings for 1, 030 German closed noun compounds extracted from domain-specific texts for do-it-ourself (DIY), cooking and automotive.

Varying Vector Representations and Integrating Meaning Shifts into a PageRank Model for Automatic Term Extraction

no code implementations LREC 2020 Anurag Nigam, Anna H{\"a}tty, Sabine Schulte im Walde

We perform a comparative study for automatic term extraction from domain-specific language using a PageRank model with different edge-weighting methods.

Term Extraction

CCOHA: Clean Corpus of Historical American English

no code implementations LREC 2020 Reem Alatrash, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde

Modelling language change is an increasingly important area of interest within the fields of sociolinguistics and historical linguistics.

Simulating Lexical Semantic Change from Sense-Annotated Data

no code implementations9 Jan 2020 Dominik Schlechtweg, Sabine Schulte im Walde

We present a novel procedure to simulate lexical semantic change from synchronic sense-annotated data, and demonstrate its usefulness for assessing lexical semantic change detection models.

Change Detection

You Shall Know a User by the Company It Keeps: Dynamic Representations for Social Media Users in NLP

no code implementations IJCNLP 2019 Marco Del Tredici, Diego Marcheggiani, Sabine Schulte im Walde, Raquel Fernández

Information about individuals can help to better understand what they say, particularly in social media where texts are short.

Graph Attention

A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains

1 code implementation ACL 2019 Dominik Schlechtweg, Anna Hätty, Marco del Tredici, Sabine Schulte im Walde

We perform an interdisciplinary large-scale evaluation for detecting lexical semantic divergences in a diachronic and in a synchronic task: semantic sense changes across time, and semantic sense changes across domains.

Term Extraction

Second-order Co-occurrence Sensitivity of Skip-Gram with Negative Sampling

1 code implementation WS 2019 Dominik Schlechtweg, Cennet Oguz, Sabine Schulte im Walde

We simulate first- and second-order context overlap and show that Skip-Gram with Negative Sampling is similar to Singular Value Decomposition in capturing second-order co-occurrence information, while Pointwise Mutual Information is agnostic to it.

SURel: A Gold Standard for Incorporating Meaning Shifts into Term Extraction

no code implementations SEMEVAL 2019 Anna H{\"a}tty, Dominik Schlechtweg, Sabine Schulte im Walde

We introduce SURel, a novel dataset with human-annotated meaning shifts between general-language and domain-specific contexts.

Term Extraction

Distributional Interaction of Concreteness and Abstractness in Verb--Noun Subcategorisation

no code implementations WS 2019 Diego Frassinelli, Sabine Schulte im Walde

In recent years, both cognitive and computational research has provided empirical analyses of contextual co-occurrence of concrete and abstract words, partially resulting in inconsistent pictures.

Language Identification Object

Projecting Embeddings for Domain Adaption: Joint Modeling of Sentiment Analysis in Diverse Domains

1 code implementation COLING 2018 Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde

Our analysis shows that our model performs comparably to state-of-the-art approaches on domains that are similar, while performing significantly better on highly divergent domains.

Domain Adaptation Sentiment Analysis +1

Fine-Grained Termhood Prediction for German Compound Terms Using Neural Networks

no code implementations COLING 2018 Anna H{\"a}tty, Sabine Schulte im Walde

Automatic term identification and investigating the understandability of terms in a specialized domain are often treated as two separate lines of research.

General Classification

Projecting Embeddings for Domain Adaptation: Joint Modeling of Sentiment Analysis in Diverse Domains

1 code implementation12 Jun 2018 Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde

Our analysis shows that our model performs comparably to state-of-the-art approaches on domains that are similar, while performing significantly better on highly divergent domains.

Domain Adaptation Sentiment Analysis +1

Combining Abstractness and Language-specific Theoretical Indicators for Detecting Non-Literal Usage of Estonian Particle Verbs

no code implementations NAACL 2018 Eleri Aedmaa, Maximilian K{\"o}per, Sabine Schulte im Walde

This paper presents two novel datasets and a random-forest classifier to automatically predict literal vs. non-literal language usage for a highly frequent type of multi-word expression in a low-resource language, i. e., Estonian.

Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change

no code implementations NAACL 2018 Dominik Schlechtweg, Sabine Schulte im Walde, Stefanie Eckmann

We propose a framework that extends synchronic polysemy annotation to diachronic changes in lexical meaning, to counteract the lack of resources for evaluating computational models of lexical semantic change.

Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness

no code implementations NAACL 2018 Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity.

Semantic Similarity Semantic Textual Similarity +1

Distribution-based Prediction of the Degree of Grammaticalization for German Prepositions

no code implementations14 Apr 2018 Dominik Schlechtweg, Sabine Schulte im Walde

We test the hypothesis that the degree of grammaticalization of German prepositions correlates with their corpus-based contextual dispersion measured by word entropy.

Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets

no code implementations WS 2017 Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde

We show that Bi-LSTMs perform well across datasets and that both LSTMs and Bi-LSTMs are particularly good at fine-grained sentiment tasks (i. e., with more than two classes).

Sentiment Analysis Word Embeddings

German in Flux: Detecting Metaphoric Change via Word Entropy

1 code implementation CONLL 2017 Dominik Schlechtweg, Stefanie Eckmann, Enrico Santus, Sabine Schulte im Walde, Daniel Hole

This paper explores the information-theoretic measure entropy to detect metaphoric change, transferring ideas from hypernym detection to research on language change.

Factoring Ambiguity out of the Prediction of Compositionality for German Multi-Word Expressions

no code implementations WS 2017 Stefan Bott, Sabine Schulte im Walde

Ambiguity represents an obstacle for distributional semantic models(DSMs), which typically subsume the contexts of all word senses within one vector.

Clustering Machine Translation

Addressing Problems across Linguistic Levels in SMT: Combining Approaches to Model Morphology, Syntax and Lexical Choice

no code implementations EACL 2017 Marion Weller-Di Marco, Alex Fraser, er, Sabine Schulte im Walde

Many errors in phrase-based SMT can be attributed to problems on three linguistic levels: morphological complexity in the target language, structural differences and lexical choice.

Word Alignment Word Sense Disambiguation

GhoSt-PV: A Representative Gold Standard of German Particle Verbs

no code implementations WS 2016 Stefan Bott, Nana Khvtisavrishvili, Max Kisselew, Sabine Schulte im Walde

German particle verbs represent a frequent type of multi-word-expression that forms a highly productive paradigm in the lexicon.

Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction

no code implementations ACL 2016 Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

We propose a novel vector representation that integrates lexical contrast into distributional vectors and strengthens the most salient features for determining degrees of word similarity.

Word Embeddings Word Similarity

GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds

no code implementations LREC 2016 Sabine Schulte im Walde, Anna H{\"a}tty, Stefan Bott, Nana Khvtisavrishvili

This paper presents a novel gold standard of German noun-noun compounds (Ghost-NN) including 868 compounds annotated with corpus frequencies of the compounds and their constituents, productivity and ambiguity of the constituents, semantic relations between the constituents, and compositionality ratings of compound-constituent pairs.

Fuzzy V-Measure - An Evaluation Method for Cluster Analyses of Ambiguous Data

no code implementations LREC 2014 Jason Utt, Sylvia Springorum, Maximilian K{\"o}per, Sabine Schulte im Walde

This paper discusses an extension of the V-measure (Rosenberg and Hirschberg, 2007), an entropy-based cluster evaluation metric.

Optimizing a Distributional Semantic Model for the Prediction of German Particle Verb Compositionality

no code implementations LREC 2014 Stefan Bott, Sabine Schulte im Walde

In the work presented here we assess the degree of compositionality of German Particle Verbs with a Distributional Semantics Model which only relies on word window information and has no access to syntactic information as such.

Lemmatization Semantic Composition

Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature

no code implementations LREC 2014 Moritz Wittmann, Marion Weller, Sabine Schulte im Walde

In our evaluation against a gold standard, we compare different pre-processing strategies (lemmatized vs. inflected forms) and introduce language model scores of synonym candidates in the context of the input particle verb as well as distributional similarity as additional re-ranking criteria.

Language Modeling Language Modelling +5

Association Norms of German Noun Compounds

no code implementations LREC 2012 Sabine Schulte im Walde, Susanne Borgwaldt, Ronny Jauch

This paper introduces association norms of German noun compounds as a lexical semantic resource for cognitive and computational linguistics research on compositionality.

Automatic classification of German \textitan particle verbs

no code implementations LREC 2012 Sylvia Springorum, Sabine Schulte im Walde, Antje Ro{\ss}deutscher

A focus of the study was on the mutual profit of theoretical and empirical perspectives with respect to salient semantic properties of the an particle verbs: (a) how can we transform the theoretical insights into empirical, corpus-based features, (b) to what extent can we replicate the theoretical classification by a machine learning approach, and (c) can the computational analysis in turn deepen our insights to the semantic properties of the particle verbs?

Classification General Classification

