Search Results for author: Simon Hengchen

The computational study of lexical semantic change (LSC) has taken off in the past few years and we are seeing increasing interest in the field, from both computational sciences and linguistics.

Paper
Add Code

Topic modelling discourse dynamics in historical newspapers

no code implementations • 20 Nov 2020 • Jani Marjanen, Elaine Zosa, Simon Hengchen, Lidia Pivovarova, Mikko Tolonen

This paper addresses methodological issues in diachronic data analysis for historical research.

Topic Models

Paper
Add Code

An Unsupervised method for OCR Post-Correction and Spelling Normalisation for Finnish

1 code implementation • NoDaLiDa 2021 • Quan Duong, Mika Hämäläinen, Simon Hengchen

Historical corpora are known to contain errors introduced by OCR (optical character recognition) methods used in the digitization process, often said to be degrading the performance of NLP systems.

Machine Translation NMT +3

Paper
Code

SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection

2 code implementations • SEMEVAL 2020 • Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky, Nina Tahmasebi

Lexical Semantic Change detection, i. e., the task of identifying words that change meaning over time, is a very active research area, with applications in NLP, lexicography, and linguistics.

Change Detection

Paper
Code

Dataset for Temporal Analysis of English-French Cognates

no code implementations • LREC 2020 • Esteban Frossard, Mickael Coustaty, Antoine Doucet, Adam Jatowt, Simon Hengchen

Languages change over time and, thanks to the abundance of digital corpora, their evolutionary analysis using computational techniques has recently gained much research attention.

Paper
Add Code

From the Paft to the Fiiture: a Fully Automatic NMT and Word Embeddings Method for OCR Post-Correction

1 code implementation • RANLP 2019 • Mika Hämäläinen, Simon Hengchen

A great deal of historical corpora suffer from errors introduced by the OCR (optical character recognition) methods used in the digitization process.

BIG-bench Machine Learning Machine Translation +5

Paper
Code

Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change

1 code implementation • ACL 2019 • Haim Dubossarsky, Simon Hengchen, Nina Tahmasebi, Dominik Schlechtweg

State-of-the-art models of lexical semantic change detection suffer from noise stemming from vector space alignment.

Change Detection

Paper
Code

GASC: Genre-Aware Semantic Change for Ancient Greek

no code implementations • WS 2019 • Valerio Perrone, Marco Palma, Simon Hengchen, Alessandro Vatri, Jim Q. Smith, Barbara McGillivray

Word meaning changes over time, depending on linguistic and extra-linguistic factors.

Information Retrieval Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.