Search Results for author: Simon Hengchen

Found 12 papers, 6 papers with code

Lexical semantic change for Ancient Greek and Latin

1 code implementation22 Jan 2021 Valerio Perrone, Simon Hengchen, Marco Palma, Alessandro Vatri, Jim Q. Smith, Barbara McGillivray

In this chapter we build on GASC, a recent computational approach to semantic change based on a dynamic Bayesian mixture model.

Challenges for Computational Lexical Semantic Change

no code implementations19 Jan 2021 Simon Hengchen, Nina Tahmasebi, Dominik Schlechtweg, Haim Dubossarsky

The computational study of lexical semantic change (LSC) has taken off in the past few years and we are seeing increasing interest in the field, from both computational sciences and linguistics.

Topic modelling discourse dynamics in historical newspapers

no code implementations20 Nov 2020 Jani Marjanen, Elaine Zosa, Simon Hengchen, Lidia Pivovarova, Mikko Tolonen

This paper addresses methodological issues in diachronic data analysis for historical research.

Topic Models

An Unsupervised method for OCR Post-Correction and Spelling Normalisation for Finnish

1 code implementation NoDaLiDa 2021 Quan Duong, Mika Hämäläinen, Simon Hengchen

Historical corpora are known to contain errors introduced by OCR (optical character recognition) methods used in the digitization process, often said to be degrading the performance of NLP systems.

Machine Translation NMT +3

SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection

2 code implementations SEMEVAL 2020 Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky, Nina Tahmasebi

Lexical Semantic Change detection, i. e., the task of identifying words that change meaning over time, is a very active research area, with applications in NLP, lexicography, and linguistics.

Change Detection

Dataset for Temporal Analysis of English-French Cognates

no code implementations LREC 2020 Esteban Frossard, Mickael Coustaty, Antoine Doucet, Adam Jatowt, Simon Hengchen

Languages change over time and, thanks to the abundance of digital corpora, their evolutionary analysis using computational techniques has recently gained much research attention.

From the Paft to the Fiiture: a Fully Automatic NMT and Word Embeddings Method for OCR Post-Correction

1 code implementation RANLP 2019 Mika Hämäläinen, Simon Hengchen

A great deal of historical corpora suffer from errors introduced by the OCR (optical character recognition) methods used in the digitization process.

BIG-bench Machine Learning Machine Translation +5

Cannot find the paper you are looking for? You can Submit a new open access paper.