1 code implementation • 28 Nov 2021 • Zein Shaheen, Gerhard Wohlgenannt, Dmitry Mouromtsev
Also, Gradual unfreezing of pre-trained model's layers during training results in relative improvement of 38-45% for French and 58-70% for German.
1 code implementation • 24 Oct 2020 • Zein Shaheen, Gerhard Wohlgenannt, Erwin Filtz
Large multi-label text classification is a challenging Natural Language Processing (NLP) problem that is concerned with text classification for datasets with thousands of labels.
1 code implementation • 5 May 2020 • Zein Shaheen, Gerhard Wohlgenannt, Bassel Zaity, Dmitry Mouromtsev, Vadim Pak
Generating coherent, grammatically correct, and meaningful text is very challenging, however, it is crucial to many modern NLP systems.
no code implementations • 19 Jul 2019 • Gerhard Wohlgenannt, Dmitry Mouromtsev, Dmitry Pavlov, Yury Emelyanov, Alexey Morozov
With the growing number and size of Linked Data datasets, it is crucial to make the data accessible and useful for users without knowledge of formal query languages.
1 code implementation • 8 Apr 2019 • Ponrudee Netisopakul, Gerhard Wohlgenannt, Aleksei Pulich
In this work, we create three Thai word similarity datasets by translating and re-rating the popular WordSim-353, SimLex-999 and SemEval-2017-Task-2 datasets.
no code implementations • 7 Mar 2019 • Gerhard Wohlgenannt, Ariadna Barinova, Dmitry Ilvovsky, Ekaterina Chernyak
Among the contributions are the evaluation of various word embedding techniques on the different task types, with the findings that even embeddings trained on small corpora perform well for example on the word intrusion task.
no code implementations • 4 Mar 2019 • Gerhard Wohlgenannt, Nikolay Klimov, Dmitry Mouromtsev, Daniil Razdyakonov, Dmitry Pavlov, Yury Emelyanov
One of the big challenges in Linked Data consumption is to create visual and natural language interfaces to the data usable for non-technical users.
1 code implementation • 4 Mar 2019 • Gerhard Wohlgenannt, Artemii Babushkin, Denis Romashov, Igor Ukrainets, Anton Maskaykin, Ilya Shutov
In this paper, we present Russian language datasets in the digital humanities domain for the evaluation of word embedding techniques or similar language modeling and feature learning algorithms.
no code implementations • 4 Mar 2019 • Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky, Ariadna Barinova, Dmitry Mouromtsev
In this research, we manually create high-quality datasets in the digital humanities domain for the evaluation of language models, specifically word embedding models.
no code implementations • WS 2016 • Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky
In this paper a social network is extracted from a literary text.