no code implementations • WS 2019 • Andrei M. Butnaru
*This is a submission for the Third VarDial Evaluation Campaign* In this paper, we present a machine learning approach for the German Dialect Identification (GDI) Closed Shared Task of the DSL 2019 Challenge.
no code implementations • WS 2019 • Marcos Zampieri, Shervin Malmasi, Yves Scherrer, Tanja Samard{\v{z}}i{\'c}, Francis Tyers, Miikka Silfverberg, Natalia Klyueva, Tung-Le Pan, Chu-Ren Huang, Radu Tudor Ionescu, Andrei M. Butnaru, Tommi Jauhiainen
In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019.
1 code implementation • NAACL 2019 • Radu Tudor Ionescu, Andrei M. Butnaru
The Vector of Locally-Aggregated Word Embeddings (VLAWE) representation of a document is then computed by accumulating the differences between each codeword vector and each word vector (from the document) associated to the respective codeword.
Ranked #1 on Sentiment Analysis on MR
1 code implementation • ACL 2019 • Andrei M. Butnaru, Radu Tudor Ionescu
In this work, we introduce the MOldavian and ROmanian Dialectal COrpus (MOROCO), which is freely available for download at https://github. com/butnaruandrei/MOROCO.
no code implementations • 2 Nov 2018 • Radu Tudor Ionescu, Andrei M. Butnaru
Although classifiers for a target domain can be trained on labeled text data from a related source domain, the accuracy of such classifiers is usually lower in the cross-domain setting.
no code implementations • EMNLP 2018 • Radu Tudor Ionescu, Andrei M. Butnaru
Instead, we use the labels predicted by the classifier in the first training iteration.
no code implementations • COLING 2018 • Andrei M. Butnaru, Radu Tudor Ionescu
Furthermore, our top macro-F1 score (58. 92%) is significantly better than the second best score (57. 59%) in the 2018 ADI Shared Task, according to the statistical significance test performed by the organizers.
no code implementations • ACL 2018 • Mădălina Cozma, Andrei M. Butnaru, Radu Tudor Ionescu
In this work, we present an approach based on combining string kernels and word embeddings for automatic essay scoring.
Ranked #3 on Automated Essay Scoring on ASAP
no code implementations • WS 2018 • Andrei M. Butnaru, Radu Tudor Ionescu
In this paper, we present a kernel-based learning approach for the 2018 Complex Word Identification (CWI) Shared Task.
no code implementations • EACL 2017 • Andrei M. Butnaru, Radu Tudor Ionescu, Florentina Hristea
In this paper, we present a novel unsupervised algorithm for word sense disambiguation (WSD) at the document level.
Ranked #8 on Word Sense Disambiguation on SemEval 2007 Task 7
no code implementations • 25 Jul 2017 • Andrei M. Butnaru, Radu Tudor Ionescu
In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision.