Search Results for author: Andrei M. Butnaru

Found 11 papers, 2 papers with code

BAM: A combination of deep and shallow models for German Dialect Identification.

no code implementations WS 2019 Andrei M. Butnaru

*This is a submission for the Third VarDial Evaluation Campaign* In this paper, we present a machine learning approach for the German Dialect Identification (GDI) Closed Shared Task of the DSL 2019 Challenge.

Dialect Identification

A Report on the Third VarDial Evaluation Campaign

no code implementations WS 2019 Marcos Zampieri, Shervin Malmasi, Yves Scherrer, Tanja Samard{\v{z}}i{\'c}, Francis Tyers, Miikka Silfverberg, Natalia Klyueva, Tung-Le Pan, Chu-Ren Huang, Radu Tudor Ionescu, Andrei M. Butnaru, Tommi Jauhiainen

In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019.

Dialect Identification Morphological Analysis

Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation

1 code implementation NAACL 2019 Radu Tudor Ionescu, Andrei M. Butnaru

The Vector of Locally-Aggregated Word Embeddings (VLAWE) representation of a document is then computed by accumulating the differences between each codeword vector and each word vector (from the document) associated to the respective codeword.

Multi-Label Text Classification Sentiment Analysis +3

MOROCO: The Moldavian and Romanian Dialectal Corpus

1 code implementation ACL 2019 Andrei M. Butnaru, Radu Tudor Ionescu

In this work, we introduce the MOldavian and ROmanian Dialectal COrpus (MOROCO), which is freely available for download at https://github. com/butnaruandrei/MOROCO.

Cultural Vocal Bursts Intensity Prediction

Transductive Learning with String Kernels for Cross-Domain Text Classification

no code implementations2 Nov 2018 Radu Tudor Ionescu, Andrei M. Butnaru

Although classifiers for a target domain can be trained on labeled text data from a related source domain, the accuracy of such classifiers is usually lower in the cross-domain setting.

Cross-Domain Text Classification General Classification +3

UnibucKernel Reloaded: First Place in Arabic Dialect Identification for the Second Year in a Row

no code implementations COLING 2018 Andrei M. Butnaru, Radu Tudor Ionescu

Furthermore, our top macro-F1 score (58. 92%) is significantly better than the second best score (57. 59%) in the 2018 ADI Shared Task, according to the statistical significance test performed by the organizers.

Dialect Identification

From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings

no code implementations25 Jul 2017 Andrei M. Butnaru, Radu Tudor Ionescu

In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision.

Clustering General Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.