Search Results for author: Andrei M. Butnaru

Found 11 papers, 2 papers with code

BAM: A combination of deep and shallow models for German Dialect Identification.

no code implementations • WS 2019 • Andrei M. Butnaru

*This is a submission for the Third VarDial Evaluation Campaign* In this paper, we present a machine learning approach for the German Dialect Identification (GDI) Closed Shared Task of the DSL 2019 Challenge.

Dialect Identification

Paper
Add Code

A Report on the Third VarDial Evaluation Campaign

no code implementations • WS 2019 • Marcos Zampieri, Shervin Malmasi, Yves Scherrer, Tanja Samard{\v{z}}i{\'c}, Francis Tyers, Miikka Silfverberg, Natalia Klyueva, Tung-Le Pan, Chu-Ren Huang, Radu Tudor Ionescu, Andrei M. Butnaru, Tommi Jauhiainen

In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019.

Dialect Identification Morphological Analysis

Paper
Add Code

Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation

1 code implementation • NAACL 2019 • Radu Tudor Ionescu, Andrei M. Butnaru

The Vector of Locally-Aggregated Word Embeddings (VLAWE) representation of a document is then computed by accumulating the differences between each codeword vector and each word vector (from the document) associated to the respective codeword.

Ranked #1 on Sentiment Analysis on MR

Multi-Label Text Classification Sentiment Analysis +3

Paper
Code

MOROCO: The Moldavian and Romanian Dialectal Corpus

1 code implementation • ACL 2019 • Andrei M. Butnaru, Radu Tudor Ionescu

In this work, we introduce the MOldavian and ROmanian Dialectal COrpus (MOROCO), which is freely available for download at https://github. com/butnaruandrei/MOROCO.

Cultural Vocal Bursts Intensity Prediction

Paper
Code

Transductive Learning with String Kernels for Cross-Domain Text Classification

no code implementations • 2 Nov 2018 • Radu Tudor Ionescu, Andrei M. Butnaru

Although classifiers for a target domain can be trained on labeled text data from a related source domain, the accuracy of such classifiers is usually lower in the cross-domain setting.

Cross-Domain Text Classification General Classification +3

Paper
Add Code

Improving the results of string kernels in sentiment analysis and Arabic dialect identification by adapting them to your test set

no code implementations • EMNLP 2018 • Radu Tudor Ionescu, Andrei M. Butnaru

Instead, we use the labels predicted by the classifier in the first training iteration.

Dialect Identification General Classification +5

Paper
Add Code

UnibucKernel Reloaded: First Place in Arabic Dialect Identification for the Second Year in a Row

no code implementations • COLING 2018 • Andrei M. Butnaru, Radu Tudor Ionescu

Furthermore, our top macro-F1 score (58. 92%) is significantly better than the second best score (57. 59%) in the 2018 ADI Shared Task, according to the statistical significance test performed by the organizers.

Dialect Identification

Paper
Add Code

Automated essay scoring with string kernels and word embeddings

no code implementations • ACL 2018 • Mădălina Cozma, Andrei M. Butnaru, Radu Tudor Ionescu

In this work, we present an approach based on combining string kernels and word embeddings for automatic essay scoring.

Ranked #3 on Automated Essay Scoring on ASAP

Automated Essay Scoring Dialect Identification +5

Paper
Add Code

UnibucKernel: A kernel-based learning method for complex word identification

no code implementations • WS 2018 • Andrei M. Butnaru, Radu Tudor Ionescu

In this paper, we present a kernel-based learning approach for the 2018 Complex Word Identification (CWI) Shared Task.

Binary Classification Complex Word Identification +2

Paper
Add Code

ShotgunWSD: An unsupervised algorithm for global word sense disambiguation inspired by DNA sequencing

no code implementations • EACL 2017 • Andrei M. Butnaru, Radu Tudor Ionescu, Florentina Hristea

In this paper, we present a novel unsupervised algorithm for word sense disambiguation (WSD) at the document level.

Ranked #8 on Word Sense Disambiguation on SemEval 2007 Task 7

Common Sense Reasoning Word Sense Disambiguation

Paper
Add Code

From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings

no code implementations • 25 Jul 2017 • Andrei M. Butnaru, Radu Tudor Ionescu

In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision.

Clustering General Classification +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.