Search Results for author: Claudia Borg

Found 13 papers, 4 papers with code

Crowd-sourcing evaluation of automatically acquired, morphologically related word groupings

no code implementations • LREC 2014 • Claudia Borg, Albert Gatt

The automatic discovery and clustering of morphologically related words is an important problem with several practical applications.

Clustering

Paper
Add Code

Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System

no code implementations • WS 2017 • Claudia Borg, Albert Gatt

In particular, we analyse a dataset of morphologically related word clusters to evaluate the difference in results for concatenative and nonconcatenative clusters.

Clustering Morphological Analysis

Paper
Add Code

Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions

1 code implementation • LREC 2018 • Albert Gatt, Marc Tanti, Adrian Muscat, Patrizia Paggio, Reuben A. Farrugia, Claudia Borg, Kenneth P. Camilleri, Mike Rosner, Lonneke van der Plas

To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus.

541

Paper
Code

CUNI--Malta system at SIGMORPHON 2019 Shared Task on Morphological Analysis and Lemmatization in context: Operation-based word formation

no code implementations • WS 2019 • Ronald Cardenas, Claudia Borg, Daniel Zeman

This paper presents the submission by the Charles University-University of Malta team to the SIGMORPHON 2019 Shared Task on Morphological Analysis and Lemmatization in context.

Lemmatization Morphological Analysis +1

Paper
Add Code

Creating Expert Knowledge by Relying on Language Learners: a Generic Approach for Mass-Producing Language Resources by Combining Implicit Crowdsourcing and Language Learning

no code implementations • LREC 2020 • Lionel Nicolas, Verena Lyding, Claudia Borg, Corina Forascu, Kar{\"e}n Fort, Katerina Zdravkova, Iztok Kosem, Jaka {\v{C}}ibej, {\v{S}}pela Arhar Holdt, Alice Millour, Alex K{\"o}nig, er, Christos Rodosthenous, Federico Sangati, Umair ul Hassan, Anisia Katinskaia, Anabela Barreiro, Lavinia Aparaschivei, Yaakov HaCohen-Kerner

We introduce in this paper a generic approach to combine implicit crowdsourcing and language learning in order to mass-produce language resources (LRs) for any language for which a crowd of language learners can be involved.

Paper
Add Code

MASRI-HEADSET: A Maltese Corpus for Speech Recognition

no code implementations • LREC 2020 • Carlos Mena, Albert Gatt, Andrea DeMarco, Claudia Borg, Lonneke van der Plas, Amanda Muscat, Ian Padovani

Maltese, the national language of Malta, is spoken by approximately 500, 000 people.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

1 code implementation • EMNLP (BlackboxNLP) 2021 • Marc Tanti, Lonneke van der Plas, Claudia Borg, Albert Gatt

Recent work has shown evidence that the knowledge acquired by multilingual BERT (mBERT) has two components: a language-specific and a language-neutral one.

Language Identification Natural Language Inference +3

Paper
Code

Analysis of Data Augmentation Methods for Low-Resource Maltese ASR

no code implementations • 15 Nov 2021 • Andrea DeMarco, Carlos Mena, Albert Gatt, Claudia Borg, Aiden Williams, Lonneke van der Plas

Recent years have seen an increased interest in the computational speech processing of Maltese, but resources remain sparse.

Data Augmentation Language Modelling +2

Paper
Add Code

Pre-training Data Quality and Quantity for a Low-Resource Language: New Corpus and BERT Models for Maltese

1 code implementation • DeepLo 2022 • Kurt Micallef, Albert Gatt, Marc Tanti, Lonneke van der Plas, Claudia Borg

We also present a newly created corpus for Maltese, and determine the effect that the pre-training data size and domain have on the downstream performance.

Cross-Lingual Transfer Dependency Parsing +4

Paper
Code

Face2Text revisited: Improved data set and baseline results

no code implementations • PVLAM (LREC) 2022 • Marc Tanti, Shaun Abdilla, Adrian Muscat, Claudia Borg, Reuben A. Farrugia, Albert Gatt

To encourage the development of more human-focused descriptions, we developed a new data set of facial descriptions based on the CelebA image data set.

Transfer Learning

Paper
Add Code

Cross-Lingual Transfer from Related Languages: Treating Low-Resource Maltese as Multilingual Code-Switching

1 code implementation • 30 Jan 2024 • Kurt Micallef, Nizar Habash, Claudia Borg, Fadhl Eryani, Houda Bouamor

Although multilingual language models exhibit impressive cross-lingual transfer capabilities on unseen languages, the performance on downstream tasks is impacted when there is a script disparity with the languages used in the multilingual model's pre-training data.

Cross-Lingual Transfer Transliteration

Paper
Code

National Language Technology Platform (NLTP): overall view

no code implementations • EAMT 2022 • Artūrs Vasiļevskis, Jānis Ziediņš, Marko Tadić, None Željka Motika, Mark Fishel, Hrafn Loftsson, Jón Gu, Claudia Borg, Keith Cortis, Judie Attard, Donatienne Spiteri

The work in progress on the CEF Action National Language Technology Platform (NLTP) is presented.

Paper
Add Code

National Language Technology Platform for Public Administration

no code implementations • TDLE (LREC) 2022 • Marko Tadić, Daša Farkaš, Matea Filko, Artūrs Vasiļevskis, Andrejs Vasiļjevs, Jānis Ziediņš, Željka Motika, Mark Fishel, Hrafn Loftsson, Jón Guðnason, Claudia Borg, Keith Cortis, Judie Attard, Donatienne Spiteri

This article presents the work in progress on the collaborative project of several European countries to develop National Language Technology Platform (NLTP).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.