Search Results for author: Surangika Ranathunga

Found 45 papers, 10 papers with code

Classification of Code-Mixed Text Using Capsule Networks

no code implementations RANLP 2021 Shanaka Chathuranga, Surangika Ranathunga

A major challenge in analysing social me-dia data belonging to languages that use non-English script is its code-mixed nature.

Classification XLM-R

BERTifying Sinhala - A Comprehensive Analysis of Pre-trained Language Models for Sinhala Text Classification

1 code implementation LREC 2022 Vinura Dhananjaya, Piyumal Demotte, Surangika Ranathunga, Sanath Jayasena

We test on a set of different Sinhala text classification tasks and our analysis shows that out of the pre-trained multilingual models that include Sinhala (XLM-R, LaBSE, and LASER), XLM-R is the best model by far for Sinhala text classification.

text-classification Text Classification +1

Dataset and Baseline for Automatic Student Feedback Analysis

no code implementations LREC 2022 Missaka Herath, Kushan Chamindu, Hashan Maduwantha, Surangika Ranathunga

In this paper, we present a student feedback corpus, which contains 3000 instances of feedback written by university students.

Aspect Extraction Sentence +1

Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation

no code implementations28 Mar 2025 Sarubi Thillainathan, Songchen Yuan, En-Shiun Annie Lee, Sanath Jayasena, Surangika Ranathunga

Our experiments reveal that these approaches enhance translation performance by an average of +1. 47 bilingual evaluation understudy (BLEU) score compared to the standard single-stage fine-tuning baseline across all translation directions.

Low Resource NMT NMT +2

Linguistic Entity Masking to Improve Cross-Lingual Representation of Multilingual Language Models for Low-Resource Languages

no code implementations10 Jan 2025 Aloka Fernando, Surangika Ranathunga

In this paper, we introduce a novel masking strategy, Linguistic Entity Masking (LEM) to be used in the continual pre-training step to further improve the cross-lingual representations of existing multiPLMs.

Language Modelling Sentiment Analysis

Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches

1 code implementation31 Dec 2024 Yomal De Mel, Kasun Wickramasinghe, Nisansa de Silva, Surangika Ranathunga

We propose two methods to address this problem: Our baseline is a rule-based method, which is then compared against our second method where we approach the transliteration problem as a sequence-to-sequence task akin to the established Neural Machine Translation (NMT) task.

Decoder NMT +1

Large Language Models for Ingredient Substitution in Food Recipes using Supervised Fine-tuning and Direct Preference Optimization

1 code implementation6 Dec 2024 Thevin Senath, Kumuthu Athukorala, Ransika Costa, Surangika Ranathunga, Rishemjit Kaur

Given that the use of LLMs for this task has been barely done, we carry out an extensive set of experiments to determine the best LLM, prompt, and the fine-tuning setups.

Multi-Task Learning

Transfer Learning on Transformers for Building Energy Consumption Forecasting -- A Comparative Study

no code implementations18 Oct 2024 Robert Spencer, Surangika Ranathunga, Mikael Boulic, Andries van Heerden, Teo Susnjak

This study investigates the application of Transfer Learning (TL) on Transformer architectures to enhance building energy consumption forecasting.

Time Series Forecasting Transfer Learning

Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research

no code implementations10 Jun 2024 Surangika Ranathunga, Nisansa de Silva, Dilith Jayakody, Aloka Fernando

We analysed a sample of NLP research papers archived in ACL Anthology as an attempt to quantify the degree of openness and the benefit of such an open culture in the NLP community.

Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning

no code implementations8 Apr 2024 Teo Susnjak, Peter Hwang, Napoleon H. Reyes, Andre L. C. Barczak, Timothy R. McIntosh, Surangika Ranathunga

This study broadens the appeal of AI-enhanced tools across various academic and research fields, setting a new standard for conducting comprehensive and accurate literature reviews with more efficiency in the face of ever-increasing volumes of academic studies.

Hallucination Language Modeling +2

Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation

no code implementations5 Apr 2024 Tong Su, Xin Peng, Sarubi Thillainathan, David Guzmán, Surangika Ranathunga, En-Shiun Annie Lee

Parameter-efficient fine-tuning (PEFT) methods are increasingly vital in adapting large-scale pre-trained language models for diverse tasks, offering a balance between adaptability and computational efficiency.

Computational Efficiency Machine Translation +3

Harnessing the power of LLMs for normative reasoning in MASs

no code implementations25 Mar 2024 Bastin Tony Roy Savarimuthu, Surangika Ranathunga, Stephen Cranefield

This paper thus aims to foster collaboration between MAS, NLP and LLM researchers in order to advance the field of normative agents.

Decision Making

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora

1 code implementation12 Feb 2024 Surangika Ranathunga, Nisansa de Silva, Menan Velayuthan, Aloka Fernando, Charitha Rathnayake

We conducted a detailed analysis on the quality of web-mined corpora for two low-resource languages (making three language pairs, English-Sinhala, English-Tamil and Sinhala-Tamil).

Machine Translation NMT +1

Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation

1 code implementation2 Jun 2023 Shravan Nayak, Surangika Ranathunga, Sarubi Thillainathan, Rikki Hung, Anthony Rinaldi, Yining Wang, Jonah Mackey, Andrew Ho, En-Shiun Annie Lee

In this paper, we show that intermediate-task fine-tuning (ITFT) of PMSS models is extremely beneficial for domain-specific NMT, especially when target domain data is limited/unavailable and the considered languages are missing or under-represented in the PMSS model.

NMT

Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World

1 code implementation16 Oct 2022 Surangika Ranathunga, Nisansa de Silva

Using an existing language categorisation based on speaker population and vitality, we analyse the distribution of language data resources, amount of NLP/CL research, inclusion in multilingual web-based platforms and the inclusion in pre-trained multilingual models.

BERTifying Sinhala -- A Comprehensive Analysis of Pre-trained Language Models for Sinhala Text Classification

no code implementations16 Aug 2022 Vinura Dhananjaya, Piyumal Demotte, Surangika Ranathunga, Sanath Jayasena

We test on a set of different Sinhala text classification tasks and our analysis shows that out of the pre-trained multilingual models that include Sinhala (XLM-R, LaBSE, and LASER), XLM-R is the best model by far for Sinhala text classification.

text-classification Text Classification +1

Data Augmentation to Address Out-of-Vocabulary Problem in Low-Resource Sinhala-English Neural Machine Translation

no code implementations18 May 2022 Aloka Fernando, Surangika Ranathunga

However, existing DA techniques have addressed only one of these OOV types and limit to considering either syntactic constraints or semantic constraints.

Data Augmentation Machine Translation +1

vue4logs -- Automatic Structuring of Heterogeneous Computer System Logs

no code implementations14 Feb 2022 Isuru Boyagane, Oshadha Katulanda, Surangika Ranathunga, Srinath Perera

Computer system log data is commonly used in system monitoring, performance characteristic investigation, workflow modeling and anomaly detection.

Anomaly Detection Information Retrieval +1

Dual-State Capsule Networks for Text Classification

no code implementations10 Sep 2021 Piyumal Demotte, Surangika Ranathunga

Thus, they could be considered as a viable alternative for text classification for languages that do not have pre-trained contextual embedding models.

Language Modeling Language Modelling +3

Neural Machine Translation for Low-Resource Languages: A Survey

no code implementations29 Jun 2021 Surangika Ranathunga, En-Shiun Annie Lee, Marjana Prifti Skenduli, Ravi Shekhar, Mehreen Alam, Rishemjit Kaur

Neural Machine Translation (NMT) has seen a tremendous spurt of growth in less than ten years, and has already entered a mature phase.

Machine Translation NMT +2

Exploiting Parallel Corpora to Improve Multilingual Embedding based Document and Sentence Alignment

no code implementations12 Jun 2021 Dilan Sachintha, Lakmali Piyarathna, Charith Rajitha, Surangika Ranathunga

This paper presents a weighting mechanism that makes use of available small-scale parallel corpora to improve the performance of multilingual sentence representations on document and sentence alignment.

Sentence

Sentiment Analysis for Sinhala Language using Deep Learning Techniques

1 code implementation14 Nov 2020 Lahiru Senevirathne, Piyumal Demotte, Binod Karunanayake, Udyogi Munasinghe, Surangika Ranathunga

For sentiment analysis, there exists only two previous research with deep learning approaches, which focused only on document-level sentiment analysis for the binary case.

Deep Learning Sentiment Analysis

Data Augmentation and Terminology Integration for Domain-Specific Sinhala-English-Tamil Statistical Machine Translation

no code implementations5 Nov 2020 Aloka Fernando, Surangika Ranathunga, Gihan Dias

This paper focuses on data augmentation techniques where bilingual lexicon terms are expanded based on case-markers with the objective of generating new words, to be used in Statistical machine Translation (SMT).

Data Augmentation Machine Translation +1

Multi-lingual Mathematical Word Problem Generation using Long Short Term Memory Networks with Enhanced Input Features

no code implementations LREC 2020 Vijini Liyanage, Surangika Ranathunga

A Mathematical Word Problem (MWP) differs from a general textual representation due to the fact that it is comprised of numerical quantities and units, in addition to text.

POS TAG +1

Word Embedding Evaluation for Sinhala

no code implementations LREC 2020 Dimuthu Lakmal, Surangika Ranathunga, Saman Peramuna, Indu Herath

This paper presents the first ever comprehensive evaluation of different types of word embeddings for Sinhala language.

Part-Of-Speech Tagging POS +3

A Multi-language Platform for Generating Algebraic Mathematical Word Problems

no code implementations19 Nov 2019 Vijini Liyanage, Surangika Ranathunga

Existing approaches for automatically generating mathematical word problems are deprived of customizability and creativity due to the inherent nature of template-based mechanisms they employ.

POS Text Generation

Comprehensive Part-Of-Speech Tag Set and SVM based POS Tagger for Sinhala

no code implementations WS 2016 Fern, S o, areka, Surangika Ranathunga, Sanath Jayasena, Gihan Dias

This paper presents a new comprehensive multi-level Part-Of-Speech tag set and a Support Vector Machine based Part-Of-Speech tagger for the Sinhala language.

POS TAG

Cannot find the paper you are looking for? You can Submit a new open access paper.