Search Results for author: Nisansa de Silva

Found 33 papers, 7 papers with code

Building a WordNet for Sinhala

no code implementations • WS 2014 • Indeewari Wijesiri, Malaka Gallage, Buddhika Gunathilaka, Madhuranga Lakjeewa, Daya Wimalasuriya, Gihan Dias, Rohini Paranavithana, Nisansa de Silva

Information Retrieval Word Sense Disambiguation

Paper
Add Code

Subject Specific Stream Classification Preprocessing Algorithm for Twitter Data Stream

no code implementations • 28 May 2017 • Nisansa de Silva, Danaja Maldeniya, Chamilka Wijeratne

Micro-blogging service Twitter is a lucrative source for data mining applications on global sentiment.

General Classification

Paper
Add Code

Synergistic Union of Word2Vec and Lexicon for Domain Specific Semantic Similarity

no code implementations • 6 Jun 2017 • Keet Sugathadasa, Buddhi Ayesha, Nisansa de Silva, Amal Shehan Perera, Vindula Jayawardana, Dimuthu Lakmal, Madhavi Perera

Semantic similarity measures are an important part in Natural Language Processing tasks.

Lemmatization Semantic Similarity +1

Paper
Add Code

Deriving a Representative Vector for Ontology Classes with Instance Word Vector Embeddings

no code implementations • 8 Jun 2017 • Vindula Jayawardana, Dimuthu Lakmal, Nisansa de Silva, Amal Shehan Perera, Keet Sugathadasa, Buddhi Ayesha

Selecting a representative vector for a set of vectors is a very common requirement in many algorithmic tasks.

Paper
Add Code

Semi-Supervised Instance Population of an Ontology using Word Vector Embeddings

no code implementations • 9 Sep 2017 • Vindula Jayawardana, Dimuthu Lakmal, Nisansa de Silva, Amal Shehan Perera, Keet Sugathadasa, Buddhi Ayesha, Madhavi Perera

With the use of word embeddings in the field of natural language processing, it became a popular topic due to its ability to cope up with semantic sensitivity.

Management Word Embeddings

Paper
Add Code

Legal Document Retrieval using Document Vector Embeddings and Deep Learning

no code implementations • 27 May 2018 • Keet Sugathadasa, Buddhi Ayesha, Nisansa de Silva, Amal Shehan Perera, Vindula Jayawardana, Dimuthu Lakmal, Madhavi Perera

The ensemble model built in this study, shows a significantly higher accuracy level, which indeed proves the need for incorporation of domain specific semantic similarity measures into the information retrieval process.

Information Retrieval Retrieval +4

Paper
Add Code

Identifying Relationships Among Sentences in Court Case Transcripts Using Discourse Relations

no code implementations • 10 Sep 2018 • Gathika Ratnayaka, Thejan Rupasinghe, Nisansa de Silva, Menuka Warushavithana, Viraj Gamage, Amal Shehan Perera

To the best of our knowledge, this is the first study where discourse relationships between sentences have been used to determine relationships among sentences in legal court case transcripts.

Paper
Add Code

Fast Approach to Build an Automatic Sentiment Annotator for Legal Domain using Transfer Learning

no code implementations • WS 2018 • Viraj Gamage, Menuka Warushavithana, Nisansa de Silva, Amal Shehan Perera, Gathika Ratnayaka, Thejan Rupasinghe

This study proposes a novel way of identifying the sentiment of the phrases used in the legal domain.

Domain Adaptation Transfer Learning

Paper
Add Code

Logic Rules Powered Knowledge Graph Embedding

no code implementations • 9 Mar 2019 • Pengwei Wang, Dejing Dou, Fangzhao Wu, Nisansa de Silva, Lianwen Jin

And then, to put both triples and mined logic rules within the same semantic space, all triples in the knowledge graph are represented as first-order logic.

Knowledge Graph Embedding Link Prediction +1

Paper
Add Code

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research

1 code implementation • 5 Jun 2019 • Nisansa de Silva

Sinhala is the native language of the Sinhalese people who make up the largest ethnic group of Sri Lanka.

Paper
Code

Shift-of-Perspective Identification Within Legal Cases

no code implementations • 6 Jun 2019 • Gathika Ratnayaka, Thejan Rupasinghe, Nisansa de Silva, Viraj Salaka Gamage, Menuka Warushavithana, Amal Shehan Perera

Therefore, the process of automatic information extraction from documents containing legal opinions related to court cases can be considered to be of significant importance.

Open Information Extraction Sentiment Analysis

Paper
Add Code

Sinhala Language Corpora and Stopwords from a Decade of Sri Lankan Facebook

1 code implementation • 15 Jul 2020 • Yudhanjaya Wijeratne, Nisansa de Silva

This paper presents two colloquial Sinhala language corpora from the language efforts of the Data, Analysis and Policy team of LIRNEasia, as well as a list of algorithmically derived stopwords.

Paper
Code

Effective Approach to Develop a Sentiment Annotator For Legal Domain in a Low Resource Setting

no code implementations • PACLIC 2020 • Gathika Ratnayaka, Nisansa de Silva, Amal Shehan Perera, Ramesh Pathirana

Analyzing the sentiments of legal opinions available in Legal Opinion Texts can facilitate several use cases such as legal judgement prediction, contradictory statements identification and party-based sentiment analysis.

Sentiment Analysis

Paper
Add Code

Rule-Based Approach for Party-Based Sentiment Analysis in Legal Opinion Texts

no code implementations • 11 Nov 2020 • Isanka Rajapaksha, Chanika Ruchini Mudalige, Dilini Karunarathna, Nisansa de Silva, Gathika Ratnayaka, Amal Shehan Perera

A document which elaborates opinions and arguments related to the previous court cases is known as a legal opinion text.

Sentiment Analysis

Paper
Add Code

SigmaLaw-ABSA: Dataset for Aspect-Based Sentiment Analysis in Legal Opinion Texts

no code implementations • 12 Nov 2020 • Chanika Ruchini Mudalige, Dilini Karunarathna, Isanka Rajapaksha, Nisansa de Silva, Gathika Ratnayaka, Amal Shehan Perera, Ramesh Pathirana

A number of publicly available datasets for a wide range of domains usually fulfill the needs of researchers to perform their studies in the field of ABSA.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

Paper
Add Code

Exploiting Node Content for Multiview Graph Convolutional Network and Adversarial Regularization

1 code implementation • COLING 2020 • Qiuhao Lu, Nisansa de Silva, Dejing Dou, Thien Huu Nguyen, Prithviraj Sen, Berthold Reinwald, Yunyao Li

Network representation learning (NRL) is crucial in the area of graph learning.

Graph Learning Link Prediction +3

Paper
Code

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

no code implementations • 22 Mar 2021 • Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Ortiz Suarez, Iroro Orife, Kelechi Ogueji, Andre Niyongabo Rubungo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Ballı, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi

With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering hundreds of languages.

Paper
Add Code

Semantic Oppositeness Assisted Deep Contextual Modeling for Automatic Rumor Detection in Social Networks

no code implementations • EACL 2021 • Nisansa de Silva, Dejing Dou

Social networks face a major challenge in the form of rumors and fake news, due to their intrinsic nature of connecting users to millions of others, and of giving any individual the power to post anything.

Paper
Add Code

Critical Sentence Identification in Legal Cases Using Multi-Class Classification

no code implementations • 10 Nov 2021 • Sahan Jayasinghe, Lakith Rambukkanage, Ashan Silva, Nisansa de Silva, Amal Shehan Perera

Inherently, the legal domain contains a vast amount of data in text format.

Classification Multi-class Classification +2

Paper
Add Code

Seeking Sinhala Sentiment: Predicting Facebook Reactions of Sinhala Posts

no code implementations • 1 Dec 2021 • Vihanga Jayawickrama, Gihan Weeraprameshwara, Nisansa de Silva, Yudhanjaya Wijeratne

This paper uses millions of such reactions, derived from a decade worth of Facebook post data centred around a Sri Lankan context, to model an eye of the beholder approach to sentiment detection for online Sinhala textual content.

Binary Classification Sentiment Analysis

Paper
Add Code

Sentiment Analysis with Deep Learning Models: A Comparative Study on a Decade of Sinhala Language Facebook Data

no code implementations • 11 Jan 2022 • Gihan Weeraprameshwara, Vihanga Jayawickrama, Nisansa de Silva, Yudhanjaya Wijeratne

The relationship between Facebook posts and the corresponding reaction feature is an interesting subject to explore and understand.

Sentiment Analysis

Paper
Add Code

Selecting Seed Words for Wordle using Character Statistics

no code implementations • 7 Feb 2022 • Nisansa de Silva

Wordle, a word guessing game rose to global popularity in the January of 2022.

Paper
Add Code

Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World

1 code implementation • 16 Oct 2022 • Surangika Ranathunga, Nisansa de Silva

Using an existing language categorisation based on speaker population and vitality, we analyse the distribution of language data resources, amount of NLP/CL research, inclusion in multilingual web-based platforms and the inclusion in pre-trained multilingual models.

Paper
Code

Sinhala Sentence Embedding: A Two-Tiered Structure for Low-Resource Languages

no code implementations • 26 Oct 2022 • Gihan Weeraprameshwara, Vihanga Jayawickrama, Nisansa de Silva, Yudhanjaya Wijeratne

In the process of numerically modeling natural languages, developing language embeddings is a vital step.

Sentence Sentence Embedding +4

Paper
Add Code

Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons & Dragons

1 code implementation • 18 Dec 2022 • Akila Peiris, Nisansa de Silva

This paper introduces the Forgotten Realms Wiki (FRW) data set and domain specific natural language generation using FRW along with related analyses.

Text Generation

Paper
Code

Sinhala-English Parallel Word Dictionary Dataset

1 code implementation • 4 Aug 2023 • Kasun Wickramasinghe, Nisansa de Silva

However, in the cases where one of the considered language pairs is a low-resource language, the existing top-down parallel data such as corpora are lacking in both tally and quality due to the dearth of human annotation.

Machine Translation Sentence

Paper
Code

Multi-document Summarization: A Comparative Evaluation

no code implementations • 10 Sep 2023 • Kushan Hewapathirana, Nisansa de Silva, C. D. Athuraliya

This work serves as a reference for future MDS research and contributes to the development of accurate and robust models which can be utilized on demanding datasets with academically and/or scientifically complex data as well as generalized, relatively simple datasets.

Document Summarization Multi-Document Summarization

Paper
Add Code

Comparative Analysis of Named Entity Recognition in the Dungeons and Dragons Domain

no code implementations • 29 Sep 2023 • Gayashan Weerasundara, Nisansa de Silva

Many NLP tasks, although well-resolved for general English, face challenges in specific domains like fantasy literature.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language

no code implementations • 17 Nov 2023 • Kasun Wickramasinghe, Nisansa de Silva

In this paper, we try to align Sinhala and English word embedding spaces based on available alignment techniques and introduce a benchmark for Sinhala language embedding alignment.

Paper
Add Code

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora

1 code implementation • 12 Feb 2024 • Surangika Ranathunga, Nisansa de Silva, Menan Velayuthan, Aloka Fernando, Charitha Rathnayake

We conducted a detailed analysis on the quality of web-mined corpora for two low-resource languages (making three language pairs, English-Sinhala, English-Tamil and Sinhala-Tamil).

Machine Translation NMT +1

Paper
Code

Fine Tuning Named Entity Extraction Models for the Fantasy Domain

no code implementations • 16 Feb 2024 • Aravinth Sivaganeshan, Nisansa de Silva

This work uses available lore of monsters in the D&D domain to fine-tune Trankit, which is a prolific NER framework that uses a pre-trained model for NER.

Miscellaneous named-entity-recognition +3

Paper
Add Code

Automatic Generation of Abstracts for Research Papers

no code implementations • ROCLING 2022 • Dushan Kumarasinghe, Nisansa de Silva

Summarizing has always been an important utility for reading long documents.

Abstractive Text Summarization Descriptive

Paper
Add Code

Legal Case Winning Party Prediction With Domain Specific Auxiliary Models

no code implementations • ROCLING 2022 • Sahan Jayasinghe, Lakith Rambukkanage, Ashan Silva, Nisansa de Silva, Amal Shehan Perera

The model is built with and experimented using legal domain specific sub-models to provide more visibility to the final model, along with other variations.

Sentence Sentence Embedding +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.