Adapting the Tesseract Open-Source OCR Engine for Tamil and Sinhala Legacy Fonts and Creating a Parallel Corpus for Tamil-Sinhala-English

1 code implementation13 Sep 2021 Charangan Vasantharajan, Laksika Tharmalingam, Uthayasanker Thayasivam

Since Tamil and Sinhala are Low-Resource Languages, we improved the performance of Tesseract by employing LSTM-based training on more than 20 legacy fonts to recognize printed characters in these languages.

Optical Character Recognition (OCR)

Data-Driven Simulation of Ride-Hailing Services using Imitation and Reinforcement Learning

no code implementations6 Apr 2021 Haritha Jayasinghe, Tarindu Jayatilaka, Ravin Gunawardena, Uthayasanker Thayasivam

Thus, a need arises for a simulated environment where they can predict users' reactions to changes in the platform-specific parameters such as trip fares and incentives.

Imitation Learning reinforcement-learning +2

Exploring Deep Neural Networks and Transfer Learning for Analyzing Emotions in Tweets

no code implementations10 Dec 2020 Yasas Senarath, Uthayasanker Thayasivam

In this paper, we present an experiment on using deep learning and transfer learning techniques for emotion analysis in tweets and suggest a method to interpret our deep learning models.

Deep Learning Emotion Classification +2

A Privacy Preserving Data Publishing Middleware for Unstructured, Textual Social Media Data

no code implementations LREC 2020 Prasadi Abeywardana, Uthayasanker Thayasivam

The next hype of data experimentation is going to be heavily dependent on privacy preserving techniques mainly as it{'}s going to be a legal responsibility rather than a mere social responsibility.

Privacy Preserving

