Neural-based Tamil Grammar Error Detection

This paper describes an ongoing development of a grammar error checker for the Tamil language using a state-of-the-art deep neural-based approach. This proposed checker capture a vital type of grammar error called subject-predicate agreement errors. In this case, we specifically target the agreement error that occurs between nominal subject and verbal predicates. We also created the first-ever grammar error annotated corpus for Tamil. In addition, we experimented with different multi-lingual pre-trained language models to capture syntactic information and found that IndicBERT gives better performance for our tasks. We implemented this grammar checker as a multi-class classification on top of the IndicBERT pre-trained model, which we fine-tuned using our annotated data. This baseline model gives an F1 Score of 73.4. We are now in the process of improving this proposed system with the use of a dependency parser.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here