Search Results for author: Anoop Kunchukuttan

Existing research on Tabular Natural Language Inference (TNLI) exclusively examines the task in a monolingual setting where the tabular premise and hypothesis are in the same language.

Natural Language Inference

Paper
Add Code

Overview of the 8th Workshop on Asian Translation

no code implementations • ACL (WAT) 2021 • Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Sadao Kurohashi

This paper presents the results of the shared tasks from the 8th workshop on Asian translation (WAT2021).

Translation

Paper
Add Code

Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation

no code implementations • 25 Mar 2024 • Kartik, Sanjana Soni, Anoop Kunchukuttan, Tanmoy Chakraborty, Md Shad Akhtar

In this paper, we tackle the problem of code-mixed (Hinglish and Bengalish) to English machine translation.

Machine Translation Sentence +2

Paper
Add Code

IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages

1 code implementation • 11 Mar 2024 • Mohammed Safi Ur Rahman Khan, Priyam Mehta, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Suriyaprasaad G, Varun Balan G, Sparsh Jain, Anoop Kunchukuttan, Pratyush Kumar, Raj Dabre, Mitesh M. Khapra

We hope that the datasets, tools, and resources released as a part of this work will not only propel the research and development of Indic LLMs but also establish an open-source blueprint for extending such efforts to other languages.

Paper
Code

Airavata: Introducing Hindi Instruction-tuned LLM

1 code implementation • 26 Jan 2024 • Jay Gala, Thanmay Jayakumar, Jaavid Aktar Husain, Aswanth Kumar M, Mohammed Safi Ur Rahman Khan, Diptesh Kanojia, Ratish Puduppully, Mitesh M. Khapra, Raj Dabre, Rudra Murthy, Anoop Kunchukuttan

We announce the initial release of "Airavata," an instruction-tuned LLM for Hindi.

Paper
Code

RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models models via Romanization

no code implementations • 25 Jan 2024 • Jaavid Aktar Husain, Raj Dabre, Aswanth Kumar, Jay Gala, Thanmay Jayakumar, Ratish Puduppully, Anoop Kunchukuttan

This study addresses the challenge of extending Large Language Models (LLMs) to non-English languages using non-Roman scripts.

Continual Pretraining Sentiment Analysis

Paper
Add Code

Bhasha-Abhijnaanam: Native-script and romanized Language Identification for 22 Indic languages

1 code implementation • 25 May 2023 • Yash Madhani, Mitesh M. Khapra, Anoop Kunchukuttan

We create publicly available language identification (LID) datasets and models in all 22 Indian languages listed in the Indian constitution in both native-script and romanized text.

Language Identification

Paper
Code

IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages

2 code implementations • 25 May 2023 • Jay Gala, Pranjal A. Chitale, Raghavan AK, Varun Gumma, Sumanth Doddapaneni, Aswanth Kumar, Janki Nawale, Anupama Sujatha, Ratish Puduppully, Vivek Raghavan, Pratyush Kumar, Mitesh M. Khapra, Raj Dabre, Anoop Kunchukuttan

Prior to this work, there was (i) no parallel training data spanning all 22 languages, (ii) no robust benchmarks covering all these languages and containing content relevant to India, and (iii) no existing translation models which support all the 22 scheduled languages of India.

Machine Translation Sentence +1

174

Paper
Code

CTQScorer: Combining Multiple Features for In-context Example Selection for Machine Translation

1 code implementation • 23 May 2023 • Aswanth Kumar, Ratish Puduppully, Raj Dabre, Anoop Kunchukuttan

We learn a regression model, CTQ Scorer (Contextual Translation Quality), that selects examples based on multiple features in order to maximize the translation quality.

In-Context Learning Machine Translation +2

Paper
Code

Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models

1 code implementation • 22 May 2023 • Ratish Puduppully, Anoop Kunchukuttan, Raj Dabre, Ai Ti Aw, Nancy F. Chen

This study investigates machine translation between related languages i. e., languages within the same family that share linguistic characteristics such as word order and lexical similarity.

Machine Translation Translation

Paper
Code

A Comprehensive Analysis of Adapter Efficiency

2 code implementations • 12 May 2023 • Nandini Mundra, Sumanth Doddapaneni, Raj Dabre, Anoop Kunchukuttan, Ratish Puduppully, Mitesh M. Khapra

However, adapters have not been sufficiently analyzed to understand if PEFT translates to benefits in training/deployment efficiency and maintainability/extensibility.

Natural Language Understanding

Paper
Code

CharSpan: Utilizing Lexical Similarity to Enable Zero-Shot Machine Translation for Extremely Low-resource Languages

no code implementations • 9 May 2023 • Kaushal Kumar Maurya, Rahul Kejriwal, Maunendra Sankar Desarkar, Anoop Kunchukuttan

We address the task of machine translation (MT) from extremely low-resource language (ELRL) to English by leveraging cross-lingual transfer from 'closely-related' high-resource language (HRL).

Cross-Lingual Transfer Machine Translation +1

Paper
Add Code

Evaluating Inter-Bilingual Semantic Parsing for Indian Languages

1 code implementation • 25 Apr 2023 • Divyanshu Aggarwal, Vivek Gupta, Anoop Kunchukuttan

Despite significant progress in Natural Language Generation for Indian languages (IndicNLP), there is a lack of datasets around complex structured tasks such as semantic parsing.

Semantic Parsing Text Generation +1

Paper
Code

Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages

1 code implementation • 20 Dec 2022 • Arnav Mhaske, Harshit Kedia, Sumanth Doddapaneni, Mitesh M. Khapra, Pratyush Kumar, Rudra Murthy V, Anoop Kunchukuttan

The dataset contains more than 400k sentences annotated with a total of at least 100k entities from three standard entity categories (Person, Location, and, Organization) for 9 out of the 11 languages.

Named Entity Recognition Sentence

Paper
Code

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

1 code implementation • 20 Dec 2022 • Ananya B. Sai, Vignesh Nagarajan, Tanay Dixit, Raj Dabre, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

In this paper, we fill this gap by creating an MQM dataset consisting of 7000 fine-grained annotations, spanning 5 Indian languages and 7 MT systems, and use it to establish correlations between annotator scores and scores obtained using existing automatic metrics.

Machine Translation

Paper
Code

Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages

1 code implementation • 11 Dec 2022 • Sumanth Doddapaneni, Rahul Aralikatte, Gowtham Ramesh, Shreya Goyal, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar

Across languages and tasks, IndicXTREME contains a total of 105 evaluation sets, of which 52 are new contributions to the literature.

Natural Language Understanding XLM-R

Paper
Code

Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages

no code implementations • 26 Aug 2022 • Kaushal Santosh Bhogale, Abhigyan Raman, Tahir Javed, Sumanth Doddapaneni, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

Significantly, we show that adding Shrutilipi to the training set of Wav2Vec models leads to an average decrease in WER of 5. 8\% for 7 languages on the IndicSUPERB benchmark.

Optical Character Recognition (OCR) Self-Supervised Learning +3

Paper
Add Code

IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages

1 code implementation • 24 Aug 2022 • Tahir Javed, Kaushal Santosh Bhogale, Abhigyan Raman, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

We hope IndicSUPERB contributes to the progress of developing speech language understanding models for Indian languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Code

Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion Users

2 code implementations • 6 May 2022 • Yash Madhani, Sushane Parthan, Priyanka Bedekar, Gokul NC, Ruchi Khapra, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

Transliteration is very important in the Indian language context due to the usage of multiple scripts and the widespread use of romanized inputs.

Transliteration

Paper
Code

IndicXNLI: Evaluating Multilingual Inference for Indian Languages

1 code implementation • 19 Apr 2022 • Divyanshu Aggarwal, Vivek Gupta, Anoop Kunchukuttan

While Indic NLP has made rapid advances recently in terms of the availability of corpora and pre-trained models, benchmark datasets on standard NLU tasks are limited.

Cross-Lingual Transfer Machine Translation +1

Paper
Code

IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages

no code implementations • 10 Mar 2022 • Aman Kumar, Himani Shrotriya, Prachi Sahu, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Amogh Mishra, Mitesh M. Khapra, Pratyush Kumar

Natural Language Generation (NLG) for non-English languages is hampered by the scarcity of datasets in these languages.

Benchmarking Headline Generation +6

Paper
Add Code

Towards Building ASR Systems for the Next Billion Users

no code implementations • 6 Nov 2021 • Tahir Javed, Sumanth Doddapaneni, Abhigyan Raman, Kaushal Santosh Bhogale, Gowtham Ramesh, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

Second, using this raw speech data we pretrain several variants of wav2vec style models for 40 Indian languages.

Paper
Add Code

An Empirical Investigation of Multi-bridge Multilingual NMT models

no code implementations • 14 Oct 2021 • Anoop Kunchukuttan

In this paper, we present an extensive investigation of multi-bridge, many-to-many multilingual NMT models (MB-M2M) ie., models trained on non-English language pairs in addition to English-centric language pairs.

NMT Translation

Paper
Add Code

IndicBART: A Pre-trained Model for Indic Natural Language Generation

1 code implementation • Findings (ACL) 2022 • Raj Dabre, Himani Shrotriya, Anoop Kunchukuttan, Ratish Puduppully, Mitesh M. Khapra, Pratyush Kumar

We present IndicBART, a multilingual, sequence-to-sequence pre-trained model focusing on 11 Indic languages and English.

Extreme Summarization Machine Translation +4

Paper
Code

A Primer on Pretrained Multilingual Language Models

no code implementations • 1 Jul 2021 • Sumanth Doddapaneni, Gowtham Ramesh, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar

Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R, \textit{etc.}

Joint Multilingual Sentence Representations Multilingual text classification +4

Paper
Add Code

Itihasa: A large-scale corpus for Sanskrit to English translation

no code implementations • ACL (WAT) 2021 • Rahul Aralikatte, Miryam de Lhoneux, Anoop Kunchukuttan, Anders Søgaard

This work introduces Itihasa, a large-scale translation dataset containing 93, 000 pairs of Sanskrit shlokas and their English translations.

Ranked #1 on Machine Translation on Itihasa

Machine Translation Translation

Paper
Add Code

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

1 code implementation • 12 Apr 2021 • Gowtham Ramesh, Sumanth Doddapaneni, Aravinth Bheemaraj, Mayank Jobanputra, Raghavan AK, Ajitesh Sharma, Sujit Sahoo, Harshita Diddee, Mahalakshmi J, Divyanshu Kakwani, Navneet Kumar, Aswin Pradeep, Srihari Nagaraj, Kumar Deepak, Vivek Raghavan, Anoop Kunchukuttan, Pratyush Kumar, Mitesh Shantadevi Khapra

We mine the parallel sentences from the web by combining many corpora, tools, and methods: (a) web-crawled monolingual corpora, (b) document OCR for extracting sentences from scanned documents, (c) multilingual representation models for aligning sentences, and (d) approximate nearest neighbor search for searching in a large collection of sentences.

Machine Translation Multilingual NLP +3

108

Paper
Code

A Large-scale Evaluation of Neural Machine Transliteration for Indic Languages

1 code implementation • EACL 2021 • Anoop Kunchukuttan, Siddharth Jain, Rahul Kejriwal

We take up the task of large-scale evaluation of neural machine transliteration between English and Indic languages, with a focus on multilingual transliteration to utilize orthographic similarity between Indian languages.

Translation Transliteration

Paper
Code

Multilingual Neural Machine Translation

no code implementations • COLING 2020 • Raj Dabre, Chenhui Chu, Anoop Kunchukuttan

The advent of neural machine translation (NMT) has opened up exciting research in building multilingual translation systems i. e. translation models that can handle more than one language pair.

Machine Translation NMT +2

Paper
Add Code

IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Divyanshu Kakwani, Anoop Kunchukuttan, Satish Golla, Gokul N.C., Avik Bhattacharyya, Mitesh M. Khapra, Pratyush Kumar.

These resources include: (a) large-scale sentence-level monolingual corpora, (b) pre-trained word embeddings, (c) pre-trained language models, and (d) multiple NLU evaluation datasets (IndicGLUE benchmark).

Ranked #2 on Multiple Choice Question Answering (MCQA) on IndicGLUE WSTP Pa

Genre classification Multiple-choice +9

271

Paper
Code

AI4Bharat-IndicNLP Corpus: Monolingual Corpora and Word Embeddings for Indic Languages

2 code implementations • 30 Apr 2020 • Anoop Kunchukuttan, Divyanshu Kakwani, Satish Golla, Gokul N. C., Avik Bhattacharyya, Mitesh M. Khapra, Pratyush Kumar

We present the IndicNLP corpus, a large-scale, general-domain corpus containing 2. 7 billion words for 10 Indian languages from two language families.

Word Embeddings

229

Paper
Code

Learning Geometric Word Meta-Embeddings

no code implementations • WS 2020 • Pratik Jawanpuria, N T V Satya Dev, Anoop Kunchukuttan, Bamdev Mishra

We propose a geometric framework for learning meta-embeddings of words from different embedding sources.

Word Similarity

Paper
Add Code

Utilizing Language Relatedness to improve Machine Translation: A Case Study on Languages of the Indian Subcontinent

no code implementations • 19 Mar 2020 • Anoop Kunchukuttan, Pushpak Bhattacharyya

To the best of our knowledge, this is the first large-scale study specifically devoted to utilizing language relatedness to improve translation between related languages.

Machine Translation Translation

Paper
Add Code

A Comprehensive Survey of Multilingual Neural Machine Translation

no code implementations • 4 Jan 2020 • Raj Dabre, Chenhui Chu, Anoop Kunchukuttan

We present a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in the recent years.

Machine Translation NMT +2

Paper
Add Code

Overview of the 6th Workshop on Asian Translation

no code implementations • WS 2019 • Toshiaki Nakazawa, Nobushige Doi, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Yusuke Oda, Shantipriya Parida, Ond{\v{r}}ej Bojar, Sadao Kurohashi

This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including Ja↔En, Ja↔Zh scientific paper translation subtasks, Ja↔En, Ja↔Ko, Ja↔En patent translation subtasks, Hi↔En, My↔En, Km↔En, Ta↔En mixed domain subtasks and Ru↔Ja news commentary translation task.

Translation

Paper
Add Code

A Brief Survey of Multilingual Neural Machine Translation

no code implementations • 14 May 2019 • Raj Dabre, Chenhui Chu, Anoop Kunchukuttan

We present a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in the recent years.

Machine Translation Transfer Learning +1

Paper
Add Code

Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages

no code implementations • NAACL 2019 • Rudra Murthy V, Anoop Kunchukuttan, Pushpak Bhattacharyya

To bridge this divergence, We propose to pre-order the assisting language sentence to match the word order of the source language and train the parent model.

Machine Translation NMT +3

Paper
Add Code

McTorch, a manifold optimization library for deep learning

1 code implementation • 3 Oct 2018 • Mayank Meghwanshi, Pratik Jawanpuria, Anoop Kunchukuttan, Hiroyuki Kasai, Bamdev Mishra

In this paper, we introduce McTorch, a manifold optimization library for deep learning that extends PyTorch.

230

Paper
Code

Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach

2 code implementations • TACL 2019 • Pratik Jawanpuria, Arjun Balgovind, Anoop Kunchukuttan, Bamdev Mishra

Our approach decouples learning the transformation from the source language to the target language into (a) learning rotations for language-specific embeddings to align them to a common space, and (b) learning a similarity metric in the common space to model similarities between the embeddings.

Bilingual Lexicon Induction Multilingual Word Embeddings +4

17,947

Paper
Code

Judicious Selection of Training Data in Assisting Language for Multilingual Neural NER

1 code implementation • ACL 2018 • Rudra Murthy, Anoop Kunchukuttan, Pushpak Bhattacharyya

Multilingual learning for Neural Named Entity Recognition (NNER) involves jointly training a neural network for multiple languages.

Domain Adaptation Machine Translation +4

Paper
Code

Leveraging Orthographic Similarity for Multilingual Neural Transliteration

no code implementations • TACL 2018 • Anoop Kunchukuttan, Mitesh Khapra, Gurneet Singh, Pushpak Bhattacharyya

We address the task of joint training of transliteration models for multiple language pairs (multilingual transliteration).

Information Retrieval Multi-Task Learning +1

Paper
Add Code

Comparing Recurrent and Convolutional Architectures for English-Hindi Neural Machine Translation

no code implementations • WS 2017 • S. Singh, hya, Ritesh Panjwani, Anoop Kunchukuttan, Pushpak Bhattacharyya

In this paper, we empirically compare the two encoder-decoder neural machine translation architectures: convolutional sequence to sequence model (ConvS2S) and recurrent sequence to sequence model (RNNS2S) for English-Hindi language pair as part of IIT Bombay{'}s submission to WAT2017 shared task.

Image Captioning Language Modelling +4

Paper
Add Code

The IIT Bombay English-Hindi Parallel Corpus

no code implementations • LREC 2018 • Anoop Kunchukuttan, Pratik Mehta, Pushpak Bhattacharyya

We present the IIT Bombay English-Hindi Parallel Corpus.

Machine Translation NMT +1

Paper
Add Code

Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT

no code implementations • IJCNLP 2017 • Anoop Kunchukuttan, Maulik Shah, Pradyot Prakash, Pushpak Bhattacharyya

We investigate pivot-based translation between related languages in a low resource, phrase-based SMT setting.

Translation

Paper
Add Code

IIT Bombay's English-Indonesian submission at WAT: Integrating Neural Language Models with SMT

no code implementations • WS 2016 • S. Singh, hya, Anoop Kunchukuttan, Pushpak Bhattacharyya

The Neural Probabilistic Language Model (NPLM) gave relatively high BLEU points for Indonesian to English translation system while the Neural Network Joint Model (NNJM) performed better for English to Indonesian direction of translation system.

Language Modelling Machine Translation +1

Paper
Add Code

Faster decoding for subword level Phrase-based SMT between related languages

no code implementations • WS 2016 • Anoop Kunchukuttan, Pushpak Bhattacharyya

The increase in length is also impacted by the specific choice of data format for representing the sentences as subwords.

Translation

Paper
Add Code

Learning variable length units for SMT between related languages via Byte Pair Encoding

no code implementations • WS 2017 • Anoop Kunchukuttan, Pushpak Bhattacharyya

We explore the use of segments learnt using Byte Pair Encoding (referred to as BPE units) as basic units for statistical machine translation between related languages and compare it with orthographic syllables, which are currently the best performing basic units for this translation task.

Machine Translation Translation

Paper
Add Code

Orthographic Syllable as basic unit for SMT between Related Languages

no code implementations • EMNLP 2016 • Anoop Kunchukuttan, Pushpak Bhattacharyya

We explore the use of the orthographic syllable, a variable-length consonant-vowel sequence, as a basic unit of translation between related languages which use abugida or alphabetic scripts.

Translation

Paper
Add Code

Substring-based unsupervised transliteration with phonetic and contextual knowledge

no code implementations • CONLL 2016 • Anoop Kunchukuttan, Pushpak Bhattacharyya, Mitesh M. Khapra

Information Retrieval Transliteration

Paper
Add Code

Statistical Machine Translation between Related Languages

no code implementations • NAACL 2016 • Pushpak Bhattacharyya, Mitesh M. Khapra, Anoop Kunchukuttan

Machine Translation Translation

Paper
Add Code

Augmenting Pivot based SMT with word segmentation

no code implementations • WS 2015 • Rohit More, Anoop Kunchukuttan, Pushpak Bhattacharyya, Raj Dabre

Machine Translation

Paper
Add Code

Addressing Class Imbalance in Grammatical Error Detection with Evaluation Metric Optimization

no code implementations • WS 2015 • Anoop Kunchukuttan, Pushpak Bhattacharyya

Grammatical Error Detection

Paper
Add Code

Investigating the potential of post-ordering SMT output to improve translation quality

no code implementations • WS 2015 • Pratik Mehta, Anoop Kunchukuttan, Pushpak Bhattacharyya

Machine Translation Translation

Paper
Add Code

Data representation methods and use of mined corpora for Indian language transliteration

no code implementations • WS 2015 • Anoop Kunchukuttan, Pushpak Bhattacharyya

Information Retrieval Transliteration

Paper
Add Code

Brahmi-Net: A transliteration and script conversion system for languages of the Indian subcontinent

no code implementations • NAACL 2015 • Anoop Kunchukuttan, Ratish Puduppully, Pushpak Bhattacharyya

Information Retrieval Question Answering +1

Paper
Add Code

Supertag Based Pre-ordering in Machine Translation

no code implementations • WS 2014 • Rajen Chatterjee, Anoop Kunchukuttan, Pushpak Bhattacharyya

Machine Translation Translation

Paper
Add Code

The IIT Bombay Hindi-English Translation System at WMT 2014

no code implementations • WS 2014 • Piyush Dungarwal, Rajen Chatterjee, Abhijit Mishra, Anoop Kunchukuttan, Ritesh Shah, Pushpak Bhattacharyya

Machine Translation Translation

Paper
Add Code

Tuning a Grammar Correction System for Increased Precision

no code implementations • WS 2014 • Anoop Kunchukuttan, Sriram Chaudhury, Pushpak Bhattacharyya

Grammatical Error Correction Language Modelling +2

Paper
Add Code

Shata-Anuvadak: Tackling Multiway Translation of Indian Languages

no code implementations • LREC 2014 • Anoop Kunchukuttan, Abhijit Mishra, Rajen Chatterjee, Ritesh Shah, Pushpak Bhattacharyya

We present a compendium of 110 Statistical Machine Translation systems built from parallel corpora of 11 Indian languages belonging to both Indo-Aryan and Dravidian families.

Translation Transliteration

Paper
Add Code

When Transliteration Met Crowdsourcing : An Empirical Study of Transliteration via Crowdsourcing using Efficient, Non-redundant and Fair Quality Control

no code implementations • LREC 2014 • Mitesh M. Khapra, Ananthakrishnan Ramanathan, Anoop Kunchukuttan, Karthik Visweswariah, Pushpak Bhattacharyya

In contrast, we propose a low-cost QC mechanism which is fair to both workers and requesters.

Fairness Transliteration

Paper
Add Code

TransDoop: A Map-Reduce based Crowdsourced Translation for Complex Domain

no code implementations • ACL 2013 • Anoop Kunchukuttan, Rajen Chatterjee, Shourya Roy, Abhijit Mishra, Pushpak Bhattacharyya

Machine Translation Translation

Paper
Add Code

IITB System for CoNLL 2013 Shared Task: A Hybrid Approach to Grammatical Error Correction

no code implementations • WS 2013 • Anoop Kunchukuttan, Ritesh Shah, Pushpak Bhattacharyya

Grammatical Error Correction Machine Translation +1

Paper
Add Code

Partially modelling word reordering as a sequence labelling problem

no code implementations • WS 2012 • Anoop Kunchukuttan, Pushpak Bhattacharyya

Machine Translation Word Alignment

Paper
Add Code

Experiences in Resource Generation for Machine Translation through Crowdsourcing

no code implementations • LREC 2012 • Anoop Kunchukuttan, Shourya Roy, Pratik Patel, Kushal Ladha, Somya Gupta, Mitesh M. Khapra, Pushpak Bhattacharyya

The logistics of collecting resources for Machine Translation (MT) has always been a cause of concern for some of the resource deprived languages of the world.

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.