no code implementations • WILDRE (LREC) 2022 • Vanlalmuansangi Khenglawt, Sahinur Rahman Laskar, Santanu Pal, Partha Pakray, Ajoy Kumar Khan
Multilingual country like India has an enormous linguistic diversity and has an increasing demand towards developing language resources such that it will outreach in various natural language processing applications like machine translation.
no code implementations • WMT (EMNLP) 2020 • Santanu Pal, Marcos Zampieri
In this paper we present the WIPRO-RIT systems submitted to the Similar Language Translation shared task at WMT 2020.
no code implementations • AACL (WAT) 2020 • Santanu Pal
In this paper we present an English–Hindi and Hindi–English neural machine translation (NMT) system, submitted to the Translation shared Task organized at WAT 2020.
1 code implementation • ICON 2021 • Rishabh Jha, Varshith Kaki, Varuna Kolla, Shubham Bhagat, Parth Patwa, Amitava Das, Santanu Pal
The aim is to generate a specialized text like a tweet, that is not a direct result of visual-linguistic grounding that is usually leveraged in similar tasks, but conveys a message that factors-in not only the visual content of the image, but also additional real world contextual information associated with the event described within the image as closely as possible.
no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.
no code implementations • 3 Jul 2024 • Ramakrishna Appicharla, Baban Gain, Santanu Pal, Asif Ekbal, Pushpak Bhattacharyya
Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context.
no code implementations • 20 Sep 2023 • Prottay Kumar Adhikary, Bandaru Sugandhi, Subhojit Ghimire, Santanu Pal, Partha Pakray
In today's globalized world, effective communication with people from diverse linguistic backgrounds has become increasingly crucial.
no code implementations • 11 Aug 2023 • Ramakrishna Appicharla, Baban Gain, Santanu Pal, Asif Ekbal
In this paper, we further explore this idea by evaluating with context-aware pronoun translation test set by training multi-encoder models trained on three different context settings viz, previous two sentences, random two sentences, and a mix of both as context.
no code implementations • 5 Oct 2021 • Atanu Mandal, Santanu Pal, Indranil Dutta, Mahidas Bhattacharya, Sudip Kumar Naskar
Language Identification (LID) is a crucial preliminary process in the field of Automatic Speech Recognition (ASR) that involves the identification of a spoken language from audio samples.
Ranked #1 on
Spoken language identification
on IndicTTS
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri
In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.
no code implementations • ACL 2020 • Nico Herbig, Tim D{\"u}wel, Santanu Pal, Kalliopi Meladaki, Mahsa Monshizadeh, Antonio Kr{\"u}ger, Josef van Genabith
On the other hand, speech and multi-modal combinations of select {\&} speech are considered suitable for replacements and insertions but offer less potential for deletion and reordering.
no code implementations • ACL 2020 • Nico Herbig, Santanu Pal, Tim D{\"u}wel, Kalliopi Meladaki, Mahsa Monshizadeh, Vladislav Hnatovskiy, Antonio Kr{\"u}ger, Josef van Genabith
The shift from traditional translation to post-editing (PE) of machine-translated (MT) text can save time and reduce errors, but it also affects the design of translation interfaces, as the task changes from mainly generating text to correcting errors within otherwise helpful translation proposals.
no code implementations • COLING 2020 • Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio Krueger, Josef van Genabith
In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the machine translated text (mt) as input.
no code implementations • 16 Aug 2019 • Santanu Pal, Marcos Zampieri, Josef van Genabith
The first edition of this shared task featured data from three pairs of similar languages: Czech and Polish, Hindi and Nepali, and Portuguese and Spanish.
no code implementations • WS 2019 • Mihaela Vela, Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Josef van Genabith
User feedback revealed that the users preferred using CATaLog Online over existing CAT tools in some respects, especially by selecting the output of the MT system and taking advantage of the color scheme for TM suggestions.
no code implementations • WS 2019 • Santanu Pal, Hongfei Xu, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith
In this paper we present an English{--}German Automatic Post-Editing (APE) system called transference, submitted to the APE Task organized at WMT 2019.
no code implementations • WS 2019 • Riktim Mondal, Shankha Raj Nayek, Aditya Chowdhury, Santanu Pal, Sudip Kumar Naskar, Josef van Genabith
In this paper we describe our joint submission (JU-Saarland) from Jadavpur University and Saarland University in the WMT 2019 news translation shared task for English{--}Gujarati language pair within the translation task sub-track.
no code implementations • WS 2019 • Santanu Pal, Marcos Zampieri, Josef van Genabith
The first edition of this shared task featured data from three pairs of similar languages: Czech and Polish, Hindi and Nepali, and Portuguese and Spanish.
no code implementations • WS 2019 • Lo{\"\i}c Barrault, Ond{\v{r}}ej Bojar, Marta R. Costa-juss{\`a}, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias M{\"u}ller, Santanu Pal, Matt Post, Marcos Zampieri
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019.
no code implementations • 7 Mar 2019 • Nico Herbig, Santanu Pal, Josef van Genabith, Antonio Krüger
Current advances in machine translation increase the need for translators to switch from traditional translation to post-editing of machine-translated text, a process that saves time and improves quality.
no code implementations • WS 2018 • Santanu Pal, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith
The proposed model is an extension of the transformer architecture: two separate self-attention-based encoders encode the machine translation output (mt) and the source (src), followed by a joint encoder that attends over a combination of these two encoded sequences (encsrc and encmt) for generating the post-edited sentence.
no code implementations • WS 2018 • Prasenjit Basu, Santanu Pal, Sudip Kumar Naskar
The paper presents our participation in the WMT 2018 shared task on word level quality estimation (QE) of machine translated (MT) text, i. e., to predict whether a word in MT output for a given source context is correctly translated and hence should be retained in the post-edited translation (PE), or not.
no code implementations • COLING 2018 • Alina Maria Ciobanu, Marcos Zampieri, Shervin Malmasi, Santanu Pal, Liviu P. Dinu
In this paper we present a system based on SVM ensembles trained on characters and words to discriminate between five similar languages of the Indo-Aryan family: Hindi, Braj Bhasha, Awadhi, Bhojpuri, and Magahi.
no code implementations • COLING 2018 • Marta R. Costa-jussà, Marcos Zampieri, Santanu Pal
In this paper we present the first neural-based machine translation system trained to translate between standard national varieties of the same language.
no code implementations • WS 2018 • Soumyadeep Kundu, Sayantan Paul, Santanu Pal
In this paper, we propose different architectures for language independent machine transliteration which is extremely important for natural language processing (NLP) applications.
no code implementations • EACL 2017 • Santanu Pal, Sudip Kumar Naskar, Mihaela Vela, Qun Liu, Josef van Genabith
APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system.
no code implementations • COLING 2016 • Santanu Pal, Sudip Kumar Naskar, Josef van Genabith
In the paper we show that parallel system combination in the APE stage of a sequential MT-APE combination yields substantial translation improvements both measured in terms of automatic evaluation metrics as well as in terms of productivity improvements measured in a post-editing experiment.
no code implementations • COLING 2016 • Santanu Pal, Sudip Kumar Naskar, Marcos Zampieri, Tapas Nayak, Josef van Genabith
We present a free web-based CAT tool called CATaLog Online which provides a novel and user-friendly online CAT environment for post-editors/translators.
no code implementations • LREC 2016 • Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Tapas Nayak, Mihaela Vela, Josef van Genabith
The tool features a number of editing and log functions similar to the desktop version of CATaLog enhanced with several new features that we describe in detail in this paper.
no code implementations • LREC 2014 • Santanu Pal, Sudip Kumar Naskar, B, Sivaji yopadhyay
Reordering poses a big challenge in statistical machine translation between distant language pairs.