no code implementations • EAMT 2022 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Matteo Negri, Marco Turchi
This project aimed at extending the test sets of the MuST-C speech translation (ST) corpus with new reference translations.
no code implementations • loresmt (AACL) 2020 • Atul Kr. Ojha, Valentin Malykh, Alina Karakanta, Chao-Hong Liu
This paper presents the findings of the LoResMT 2020 Shared Task on zero-shot translation for low resource languages.
1 code implementation • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi
Subtitling tools are recently being adapted for post-editing by providing automatically generated subtitles, and featuring not only machine translation, but also automatic segmentation and synchronisation.
no code implementations • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi
In response to the increasing interest towards automatic subtitling, this EAMT-funded project aimed at collecting subtitle post-editing data in a real use case scenario where professional subtitlers edit automatically generated subtitles.
1 code implementation • 27 Sep 2022 • Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi
Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i. e. subtitles and their corresponding timestamps.
1 code implementation • 21 Sep 2022 • Sara Papi, Alina Karakanta, Matteo Negri, Marco Turchi
Speech translation for subtitling (SubST) is the task of automatically translating speech data into well-formed subtitles by inserting subtitle breaks compliant to specific displaying guidelines.
1 code implementation • LREC 2022 • Alina Karakanta, François Buet, Mauro Cettolo, François Yvon
Subtitle segmentation can be evaluated with sequence segmentation metrics against a human reference.
1 code implementation • MTSummit 2021 • Alina Karakanta, Sara Papi, Matteo Negri, Marco Turchi
Experiments on three language pairs (en$\rightarrow$it, de, fr) show that scrolling lines is the only mode achieving an acceptable reading speed while keeping delay close to a 4-second threshold.
1 code implementation • ACL (IWSLT) 2021 • Alina Karakanta, Marco Gaido, Matteo Negri, Marco Turchi
Speech translation (ST) has lately received growing interest for the generation of subtitles without the need for an intermediate source language transcription and timing (i. e. captions).
no code implementations • ACL 2021 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi
Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions.
no code implementations • COLING 2020 • Alina Karakanta, Supratik Bhattacharya, Shravan Nayak, Timo Baumann, Matteo Negri, Marco Turchi
Dubbing has two shades; synchronisation constraints are applied only when the actor{'}s mouth is visible on screen, while the translation is unconstrained for off-screen dubbing.
no code implementations • WS 2020 • Alina Karakanta, Matteo Negri, Marco Turchi
Subtitling is becoming increasingly important for disseminating information, given the enormous amounts of audiovisual content becoming available daily.
no code implementations • LREC 2020 • Alina Karakanta, Matteo Negri, Marco Turchi
Growing needs in localising audiovisual content in multiple languages through subtitles call for the development of automatic solutions for human subtitling.
1 code implementation • EMNLP (IWSLT) 2019 • Surafel M. Lakew, Alina Karakanta, Marcello Federico, Matteo Negri, Marco Turchi
In order to improve NMT for LRL, we employ perplexity to select HRL data that are most similar to the LRL on the basis of language distance.