no code implementations • 26 Feb 2023 • Shruti Rijhwani, Daisy Rosenblum, Michayla King, Antonios Anastasopoulos, Graham Neubig
There has been recent interest in improving optical character recognition (OCR) for endangered languages, particularly because a large number of documents and books in these languages are not in machine-readable formats.
Optical Character Recognition Optical Character Recognition (OCR)
1 code implementation • 4 Nov 2021 • Shruti Rijhwani, Daisy Rosenblum, Antonios Anastasopoulos, Graham Neubig
In addition, to enforce consistency in the recognized vocabulary, we introduce a lexically-aware decoding method that augments the neural post-correction model with a count-based language model constructed from the recognized texts, implemented using weighted finite-state automata (WFSA) for efficient and effective decoding.
no code implementations • COLING 2020 • Roland Kuhn, Fineen Davis, Alain D{\'e}silets, Eric Joanis, Anna Kazantseva, Rebecca Knowles, Patrick Littell, Delaney Lothian, Aidan Pine, Caroline Running Wolf, Eddie Santos, Darlene Stewart, Gilles Boulianne, Vishwa Gupta, Brian Maracle Owennat{\'e}kha, Akwirat{\'e}kha{'} Martin, Christopher Cox, Marie-Odile Junker, Olivia Sammons, Delasie Torkornoo, Nathan Thanyeht{\'e}nhas Brinklow, Sara Child, Beno{\^\i}t Farley, David Huggins-Daines, Daisy Rosenblum, Heather Souter
This paper surveys the first, three-year phase of a project at the National Research Council of Canada that is developing software to assist Indigenous communities in Canada in preserving their languages and extending their use.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3