no code implementations • NAACL (AmericasNLP) 2021 • Diego Barriga Martínez, Victor Mijangos, Ximena Gutierrez-Vasques
Our work also comprises the pre-processing and annotation of the corpus.
no code implementations • NAACL (AmericasNLP) 2021 • Manuel Mager, Arturo Oncevay, Abteen Ebrahimi, John Ortega, Annette Rios, Angela Fan, Ximena Gutierrez-Vasques, Luis Chiruzzo, Gustavo Giménez-Lugo, Ricardo Ramos, Ivan Vladimir Meza Ruiz, Rolando Coto-Solano, Alexis Palmer, Elisabeth Mager-Hois, Vishrav Chaudhary, Graham Neubig, Ngoc Thang Vu, Katharina Kann
This paper presents the results of the 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas.
1 code implementation • LREC 2022 • Steven Moran, Christian Bentz, Ximena Gutierrez-Vasques, Olga Pelloni, Tanja Samardzic
We present the TeDDi sample, a diversity sample of text data for language comparison and multilingual Natural Language Processing.
1 code implementation • EACL 2021 • Tatyana Ruzsics, Olga Sozinova, Ximena Gutierrez-Vasques, Tanja Samardzic
We apply our methodology to analyze the model{'}s decisions on three typologically-different languages and find that a) our pattern extraction method applied to cross-attention weights uncovers variation in form of inflection morphemes, b) pattern extraction from self-attention shows triggers for such variation, c) both types of patterns are closely aligned with grammar inflection classes and class assignment criteria, for all three languages.
1 code implementation • EACL 2021 • Ximena Gutierrez-Vasques, Christian Bentz, Olga Sozinova, Tanja Samardzic
The distributions of orthographic word types are very different across languages due to typological characteristics, different writing traditions and potentially other factors.
no code implementations • WS 2018 • Ximena Gutierrez-Vasques, Victor Mijangos
We use two small parallel corpora for comparing the morphological complexity of Spanish, Otomi and Nahuatl.
1 code implementation • COLING 2018 • Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra, Ivan Meza
Indigenous languages of the American continent are highly diverse.
no code implementations • 6 Oct 2017 • Ximena Gutierrez-Vasques, Victor Mijangos
Our proposal is to construct bilingual word vectors from a graph.
no code implementations • LREC 2016 • Ximena Gutierrez-Vasques, Gerardo Sierra, Isaac Hern Pompa, ez
This paper describes the project called Axolotl which comprises a Spanish-Nahuatl parallel corpus and its search interface.