no code implementations • 3 Apr 2020 • Felipe Sánchez-Martínez, Juan Antonio Pérez-Ortiz, Rafael C. Carrasco
Translation models based on hierarchical phrase-based statistical machine translation (HSMT) have shown better performances than the non-hierarchical phrase-based counterparts for some language pairs.
no code implementations • 18 Jan 2014 • Felipe Sánchez-Martínez, Rafael C. Carrasco, Miguel A. Martínez-Prieto, Joaquin Adiego
For example, a bitext can be seen as a sequence of biwords ---pairs of parallel words with a high probability of co-occurrence--- that can be used as an intermediate representation in the compression process.
no code implementations • 16 Jun 2013 • Felipe Sánchez-Martínez, Isabel Martínez-Sempere, Xavier Ivars-Ribes, Rafael C. Carrasco
The IMPACT-es diachronic corpus of historical Spanish compiles over one hundred books --containing approximately 8 million words-- in addition to a complementary lexicon which links more than 10 thousand lemmas with attestations of the different variants found in the documents.