Search Results for author: Patricio Cerda

Found 3 papers, 3 papers with code

Similarity encoding for learning with dirty categorical variables

2 code implementations4 Jun 2018 Patricio Cerda, Gaël Varoquaux, Balázs Kégl

We show that a simple approach that exposes the redundancy to the learning algorithm brings significant gains.

Dimensionality Reduction

Encoding high-cardinality string categorical variables

1 code implementation3 Jul 2019 Patricio Cerda, Gaël Varoquaux

We introduce two encoding approaches for string categories: a Gamma-Poisson matrix factorization on substring counts, and the min-hash encoder, for fast approximation of string similarities.

AutoML Feature Engineering +1

Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers

1 code implementation5 Jun 2023 Félix Gaschi, Patricio Cerda, Parisa Rastin, Yannick Toussaint

Namely, we find that realignment works better on tasks for which alignment is correlated with cross-lingual transfer when generalizing to a distant language and with smaller models, as well as when using a bilingual dictionary rather than FastAlign to extract realignment pairs.

Cross-Lingual Transfer POS +2

Cannot find the paper you are looking for? You can Submit a new open access paper.