no code implementations • LChange (ACL) 2022 • Iiro Rastas, Yann Ciarán Ryan, Iiro Tiihonen, Mohammadreza Qaraei, Liina Repo, Rohit Babbar, Eetu Mäkelä, Mikko Tolonen, Filip Ginter
In this paper, we describe a BERT model trained on the Eighteenth Century Collections Online (ECCO) dataset of digitized documents.
Optical Character Recognition Optical Character Recognition (OCR)
no code implementations • 16 Oct 2024 • Kamaledin Ghiasi-Shirazi, Mohammadreza Qaraei
Kernel methods in machine learning use a kernel function that takes two data points as input and returns their inner product after mapping them to a Hilbert space, implicitly and without actually computing the mapping.
1 code implementation • 14 Dec 2021 • Mohammadreza Qaraei, Rohit Babbar
Extreme Multilabel Text Classification (XMTC) is a text classification problem in which, (i) the output space is extremely large, (ii) each data point may have multiple positive labels, and (iii) the data follows a strongly imbalanced distribution.
no code implementations • 26 Feb 2021 • Ahmad Navid Ghanizadeh, Kamaledin Ghiasi-Shirazi, Reza Monsefi, Mohammadreza Qaraei
By this interpretation, we propose a Neural Generalization of Multiple Kernel Learning (NGMKL), which extends the conventional multiple kernel learning framework to a multi-layer neural network with nonlinear activation functions.
no code implementations • 1 Jul 2020 • Erik Schultheis, Mohammadreza Qaraei, Priyanshu Gupta, Rohit Babbar
In addition to the computational burden arising from large number of training instances, features and labels, problems in XMC are faced with two statistical challenges, (i) large number of 'tail-labels' -- those which occur very infrequently, and (ii) missing labels as it is virtually impossible to manually assign every relevant label to an instance.