Search Results for author: Mourad Ouzzani

Found 6 papers, 2 papers with code

RetClean: Retrieval-Based Data Cleaning Using Foundation Models and Data Lakes

no code implementations29 Mar 2023 Mohammad Shahmeer Ahmad, Zan Ahmad Naeem, Mohamed Eltabakh, Mourad Ouzzani, Nan Tang

To assist with this scenario, we developed a custom RoBERTa-based foundation model that can be locally deployed.

Retrieval

RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation

no code implementations4 Dec 2020 Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Sam Madden, Mourad Ouzzani

RPT is pre-trained for a tuple-to-tuple model by corrupting the input tuple and then learning a model to reconstruct the original tuple.

Denoising Entity Resolution +4

Reuse and Adaptation for Entity Resolution through Transfer Learning

no code implementations28 Sep 2018 Saravanan Thirumuruganathan, Shameem A Puthiya Parambath, Mourad Ouzzani, Nan Tang, Shafiq Joty

Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results.

Entity Resolution Feature Engineering +1

DeepER -- Deep Entity Resolution

3 code implementations2 Oct 2017 Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq Joty, Mourad Ouzzani, Nan Tang

word embeddings), we present a novel ER system, called DeepER, that achieves good accuracy, high efficiency, as well as ease-of-use (i. e., much less human efforts).

Databases

A large scale study of SVM based methods for abstract screening in systematic reviews

no code implementations1 Oct 2016 Tanay Kumar Saha, Mourad Ouzzani, Hossam M. Hammady, Ahmed K. Elmagarmid, Wajdi Dhifli, Mohammad Al Hasan

However, it is very hard to clearly understand the applicability of these methods in a systematic review platform because of the following challenges: (1) the use of non-overlapping metrics for the evaluation of the proposed methods, (2) usage of features that are very hard to collect, (3) using a small set of reviews for the evaluation, and (4) no solid statistical testing or equivalence grouping of the methods.

Cannot find the paper you are looking for? You can Submit a new open access paper.