1 code implementation • 23 Nov 2023 • Madelon Hulsebos, Paul Groth, Çağatay Demiralp
A key source for understanding a table is the semantics of its columns.
no code implementations • 10 Aug 2022 • Robert Redmond, Nathan W. Weckwerth, Brian S. Xia, Qian Li, Peter Kraft, Deeptaanshu Kumar, Çağatay Demiralp, Michael Stonebraker
We recently proposed a new cluster operating system stack, DBOS, centered on a DBMS.
no code implementations • 13 Sep 2021 • Sneha Gathani, Madelon Hulsebos, James Gale, Peter J. Haas, Çağatay Demiralp
The fundamental goal of business data analysis is to improve business decisions using data.
no code implementations • 11 Sep 2021 • Madelon Hulsebos, Sneha Gathani, James Gale, Isil Dillig, Paul Groth, Çağatay Demiralp
However, we observe that there exists a gap between the performance of these models on these benchmarks and their applicability in practice.
1 code implementation • 24 Jun 2021 • Dongjin Choi, Sara Evensen, Çağatay Demiralp, Estevam Hruschka
In this work, we extend the DPBD framework to span-level annotation tasks, arguably one of the most time-consuming NLP labeling tasks.
2 code implementations • 14 Jun 2021 • Madelon Hulsebos, Çağatay Demiralp, Paul Groth
Existing table corpora primarily contain tables extracted from HTML pages, limiting the capability to represent offline database tables.
1 code implementation • 5 Apr 2021 • Yoshihiko Suhara, Jinfeng Li, Yuliang Li, Dan Zhang, Çağatay Demiralp, Chen Chen, Wang-Chiew Tan
Inferring meta information about tables, such as column headers or relationships between columns, is an active research topic in data management as we find many tables are missing some of this information.
Ranked #1 on Column Type Annotation on VizNet-Sato-MultiColumn
no code implementations • 8 Sep 2020 • Sajjadur Rahman, Peter Griggs, Çağatay Demiralp
Text data analysis is an iterative, non-linear process with diverse workflows spanning multiple stages, from data cleaning to visualization.
1 code implementation • 3 Sep 2020 • Sara Evensen, Chang Ge, Dongjin Choi, Çağatay Demiralp
We operationalize our framework with Ruler, an interactive system that synthesizes labeling rules for document classification by using span-level annotations of users on document examples.
1 code implementation • 15 Jan 2020 • Xiong Zhang, Jonathan Engel, Sara Evensen, Yuliang Li, Çağatay Demiralp, Wang-Chiew Tan
They contain a wealth of information about the opinions and experiences of users, which can help better understand consumer decisions and improve user experience with products and services.
1 code implementation • 14 Nov 2019 • Dan Zhang, Yoshihiko Suhara, Jinfeng Li, Madelon Hulsebos, Çağatay Demiralp, Wang-Chiew Tan
Detecting the semantic types of data columns in relational tables is important for various data preparation and information retrieval tasks such as data cleaning, schema matching, data discovery, and semantic search.
Ranked #2 on Column Type Annotation on VizNet-Sato-MultiColumn
2 code implementations • 25 May 2019 • Madelon Hulsebos, Kevin Hu, Michiel Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, Çağatay Demiralp, César Hidalgo
Correctly detecting the semantic type of data columns is crucial for data science tasks such as automated data cleaning, schema matching, and data discovery.
1 code implementation • 12 May 2019 • Kevin Hu, Neil Gaikwad, Michiel Bakker, Madelon Hulsebos, Emanuel Zgraggen, César Hidalgo, Tim Kraska, Guoliang Li, Arvind Satyanarayan, Çağatay Demiralp
Researchers currently rely on ad hoc datasets to train automated visualization tools and evaluate the effectiveness of visualization designs.
no code implementations • 28 Nov 2018 • Marco Cavallo, Çağatay Demiralp
However, reasoning dynamically about the results of a dimensionality reduction is difficult.
no code implementations • 25 Jun 2018 • Marco Cavallo, Çağatay Demiralp
We demonstrate how Track Xplorer helps identify early on possible systemic data errors, effectively track and compare the results of different classifiers, and reason about and pinpoint the causes of misclassifications.
2 code implementations • 9 Apr 2018 • Victor Dibia, Çağatay Demiralp
Rapidly creating effective visualizations using expressive grammars is challenging for users who have limited time and limited skills in statistics and data visualization.
no code implementations • 9 Apr 2018 • Marco Cavallo, Çağatay Demiralp
Data scientists need adequate interactive tools to effectively explore and navigate the large clustering space so as to improve the effectiveness of exploratory clustering analysis.