no code implementations • 30 Nov 2024 • Daomin Ji, Hui Luo, Zhifeng Bao, Shane Culpepper
In this work, we investigate the challenge of integrating multiple tables from a data lake, focusing on three core tasks: 1) pairwise integrability judgment, which determines whether a tuple pair is integrable, accounting for any occurrences of semantic equivalence or typographical errors; 2) integrable set discovery, which identifies all integrable sets in a table based on pairwise integrability judgments established in the first task; 3) multi-tuple conflict resolution, which resolves conflicts between multiple tuples during integration.
1 code implementation • 29 Jul 2024 • Minxiao Chen, Haitao Yuan, Nan Jiang, Zhifeng Bao, Shangguang Wang
In particular, it should adequately consider the regional background, accurately capture both spatial proximity and semantic similarity, and effectively address the sparsity of traffic accidents.
no code implementations • 8 May 2024 • Sha Wang, Yuchen Li, Hanhua Xiao, Zhifeng Bao, Lambert Deng, Yanfei Dong
Efficient news exploration is crucial in real-world applications, particularly within the financial sector, where numerous control and risk assessment tasks rely on the analysis of public news reports.
no code implementations • 18 Mar 2024 • Yile Chen, Xiucheng Li, Gao Cong, Zhifeng Bao, Cheng Long
In this study, we introduce a novel framework called Toast for learning general-purpose representations of road networks, along with its advanced counterpart DyToast, designed to enhance the integration of temporal dynamics to boost the performance of various time-sensitive downstream tasks.
no code implementations • 15 Dec 2023 • Tianhao Peng, Wenjun Wu, Haitao Yuan, Zhifeng Bao, Zhao Pengrui, Xin Yu, Xuetao Lin, Yu Liang, Yanjun Pu
To address this limitation, this paper presents GraphRARE, a general framework built upon node relative entropy and deep reinforcement learning, to strengthen the expressive capability of GNNs.
Ranked #3 on
Node Classification
on Cornell
1 code implementation • 20 Sep 2023 • Xin Zheng, Yixin Liu, Zhifeng Bao, Meng Fang, Xia Hu, Alan Wee-Chung Liew, Shirui Pan
Data-centric AI, with its primary focus on the collection, management, and utilization of data to drive AI models and applications, has attracted increasing attention in recent years.
no code implementations • 28 Feb 2022 • Yile Chen, Xiucheng Li, Gao Cong, Cheng Long, Zhifeng Bao, Shang Liu, Wanli Gu, Fuzheng Zhang
As a fundamental component in location-based services, inferring the relationship between points-of-interests (POIs) is very critical for service providers to offer good user experience to business owners and customers.
no code implementations • 8 Jan 2021 • Hui Luo, Jingbo Zhou, Zhifeng Bao, Shuangli Li, J. Shane Culpepper, Haochao Ying, Hao liu, Hui Xiong
We design a novel multi-task learning model called MPR (short for Multi-level POI Recommendation), where each task aims to return the top-k POIs at a certain spatial granularity level.
no code implementations • 5 Jan 2021 • Hai Lan, Zhifeng Bao, Yuwei Peng
A cost-based optimizer introduces a plan enumeration algorithm to find a (sub)plan, and then uses a cost model to obtain the cost of that plan, and selects the plan with the lowest cost.
no code implementations • 13 Oct 2020 • Sheng Wang, Yuan Sun, Zhifeng Bao
This paper presents a thorough evaluation of the existing methods that accelerate Lloyd's algorithm for fast k-means clustering.
1 code implementation • 30 Mar 2020 • Shixun Huang, Zhifeng Bao, Guoliang Li, Yanghao Zhou, J. Shane Culpepper
More specifically, we first propose a temporal random walk that can identify relevant nodes in historical neighborhoods which have impact on edge formations.
1 code implementation • 21 Feb 2020 • Jiacheng Huang, Wei Hu, Zhifeng Bao, Yuzhong Qu
Knowledge bases (KBs) store rich yet heterogeneous entities and facts.
no code implementations • 7 Jan 2019 • Guangliang Gao, Zhifeng Bao, Jie Cao, A. K. Qin, Timos Sellis, Fellow, IEEE, Zhiang Wu
Regarding the choice of prediction model, we observe that a variety of approaches either consider the entire house data for modeling, or split the entire data and model each partition independently.