no code implementations • 18 Dec 2024 • Xin Wang, Boyan Gao, Yi Dai, Lei Cao, Liang Zhao, Yibo Yang, David Clifton
We further study the benefits brought by the proposed Cognition Chain format by utilising it as a synthetic dataset generation template for LLMs instruction-tuning and introduce CogInstruct, an instruction-tuning dataset for stress detection.
no code implementations • 10 Dec 2024 • Meihao Fan, Ju Fan, Nan Tang, Lei Cao, Guoliang Li, Xiaoyong Du
Many of these tables are derived from web sources or real-world scenarios, which require meticulous data preparation (or data prep) to ensure accurate responses.
no code implementations • 19 Nov 2024 • Xiang Li, Jianpeng Qi, Zhongying Zhao, Guanjie Zheng, Lei Cao, Junyu Dong, Yanwei Yu
To address the above challenges, we propose a novel Unsupervised Multiplex Graph Anomaly Detection method, named UMGAD.
1 code implementation • 31 Oct 2024 • Hao Zhang, Lei Cao, Jiayi Ma
Second, by embedding the combination of the text and zero-shot location model into the diffusion fusion process, a text-controlled fusion re-modulation strategy is developed.
no code implementations • 25 Sep 2024 • Chi Zhang, Huaping Zhong, Kuan Zhang, Chengliang Chai, Rui Wang, Xinlin Zhuang, Tianyi Bai, Jiantao Qiu, Lei Cao, Ju Fan, Ye Yuan, Guoren Wang, Conghui He
For each cluster, if we opt to select data from it, we take some samples to evaluate the influence to prevent processing all instances.
no code implementations • 20 Jun 2024 • Ferdi Kossmann, Ziniu Wu, Alex Turk, Nesime Tatbul, Lei Cao, Samuel Madden
In the offline phase, the system pre-computes a gear plan that specifies how to serve inferences online.
1 code implementation • 23 May 2024 • Chunwei Liu, Matthew Russo, Michael Cafarella, Lei Cao, Peter Baille Chen, Zui Chen, Michael Franklin, Tim Kraska, Samuel Madden, Gerardo Vitagliano
We describe the workload of AI-powered analytics tasks, the optimization methods that Palimpzest uses, and the prototype system itself.
1 code implementation • 27 Mar 2024 • Long Shi, Lei Cao, Yunshan Ye, Yu Zhao, Badong Chen
In the context of multi-view clustering, graph learning is recognized as a crucial technique, which generally involves constructing an adaptive neighbor graph based on probabilistic neighbors, and then learning a consensus graph to for clustering.
no code implementations • 3 Feb 2024 • Long Shi, Lei Cao, Zhongpu Chen, Badong Chen, Yu Zhao
Additionally, we introduce a convex combination subspace clustering scheme, which combining a linear subspace clustering method with the functional link neural network subspace clustering approach.
1 code implementation • 22 Dec 2023 • Long Shi, Lei Cao, Jun Wang, Badong Chen
Specifically, we stack the data matrices from various views into the block-diagonal locations of the augmented matrix to exploit the complementary information.
1 code implementation • 7 Oct 2023 • Ferdinand Kossmann, Ziniu Wu, Eugenie Lai, Nesime Tatbul, Lei Cao, Tim Kraska, Samuel Madden
We find that no current system sufficiently fulfills both needs and therefore propose Skyscraper, a system tailored to V-ETL.
no code implementations • 1 Oct 2023 • Zui Chen, Lei Cao, Sam Madden, Tim Kraska, Zeyuan Shang, Ju Fan, Nan Tang, Zihui Gu, Chunwei Liu, Michael Cafarella
As a result, data scientists often have to develop domain-specific solutions tailored to both the dataset and the task, e. g. writing domain-specific code or training machine learning models on a sufficient number of annotated examples.
no code implementations • 6 Jul 2023 • Nan Tang, Chenyu Yang, Ju Fan, Lei Cao, Yuyu Luo, Alon Halevy
We propose that verifying the outputs of generative AI from a data management perspective is an emerging issue for generative AI.
no code implementations • 20 Jun 2023 • Zui Chen, Lei Cao, Sam Madden
Data curation is a wide-ranging area which contains many critical but time-consuming data processing tasks.
no code implementations • 20 Jun 2023 • Zui Chen, Lei Cao, Sam Madden
In addition to the row-based architecture, we introduce several techniques: cell-aware position embedding, teacher-student training paradigm, and selective backward to improve the performance of RoTaR model.
1 code implementation • 15 Jun 2023 • Zihui Gu, Ju Fan, Nan Tang, Songyue Zhang, Yuxin Zhang, Zui Chen, Lei Cao, Guoliang Li, Sam Madden, Xiaoyong Du
PLMs can perform well in schema alignment but struggle to achieve complex reasoning, while LLMs is superior in complex reasoning tasks but cannot achieve precise schema alignment.
no code implementations • 2 Jun 2023 • Jiaming Liang, Lei Cao, Samuel Madden, Zachary Ives, Guoliang Li
Timeseries analytics is of great importance in many real-world applications.
no code implementations • 11 Mar 2023 • Yu Wang, Lei Cao, Yizhou Yan, Samuel Madden
Moreover, to effectively handle high dimensional, highly complex data sets which are hard to summarize with simple rules, we propose a localized STAIR approach, called L-STAIR.
1 code implementation • 20 Apr 2022 • Zhongqiang Gao, Chuanqi Cheng, Yanwei Yu, Lei Cao, Chao Huang, Junyu Dong
We first categorize the temporal motifs based on their distinct properties, and then design customized algorithms that offer efficient strategies to exactly count the motif instances of each category.
no code implementations • 4 Mar 2022 • Guocheng Zhou, Shaohui Zhang, Yao Hu, Lei Cao, Yong Huang, Qun Hao
Fourier ptychography has attracted a wide range of focus for its ability of large space-bandwidth-produce, and quantative phase measurement.
no code implementations • 16 Dec 2020 • Lei Cao, Huijun Zhang, Ling Feng
As the most popular platform for self-expression, emotion release, and personal interaction, individuals may exhibit a number of symptoms of suicidal ideation on social media.
no code implementations • IJCNLP 2019 • Lei Cao, Huijun Zhang, Ling Feng, Zihan Wei, Xin Wang, Ningyun Li, Xiaohao He
Despite detection of suicidal ideation on social media has made great progress in recent years, people's implicitly and anti-real contrarily expressed posts still remain as an obstacle, constraining the detectors to acquire higher satisfactory performance.
no code implementations • 25 Sep 2019 • Yizhou Yan, Lei Cao, Samuel Madden, Elke Rundensteiner
Although the state-of-the-art object detection methods are successful in detecting and classifying objects by leveraging deep convolutional neural networks (CNNs), these methods overlook the semantic context which implies the probabilities that different classes of objects occur jointly.
no code implementations • 25 Sep 2019 • Lei Cao, Yizhou Yan, Samuel Madden, Elke Rundensteiner
Unfortunately, although the strong generalization ability of existing CNNs ensures their accuracy when classifying known objects, it also causes them to often assign an unknown to a target class with high confidence.
no code implementations • ICLR 2019 • Lei Cao, Yizhou Yan, Samuel Madden, Elke Rundensteiner
Modern applications from Autonomous Vehicles to Video Surveillance generate massive amounts of image data.