While existing work has explored utilizing knowledge graphs to enhance language modeling via joint training and customized model architectures, applying this to LLMs is problematic owing to their large number of parameters and high computational cost.
This discrepancy stems from a fundamental limitation: while MPNNs excel in node-level representation, they stumble with encoding the joint structural features essential to link prediction, like CN.
This tutorial paper provides a general overview of symbolic regression (SR) with specific focus on standards of interpretability.
In this paper, rather than pursuing state-of-the-art performance, we aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain.
Data augmentation forms the cornerstone of many modern machine learning training pipelines; yet, the mechanisms by which it works are not clearly understood.
Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance.
Therefore, to improve the applicability of GNNs and fully encode the complicated topological information, knowledge distillation on graphs (KDG) has been introduced to build a smaller yet effective model and exploit more knowledge from data, leading to model compression and performance improvement.
We propose a set of techniques that can be used by both deep learning model users to identify, visualize and understand class prototypes, sub-concepts and outlier instances; and by imbalanced learning algorithm developers to detect features and class exemplars that are key to model performance.
In this paper, we first identify the dataset shift problem in the link prediction task and provide theoretical analyses on how existing link prediction methods are vulnerable to it.
In this work, to combine the advantages of GNNs and MLPs, we start with exploring direct knowledge distillation (KD) methods for link prediction, i. e., predicted logit-based matching and node representation-based matching.
Existing methods attempt to address this scalability issue by training multi-layer perceptrons (MLPs) exclusively on node content features using labels derived from trained GNNs.
In light of this, we study the problem of generative SSL on heterogeneous graphs and propose HGMAE, a novel heterogeneous graph masked autoencoder model to address these challenges.
Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning.
Graph neural networks (GNNs) continue to achieve state-of-the-art performance on many graph learning tasks, but rely on the assumption that a given graph is a sufficient approximation of the true neighborhood structure.
Learning effective recipe representations is essential in food studies.
We then propose RecipeRec, a novel heterogeneous graph learning model for recipe recommendation.
In light of this, few-shot learning on graphs (FSLG), which combines the strengths of graph representation learning and few-shot learning together, has been proposed to tackle the performance degradation in face of limited annotated data challenge.
From a machine learning perspective, we found that the Random Forest model outperformed several deep models on our multimodal, noisy, and imbalanced data set, thus demonstrating the efficacy of our novel feature representation method in such a context.
The self-supervised learning (SSL) paradigm is an essential exploration area, which tries to eliminate the need for expensive data labeling.
An important advantage of DeepSMOTE over GAN-based oversampling is that DeepSMOTE does not require a discriminator, and it generates high-quality artificial images that are both information-rich and suitable for visual inspection.
The recent success of graph neural networks has significantly boosted molecular property prediction, advancing activities such as drug discovery.
Ranked #1 on Molecular Property Prediction (1-shot)) on Tox21
Representation learning has overcome the often arduous and manual featurization of networks through (unsupervised) feature learning as it results in embeddings that can apply to a variety of downstream learning tasks.
In this work, we present a novel framework called CoEvoGNN for modeling dynamic attributed graph sequence.
Noun phrases and relational phrases in Open Knowledge Bases are often not canonical, leading to redundant and ambiguous facts.
The user embeddings preserve spatial patterns and temporal patterns of a variety of periodicity (e. g., hourly, weekly, and weekday patterns).
no code implementations • 10 Jun 2020 • Pablo Robles-Granda, Suwen Lin, Xian Wu, Sidney D'Mello, Gonzalo J. Martinez, Koustuv Saha, Kari Nies, Gloria Mark, Andrew T. Campbell, Munmun De Choudhury, Anind D. Dey, Julie Gregg, Ted Grover, Stephen M. Mattingly, Shayan Mirjafari, Edward Moskal, Aaron Striegel, Nitesh V. Chawla
In this paper, we create a benchmark for predictive analysis of individuals from a perspective that integrates: physical and physiological behavior, psychological states and traits, and job performance.
Towards the challenging problem of semi-supervised node classification, there have been extensive studies.
Representation learning on networks offers a powerful alternative to the oft painstaking process of manual feature engineering, and as a result, has enjoyed considerable success in recent years.
Conditions are essential in the statements of biological literature.
Experimental results on several downstream tasks, over seven real-world data sets, show that FILDNE is able to reduce memory and computational time costs while providing competitive quality measure gains with respect to the contemporary methods for representation learning on dynamic graphs.
Subsequently, given the signature matrices, a convolutional encoder is employed to encode the inter-sensor (time series) correlations and an attention based Convolutional Long-Short Term Memory (ConvLSTM) network is developed to capture the temporal patterns.
A major branch of anomaly detection methods relies on dynamic networks: raw sequence data is first converted to a series of networks, then critical change points are identified in the evolving network structure.
Social and Information Networks Physics and Society
From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm.
The effectiveness of such predictions, however, is fundamentally limited by the power-law distribution of citations, whereby publications with few citations are extremely common and publications with many citations are relatively rare.
Social and Information Networks Digital Libraries Physics and Society H.2.8; H.3.7