Knowledge graph inference has been studied extensively due to its wide applications.
We further bridge GCN's preferential attachment bias with unfairness in link prediction and propose a new within-group fairness metric.
Recent advances in large language models (LLMs) have demonstrated notable progress on many mathematical benchmarks.
no code implementations • 19 Jul 2023 • Wei Jin, Haitao Mao, Zheng Li, Haoming Jiang, Chen Luo, Hongzhi Wen, Haoyu Han, Hanqing Lu, Zhengyang Wang, Ruirui Li, Zhen Li, Monica Xiao Cheng, Rahul Goutam, Haiyang Zhang, Karthik Subbian, Suhang Wang, Yizhou Sun, Jiliang Tang, Bing Yin, Xianfeng Tang
To test the potential of the dataset, we introduce three tasks in this work: (1) next-product recommendation, (2) next-product recommendation with domain shifts, and (3) next-product title generation.
The expressive power of graph neural networks is usually measured by comparing how many pairs of graphs or nodes an architecture can possibly distinguish as non-isomorphic to those distinguishable by the $k$-dimensional Weisfeiler-Lehman ($k$-WL) test.
In practice, however, we might observe multiple systems that are generated across different environments, which differ in latent exogenous factors such as temperature and gravity.
Heterogeneous Information Networks (HINs) are information networks with multiple types of nodes and edges.
We propose Concept2Box, a novel approach that jointly embeds the two views of a KG using dual geometric representations.
Drug-target interaction (DTI) prediction, which aims at predicting whether a drug will be bounded to a target, have received wide attention recently, with the goal to automate and accelerate the costly process of drug design.
Graph neural networks (GNNs) are emerging for machine learning research on graph-structured data.
We model a multi-agent dynamical system as a graph and propose CounterFactual GraphODE (CF-GODE), a causal model that estimates continuous-time counterfactual outcomes in the presence of inter-dependencies between units.
Second, we use examples of user decision-making to provide our LLM-powered planner and reasoner with relevant contextual instances, enhancing their capacity to make informed decisions.
In addition, these programs can be compiled and converted into a control data flow graph (CDFG), and the compiler also provides fine-grained alignment between the code tokens and the CDFG nodes.
NCRL detects the best compositional structure of a rule body, and breaks it into small compositions in order to infer the rule head.
However, GNN explanation for link prediction (LP) is lacking in the literature.
REVEAL consists of four key components: the memory, the encoder, the retriever and the generator.
Ranked #1 on Visual Question Answering (VQA) on A-OKVQA (Accuracy metric)
The existing Active Graph Embedding framework proposes to use centrality score, density score, and entropy score to evaluate the value of unlabeled nodes, and it has been shown to be capable of bringing some improvement to the node classification tasks of Graph Convolutional Networks.
Answering open-domain questions requires world knowledge about in-context entities.
no code implementations • 15 Nov 2022 • Derek Xu, Shuyan Dong, Changhan Wang, Suyoun Kim, Zhaojiang Lin, Akshat Shrivastava, Shang-Wen Li, Liang-Hsuan Tseng, Alexei Baevski, Guan-Ting Lin, Hung-Yi Lee, Yizhou Sun, Wei Wang
Recent studies find existing self-supervised speech encoders contain primarily acoustic rather than semantic information.
In this paper, we present a machine learning framework that uses the bounds of the benefit function that are estimable from the finite population data to learn the bounds of the benefit function for each cell of characteristics.
In this paper, we formulate the novel problem of code recommendation, whose purpose is to predict the future contribution behaviors of developers given their interaction history, the semantic features of source code, and the hierarchical file structures of projects.
For works that seek to put both views of the KG together, the instance and ontology views are assumed to belong to the same geometric space, such as all nodes embedded in the same Euclidean space or non-Euclidean product space, an assumption no longer reasonable for two-view KGs where different portions of the graph exhibit different structures.
We examine a variety of applications and we thereby demonstrate the effectiveness of our PEM model.
In this paper, we propose NSUBS with two innovations to tackle the challenges: (1) A novel encoder-decoder neural network architecture to dynamically compute the matching information between the query and the target graphs at each search state; (2) A novel look-ahead loss function for training the policy network.
Existing link prediction or graph completion methods have difficulty dealing with event graphs because they are usually designed for a single large graph such as a social network or a knowledge graph, rather than multiple small dynamic event graphs.
First, the risk of having non-causal knowledge is higher, as the shared MTL model needs to encode all knowledge from different tasks, and causal knowledge for one task could be potentially spurious to the other.
In this paper, we explore multilingual KG completion, which leverages limited seed alignment as a bridge, to embrace the collective knowledge from multiple languages.
Ranked #3 on Knowledge Graph Completion on DPB-5L (French)
This research studies graph-based approaches for Answer Sentence Selection (AS2), an essential component for retrieval-based Question Answering (QA) systems.
Explaining machine learning models is an important and increasingly popular area of research interest.
To this end, we construct a network mapping $\phi$, converting a neural network $G_A$ to a directed line graph $G_B$ that is defined on those edges in $G_A$.
High-level synthesis (HLS) has freed the computer architects from developing their designs in a very low-level language and needing to exactly specify how the data should be transferred in register-level.
Conversely, multi-layer perceptrons (MLPs) have no graph dependency and infer much faster than GNNs, even though they are less accurate than GNNs for node classification in general.
Ranked #3 on Node Classification on AMZ Computers
This review systematizes the emerging literature for causal inference using deep neural networks under the potential outcomes framework.
Answering complex open-domain questions requires understanding the latent relations between involving entities.
The detection of fake news often requires sophisticated reasoning skills, such as logically combining information by considering word-level subtle clues.
Answering complex First-Order Logical (FOL) queries on large-scale incomplete knowledge graphs (KGs) is an important yet challenging task.
To facilitate various downstream applications using clinical case reports (CCRs), we pre-train two deep contextualized language models, Clinical Embeddings from Language Model (C-ELMo) and Clinical Contextual String Embeddings (C-Flair) using the clinical-related corpus from the PubMed Central.
The cross-view association model is learned to bridge the embeddings of ontological concepts and their corresponding instance-view entities.
Leveraging a wide-range of biological knowledge, such as gene ontology and protein-protein interaction (PPI) networks from other closely related species presents a vital approach to infer the molecular impact of a new species.
Clinical case reports are written descriptions of the unique aspects of a particular clinical case, playing an essential role in sharing clinical experiences about atypical disease phenotypes and new therapies.
Detecting the Maximum Common Subgraph (MCS) between two input graphs is fundamental for applications in biomedical analysis, malware detection, cloud computing, etc.
Our framework MotIf-driven Contrastive leaRning Of Graph representations (MICRO-Graph) can: 1) use GNNs to extract motifs from large graph datasets; 2) leverage learned motifs to sample informative subgraphs for contrastive learning of GNN.
A heterogeneous information network (HIN) has as vertices objects of different types and as edges the relations between objects, which are also of various types.
There has been a steady need in the medical community to precisely extract the temporal relations between clinical events.
To bridge the gap between theoretical graph attacks and real-world scenarios, in this work, we propose a novel and more realistic setting: strict black-box graph attack, in which the attacker has no knowledge about the victim model at all and is not allowed to send any queries.
Additionally, we explore a provable connection between the robustness of the unsupervised graph encoder and that of models on downstream tasks.
In this paper, we propose to learn system dynamics from irregularly-sampled partial observations with underlying graph structure for the first time.
Then we combine GNNs and our proposed variational graph pooling layers for joint graph representation learning and graph coarsening, after which the graph is progressively coarsened to one node.
Predicting missing facts in a knowledge graph (KG) is a crucial task in knowledge base construction and reasoning, and it has been the subject of much research in recent works using KG embeddings.
Ranked #2 on Knowledge Graph Completion on DPB-5L (French)
We introduce Bi-GNN for modeling biological link prediction tasks such as drug-drug interaction (DDI) and protein-protein interaction (PPI).
However, the incompleteness of the labels and the features in social network datasets is tricky, not to mention the enormous data size and the heterogeneousity.
To continue to advance this research, we present the program-derived semantics graph, a new graphical structure to capture semantics of code.
Since there has already been a broad body of HNE algorithms, as the first contribution of this work, we provide a generic paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms.
Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data.
Ranked #21 on Node Property Prediction on ogbn-mag
However, MCS computation is NP-hard, and state-of-the-art MCS solvers rely on heuristic search algorithms which in practice cannot find good solution for large graph pairs given a limited computation budget.
Original full-batch GCN training requires calculating the representation of all the nodes in the graph per GCN layer, which brings in high computation and memory costs.
However, there still lacks in-depth analysis on (1) Whether there exists a best filter that can perform best on all graph data; (2) Which graph properties will influence the optimal choice of graph filter; (3) How to design appropriate filter adaptive to the graph data.
Embedding layers are commonly used to map discrete symbols into continuous embedding vectors that reflect their semantic meanings.
Maximum Common Subgraph (MCS) is defined as the largest subgraph that is commonly present in both graphs of a graph pair.
Despite the impressive success of graph convolutional networks (GCNs) on numerous applications, training on large-scale sparse networks remains challenging.
Existing approaches for learning word embeddings often assume there are sufficient occurrences for each word in the corpus, such that the representation of words can be accurately estimated from their contexts.
With the proposed pre-training procedure, the generic structural information is learned and preserved, thus the pre-trained GNN requires less amount of labeled data and fewer domain-specific features to achieve high performance on different downstream tasks.
By training on small-scale networks, the learned model is capable of assigning relative BC scores to nodes for any unseen networks, and thus identifying the highly-ranked nodes.
In this work, we propose a dissection of GNNs on graph classification into two parts: 1) the graph filtering, where graph-based neighbor aggregations are performed, and 2) the set function, where a set of hidden node features are composed for prediction.
Ranked #1 on Graph Classification on RE-M12K
Fairness has become a central issue for our research community as classification algorithms are adopted in societally critical domains such as recidivism prediction and loan approval.
We introduce a novel approach to graph-level representation learning, which is to embed an entire graph into a vector space where the embeddings of two graphs preserve their graph-graph proximity.
Ranked #1 on Graph Classification on Web
However, there are many KGs that model uncertain knowledge, which typically model the inherent uncertainty of relations facts with a confidence score, and embedding such uncertain knowledge represents an unresolved challenge.
We introduce GSimCNN (Graph Similarity Computation via Convolutional Neural Networks) for predicting the similarity score between two graphs.
Since computing the exact distance/similarity between two graphs is typically NP-hard, a series of approximate methods have been proposed with a trade-off between accuracy and speed.
Our model achieves better generalization on unseen graphs, and in the worst case runs in quadratic time with respect to the number of nodes in two graphs.
Ranked #1 on Graph Similarity on IMDb
The selection of heroes, also known as pick or draft, takes place before the match starts and alternates between the two teams until each player has selected one hero.
Deck building is a crucial component in playing Collectible Card Games (CCGs).
Conventional embedding methods directly associate each symbol with a continuous embedding vector, which is equivalent to applying a linear transformation based on a "one-hot" encoding of the discrete symbols.
To the best of our knowledge, this is the first study to use Heterogeneous Information Network for modeling clinical data and disease diagnosis.
Conventional embedding methods directly associate each symbol with a continuous embedding vector, which is equivalent to applying linear transformation based on "one-hot" encoding of the discrete symbols.
In this paper, we propose a general neural network-based recommendation framework, which subsumes several existing state-of-the-art recommendation algorithms, and address the efficiency issue by investigating sampling strategies in the stochastic gradient descent training for the framework.
While latent factors of items can be learned effectively from user interaction data, in many cases, such data is not available, especially for newly emerged items.
The problem of ideology detection is to study the latent (political) placement for people, which is traditionally studied on politicians according to their voting behaviors.
Social and Information Networks
To address the challenges, we propose a task-guided and path-augmented heterogeneous network embedding model.
Anomaly detection plays an important role in modern data-driven security applications, such as detecting suspicious access to a socket from a process.
First, we can boost the diversity of classification ensemble by incorporating multiple clustering outputs, each of which provides grouping constraints for the joint label predictions of a set of related objects.