The idea is to supplement the GNN-based main supervised recommendation task with the temporal representation via an auxiliary cross-view contrastive learning mechanism.
In this work, we present the Knowledge Graph Transformer (kgTransformer) with masked pre-training and fine-tuning strategies.
Then, NES computes the network embedding from this representative subgraph, efficiently.
The results demonstrate that our method outperforms various SOTA GNNs for stable prediction on graphs with agnostic distribution shift, including shift caused by node labels and attributes.
RetroPrime achieves the Top-1 accuracy of 64. 8% and 51. 4%, when the reaction type is known and unknown, respectively, in the USPTO-50 K dataset.
Ranked #10 on Single-step retrosynthesis on USPTO-50k
In this work, we show that with merely a small fraction of contexts (Q-contexts)which are typical in the whole corpus (and their mutual information with words), one can construct high-quality word embedding with negligible errors.
The emergence of deep learning models makes modeling data patterns in large quantities of data possible.
no code implementations • 14 Jun 2021 • Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan YAO, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI).
However, training trillion-scale MoE requires algorithm and system co-design for a well-tuned high performance distributed training system.
On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1. 25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.
Ranked #2 on Language Modelling on WikiText-103 (using extra training data)
Based on the theoretical analysis, we propose Local Clustering Graph Neural Networks (LCGNN), a GNN learning paradigm that utilizes local clustering to efficiently search for small but compact subgraphs for GNN training and inference.
Our result gives the first bound on the convergence rate of the co-occurrence matrix and the first sample complexity analysis in graph representation learning.
Graph representation learning has emerged as a powerful technique for addressing real-world problems.
We present BlockBERT, a lightweight and efficient BERT model for better modeling long-distance dependencies.
Previous research shows that 1) popular network embedding benchmarks, such as DeepWalk, are in essence implicitly factorizing a matrix with a closed form, and 2)the explicit factorization of such matrix generates more powerful embeddings than existing methods.
1 code implementation • 22 Jun 2019 • Guangyong Chen, Pengfei Chen, Chang-Yu Hsieh, Chee-Kong Lee, Benben Liao, Renjie Liao, Weiwen Liu, Jiezhong Qiu, Qiming Sun, Jie Tang, Richard Zemel, Shengyu Zhang
We introduce a new molecular dataset, named Alchemy, for developing machine learning models useful in chemistry and material science.
Inspired by the recent success of deep neural networks in a wide range of computing applications, we design an end-to-end framework, DeepInf, to learn users' latent feature representation for predicting social influence.
We study the problem of knowledge base (KB) embedding, which is usually addressed through two frameworks---neural KB embedding and tensor decomposition.
This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning.