1 code implementation • EMNLP 2021 • Shitao Xiao, Zheng Liu, Yingxia Shao, Defu Lian, Xing Xie
In this work, we propose the Matching-oriented Product Quantization (MoPQ), where a novel objective Multinoulli Contrastive Loss (MCL) is formulated.
no code implementations • ICML 2020 • Fangcheng Fu, Yuzheng Hu, Yihan He, Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui
Recent years have witnessed intensive research interests on training deep neural networks (DNNs) more efficiently by quantization-based compression methods, which facilitate DNNs training in two ways: (1) activations are quantized to shrink the memory consumption, and (2) gradients are quantized to decrease the communication cost.
no code implementations • 20 Oct 2024 • Zhenyu Lin, Hongzheng Li, Yingxia Shao, Guanhua Ye, Yawen Li, Quanqing Xu
The existing research on efficient data augmentation methods and ideal pretext tasks for graph contrastive learning remains limited, resulting in suboptimal node representation in the unsupervised setting.
1 code implementation • 24 Sep 2024 • Chaofan Li, Minghao Qin, Shitao Xiao, Jianlyu Chen, Kun Luo, Yingxia Shao, Defu Lian, Zheng Liu
To this end, we introduce a novel model bge-en-icl, which employs few-shot examples to produce high-quality text embeddings.
no code implementations • 19 Aug 2024 • Yuanhao Zeng, Fei Ren, Xinpeng Zhou, Yihang Wang, Yingxia Shao
Although instruction tuning is widely used to adjust behavior in Large Language Models (LLMs), extensive empirical evidence and research indicates that it is primarily a process where the model fits to specific task formats, rather than acquiring new knowledge or capabilities.
1 code implementation • 16 Aug 2024 • Rui Wang, Mengshi Qi, Yingxia Shao, Anfu Zhou, Huadong Ma
To tackle this challenge, we introduce a novel physics-informed temporal network~(PITN) with adversarial contrastive learning to enable precise BP estimation with very limited data.
1 code implementation • 7 Jun 2024 • Xizhi Gu, Hongzheng Li, Shihong Gao, Xinyan Zhang, Lei Chen, Yingxia Shao
To address this memory problem, a popular solution is mini-batch GNN training.
1 code implementation • 1 Apr 2024 • Yuanhao Zeng, Min Wang, Yihang Wang, Yingxia Shao
With the same amount of task data, TELL leads in improving task performance compared to SFT.
no code implementations • 12 Mar 2024 • Linmei Hu, Hongyu He, Duokang Wang, Ziwang Zhao, Yingxia Shao, Liqiang Nie
Furthermore, we utilize the LLM to enrich the information of personality labels for enhancing the detection performance.
1 code implementation • 24 Dec 2023 • Chaofan Li, Zheng Liu, Shitao Xiao, Yingxia Shao
LLaRA consists of two pretext tasks: EBAE (Embedding-Based Auto-Encoding) and EBAR (Embedding-Based Auto-Regression), where the text embeddings from LLM are used to reconstruct the tokens for the input sentence and predict the tokens for the next sentence, respectively.
1 code implementation • 27 Nov 2023 • Hailin Zhang, Penghao Zhao, Xupeng Miao, Yingxia Shao, Zirui Liu, Tong Yang, Bin Cui
Learnable embedding vector is one of the most important applications in machine learning, and is widely used in various database-related domains.
no code implementations • 5 Nov 2023 • Peiyu Liu, Junping Du, Yingxia Shao, Zeli Guan
The CasAug model proposed in this paper based on the CasRel framework combined with the semantic enhancement mechanism can solve this problem to a certain extent.
no code implementations • 2 Nov 2023 • Weikang Chen, Junping Du, Yingxia Shao, Jia Wang, Yangxi Zhou
Federated learning enables a collaborative training and optimization of global models among a group of devices without sharing local data samples.
no code implementations • 1 Nov 2023 • Runze Fang, Yawen Li, Yingxia Shao, Zeli Guan, Zhe Xue
The entity alignment of science and technology patents aims to link the equivalent entities in the knowledge graph of different science and technology patent data sources.
no code implementations • 17 Oct 2023 • Xinyi Gao, Wentao Zhang, Junliang Yu, Yingxia Shao, Quoc Viet Hung Nguyen, Bin Cui, Hongzhi Yin
To further accelerate Scalable GNNs inference in this inductive setting, we propose an online propagation framework and two novel node-adaptive propagation methods that can customize the optimal propagation depth for each node based on its topological information and thereby avoid redundant feature propagation.
no code implementations • 22 Jun 2023 • Tianyu Zhao, Junping Du, Yingxia Shao, Zeli Guan
The algorithm combines OPTICS clustering and adaptive learning technology, and can effective-ly deal with the problem of non-independent and identically distributed data across different user terminals.
1 code implementation • 4 May 2023 • Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
It is designed to improve the quality of semantic representation where all contextualized embeddings of the pre-trained model can be leveraged.
no code implementations • 1 Nov 2022 • Xinyi Gao, Wentao Zhang, Yingxia Shao, Quoc Viet Hung Nguyen, Bin Cui, Hongzhi Yin
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications.
no code implementations • 1 Nov 2022 • Yingxia Shao, Hongzheng Li, Xizhi Gu, Hongbo Yin, Yawen Li, Xupeng Miao, Wentao Zhang, Bin Cui, Lei Chen
In recent years, many efforts have been made on distributed GNN training, and an array of training algorithms and systems have been proposed.
no code implementations • 12 Oct 2022 • Yuxin Liu, Yawen Li, Yingxia Shao, Zeli Guan
Therefore, a hypergraph neural network model based on dual channel convolution is proposed.
no code implementations • 7 Oct 2022 • Runze Fang, Junping Du, Yingxia Shao, Zeli Guan
However, most of them only establish separate table features for each relationship, which ignores the implicit relationship between different entity pairs and different relationship features.
2 code implementations • 2 Sep 2022 • Ling Yang, Zhilong Zhang, Yang song, Shenda Hong, Runsheng Xu, Yue Zhao, Yingxia Shao, Wentao Zhang, Bin Cui, Ming-Hsuan Yang
This survey aims to provide a contextualized, in-depth look at the state of diffusion models, identifying the key areas of focus and pointing to potential areas for further exploration.
no code implementations • 30 Jun 2022 • Chengjie Ma, Junping Du, Yingxia Shao, Ang Li, Zeli Guan
We provide a simple and general solution for the discovery of scarce topics in unbalanced short-text datasets, namely, a word co-occurrence network-based model CWIBTD, which can simultaneously address the sparsity and unbalance of short-text topics and attenuate the effect of occasional pairwise occurrences of words, allowing the model to focus more on the discovery of scarce topics.
no code implementations • 6 Jun 2022 • Xingchen Liu, Yawen Li, Yingxia Shao, Ang Li, Jian Liang
Based on this, we propose a car review text sentiment analysis model based on adversarial training and whole word mask BERT(ATWWM-BERT).
no code implementations • 5 Jun 2022 • Jia Wang, Junping Du, Yingxia Shao, Ang Li
In this paper, we study the text sentiment classification of online travel reviews based on social media online comments and propose the SCCL model based on capsule network and sentiment lexicon.
1 code implementation • 24 May 2022 • Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
The sentence embedding is generated from the encoder's masked input; then, the original sentence is recovered based on the sentence embedding and the decoder's masked input via masked language modeling.
Ranked #1 on Information Retrieval on MSMARCO
no code implementations • 20 Apr 2022 • Bowen Yu, Yingxia Shao, Ang Li
In recent years, with the rapid growth of Internet data, the number and types of scientific and technological resources are also rapidly expanding.
no code implementations • 13 Apr 2022 • Suyu Ouyang, Yingxia Shao, Ang Li
The scientific and technological resources of experts and scholars are mainly composed of basic attributes and scientific research achievements.
no code implementations • 13 Apr 2022 • Yuhui Wang, Yingxia Shao, Ang Li
In the era of big data, intellectual property-oriented scientific and technological resources show the trend of large data scale, high information density and low value density, which brings severe challenges to the effective use of intellectual property resources, and the demand for mining hidden information in intellectual property is increasing.
2 code implementations • 1 Apr 2022 • Shitao Xiao, Zheng Liu, Weihao Han, Jianjin Zhang, Defu Lian, Yeyun Gong, Qi Chen, Fan Yang, Hao Sun, Yingxia Shao, Denvy Deng, Qi Zhang, Xing Xie
We perform comprehensive explorations for the optimal conduct of knowledge distillation, which may provide useful insights for the learning of VQ based ANN index.
no code implementations • 31 Mar 2022 • Suyu Ouyang, Yingxia Shao, Junping Du, Ang Li
The knowledge extraction task is to extract triple relations (head entity-relation-tail entity) from unstructured text data.
no code implementations • 21 Mar 2022 • Bowen Yu, Junping Du, Yingxia Shao
With the rapid growth of the number and types of web resources, there are still problems to be solved when using a single strategy to extract the text information of different pages.
no code implementations • 21 Mar 2022 • Yuhui Wang, Junping Du, Yingxia Shao
This paper proposes a method for extracting intellectual property entities based on Transformer and technical word information , and provides accurate word vector representation in combination with the BERT language method.
1 code implementation • 18 Feb 2022 • Tianyu Zhao, Cheng Yang, Yibo Li, Quan Gan, Zhenyi Wang, Fengqi Liang, Huan Zhao, Yingxia Shao, Xiao Wang, Chuan Shi
Heterogeneous Graph Neural Network (HGNN) has been successfully employed in various tasks, but we cannot accurately know the importance of different design dimensions of HGNNs due to diverse architectures and applied scenarios.
no code implementations • 13 Feb 2022 • Jianjin Zhang, Zheng Liu, Weihao Han, Shitao Xiao, Ruicheng Zheng, Yingxia Shao, Hao Sun, Hanqing Zhu, Premkumar Srinivasan, Denvy Deng, Qi Zhang, Xing Xie
On the other hand, the capability of making high-CTR retrieval is optimized by learning to discriminate user's clicked ads from the entire corpus.
2 code implementations • 14 Jan 2022 • Shitao Xiao, Zheng Liu, Weihao Han, Jianjin Zhang, Yingxia Shao, Defu Lian, Chaozhuo Li, Hao Sun, Denvy Deng, Liangjie Zhang, Qi Zhang, Xing Xie
In this work, we tackle this problem with Bi-Granular Document Representation, where the lightweight sparse embeddings are indexed and standby in memory for coarse-grained candidate search, and the heavyweight dense embeddings are hosted in disk for fine-grained post verification.
4 code implementations • Science China Information Sciences 2021 • Shitao Xiao, Yingxia Shao, Yawen Li, Hongzhi Yin, Yanyan Shen & Bin Cui
In this paper, we model an interaction between user and item as an edge and propose a novel CF framework, called learnable edge collaborative filtering (LECF).
2 code implementations • 24 Aug 2021 • Xin Xia, Hongzhi Yin, Junliang Yu, Yingxia Shao, Lizhen Cui
In this paper, for informative session-based data augmentation, we combine self-supervised learning with co-training, and then develop a framework to enhance session-based recommendation.
no code implementations • The VLDB Journal 2021 • Yingxia Shao, Shiyue Huang, Yawen Li, Xupeng Miao, Bin Cui & Lei Chen
In this paper, to clearly compare the efficiency of various node sampling methods, we first design a cost model and propose two new node sampling methods: one follows the acceptance-rejection paradigm to achieve a better balance between memory and time cost, and the other is optimized for fast sampling the skewed probability distributions existed in natural graphs.
2 code implementations • 16 Apr 2021 • Shitao Xiao, Zheng Liu, Yingxia Shao, Defu Lian, Xing Xie
In this work, we propose the Matching-oriented Product Quantization (MoPQ), where a novel objective Multinoulli Contrastive Loss (MCL) is formulated.
1 code implementation • 18 Feb 2021 • Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
Secondly, it improves the data efficiency of the training workflow, where non-informative data can be eliminated from encoding.
no code implementations • 8 Dec 2020 • Yang Li, Jiawei Jiang, Jinyang Gao, Yingxia Shao, Ce Zhang, Bin Cui
In this framework, the BO methods are used to solve the HPO problem for each ML algorithm separately, incorporating a much smaller hyperparameter space for BO methods.
1 code implementation • 10 Oct 2020 • Xingyu Yao, Yingxia Shao, Bin Cui, Lei Chen
Finally, with the new edge sampler and random walk model abstraction, we carefully implement a scalable NRL framework called UniNet.
no code implementations • 10 Oct 2019 • Xupeng Miao, Nezihe Merve Gürel, Wentao Zhang, Zhichao Han, Bo Li, Wei Min, Xi Rao, Hansheng Ren, Yinan Shan, Yingxia Shao, Yujie Wang, Fan Wu, Hui Xue, Yaming Yang, Zitao Zhang, Yang Zhao, Shuai Zhang, Yujing Wang, Bin Cui, Ce Zhang
Despite the wide application of Graph Convolutional Network (GCN), one major limitation is that it does not benefit from the increasing depth and suffers from the oversmoothing problem.
no code implementations • 3 Jul 2019 • Fangcheng Fu, Jiawei Jiang, Yingxia Shao, Bin Cui
Gradient boosting decision tree (GBDT) is a widely-used machine learning algorithm in both data analytic competitions and real-world industrial applications.
6 code implementations • 16 Dec 2018 • Yongqi Zhang, Quanming Yao, Yingxia Shao, Lei Chen
Negative sampling, which samples negative triplets from non-observed ones in the training data, is an important step in KG embedding.
Ranked #6 on Link Prediction on FB15k
no code implementations • 6 Nov 2018 • Yang Li, Jiawei Jiang, Yingxia Shao, Bin Cui
The performance of deep neural networks crucially depends on good hyperparameter configurations.