Based on continuous prompt embeddings, we propose TransPrompt, a transferable prompting framework for few-shot learning across similar tasks.
We theoretically show that the consensus mechanism can guarantee the convergence of the global objective.
Specifically, the framework consists of three components: a backbone GNN model, a propagation controller to determine the optimal propagation steps for nodes, and a weight controller to compute the priority scores for nodes.
The recent success of large pre-trained language models (PLMs) heavily hinges on massive labeled data, which typically produces inferior performance in low-resource scenarios.
Chain-of-Thought (CoT) prompting has proven to be effective in enhancing the reasoning capabilities of Large Language Models (LLMs) with at least 100 billion parameters.
In-Context Learning (ICL) over Large language models (LLMs) aims at solving previously unseen tasks by conditioning on a few training examples, eliminating the need for parameter updates and achieving competitive performance.
To solve these issues, in this paper, we propose a novel exchanging-based multimodal fusion model MuSE for text-vision fusion based on Transformer.
Further, Since the triplet loss only optimizes the relative distance between the anchor and its positive/negative samples, it is difficult to ensure the absolute distance between the anchor and positive sample.
We propose TransPrompt v2, a novel transferable prompting framework for few-shot learning across similar or distant text classification tasks.
The resistance of pFL methods with parameter decoupling is attributed to the heterogeneous classifiers between malicious clients and benign counterparts.
To address this challenge, we extend the adaptive risk minimization technique into the unsupervised personalized federated learning setting and propose our method, FedTTA.
To mitigate this brittleness, we propose a novel Chain-of-Knowledge (CoK) prompting, where we aim at eliciting LLMs to generate explicit pieces of knowledge evidence in the form of structure triple.
In this paper, we propose a novel language-guided 3D arbitrary neural style transfer method (CLIP3Dstyler).
To tackle the issue, in this paper, we present TransCoder, a unified Transferable fine-tuning strategy for Code representation learning.
In this paper, we introduce gradient descent into black-box tuning scenario through knowledge distillation.
Large language models (LLMs) have shown increasing power on various natural language processing (NLP) tasks.
In this paper, we introduce HugNLP, a unified and comprehensive library for natural language processing (NLP) with the prevalent backend of HuggingFace Transformers, which is designed for NLP researchers to easily utilize off-the-shelf algorithms and develop novel methods with user-defined models and tasks in real-world scenarios.
Federated Learning (FL) has emerged as a de facto machine learning area and received rapid increasing research interests from the community.
Neural sequence labeling (NSL) aims at assigning labels for input language tokens, which covers a broad range of applications, such as named entity recognition (NER) and slot filling, etc.
We design an improved triplet network to map samples and prototype vectors into a low-dimensional space that is easier to be classified and propose an adaptive margin for each entity type.
Few-shot learning has been used to tackle the problem of label scarcity in text classification, of which meta-learning based methods have shown to be effective, such as the prototypical networks (PROTO).
In this paper, to comprehensively enhance the performance of generative graph SSL against other GCL models on both unsupervised and supervised learning tasks, we propose the SeeGera model, which is based on the family of self-supervised variational graph auto-encoder (VGAE).
It abstracts the shape prior of a category, and thus can provide constraints on the overall shape of an instance.
In open source project governance, there has been a lot of concern about how to measure developers' contributions.
Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data.
In this paper, to address these problems, we introduce a seminal knowledge prompting paradigm and further propose a knowledge-prompting-based PLM framework KP-PLM.
To interpret these models, some probing methods have been applied.
With top-$k$ sparse attention, the most crucial attention relation can be obtained with a lower computational cost.
Prompt-based fine-tuning has boosted the performance of Pre-trained Language Models (PLMs) on few-shot text classification by employing task-specific prompts.
In this paper, we consider human behaviors and propose the PGNN-EK model that consists of two main components.
Extractive Question Answering (EQA) is one of the most important tasks in Machine Reading Comprehension (MRC), which can be solved by fine-tuning the span selecting heads of Pre-trained Language Models (PLMs).
In other words, at least for Gaussian models with equal error variances, learning a directed graphical model is statistically no more difficult than learning an undirected graphical model.
More specifically, we construct a bipartite graph for programming problem embedding, and design an improved pre-training model PLCodeBERT for code embedding, as well as a double-sequence RNN model with exponential decay attention for effective feature fusion.
Perhaps surprisingly, we show that for certain graph ensembles, a simple forward greedy search algorithm (i. e. without a backward pruning phase) suffices to learn the Markov boundary of each node.
Greedy algorithms have long been a workhorse for learning graphical models, and more broadly for learning statistical models with sparse structure.
Meta-learning has emerged as a trending technique to tackle few-shot text classification and achieved state-of-the-art performance.
In this paper, we model the author disambiguation as a collaboration network reconstruction problem, and propose an incremental and unsupervised author disambiguation method, namely IUAD, which performs in a bottom-up manner.
In this paper, we propose a general approach to learn relation prototypesfrom unlabeled texts, to facilitate the long-tail relation extraction by transferring knowledge from the relation types with sufficient trainingdata.
The symbol-level image encoder of EDSL consists of segmentation module and reconstruction module.
We propose Hierarchical Optimization Time Integration (HOT) for efficient implicit time-stepping of the Material Point Method (MPM) irrespective of simulated materials and conditions.
In contrast to existing distant supervision approaches that suffer from insufficient training corpora to extract relations, our proposal of mining implicit mutual relation from the massive unlabeled corpora transfers the semantic information of entity pairs into the RE model, which is more expressive and semantically plausible.
Recent years have witnessed a widespread increase of interest in network representation learning (NRL).
In this paper, we study the problem of ranking vertices of a bipartite graph, based on the graph's link structure as well as prior information about vertices (which we term a query vector).