Interpolation-based regularisation methods such as Mixup, which generate virtual training samples, have proven to be effective for various tasks and modalities. We extend Mixup and propose DMix, an adaptive distance-aware interpolative Mixup that selects samples based on their diversity in the embedding space.
Interpolation-based regularisation methods for data augmentation have proven to be effective for various tasks and modalities.
In this work, we propose to automatically convert the background knowledge documents into document semantic graphs and then perform knowledge selection over such graphs.
As information filtering services, recommender systems have extremely enriched our daily life by providing personalized suggestions and facilitating people in decision-making, which makes them vital and indispensable to human society in the information era.
Socratic Questioning is driven by a Self-Questioning module that employs a large-scale language model to propose sub-problems related to the original problem as intermediate steps and Socratic Questioning recursively backtracks and answers the sub-problems until reaches the original problem.
We hope this task and dataset can promote further research on TOD and subjective content understanding.
We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item.
Based on this, we propose a sampling-based node-level residual module (SNR) that can achieve a more flexible utilization of different hops of subgraph aggregation by introducing node-level parameters sampled from a learnable distribution.
Graph neural networks for trust evaluation typically adopt a straightforward way such as one-hot or node2vec to comprehend node characteristics, which ignores the valuable semantic knowledge attached to nodes.
This work focuses on in-context data augmentation for intent detection.
Ranked #1 on Intent Detection on HWU64 5-shot
For instance, using automatic evaluation, we find our best fine-tuned baseline only generates safe responses to unsafe dialogue contexts from DiaSafety 4. 04% more than our approach.
Dialogue state tracking (DST) is an important step in dialogue management to keep track of users' beliefs.
A key premise for the remarkable performance of GNNs relies on complete and trustworthy initial graph descriptions (i. e., node features and graph structure), which is often not satisfied since real-world graphs are often incomplete due to various unavoidable factors.
These guidelines provide information about the context they are applicable to and what should be included in the response, allowing the models to generate responses that are more closely aligned with the developer's expectations and intent.
Prefix-tuning, or more generally continuous prompt tuning, has become an essential paradigm of parameter-efficient transfer learning.
In this work, we curated a dataset where responses from multiple response generators produced for the same dialog context are manually annotated as appropriate (positive) and inappropriate (negative).
Previous work has treated contradiction detection in bot responses as a task similar to natural language inference, e. g., detect the contradiction between a pair of bot utterances.
Due to the homophily assumption of Graph Convolutional Networks (GCNs) that these methods use, they are not suitable for heterophily graphs where nodes with different labels or dissimilar attributes tend to be adjacent.
no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou
This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.
Providing conversation models with background knowledge has been shown to make open-domain dialogues more informative and engaging.
Graph Neural Networks (GNNs) have gained great popularity in tackling various analytical tasks on graph-structured data (i. e., networks).
This paper revisits the approach from a matrix approximation perspective, and identifies two issues in the existing layer-wise sampling methods: suboptimal sampling probabilities and estimation biases induced by sampling without replacement.
As most of the existing graph neural networks yield effective graph representations of a single graph, little effort has been made for jointly learning two graph representations and calculating their similarity score.
In this work, we propose a new GNN based trust evaluation method named TrustGNN, which integrates smartly the propagative and composable nature of trust graphs into a GNN framework for better trust evaluation.
Natural language guided embodied task completion is a challenging problem since it requires understanding natural language instructions, aligning them with egocentric visual observations, and choosing appropriate actions to execute in the environment to produce desired changes.
The massive amount of trainable parameters in the pre-trained language models (PLMs) makes them hard to be deployed to multiple downstream tasks.
However, the existing contrastive learning methods are inadequate for heterogeneous graphs because they construct contrastive views only based on data perturbation or pre-defined structural properties (e. g., meta-path) in graph data while ignore the noises that may exist in both node attributes and graph topologies.
In many real-world settings, machine learning models need to identify user inputs that are out-of-domain (OOD) so as to avoid performing wrong actions.
Therefore, in the most current state-of-the-art network architectures, only a few branches corresponding to a limited number of temporal scales could be designed for speaker embeddings.
Recent years have witnessed fast developments of graph neural networks (GNNs) that have benefited myriads of graph analytic tasks and applications.
To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs.
Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention modules.
In this review, we focus on the interpretability of the DL models in healthcare.
So can we reasonably utilize these segmentation rules to design a universal propagation mechanism independent of the network structural assumption?
Understanding the training dynamics of deep neural networks (DNNs) is important as it can lead to improved training efficiency and task performance.
AdaMEL models the attribute importance that is used to match entities through an attribute-level self-attention mechanism, and leverages the massive unlabeled data from new data sources through domain adaptation to make it generic and data-source agnostic.
To accelerate the training of graph convolutional networks (GCN), many sampling-based methods have been developed for approximating the embedding aggregation.
Most prior work in dialogue modeling has been on written conversations mostly because of existing data sets.
Most prior work on task-oriented dialogue systems is restricted to supporting domain APIs.
Most prior work on task-oriented dialogue systems are restricted to limited coverage of domain APIs.
A community reveals the features and connections of its members that are different from those in other communities in a network.
Recent developments in Natural Language Processing (NLP) demonstrate that large-scale, self-supervised pre-training can be extremely beneficial for downstream tasks.
While most network embedding techniques model the proximity between nodes in a network, recently there has been significant interest in structural embeddings that are based on node equivalences, a notion rooted in sociology: equivalences or positions are collections of nodes that have similar roles--i. e., similar functions, ties or interactions with nodes in other positions--irrespective of their distance or reachability in the network.
Network Embedding Social and Information Networks
We propose in this paper a novel Second-order Relevance, which is fundamentally different from the previous First-order Relevance, to improve result relevance prediction.
We conclude with discussions of the challenges of the field and suggestions of possible directions for future research.
Models with a large number of parameters are prone to over-fitting and often fail to capture the underlying input distribution.
Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others.
We propose BiTe-GCN, a novel GCN architecture with bidirectional convolution of both topology and features on text-rich networks to solve these limitations.
Relevance has significant impact on user experience and business profit for e-commerce search platform.
Open domain question answering (OpenQA) tasks have been recently attracting more and more attention from the natural language processing (NLP) community.
While previous work on dynamic modeling and embedding has focused on representing a stream of timestamped edges using a time-series of graphs based on a specific time-scale (e. g., 1 month), we propose the notion of an $\epsilon$-graph time-series that uses a fixed number of edges for each graph, and show its superiority over the time-scale representation used in previous work.
Based on the SemEval 2014 dataset, we construct the Aspect Robustness Test Set (ARTS) as a comprehensive probe of the aspect robustness of ABSA models.
The goal of this work is to develop a novel learning framework for accurate and expressive fashion captioning.
Graph neural networks for HIN embeddings typically adopt a hierarchical attention (including node-level and meta-path-level attentions) to capture the information from meta-path-based neighbors.
Multiple-choice question answering (MCQA) is one of the most challenging tasks in machine reading comprehension since it requires more advanced reading comprehension skills such as logical reasoning, summarization, and arithmetic operations.
Learning continuous-time stochastic dynamics is a fundamental and essential problem in modeling sporadic time series, whose observations are irregular and sparse in both time and dimension.
TextAttack also includes data augmentation and adversarial training modules for using components of adversarial attacks to improve model accuracy and robustness.
In this paper, we propose using machine reading comprehension (RC) in state tracking from two perspectives: model architectures and datasets.
Current summarization systems only produce plain, factual headlines, but do not meet the practical needs of creating memorable titles to increase exposure.
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
Machine Reading Comprehension (MRC) for question answering (QA), which aims to answer a question given the relevant context passages, is an important way to test the ability of intelligence systems to understand human language.
Unfortunately, recent work has sometimes confused the notion of structural roles and communities (based on proximity) leading to misleading or incorrect claims about the capabilities of network embedding methods.
Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models.
We propose a new neural transfer method termed Dual Adversarial Transfer Network (DATNet) for addressing low-resource Named Entity Recognition (NER).
We propose a new architecture termed Dual Adversarial Transfer Network (DATNet) for addressing low-resource Named Entity Recognition (NER).
Identity stitching, the task of identifying and matching various online references (e. g., sessions over different devices and timespans) to the same user in real-world web services, is crucial for personalization and recommendations.
Contextual word embedding models such as ELMo (Peters et al., 2018) and BERT (Devlin et al., 2018) have dramatically improved performance for many natural language processing (NLP) tasks in recent months.
Text attribute transfer aims to automatically rewrite sentences such that they possess certain linguistic attributes, while simultaneously preserving their semantic content.
Motivated by the computational and storage challenges that dense embeddings pose, we introduce the problem of latent network summarization that aims to learn a compact, latent representation of the graph structure with dimensionality that is independent of the input graph size (i. e., #nodes and #edges), while retaining the ability to derive node representations on the fly.
Social and Information Networks
One is the PubMed-PICO dataset, where our best results outperform the previous best by 5. 5%, 7. 9%, and 5. 8% for P, I, and O elements in terms of F1 score, respectively.
Prevalent models based on artificial neural network (ANN) for sentence classification often classify sentences in isolation without considering the context in which sentences appear.
Ranked #1 on Sentence Classification on PubMed 20k RCT
In this paper, we propose a novel neural Tensor network framework with Interactive Attention and Sparse Learning (TIASL) for implicit discourse relation recognition.
They ignore that one discusses diverse topics when dynamically interacting with different people.
Successful evidence-based medicine (EBM) applications rely on answering clinical questions by analyzing large medical literature databases.
SemEval 2018 Task 7 tasked participants to build a system to classify two entities within a sentence into one of the 6 possible relation types.
We combine generative adversarial network (GAN) with light microscopy to achieve deep learning super-resolution under a large field of view (FOV).
Our proposed system consists of two subsystems and one regression model for predicting STS scores.
Framing is a political strategy in which politicians carefully word their statements in order to control public perception of issues.