Furthermore, we propose a threshold-free intent multi-intent classifier that utilizes the output of IND task and detects the multiple intents without depending on the threshold.
Non-autoregressive translation (NAT) models, which eliminate the sequential dependencies within the target sentence, have achieved remarkable inference speed, but suffer from inferior translation quality.
Instead, we employ a proxy model to extract state features that are both discriminative (adaptive to the agent) and generally applicable (robust to agent noise).
To obtain demonstration examples with high-quality explanations, we propose a new explanation generation bootstrapping to iteratively refine generated explanations by considering the previous generation and template-based hint.
An ideal detection model is expected to achieve all three critical properties of (I) early detection, (II) good interpretability, and (III) versatility for various illicit activities.
Data volumes have soared in recent years and the computational cost of an exhaustive exact nearest neighbor search is often prohibitive, necessitating the adoption of approximate techniques.
The proposed methods not only significantly outperform the conventional PINN method in terms of computational efficiency and computational accuracy, but also compare favorably with the state-of-the-art methods in the recent literature.
Notably, our model achieves state-of-the-art performance on all action categories in the Human3. 6M dataset using detected 2D poses from CPN, and our code is available at: https://github. com/KHB1698/DC-GCT.
Ranked #41 on 3D Human Pose Estimation on Human3.6M
Given a limited labeling budget, active learning (AL) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training.
Dynamic multi-objective optimisation (DMO) handles optimisation problems with multiple (often conflicting) objectives in varying environments.
Target detection in use-case environments is a challenging task, which is influenced by complex and dynamic landscapes, illumination, and vibrations.
To enable each agent to accurately understand the current network state and the status of multicast tree construction, the state space of each agent is designed based on the traffic and multicast tree status matrices, and the set of AP nodes in the network is used as the action space.
Second, a DRL-based data forwarding mechanism is designed in the knowledge plane.
To detect fraud behaviors of malicious addresses in the early stage, we present Evolve Path Tracer, which consists of Evolve Path Encoder LSTM, Evolve Path Graph GCN, and Hierarchical Survival Predictor.
Due to the issue that existing wireless sensor network (WSN)-based anomaly detection methods only consider and analyze temporal features, in this paper, a self-supervised learning-based anomaly node detection method based on an autoencoder is designed.
In this work, we propose an ID-preserving talking head generation framework, which advances previous methods in two aspects.
This paper proposes reconstructing the binary adjacency matrix via tensor decomposition, and a traffic flow forecasting method is proposed.
With the deterioration of climate, the phenomenon of rain-induced flooding has become frequent.
Gait recognition is widely used in diversified practical applications.
% With the type-dependent selection strategy and global status vectors, our model can be applied to detect various illicit activities with strong interpretability.
Traditional multicast routing methods have some problems in constructing a multicast tree, such as limited access to network state information, poor adaptability to dynamic and complex changes in the network, and inflexible data forwarding.
There lacks an efficient method to help users conduct gesture exploration, which is challenging due to the intrinsically temporal evolution of gestures and their complex correlation to speech content.
For the graph construction, a two-stage graph diversification scheme is proposed, which makes a good trade-off between the efficiency and reachability for the search procedure that builds upon it.
Anomaly detection is widely used to distinguish system anomalies by analyzing the temporal and spatial features of wireless sensor network (WSN) data streams; it is one of critical technique that ensures the reliability of WSNs.
Natural language interfaces (NLIs) provide users with a convenient way to interactively analyze data through natural language queries.
To this end, we believe that local attention is crucial to strike the balance between computational efficiency and modeling capacity.
Ranked #1 on Image Generation on CelebA 256x256 (FID metric)
In this paper, we show our solution to the Google Landmark Recognition 2021 Competition.
However, those models do not consider the numerical properties of numbers and cannot perform robustly on numerical reasoning tasks (e. g., math word problems and measurement estimation).
Despite being a critical communication skill, grasping humor is challenging -- a successful use of humor requires a mixture of both engaging content build-up and an appropriate vocal delivery (e. g., pause).
Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels.
This paper presents our systems for the three Subtasks of SemEval Task4: Reading Comprehension of Abstract Meaning (ReCAM).
Ranked #1 on Reading Comprehension on ReCAM (using extra training data)
In this paper, we rely on representative prototypes, the feature centroids of classes, to address the two issues for unsupervised domain adaptation.
Ranked #10 on Semantic Segmentation on GTAV-to-Cityscapes Labels
In this paper, we classify affine Ricci solitons associated to canonical connections and Kobayashi-Nomizu connections and perturbed canonical connections and perturbed Kobayashi-Nomizu connections on three-dimensional Lorentzian Lie groups with some product structure.
Differential Geometry 53C40, 53C42
In this paper, we try to eliminate semantic ambiguity in skip connection operations by adding attention gates (AGs), and use attention mechanisms to combine local features with their corresponding global dependencies, explicitly model the dependencies between channels and use multi-scale predictive fusion to utilize global information at different scales.
Starting from the first region, the feedforward control parameters are learned simultaneously with the low order plant model in the same region and then moves to the next region until all the regions are performed.
Two case studies and interviews with domain experts demonstrate the effectiveness of GNNLens in facilitating the understanding of GNN models and their errors.
Modern neural machine translation (NMT) models employ a large number of parameters, which leads to serious over-parameterization and typically causes the underutilization of computational resources.
Fusing multi-modality medical images, such as MR and PET, can provide various anatomical or functional information about human body.
Specifically, we model the relationship between students and questions using student interactions to construct the student-interaction-question network and further present a new GNN model, called R^2GCN, which intrinsically works for the heterogeneous networks, to achieve generalizable student performance prediction in interactive online question pools.
The growing use of automated decision-making in critical applications, such as crime prediction and college admission, has raised questions about fairness in machine learning.
These optimization problems usually have complex properties, such as non-convexity and NP-hardness, which may not be addressed by the traditional convex optimization-based solutions.
Pixel-wise operations between polarimetric images are important for processing polarization information.
In the occluded region, as depth and camera motion can provide more reliable motion estimation, they can be used to instruct unsupervised learning of optical flow.
The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech.
The key challenge of multi-domain translation lies in simultaneously encoding both the general knowledge shared across domains and the particular knowledge distinctive to each domain in a unified model.
With increasing popularity in online learning, a surge of E-learning platforms have emerged to facilitate education opportunities for k-12 (from kindergarten to 12th grade) students and with this, a wealth of information on their learning logs are getting recorded.
Our visualization system features a channel coherence view and a sentence clustering view that together enable users to obtain a quick overview of emotion coherence and its temporal evolution.
Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings.
To leverage the data and resources, a new machine learning paradigm, called edge learning, has emerged where learning algorithms are deployed at the edge for providing fast and intelligent services to mobile users.
Inspired by the conditional integration idea in classical control society, we propose SPI-Optimizer, an integral-Separated PI controller based optimizer WITHOUT introducing extra hyperparameter.
We frame low-resource translation as a meta-learning problem, and we learn to adapt to low-resource languages based on multilingual high-resource language tasks.
In this paper, we present a novel deep learning-based approach to evaluate the readability of graph layouts by directly using graph images.
In this paper, we present a multi-scale Fully Convolutional Networks (MSP-RFCN) to robustly detect and classify human hands under various challenging conditions.
Specifically, the former is devised to find a pair of individuals with the minimum vector angle, which means that these two individuals share the most similar search direction.
In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training.
In the evolutionary computation research community, the performance of most evolutionary algorithms (EAs) depends strongly on their implemented coordinate system.
This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy.
Multi-objective optimisation is regarded as one of the most promising ways for dealing with constrained optimisation problems in evolutionary optimisation.
In this work, we propose to assess classifiers in terms of normalized mutual information (NI), which is novel and well defined in a compact range for classifier evaluation.