To address this phenomenon, we propose a novel end-to-end training scheme that brings the three separate modules into a single model.
Emotion inference in multi-turn conversations aims to predict the participant’s emotion in the next upcoming turn without knowing the participant’s response yet, and is a necessary step for applications such as dialogue planning.
However, a majority of generative modeling approaches are focused solely on the joint distribution $p(x)$ and utilize models where it is intractable to obtain the conditional distribution of some arbitrary subset of features $x_u$ given the rest of the observed covariates $x_o$: $p(x_u \mid x_o)$.
The ever-growing demand and complexity of machine learning are putting pressure on hyper-parameter tuning systems: while the evaluation cost of models continues to increase, the scalability of state-of-the-arts starts to become a crucial bottleneck.
In this model, the upper-level aims to seek the optimal location and capacity of DGs and energy storage, while the lower-level optimizes the operation of energy storage devices.
The layout of a mobile screen is a critical data source for UI design research and semantic understanding of the screen.
The IM and OM were 2D convolutional layers and the FEM was composed of a cascaded of residual Swin transformer blocks (RSTBs) and 2D convolutional layers.
Few-shot learning (learning with a few samples) is one of the most important capacities of the human brain.
With the rapid growth of power market reform and power demand, the power transmission capacity of a power grid is approaching its limit, and the secure and stable operation of power systems becomes increasingly important.
Recent advances in neuroimaging along with algorithmic innovations in statistical learning from network data offer a unique pathway to integrate brain structure and function, and thus facilitate revealing some of the brain's organizing principles at the system level.
Moreover, motivated by the observation of the relationship between coarse- and fine-grained emotions, we adopt a dual-head module that enables the PGCN to progressively learn more discriminative EEG features, from coarse-grained (easy) to fine-grained categories (difficult), referring to the hierarchical characteristic of emotion.
An operating entity utilizing community-integrated energy systems with a large number of small-scale distributed energy sources can easily trade with existing distribution markets.
DeGBBBA is an advanced variant of GBBBA in which a modified Gaussian distribution is introduced so as to allow the dynamic adaptation of exploitation and exploitation in the proposed algorithm.
Correlations between imaging findings and clinical lab tests suggested the value of this system as a potential tool to assess disease severity of COVID-19.
Our model consists of a multimodal Transformer encoder that jointly encodes UI images and structures, and performs UI object detection when the UI structures are absent in the input.
Facing the difficulty of expensive and trivial data collection and annotation, how to make a deep learning-based short-term voltage stability assessment (STVSA) model work well on a small training dataset is a challenging and urgent problem.
In this paper, we propose two methods to improve the performance of GCs: 1) Utilizing structural information in the feature space, and 2) exploiting the multi-hop information in one GC step.
Starting at a random initial point or an existing estimate, our method iteratively updates the pairwise vertex distances, the sets of similar vertices, and connecting probabilities to improve the precision of the estimate.
In this setting, we embed an additional pair of “latent-latent” to reduce the domain gap between the source and different latent domains, allowing the model to adapt well on multiple target domains simultaneously.
We present Lepard, a Learning based approach for partial point cloud matching for rigid and deformable scenes.
Ranked #1 on Partial Point Cloud Matching on 4DMatch
In order to solve this model, this research combines Jaya algorithm and interior point method (IPM) to develop a hybrid analysis-heuristic solution method called Jaya-IPM, where the lower- and upper- levels are respectively addressed by the IPM and the Jaya, and the scheduling scheme is obtained via iterations between the two levels.
The lung volume volume was firstly delineated using a pre-trained U-net and worked as the input for the later network.
Our Spiking CapsNet fully combines the strengthens of SNN and CapsNet, and shows strong robustness to noise and affine transformation.
no code implementations • 15 Nov 2021 • Xiang Huang, Zhanhong Ye, Hongsheng Liu, Beiji Shi, Zidong Wang, Kang Yang, Yang Li, Bingya Weng, Min Wang, Haotian Chu, Jing Zhou, Fan Yu, Bei Hua, Lei Chen, Bin Dong
In these applications, our goal is to solve parametric PDEs rather than one instance of them.
Unsupervised dialogue structure learning is an important and meaningful task in natural language processing.
In recent years, deep learning technology has been used to solve partial differential equations (PDEs), among which the physics-informed neural networks (PINNs) emerges to be a promising method for solving both forward and inverse PDE problems.
Transient stability assessment (TSA) has always been a fundamental means for ensuring the secure and stable operation of power systems.
In this paper, we describe our method for tackling the automated hyperparameter optimization challenge in QQ Browser 2021 AI Algorithm Competiton (ACM CIKM 2021 AnalyticCup Track 2).
The multi-relational Knowledge Base Question Answering (KBQA) system performs multi-hop reasoning over the knowledge graph (KG) to achieve the answer.
The real-time prediction of NOx emissions is of great significance for pollutant emission control and unit operation of coal-fired power plants.
Recent works reveal that feature or label smoothing lies at the core of Graph Neural Networks (GNNs).
StrainNet predicts the strain field directly from the image input without relying on the displacement prediction, which significantly improves the strain prediction accuracy.
Designing neural architectures requires immense manual efforts.
The design process of user interfaces (UIs) often begins with articulating high-level design goals.
Then an automatically-generating transaction strategy is constructed building on PPO with LSTM as the basis to construct the policy.
Despite the wide applications of non-Gaussian fluctuations in numerous physical phenomena, the data-driven approaches to extract stochastic dynamical systems with (non-Gaussian) L\'evy noise are relatively few so far.
Recent analytical transferability metrics are mainly designed for image classification problem, and currently there is no specific investigation for the transferability estimation of semantic segmentation task, which is an essential problem in autonomous driving, medical image analysis, etc.
It consists of (1) a pairwise type-enriched sentence encoding module injecting both context-free and -related backgrounds to alleviate sentence-level wrong labeling, and (2) a hierarchical type-sentence alignment module enriching a sentence with the triple fact's basic attributes to support long-tail relations.
Given multiple source domains, domain generalization aims at learning a universal model that performs well on any unseen but related target domain.
In this work, we propose a data-driven approach to extract stochastic governing laws with both (Gaussian) Brownian motion and (non-Gaussian) L\'evy motion, from short bursts of simulation data.
Modern deep neural networks (DNNs) have greatly facilitated the development of sequential recommender systems by achieving state-of-the-art recommendation performance on various sequential recommendation tasks.
The community integrated energy system (CIES) is an essential energy internet carrier that has recently been the focus of much attention.
District energy systems can not only reduce energy consumption but also set energy supply dispatching schemes according to demand.
In order to reduce the negative impact of the uncertainty of load and renewable energies outputs on microgrid operation, an optimal scheduling model is proposed for isolated microgrids by using automated reinforcement learning-based multi-period forecasting of renewable power generations and loads.
Modeling tap or click sequences of users on a mobile device can improve our understandings of interaction behavior and offers opportunities for UI optimization by recommending next element the user might want to click on.
We present HelpViz, a tool for generating contextual visual mobile tutorials from text-based instructions that are abundant on the web.
Mobile User Interface Summarization generates succinct language descriptions of mobile screens for conveying important contents and functionalities of the screen, which can be useful for many language-based application scenarios.
Aiming at the problem that delay time is difficult to determine and prediction accuracy is low in building prediction model of SCR system, a dynamic modeling scheme based on a hybrid of multiple data-driven algorithms was proposed.
However, for the existing frequency regulation scheme of wind turbines, the control gains in the auxiliary frequency controller are difficult to set because of the compromise of the frequency regulation performance and the stable operation of wind turbines, especially when the wind speed remains variable.
Data selection methods, such as active learning and core-set selection, are useful tools for improving the data efficiency of deep learning models on large-scale datasets.
Unfortunately, many real-world networks are sparse in terms of both edges and labels, leading to sub-optimal performance of GNNs.
Advances in data science are leading to new progresses in the analysis and understanding of complex dynamics for systems with experimental and observational data.
End-to-end AutoML has attracted intensive interests from both academia and industry, which automatically searches for ML pipelines in a space induced by feature engineering, algorithm/model selection, and hyper-parameter tuning.
A community integrated energy system (CIES) with an electric vehicle charging station (EVCS) provides a new way for tackling growing concerns of energy efficiency and environmental pollution, it is a critical task to coordinate flexible demand response and multiple renewable uncertainties.
Scenario generation is a fundamental and crucial tool for decision-making in power systems with high-penetration renewables.
Our framework can easily handle a large number of features using a hierarchical acquisition policy and is more robust to OOD inputs with the help of an OOD detector for partially observed data.
As a result, this creates a severe bottleneck when we are trying to advance the recommendation accuracy and generating fine-grained explanations since the explicit attributes have only loose connections to the actual recommendation process.
PV generation reserve a part of the active power in accordance with the pre-defined power versus voltage curve.
Furthermore, the sparse POI-POI transitions restrict the ability of a model to learn effective sequential patterns for recommendation.
It aims at suggesting the next POI to a user in spatial and temporal context, which is a practical yet challenging task in various applications.
Histological subtype of papillary (p) renal cell carcinoma (RCC), type 1 vs. type 2, is an essential prognostic factor.
Therefore, in this paper, we introduce a new capsule network with graph routing to learn both relationships, where capsules in each layer are treated as the nodes of a graph.
In this paper, we propose a Composite High-Resolution Network for ccRCC nuclei grading.
During training, this relative order prediction network and the feature embedding network are tightly coupled, providing mutual constraints to each other to improve overall metric learning performance in a cooperative manner.
It has been long recognized that deep neural networks are sensitive to changes in spatial configurations or scene structures.
Transferability estimation is an essential problem in transfer learning to predict how good the performance is when transferring a source model (or source task) to a target task.
Based on this observation, we propose a necessary condition of IID generation and provide a new loss to encourage the closeness between the inverse source of real data and the Gaussian source in the latent space to regularize the generation to be IID from the target distribution.
Black-box optimization (BBO) has a broad range of applications, including automatic machine learning, engineering, physics, and experimental design.
With the combination of the two mechanisms, we propose a deep spiking neural network with adaptive self-feedback and balanced excitatory and inhibitory neurons (BackEISNN).
Also, when ResNet structure-based ANNs are converted, the information of output neurons is incomplete due to the rapid transmission of the shortcut path.
In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data.
no code implementations • 7 May 2021 • Jinjin Gu, Haoming Cai, Chao Dong, Jimmy S. Ren, Yu Qiao, Shuhang Gu, Radu Timofte, Manri Cheon, SungJun Yoon, Byungyeon Kang, Junwoo Lee, Qing Zhang, Haiyang Guo, Yi Bin, Yuqing Hou, Hengliang Luo, Jingyu Guo, ZiRui Wang, Hai Wang, Wenming Yang, Qingyan Bai, Shuwei Shi, Weihao Xia, Mingdeng Cao, Jiahao Wang, Yifan Chen, Yujiu Yang, Yang Li, Tao Zhang, Longtao Feng, Yiting Liao, Junlin Li, William Thong, Jose Costa Pereira, Ales Leonardis, Steven McDonagh, Kele Xu, Lehan Yang, Hengxing Cai, Pengfei Sun, Seyed Mehdi Ayyoubzadeh, Ali Royat, Sid Ahmed Fezza, Dounia Hammou, Wassim Hamidouche, Sewoong Ahn, Gwangjin Yoon, Koki Tsubota, Hiroaki Akutsu, Kiyoharu Aizawa
This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021.
Tracking non-rigidly deforming scenes using range sensors has numerous applications including computer vision, AR/VR, and robotics.
Moreover, modeling and inferring complex relations of one-to-many (1-N), many-to-one (N-1), and many-to-many (N-N) by previous knowledge graph completion approaches requires high model complexity and a large amount of training instances.
In recent studies, neural message passing has proved to be an effective way to design graph neural networks (GNNs), which have achieved state-of-the-art performance in many graph-based tasks.
This task is challenging as models need not only to capture spatial dependency and temporal dependency within the data, but also to leverage useful auxiliary information for accurate predictions.
Specifically, we use optimal transport to estimate domain difference and the optimal coupling between source and target distributions, which is then used to derive the conditional entropy of the target task (task difference).
In order to balance the interests of integrated energy operator (IEO) and users, a novel Stackelberg game-based optimization framework is proposed for the optimal scheduling of integrated demand response (IDR)-enabled integrated energy systems with uncertain renewable generations, where the IEO acts as the leader who pursues the maximization of his profits by setting energy prices, while the users are the follower who adjusts energy consumption plans to minimize their energy costs.
Existing coordinated cyber-attack detection methods have low detection accuracy and efficiency and poor generalization ability due to difficulties dealing with unbalanced attack data samples, high data dimensionality, and noisy data sets.
To obtain the accurate transient states of the big scale natural gas pipeline networks under the bad data and non-zero mean noises conditions, a robust Kalman filter-based dynamic state estimation method is proposed using the linearized gas pipeline transient flow equations in this paper.
no code implementations • 25 Feb 2021 • Justin R. Gagnon, Orad Reshef, Daniel H. G. Espinosa, M. Zahirul Alam, Daryl I. Vulis, Erik N. Knall, Jeremy Upham, Yang Li, Ksenia Dolgaleva, Eric Mazur, Robert W. Boyd
However, the utility of all such parametric nonlinear optical processes is hampered by phase-matching requirements.
Optics Applied Physics
We find that the thermodynamics of this class of conformally related black holes is independent of scale factors.
General Relativity and Quantum Cosmology High Energy Physics - Theory
To make off-screen interaction without specialized hardware practical, we investigate using deep learning methods to process the common built-in IMU sensor (accelerometers and gyroscopes) on mobile phones into a useful set of one-handed interaction events.
Based on our experiments, Spacewalker allows designers to effectively search a large design space of a UI, using the language they are familiar with, and improve their design rapidly at a minimal cost.
Secondly, a long short-term memory (LSTM) based assessment model is built through learning the time dependencies from the post-disturbance system dynamics.
Considering the constraints of the temporal conversion of information flow and energy flow, a microgrid CPS coupling model is established, the effectiveness of which is verified by simulating false data injection attack (FDIA) scenarios.
In recent years, deep learning-based automated personality trait detection has received a lot of attention, especially now, due to the massive digital footprints of an individual.
Generative adversarial networks have shown their ability in capturing high-dimensional complex distributions and generating realistic data samples e. g. images.
In this framework, the BO methods are used to solve the HPO problem for each ML algorithm separately, incorporating a much smaller hyperparameter space for BO methods.
Instead of sampling configurations randomly in HB, BOHB samples configurations based on a BO surrogate model, which is constructed with the high-fidelity measurements only.
Then, facilitated by the proposed base model, we introduce collaborating relation features shared among relations in the hierarchies to promote the relation-augmenting process and balance the training data for long-tail relations.
We propose widget captioning, a novel task for automatically generating language descriptions for UI elements from multimodal input including both the image and the structural representations of user interfaces.
The emergence of transition phenomena between metastable states induced by noise plays a fundamental role in a broad range of nonlinear systems.
The success of Deep Neural Networks (DNNs) highly depends on data quality.
So it is necessary to give more attention to the EEG samples with strong transferability rather than forcefully training a classification model by all the samples.
no code implementations • 21 Aug 2020 • Zhang Li, Jiehua Zhang, Tao Tan, Xichao Teng, Xiaoliang Sun, Yang Li, Lihong Liu, Yang Xiao, Byungjae Lee, Yilong Li, Qianni Zhang, Shujiao Sun, Yushan Zheng, Junyu Yan, Ni Li, Yiyu Hong, Junsu Ko, Hyun Jung, Yanling Liu, Yu-cheng Chen, Ching-Wei Wang, Vladimir Yurovskiy, Pavel Maevskikh, Vahid Khanagha, Yi Jiang, Xiangjun Feng, Zhihong Liu, Daiqiang Li, Peter J. Schüffler, Qifeng Yu, Hui Chen, Yuling Tang, Geert Litjens
All methods were based on deep learning and categorized into two groups: multi-model method and single model method.
To further enhance the inter-class discriminative power of the feature generated by this network, we adapt the concept of triplet loss from supervised metric learning to our unsupervised case and introduce the contrastive clustering loss.
Reasoning over an instance composed of a set of vectors, like a point cloud, requires that one accounts for intra-set dependent features among elements.
Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language questions via well-structured relation information between entities stored in knowledge bases.
This paper proposes to use sentiment analysis to extract useful information from multiple textual data sources and a blending ensemble deep learning model to predict future stock movement.
With this score, we can identify the pretraining examples in the pretraining task that contribute most to a prediction in the finetuning task.
To solve this challenge, we proposed a Teacher-Student GAN model (TS-GAN) to adopt different domains and guide the ReID backbone to learn better ReID information.
We present SplitFusion, a novel dense RGB-D SLAM framework that simultaneously performs tracking and dense reconstruction for both rigid and non-rigid components of the scene.
The probabilistic classification vector machine (PCVM) synthesizes the advantages of both the support vector machine and the relevant vector machine, delivering a sparse Bayesian solution to classification problems.
In previous studies, decoding electroencephalography (EEG) signals has not considered the topological relationship of EEG electrodes.
To trade off the improvement with the cost of acquisition, we leverage an information theoretic metric, conditional mutual information, to select the most informative feature to acquire.
In order to relieve this problem, we propose a novel grafted network (GraftedNet), which is designed by grafting a high-accuracy rootstock and a light-weighted scion.
A new heterogeneous branch, SE-Res-Branch, is proposed based on the SE-Res module, which consists of the Squeeze-and-Excitation block and the residual block.
We then design a numerical algorithm to compute the drift, diffusion coefficient and jump measure, and thus extract a governing stochastic differential equation with Gaussian and non-Gaussian noise.
We present a new problem: grounding natural language instructions to mobile user interface actions, and create three new datasets for it.
We describe the annotation process in detail and compare it with other similar evaluation systems.
We calculated the scattering force distribution of a micro-particle trapped in an optical tweezers formed by the strongly focused LG beam, and showed that there exist stable trajectories of the particle that controlled by the negative torque.
One of the widespread solutions for non-rigid tracking has a nested-loop structure: with Gauss-Newton to minimize a tracking objective in the outer loop, and Preconditioned Conjugate Gradient (PCG) to solve a sparse linear system in the inner loop.
In this work we develop a novel Bayesian neural network methodology to achieve strong adversarial robustness without the need for online adversarial training.
Dynamic environments are challenging for visual SLAM since the moving objects occlude the static environment features and lead to wrong camera motion estimation.
Robotic automation in surgery requires precise tracking of surgical tools and mapping of deformable tissue.
Versions of the ten DE algorithms based on individuals redistribution are compared with not only original version but also version based on complete restart, where individuals redistribution and complete restart are based on the same entry criterion.
To the best of our knowledge, this study is the first research on the prospective use of a deep learning-based diagnosis system for AVNFH by conducting two pilot studies representing real-world application scenarios.
Most existing works in Person Re-identification (ReID) focus on settings where illumination either is kept the same or has very little fluctuation.
While we focus on interface layout prediction, our model can be generally applicable for other layout prediction problems that involve tree structures and 2-dimensional placements.
Distantly supervised relation extraction intrinsically suffers from noisy labels due to the strong assumption of distant supervision.
In this paper, we proposed a general framework for data poisoning attacks to graph-based semi-supervised learning (G-SSL).
To tackle this critical problem, we propose an attribute-aware pedestrian detector to explicitly model people's semantic attributes in a high-level feature detection fashion.
In this work, we propose a novel surgical perception framework, SuPer, for surgical robotic control.
Specifically, we start by learning the structure of the graph that parsimoniously represents the spatial dependency between the data at different locations.
We propose a hybrid hardware-software framework that has the potential to significantly reduce the computational complexity and memory requirements of on-device machine learning.
In this work, we develop a joint sample discovery and iterative model evolution method for semi-supervised learning on very small labeled training sets.
This paper addresses the issue and proposes a criterion for linear dynamical system based on the principle of minimum description length.
We show that there is a simple linear time algorithm for verifying a single tree, and for tree ensembles, the verification problem can be cast as a max-clique problem on a multi-partite graph with bounded boxicity.
Similar to IQA models, the structural dissimilarity is computed based on the correlation of the structural features.
Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99. 36 on the PPI dataset, while the previous best result was 98. 71 by .
Ranked #1 on Node Classification on Amazon2M
The performance of most the clustering methods hinges on the used pairwise affinity, which is usually denoted by a similarity matrix.
An important question in task transfer learning is to determine task transferability, i. e. given a common input domain, estimating to what extent representations learned from a source task can help in learning a target task.
Thanks to the flexibility of ARM, many smooth or non-smooth parametric functions, such as scaled sigmoid or hard sigmoid, can be used to parameterize this binary optimization problem and the unbiasness of the ARM estimator is retained, while the hard concrete estimator has to rely on the hard sigmoid function to achieve conditional computation and thus accelerated training.
To overcome this shortcoming, in this study we proposed a deep learning-based framework for reconstructing full image from its much smaller sub-area(s).
Due to its low storage cost and fast query speed, hashing has been recognized to accomplish similarity search in large-scale multimedia retrieval applications.
In this paper, we propose to leverage graph optimization and loop closure detection to overcome limitations of unsupervised learning based monocular visual odometry.
Moral graphs were introduced in the 1980s as an intermediate step when transforming a Bayesian network to a junction tree, on which exact belief propagation can be efficiently done.
In this work, we develop a new approach to generative density estimation for exchangeable, non-i. i. d.
Sensor fusion has wide applications in many domains including health care and autonomous systems.
Furthermore, we also fuse phonetic features with textual and visual features in order to mimic the way humans read and understand Chinese text.
The concept of conditional computation for deep nets has been proposed previously to improve model performance by selectively using only parts of the model conditioned on the sample it is processing.
Brain-Computer Interface (BCI) system provides a pathway between humans and the outside world by analyzing brain signals which contain potential neural information.
The performance of deep neural networks crucially depends on good hyperparameter configurations.
The algorithm achieves an order of magnitude faster inference than the original softmax layer for predicting top-$k$ words in various tasks such as beam search in machine translation or next words prediction.
We propose area attention: a way to attend to areas in the memory, where each area contains a group of items that are structurally adjacent, e. g., spatially for a 2D memory such as images, or temporally for a 1D memory such as natural language sentences.
Decoding EEG signals of different mental states is a challenging task for brain-computer interfaces (BCIs) due to nonstationarity of perceptual decision processes.
Existing image paragraph captioning methods give a series of sentences to represent the objects and regions of interests, where the descriptions are essentially generated by feeding the image fragments containing objects and regions into conventional image single-sentence captioning models.
Hashing techniques are in great demand for a wide range of real-world applications such as image retrieval and network compression.
Here we explore two related but important tasks based on the recently released REalistic Single Image DEhazing (RESIDE) benchmark dataset: (i) single image dehazing as a low-level image restoration problem; and (ii) high-level visual understanding (e. g., object detection) of hazy images.
Model compression is essential for serving large deep neural nets on devices with limited resources or applications that require real-time responses.
The inference structures and computational complexity of existing deep neural networks, once trained, are fixed and remain the same for all test images.
For cross graph convolution, a parameterized Kronecker sum operation is proposed to generate a conjunctive adjacency matrix characterizing the relationship between every pair of nodes across two subgraphs.