Specifically, we design a Patch Message Passing (PMP) module based on the Message Passing mechanism to establish global interaction for pathological semantic features and to exploit the subtle differences further between different diseases.
1 code implementation • 1 Nov 2023 • Xingru Huang, Yihao Guo, Jian Huang, Zhi Li, Tianyun Zhang, Kunyan Cai, Gaopeng Huang, WenHao Chen, Zhaoyang Xu, Liangqiong Qu, Ji Hu, Tinyu Wang, Shaowei Jiang, Chenggang Yan, Yaoqi Sun, Xin Ye, Yaqi Wang
Macular hole diagnosis and treatment rely heavily on spatial and quantitative data, yet the scarcity of such data has impeded the progress of deep learning techniques for effective segmentation and real-time 3D reconstruction.
In contrast to conventional pipeline Spoken Language Understanding (SLU) which consists of automatic speech recognition (ASR) and natural language understanding (NLU), end-to-end SLU infers the semantic meaning directly from speech and overcomes the error propagation caused by ASR.
We present a novel two-layer hierarchical reinforcement learning approach equipped with a Goals Relational Graph (GRG) for tackling the partially observable goal-driven task, such as goal-driven visual navigation.
Despite the significant success at enabling robots with autonomous behaviors makes deep reinforcement learning a promising approach for robotic object search task, the deep reinforcement learning approach severely suffers from the nature sparse reward setting of the task.
Visual Indoor Navigation (VIN) task has drawn increasing attention from the data-driven machine learning communities especially with the recently reported success from learning-based methods.
no code implementations • 8 Nov 2019 • Shanxin Yuan, Radu Timofte, Gregory Slabaugh, Ales Leonardis, Bolun Zheng, Xin Ye, Xiang Tian, Yaowu Chen, Xi Cheng, Zhen-Yong Fu, Jian Yang, Ming Hong, Wenying Lin, Wenjin Yang, Yanyun Qu, Hong-Kyu Shin, Joon-Yeon Kim, Sung-Jea Ko, Hang Dong, Yu Guo, Jie Wang, Xuan Ding, Zongyan Han, Sourya Dipta Das, Kuldeep Purohit, Praveen Kandula, Maitreya Suin, A. N. Rajagopalan
A new dataset, called LCDMoire was created for this challenge, and consists of 10, 200 synthetically generated image pairs (moire and clean ground truth).
Multiple-choice VQA has drawn increasing attention from researchers and end-users recently.
Despite significant progress in Robotic Object Search (ROS) over the recent years with deep reinforcement learning based approaches, the sparsity issue in reward setting as well as the lack of interpretability of the previous ROS approaches leave much to be desired.
In this paper, we introduce an ensemble method of incremental classifier to alleviate this problem, which is based on the cosine distance between the output feature and the pre-defined center, and can let each task to be preserved in different networks.
With the development of convolutional neural networks (CNNs) in recent years, the network structure has become more and more complex and varied, and has achieved very good results in pattern recognition, image classification, object detection and tracking.
We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest in an indoor environment solely from its visual inputs.
We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs.
This paper, for the first time, introduces a quadratic programming model with the network flow constraints to improve the accuracy of crowd counting.