no code implementations • 1 Aug 2024 • Xuri Ge, Junchen Fu, Fuhai Chen, Shan An, Nicu Sebe, Joemon M. Jose
Facial action units (AUs), as defined in the Facial Action Coding System (FACS), have received significant research interest owing to their diverse range of applications in facial state analysis.
1 code implementation • 26 Apr 2024 • Xuri Ge, Songpei Xu, Fuhai Chen, Jie Wang, Guoxin Wang, Shan An, Joemon M. Jose
In this paper, we propose a novel visual Semantic-Spatial Self-Highlighting Network (termed 3SHNet) for high-precision, high-efficiency and high-generalization image-sentence retrieval.
Ranked #1 on
Cross-Modal Retrieval
on MSCOCO
no code implementations • 15 Jan 2024 • Guoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wang
Previous research on the diagnosis of Bipolar disorder has mainly focused on resting-state functional magnetic resonance imaging.
2 code implementations • 31 Dec 2023 • Dimitrios Psychogyios, Emanuele Colleoni, Beatrice van Amsterdam, Chih-Yang Li, Shu-Yu Huang, Yuchong Li, Fucang Jia, Baosheng Zou, Guotai Wang, Yang Liu, Maxence Boels, Jiayu Huo, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin, Mengya Xu, An Wang, Yanan Wu, Long Bai, Hongliang Ren, Atsushi Yamada, Yuriko Harai, Yuto Ishikawa, Kazuyuki Hayashi, Jente Simoens, Pieter DeBacker, Francesco Cisternino, Gabriele Furnari, Alex Mottrie, Federica Ferraguti, Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa, Soohee Kim, Seung Hyun Lee, Kyu Eun Lee, Hyoun-Joong Kong, Kui Fu, Chao Li, Shan An, Stefanie Krell, Sebastian Bodenstedt, Nicolas Ayobi, Alejandra Perez, Santiago Rodriguez, Juanita Puentes, Pablo Arbelaez, Omid Mohareri, Danail Stoyanov
Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems.
no code implementations • 4 Oct 2023 • Guoxin Wang, Xuyang Cao, Shan An, Fengmei Fan, Chao Zhang, Jinsong Wang, Feng Yu, Zhiren Wang
In this work, we proposed a multi-dimension-embedding-aware modality fusion transformer (MFFormer) for schizophrenia and bipolar disorder classification using rs-fMRI and T1 weighted structural MRI (T1w sMRI).
3 code implementations • 11 Sep 2023 • Dingfeng Shi, Qiong Cao, Yujie Zhong, Shan An, Jian Cheng, Haogang Zhu, DaCheng Tao
Temporal action detection (TAD) aims to detect all action boundaries and their corresponding categories in an untrimmed video.
Ranked #1 on
Temporal Action Localization
on MultiTHUMOS
1 code implementation • 20 Mar 2023 • Junjie Ye, Changhong Fu, Ziang Cao, Shan An, Guangze Zheng, Bowen Li
To realize reliable UAV tracking at night, a spatial-channel Transformer-based low-light enhancer (namely SCT), which is trained in a novel task-inspired manner, is proposed and plugged prior to tracking approaches.
no code implementations • 24 Aug 2021 • Shan An, Guangfu Che, Jinghao Guo, Haogang Zhu, Junjie Ye, Fangru Zhou, Zhaoqi Zhu, Dong Wei, Aishan Liu, Wei zhang
To this concern, this work proposes a real-time augmented reality virtual shoe try-on system for smartphones, namely ARShoe.
1 code implementation • 24 Aug 2021 • Shan An, Fangru Zhou, Mei Yang, Haogang Zhu, Changhong Fu, Konstantinos A. Tsintotas
Estimating a scene's depth to achieve collision avoidance against moving pedestrians is a crucial and fundamental problem in the robotic field.
no code implementations • 14 Feb 2021 • Shan An, Xiajie Zhang, Dong Wei, Haogang Zhu, Jianyu Yang, Konstantinos A. Tsintotas
Hand pose estimation is a fundamental task in many human-robot interaction-related applications.
2 code implementations • 29 Sep 2020 • Shan An, Haogang Zhu, Dong Wei, Konstantinos A. Tsintotas, Antonios Gasteratos
In recent years, the robotics community has extensively examined methods concerning the place recognition task within the scope of simultaneous localization and mapping applications. This article proposes an appearance-based loop closure detection pipeline named ``FILD++" (Fast and Incremental Loop closure Detection). First, the system is fed by consecutive images and, via passing them twice through a single convolutional neural network, global and local deep features are extracted. Subsequently, a hierarchical navigable small-world graph incrementally constructs a visual database representing the robot's traversed path based on the computed global features. Finally, a query image, grabbed each time step, is set to retrieve similar locations on the traversed route. An image-to-image pairing follows, which exploits local features to evaluate the spatial information.
Loop Closure Detection
Simultaneous Localization and Mapping
1 code implementation • 25 Nov 2019 • Shan An, Guangfu Che, Fangru Zhou, Xianglong Liu, Xin Ma, Yu Chen
Visual loop closure detection, which can be considered as an image retrieval task, is an important problem in SLAM (Simultaneous Localization and Mapping) systems.