1 code implementation • 6 Aug 2024 • Cheng Ye, Weidong Chen, Jingyu Li, Lei Zhang, Zhendong Mao
Emotional Video Captioning is an emerging task that aims to describe factual content with the intrinsic emotions expressed in videos.
1 code implementation • 16 Jul 2024 • Yanbo Wang, Wentao Zhao, Chuan Cao, Tianchen Deng, Jingchuan Wang, Weidong Chen
Although LiDAR semantic segmentation advances rapidly, state-of-the-art methods often incorporate specifically designed inductive bias derived from benchmarks originating from mechanical spinning LiDAR.
1 code implementation • 26 May 2024 • Tianchen Deng, Yi Zhou, Wenhua Wu, Mingrui Li, Jingwei Huang, Shuhong Liu, Yanzeng Song, Hao Zuo, Yanbo Wang, Yutao Yue, Hesheng Wang, Weidong Chen
Leveraging this information, we propose a multi-modal UAV detection, classification, and 3D tracking method for accurate UAV classification and tracking.
1 code implementation • 19 Apr 2024 • Fengyi Fu, Shancheng Fang, Weidong Chen, Zhendong Mao
Furthermore, a batch attention module is also proposed in this paper to alleviate the problem of missing sentimental samples, caused by the data imbalance, which is common in live videos as the popularity of videos varies.
no code implementations • 9 Apr 2024 • Tianchen Deng, Nailin Wang, Chongdi Wang, Shenghai Yuan, Jingchuan Wang, Danwei Wang, Weidong Chen
For pose estimation, a feature-metric bundle adjustment (FBA) method is designed for accurate and robust camera tracking in large-scale scenes.
1 code implementation • 29 Mar 2024 • Tianchen Deng, Yanbo Wang, Hongle Xie, Hesheng Wang, Jingchuan Wang, Danwei Wang, Weidong Chen
Second, the occupancy scene representation is replaced with Signed Distance Field (SDF) hierarchical scene representation for high-quality reconstruction and view synthesis.
1 code implementation • 17 Mar 2024 • Tianchen Deng, Yaohui Chen, Leyan Zhang, Jianfei Yang, Shenghai Yuan, Jiuming Liu, Danwei Wang, Hesheng Wang, Weidong Chen
Recent work has shown that 3D Gaussian-based SLAM enables high-quality reconstruction, accurate pose estimation, and real-time rendering of scenes.
no code implementations • CVPR 2024 • Tianchen Deng, Guole Shen, Tong Qin, Jianyu Wang, Wentao Zhao, Jingchuan Wang, Danwei Wang, Weidong Chen
To this end, we introduce PLGSLAM, a neural visual SLAM system capable of high-fidelity surface reconstruction and robust camera tracking in real-time.
no code implementations • 14 Dec 2023 • Tianchen Deng, Siyang Liu, Xuan Wang, Yejia Liu, Danwei Wang, Weidong Chen
Implicit neural representation has demonstrated promising results in view synthesis for large and complex scenes.
no code implementations • 14 Nov 2023 • Ting Wang, Weidong Chen, Yuanhe Tian, Yan Song, Zhendong Mao
Having the difficulty of solving the semantic gap between images and texts for the image captioning task, conventional studies in this area paid some attention to treating semantic concepts as a bridge between the two modalities and improved captioning performance accordingly.
1 code implementation • 20 Jul 2023 • Weidong Chen, Xiaofen Xing, Peihao Chen, Xiangmin Xu
Although PTMs shed new light on artificial general intelligence, they are constructed with general tasks in mind, and thus, their efficacy for specific tasks can be further improved.
no code implementations • 14 Mar 2023 • Tengjun Liu, Yansong Chua, Yiwei Zhang, Yuxiao Ning, Pengfu Liu, Guihua Wan, Zijun Wan, Shaomin Zhang, Weidong Chen
Despite its better bio-plausibility, goal-driven spiking neural network (SNN) has not achieved applicable performance for classifying biological spike trains, and showed little bio-functional similarities compared to traditional artificial neural networks.
1 code implementation • 3 Mar 2023 • Shuaiqi Chen, Xiaofen Xing, Weibin Zhang, Weidong Chen, Xiangmin Xu
Self-attention mechanism is applied within windows for capturing temporal important information locally in a fine-grained way.
1 code implementation • 27 Feb 2023 • Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du
Paralinguistic speech processing is important in addressing many issues, such as sentiment and neurocognitive disorder analyses.
1 code implementation • 26 Jul 2022 • Weidong Chen, Dexiang Hong, Yuankai Qi, Zhenjun Han, Shuhui Wang, Laiyun Qing, Qingming Huang, Guorong Li
To address this problem, we propose a multi-attention network which consists of dual-path dual-attention module and a query-based cross-modal Transformer module.
Ranked #5 on Referring Expression Segmentation on A2D Sentences
1 code implementation • 1 Apr 2021 • Guangming Wang, Hesheng Wang, Yiling Liu, Weidong Chen
A new unsupervised learning method of depth and ego-motion using multiple masks from monocular video is proposed in this paper.
1 code implementation • 30 Jan 2021 • Weiquan Fan, Xiangmin Xu, Xiaofen Xing, Weidong Chen, DongYan Huang
Speech emotion recognition is a vital contributor to the next generation of human-computer interaction (HCI).
Ranked #1 on Speech Emotion Recognition on LSSED
1 code implementation • 16 Sep 2020 • Hanjiang Hu, Hesheng Wang, Zhe Liu, Weidong Chen
Visual localization is a crucial component in the application of mobile robot and autonomous driving.
no code implementations • 30 Mar 2020 • Xiyi Wei, Yu-Tian Xiao, Jian Wang, Rui Chen, Wei zhang, Yue Yang, Daojun Lv, Chao Qin, Di Gu, Bo Zhang, Weidong Chen, Jianquan Hou, Ninghong Song, Guohua Zeng, Shancheng Ren
Objective: To conduct a meta-analysis of current studies that examined sex differences in severity and mortality in patients with COVID-19, and identify potential mechanisms underpinning these differences.
1 code implementation • 23 Sep 2019 • Hanjiang Hu, Hesheng Wang, Zhe Liu, Chenguang Yang, Weidong Chen, Le Xie
To retrieve a target image from the database, the query image is first encoded using the encoder belonging to the query domain to obtain a domain-invariant feature vector.
1 code implementation • 7 Jan 2019 • Baoyuan Wu, Weidong Chen, Yanbo Fan, Yong Zhang, Jinlong Hou, Jie Liu, Tong Zhang
In this work, we propose to train CNNs from images annotated with multiple tags, to enhance the quality of visual representation of the trained CNN model.
no code implementations • CVPR 2018 • Baoyuan Wu, Weidong Chen, Peng Sun, Wei Liu, Bernard Ghanem, Siwei Lyu
In D2IA, we generate a relevant and distinct tag subset, in which the tags are relevant to the image contents and semantically distinct to each other, using sequential sampling from a determinantal point process (DPP) model.
no code implementations • 1 Mar 2017 • Nevrez Imamoglu, Zhixuan Wei, Huangjun Shi, Yuki Yoshida, Myagmarbayar Nergui, Jose Gonzalez, Dongyun Gu, Weidong Chen, Kenzo Nonami, Wenwei Yu
Saliency computation has become a popular research field for many applications due to the useful information provided by saliency maps.