no code implementations • 14 Jan 2025 • Mobai Xue, Jun Du, Zhenrong Zhang, Jiefeng Ma, Qikai Chang, Pengfei Hu, Jianshu Zhang, Yu Hu
We used generated misspelled characters as data augmentation in Chinese character error correction tasks, simulating the scenario where students learn handwritten Chinese characters with the help of misspelled characters.
no code implementations • 2 Jan 2025 • Huashan Chen, Yifan Xu, Yue Feng, Ming Jian, Feng Liu, Pengfei Hu, KeBin Peng, Sen He, Zi Wang
We study the dynamic characteristics of lip biometrics based on articulator motion.
1 code implementation • 22 Dec 2024 • Wenhang Shi, Yiren Chen, Shuqing Bian, Xinyi Zhang, Zhe Zhao, Pengfei Hu, Wei Lu, Xiaoyong Du
Knowledge stored in large language models requires timely updates to reflect the dynamic nature of real-world information.
1 code implementation • 10 Dec 2024 • Qikai Chang, Mingjun Chen, Changpeng Pi, Pengfei Hu, Zhenrong Zhang, Jiefeng Ma, Jun Du, BaoCai Yin, Jinshui Hu
The primary objective of Optical Chemical Structure Recognition is to identify chemical structure images into corresponding markup sequences.
no code implementations • 17 Nov 2024 • Eric Yang, Pengfei Hu, Xiaoxue Han, Yue Ning
The adoption of digital systems in healthcare has resulted in the accumulation of vast electronic health records (EHRs), offering valuable data for machine learning methods to predict patient health outcomes.
no code implementations • 11 Nov 2024 • Runming Yang, Taiqiang Wu, Jiahao Wang, Pengfei Hu, Ngai Wong, Yujiu Yang
Inspired by this observation, we explore the strategy that combines LoRA and KD to enhance the efficiency of knowledge transfer.
no code implementations • 25 Oct 2024 • Pengfei Hu, Chang Lu, Fei Wang, Yue Ning
Electronic Health Records (EHR) has revolutionized healthcare data management and prediction in the field of AI and machine learning.
1 code implementation • 17 Oct 2024 • Hanbo Cheng, Limin Lin, Chenyu Liu, Pengcheng Xia, Pengfei Hu, Jiefeng Ma, Jun Du, Jia Pan
To address these challenges, we present DAWN (Dynamic frame Avatar With Non-autoregressive diffusion), a framework that enables all-at-once generation of dynamic-length video sequences.
no code implementations • 13 Oct 2024 • Pengfei Hu, Yuhang Qian, Tianyue Zheng, Ang Li, Zhe Chen, Yue Gao, Xiuzhen Cheng, Jun Luo
Given the wide adoption of multimodal sensors (e. g., camera, lidar, radar) by autonomous vehicles (AVs), deep analytics to fuse their outputs for a robust perception become imperative.
no code implementations • 29 Sep 2024 • Shuhang Liu, Zhenrong Zhang, Pengfei Hu, Jiefeng Ma, Jun Du, Qing Wang, Jianshu Zhang, Chenyu Liu
Positioned at the outset of the answer text, the <see> token allows the model to first see--observing the regions of the image related to the input question--and then tell--providing articulated textual responses.
no code implementations • 18 Sep 2024 • Pengfei Hu, Zhenrong Zhang, Jiefeng Ma, Shuhang Liu, Jun Du, Jianshu Zhang
In recent years, visually-rich document understanding has attracted increasing attention.
no code implementations • 21 Jun 2024 • Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee
Evaluation results on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2023 Challenge data set demonstrate significant improvements in SELD performances.
no code implementations • 13 Jun 2024 • Jiefeng Ma, Yan Wang, Chenyu Liu, Jun Du, Yu Hu, Zhenrong Zhang, Pengfei Hu, Qing Wang, Jianshu Zhang
Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding.
1 code implementation • 20 May 2024 • Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du
The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial.
no code implementations • 26 Feb 2024 • Jiahao Wang, Sikun Yang, Heinz Koeppl, Xiuzhen Cheng, Pengfei Hu, Guoming Zhang
Probabilistic approaches for handling count-valued time sequences have attracted amounts of research attentions because their ability to infer explainable latent structures and to estimate uncertainties, and thus are especially suitable for dealing with \emph{noisy} and \emph{incomplete} count data.
no code implementations • 31 Dec 2023 • Hanbo Cheng, Chenyu Liu, Pengfei Hu, Zhenrong Zhang, Jiefeng Ma, Jun Du
The Handwritten Mathematical Expression Recognition (HMER) task is a critical branch in the field of OCR.
no code implementations • 21 Sep 2023 • Feng Li, Yuqi Chai, Huan Yang, Pengfei Hu, Lingjie Duan
How to incentivize strategic workers using limited budget is a very fundamental problem for crowdsensing systems; nevertheless, since the sensing abilities of the workers may not always be known as prior knowledge due to the diversities of their sensor devices and behaviors, it is difficult to properly select and pay the unknown workers.
no code implementations • 11 Sep 2023 • Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng
Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion.
1 code implementation • ICCV 2023 • Xiuzhe Wu, Pengfei Hu, Yang Wu, Xiaoyang Lyu, Yan-Pei Cao, Ying Shan, Wenming Yang, Zhongqian Sun, Xiaojuan Qi
Therefore, directly learning a mapping function from speech to the entire head image is prone to ambiguity, particularly when using a short video for training.
no code implementations • 30 Jul 2023 • Pengfei Hu, Jiefeng Ma, Zhenrong Zhang, Jun Du, Jianshu Zhang
This poses a challenge when dealing with an unseen misspelled character, as the decoder may generate an IDS sequence that matches a seen character instead.
1 code implementation • 24 Mar 2023 • Jiefeng Ma, Jun Du, Pengfei Hu, Zhenrong Zhang, Jianshu Zhang, Huihui Zhu, Cong Liu
Moreover, we proposed an encoder-decoder-based hierarchical document structure parsing system (DSPS) to tackle this problem.
1 code implementation • 8 Mar 2023 • Zhenrong Zhang, Pengfei Hu, Jiefeng Ma, Jun Du, Jianshu Zhang, Huihui Zhu, BaoCai Yin, Bing Yin, Cong Liu
Table structure recognition is an indispensable element for enabling machines to comprehend tables.
1 code implementation • 6 Dec 2022 • Pengfei Hu, Zhenrong Zhang, Jianshu Zhang, Jun Du, Jiajia Wu
Next, to parse the hierarchical relationship between the heading entities, a tree-structured decoder is designed.
1 code implementation • 24 Nov 2022 • Huanle Zhang, Lei Fu, Mi Zhang, Pengfei Hu, Xiuzhen Cheng, Prasant Mohapatra, Xin Liu
In this paper, we propose FedTune, an automatic FL hyper-parameter tuning algorithm tailored to applications' diverse system requirements in FL training.
no code implementations • 9 Sep 2022 • Peiwen Sun, Shanshan Zhang, Zishan Liu, Yougen Yuan, Taotao Zhang, Honggang Zhang, Pengfei Hu
It has already been observed that audio-visual embedding is more robust than uni-modality embedding for person verification.
no code implementations • 6 Sep 2022 • Guangrong Zhao, Yiran Shen, Ning Chen, Pengfei Hu, Lei Liu, Hongkai Wen
By designing a series of signal processing algorithms bespoke for dynamic vision sensing on mobile devices, EV-Tach is able to extract the rotational speed accurately from the event stream produced by dynamic vision sensing on rotary targets.
1 code implementation • CVPR 2022 • Jiakai Wang, Zixin Yin, Pengfei Hu, Aishan Liu, Renshuai Tao, Haotong Qin, Xianglong Liu, DaCheng Tao
For the generalization against diverse noises, we inject class-specific identifiable patterns into a confined local patch prior, so that defensive patches could preserve more recognizable features towards specific classes, leading models for better recognition under noises.
no code implementations • 13 Dec 2021 • Guodong Ma, Pengfei Hu, Nurmemet Yolwas, Shen Huang, Hao Huang
To boost the performance of PMT, we propose multi-modeling unit training (MMUT) architecture fusion with PMT (PM-MMUT).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 16 Sep 2021 • Minxing Zhang, Zhaochun Ren, Zihan Wang, Pengjie Ren, Zhumin Chen, Pengfei Hu, Yang Zhang
In this paper, we make the first attempt on quantifying the privacy leakage of recommender systems through the lens of membership inference.
no code implementations • 5 Jul 2021 • Huanhai Xin, Kehao Zhuang, Pengfei Hu, Yunjie Gu, Ping Ju
Based on dual synchronous idea, a dual synchronous generator (DSG) control is applied in VSC to form inertial current source.
no code implementations • 29 Oct 2019 • Lingchen Zhao, Shengshan Hu, Qian Wang, Jianlin Jiang, Chao Shen, Xiangyang Luo, Pengfei Hu
Collaborative learning allows multiple clients to train a joint model without sharing their data with each other.