3 code implementations • 19 Aug 2024 • Zhiyong Zhang, Aniket Gupta, Huaizu Jiang, Hanumant Singh
In this paper, we propose a highly efficient optical flow method that balances high accuracy with reduced computational demands.
no code implementations • 14 Jul 2024 • Yang Liu, Zhiyong Zhang
The current methods of video-based 3D human pose estimation have achieved significant progress; however, they continue to confront the significant challenge of depth ambiguity.
1 code implementation • 15 Mar 2024 • Zhiyong Zhang, Huaizu Jiang, Hanumant Singh
Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy.
1 code implementation • 29 Feb 2024 • Yang Liu, Changzhen Qiu, Zhiyong Zhang
To the best of our knowledge, this survey is arguably the first to comprehensively cover deep learning methods for 3D human pose estimation, including both single-person and multi-person approaches, as well as human mesh recovery, encompassing methods based on explicit models and implicit representations.
no code implementations • 13 Dec 2023 • Yuyang Sun, Huy H. Nguyen, Chun-Shien Lu, Zhiyong Zhang, Lu Sun, Isao Echizen
The growing diversity of digital face manipulation techniques has led to an urgent need for a universal and robust detection technology to mitigate the risks posed by malicious forgeries.
no code implementations • 11 Jan 2023 • Yuhan Xie, Zhiyong Zhang, Shaolong Chen, Changzhen Qiu
The segmentation of atrial scan images is of great significance for the three-dimensional reconstruction of the atrium and the surgical positioning.
no code implementations • 31 Dec 2022 • Jie Feng, Ruimin Feng, Qing Wu, Zhiyong Zhang, Yuyao Zhang, Hongjiang Wei
The high-quality and inner continuity of the images provided by INR has great potential to further improve the spatiotemporal resolution of dynamic MRI, without the need of any training data.
no code implementations • 7 Dec 2022 • Yuyang Sun, Zhiyong Zhang, Isao Echizen, Huy H. Nguyen, Changzhen Qiu, Lu Sun
We introduce a method for detecting manipulated videos that is based on the trajectory of the facial region displacement.
no code implementations • 25 Oct 2022 • xulong Zhang, Jianzong Wang, Ning Cheng, Mengyuan Zhao, Zhiyong Zhang, Jing Xiao
We also find that in joint CTC-Attention ASR model, decoder is more sensitive to linguistic information than acoustic information.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 7 May 2022 • Yuhan Xie, Kexin Jiang, Zhiyong Zhang, Shaolong Chen, Xiaodong Zhang, Changzhen Qiu
Medical image segmentation based on deep learning is often faced with the problems of insufficient datasets and long time-consuming labeling.
no code implementations • 27 Apr 2022 • Zhiyong Zhang, Pushyami Kaveti, Hanumant Singh, Abigail Powell, Erica Fruh, M. Elizabeth Clarke
In this paper, we present a methodology for fisheries-related data that allows us to converge on a labeled image dataset by iterating over the dataset with multiple training and production loops that can exploit crowdsourcing interfaces.
no code implementations • 15 Nov 2021 • Yuyang Sun, Zhiyong Zhang, Changzhen Qiu, Liang Wang, Zekai Wang
With the rapid development of generation model, AI-based face manipulation technology, which called DeepFakes, has become more and more realistic.
no code implementations • 13 Aug 2020 • Xueli Jia, Jianzong Wang, Zhiyong Zhang, Ning Cheng, Jing Xiao
However, the increased complexity of a model can also introduce high risk of over-fitting, which is a major challenge in SLU tasks due to the limitation of available data.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 10 Oct 2019 • Manzhang Xu, Bijun Tang, Yuhao Lu, Chao Zhu, Lu Zheng, Jingyu Zhang, Nannan Han, Yuxi Guo, Jun Di, Pin Song, Yongmin He, Lixing Kang, Zhiyong Zhang, Wu Zhao, Cuntai Guan, Xuewen Wang, Zheng Liu
Reducing the lateral scale of two-dimensional (2D) materials to one-dimensional (1D) has attracted substantial research interest not only to achieve competitive electronic device applications but also for the exploration of fundamental physical properties.
no code implementations • 28 Jun 2015 • Lantian Li, Yiye Lin, Zhiyong Zhang, Dong Wang
A deep learning approach has been proposed recently to derive speaker identifies (d-vector) by a deep neural network (DNN).
no code implementations • 16 Jun 2015 • Xi Ma, Xiaoxi Wang, Dong Wang, Zhiyong Zhang
We also employ this approach to deal with out-of-language words in the task of multi-lingual speech recognition.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 7 Jun 2015 • Zhiyuan Tang, Dong Wang, Yiqiao Pan, Zhiyong Zhang
Compared to the conventional layer-wise methods, this new method does not care about the model structure, so can be used to pre-train very complex models.
no code implementations • 24 May 2015 • Lantian Li, Dong Wang, Zhiyong Zhang, Thomas Fang Zheng
Recent research shows that deep neural networks (DNNs) can be used to extract deep speaker vectors (d-vectors) that preserve speaker characteristics and can be used in speaker verification.
no code implementations • 18 May 2015 • Zhiyuan Tang, Dong Wang, Zhiyong Zhang
Recent research found that a well-trained model can be used as a teacher to train other child models, by using the predictions generated by the teacher model as supervision.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2