Search Results for author: Shoukang Hu

Found 28 papers, 9 papers with code

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos

1 code implementation • 5 Dec 2023 • Shoukang Hu, Ziwei Liu

We present, GauHuman, a 3D human model with Gaussian Splatting for both fast training (1 ~ 2 minutes) and real-time rendering (up to 189 FPS), compared with existing NeRF-based implicit representation modelling frameworks demanding hours of training and seconds of rendering per frame.

Ranked #1 on Generalizable Novel View Synthesis on ZJU-MoCap

Generalizable Novel View Synthesis Novel View Synthesis

250

Paper
Code

HumanLiff: Layer-wise 3D Human Generation with Diffusion Model

no code implementations • 18 Aug 2023 • Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, Ziwei Liu

In this work, we propose HumanLiff, the first layer-wise 3D human generative model with a unified diffusion process.

3D Generation Neural Rendering

Paper
Add Code

Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition

no code implementations • 27 Jun 2023 • Tianzi Wang, Shoukang Hu, Jiajun Deng, Zengrui Jin, Mengzhe Geng, Yi Wang, Helen Meng, Xunying Liu

Automatic recognition of disordered and elderly speech remains highly challenging tasks to date due to data scarcity.

Domain Adaptation speech-recognition +1

Paper
Add Code

ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis

1 code implementation • 18 May 2023 • Shoukang Hu, Kaichen Zhou, Kaiyu Li, Longhui Yu, Lanqing Hong, Tianyang Hu, Zhenguo Li, Gim Hee Lee, Ziwei Liu

In this paper, we propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels.

3D Reconstruction SSIM

Paper
Code

SHERF: Generalizable Human NeRF from a Single Image

1 code implementation • ICCV 2023 • Shoukang Hu, Fangzhou Hong, Liang Pan, Haiyi Mei, Lei Yang, Ziwei Liu

To this end, we propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.

3D Human Reconstruction

285

Paper
Code

Exploiting prompt learning with pre-trained language models for Alzheimer's Disease detection

1 code implementation • 29 Oct 2022 • Yi Wang, Jiajun Deng, Tianzi Wang, Bo Zheng, Shoukang Hu, Xunying Liu, Helen Meng

Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression.

Alzheimer's Disease Detection Language Modelling +1

Paper
Code

Bayesian Neural Network Language Modeling for Speech Recognition

1 code implementation • 28 Aug 2022 • Boyang Xue, Shoukang Hu, Junhao Xu, Mengzhe Geng, Xunying Liu, Helen Meng

State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.

Data Augmentation Language Modelling +4

Paper
Code

Exploring linguistic feature and model combination for speech recognition based automatic AD detection

no code implementations • 28 Jun 2022 • Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng

Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and delay progression.

Model Selection speech-recognition +1

Paper
Add Code

Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems

no code implementations • 23 Jun 2022 • Mingyu Cui, Jiajun Deng, Shoukang Hu, Xurong Xie, Tianzi Wang, Shujie Hu, Mengzhe Geng, Boyang Xue, Xunying Liu, Helen Meng

Fundamental modelling differences between hybrid and end-to-end (E2E) automatic speech recognition (ASR) systems create large diversity and complementarity among them.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection

no code implementations • 23 Jun 2022 • Tianzi Wang, Jiajun Deng, Mengzhe Geng, Zi Ye, Shoukang Hu, Yi Wang, Mingyu Cui, Zengrui Jin, Xunying Liu, Helen Meng

Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care to delay further progression.

Alzheimer's Disease Detection Data Augmentation +3

Paper
Add Code

Neural Architecture Search for Speech Emotion Recognition

no code implementations • 31 Mar 2022 • Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng

Deep neural networks have brought significant advancements to speech emotion recognition (SER).

Neural Architecture Search Speech Emotion Recognition

Paper
Add Code

Generalizing Few-Shot NAS with Gradient Matching

1 code implementation • ICLR 2022 • Shoukang Hu, Ruochen Wang, Lanqing Hong, Zhenguo Li, Cho-Jui Hsieh, Jiashi Feng

Efficient performance estimation of architectures drawn from large search spaces is essential to Neural Architecture Search.

Neural Architecture Search

Paper
Code

Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition

no code implementations • 19 Mar 2022 • Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng

Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems for normal speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Recent Progress in the CUHK Dysarthric Speech Recognition System

no code implementations • 15 Jan 2022 • Shansong Liu, Mengzhe Geng, Shoukang Hu, Xurong Xie, Mingyu Cui, Jianwei Yu, Xunying Liu, Helen Meng

Despite the rapid progress of automatic speech recognition (ASR) technologies in the past few decades, recognition of disordered speech remains a highly challenging task to date.

Audio-Visual Speech Recognition Automatic Speech Recognition +4

Paper
Add Code

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition

no code implementations • 14 Jan 2022 • Mengzhe Geng, Shansong Liu, Jianwei Yu, Xurong Xie, Shoukang Hu, Zi Ye, Zengrui Jin, Xunying Liu, Helen Meng

Automatic recognition of disordered speech remains a highly challenging task to date.

Data Augmentation speech-recognition +1

Paper
Add Code

Investigation of Data Augmentation Techniques for Disordered Speech Recognition

no code implementations • 14 Jan 2022 • Mengzhe Geng, Xurong Xie, Shansong Liu, Jianwei Yu, Shoukang Hu, Xunying Liu, Helen Meng

This paper investigates a set of data augmentation techniques for disordered speech recognition, including vocal tract length perturbation (VTLP), tempo perturbation and speed perturbation.

Data Augmentation speech-recognition +1

Paper
Add Code

Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks

1 code implementation • 8 Jan 2022 • Shoukang Hu, Xurong Xie, Mingyu Cui, Jiajun Deng, Shansong Liu, Jianwei Yu, Mengzhe Geng, Xunying Liu, Helen Meng

State-of-the-art automatic speech recognition (ASR) system development is data and computation intensive.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Mixed Precision of Quantization of Transformer Language Models for Speech Recognition

no code implementations • 29 Nov 2021 • Junhao Xu, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng

Experiments conducted on Penn Treebank (PTB) and a Switchboard corpus trained LF-MMI TDNN system suggest the proposed mixed precision Transformer quantization techniques achieved model size compression ratios of up to 16 times over the full precision baseline with no recognition performance degradation.

Quantization speech-recognition +1

Paper
Add Code

Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition

no code implementations • 29 Nov 2021 • Junhao Xu, Jianwei Yu, Shoukang Hu, Xunying Liu, Helen Meng

In order to overcome the difficulty in using gradient descent methods to directly estimate discrete quantized weights, alternating direction methods of multipliers (ADMM) are used to efficiently train quantized LMs.

Neural Architecture Search Quantization +2

Paper
Add Code

Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers

no code implementations • 29 Nov 2021 • Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng

Index Terms: Language models, Recurrent neural networks, Quantization, Alternating direction methods of multipliers.

Quantization

Paper
Add Code

DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture

no code implementations • 13 Sep 2021 • Kaichen Zhou, Lanqing Hong, Shoukang Hu, Fengwei Zhou, Binxin Ru, Jiashi Feng, Zhenguo Li

In view of these, we propose DHA, which achieves joint optimization of Data augmentation policy, Hyper-parameter and Architecture.

Data Augmentation Neural Architecture Search

Paper
Add Code

Bayesian Transformer Language Models for Speech Recognition

no code implementations • 9 Feb 2021 • Boyang Xue, Jianwei Yu, Junhao Xu, Shansong Liu, Shoukang Hu, Zi Ye, Mengzhe Geng, Xunying Liu, Helen Meng

Performance improvements were also obtained on a cross domain LM adaptation task requiring porting a Transformer LM trained on the Switchboard and Fisher data to a low-resource DementiaBank elderly speech corpus.

speech-recognition Speech Recognition +1

Paper
Add Code

Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition

no code implementations • 8 Dec 2020 • Shoukang Hu, Xurong Xie, Shansong Liu, Jianwei Yu, Zi Ye, Mengzhe Geng, Xunying Liu, Helen Meng

On a third cross domain adaptation task requiring rapidly porting a 1000 hour LibriSpeech data trained system to a small DementiaBank elderly speech corpus, the proposed Bayesian TDNN LF-MMI systems outperformed the baseline system using direct weight fine-tuning by up to 2. 5\% absolute WER reduction.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Audio-visual Multi-channel Integration and Recognition of Overlapped Speech

no code implementations • 16 Nov 2020 • Jianwei Yu, Shi-Xiong Zhang, Bo Wu, Shansong Liu, Shoukang Hu, Mengzhe Geng, Xunying Liu, Helen Meng, Dong Yu

Automatic speech recognition (ASR) technologies have been significantly advanced in the past few decades.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Understanding the wiring evolution in differentiable neural architecture search

1 code implementation • 2 Sep 2020 • Sirui Xie, Shoukang Hu, Xinjiang Wang, Chunxiao Liu, Jianping Shi, Xunying Liu, Dahua Lin

To this end, we pose questions that future differentiable methods for neural wiring discovery need to confront, hoping to evoke a discussion and rethinking on how much bias has been enforced implicitly in existing NAS methods.

Neural Architecture Search

144

Paper
Code

Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks

no code implementations • 17 Jul 2020 • Shoukang Hu, Xurong Xie, Shansong Liu, Mingyu Cui, Mengzhe Geng, Xunying Liu, Helen Meng

Deep neural networks (DNNs) based automatic speech recognition (ASR) systems are often designed using expert knowledge and empirical evaluation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification

no code implementations • 8 Apr 2020 • Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng

Our experiment results indicate that the DNN x-vector system could benefit from BNNs especially when the mismatch problem is severe for evaluations using out-of-domain data.

Speaker Verification

Paper
Add Code

DSNAS: Direct Neural Architecture Search without Parameter Retraining

1 code implementation • CVPR 2020 • Shoukang Hu, Sirui Xie, Hehui Zheng, Chunxiao Liu, Jianping Shi, Xunying Liu, Dahua Lin

We argue that given a computer vision task for which a NAS method is expected, this definition can reduce the vaguely-defined NAS evaluation to i) accuracy of this task and ii) the total computation consumed to finally obtain a model with satisfying accuracy.

Ranked #16 on Neural Architecture Search on NAS-Bench-201, ImageNet-16-120 (Accuracy (Val) metric)

Neural Architecture Search

144

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.