no code implementations • 11 Sep 2021 • Yangyang Xia, Buye Xu, Anurag Kumar
Supervised speech enhancement relies on parallel databases of degraded speech signals and their clean reference signals during training.
no code implementations • CVPR 2021 • Yu Wang, Rui Zhang, Shuo Zhang, Miao Li, Yangyang Xia, Xishan Zhang, Shaoli Liu
The directions of weights, and the gradients, can be divided into domain-specific and domain-invariant parts, and the goal of domain adaptation is to concentrate on the domain-invariant direction while eliminating the disturbance from domain-specific one.
1 code implementation • 19 Oct 2020 • Tyler Vuong, Yangyang Xia, Richard Stern
Voice Type Discrimination (VTD) refers to discrimination between regions in a recording where speech was produced by speakers that are physically within proximity of the recording device ("Live Speech") from speech and other types of audio that were played back such as traffic noise and television broadcasts ("Distractor Audio").
Audio and Speech Processing