no code implementations • 24 Mar 2023 • Jiguo Li, Xiaobin Liu, Lirong Zheng
Prior works about text-to-image synthesis typically concatenated the sentence embedding with the noise vector, while the sentence embedding and the noise vector are two different factors, which control the different aspects of the generation.
no code implementations • 6 Sep 2022 • Jiguo Li, Chuanmin Jia, Xinfeng Zhang, Siwei Ma, Wen Gao
With the recent advances in cross modal translation and generation, in this paper, we propose the cross modal compression~(CMC), a semantic compression framework for visual data, to transform the high redundant visual data~(such as image, video, etc.)
1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao
Due to the widespread deployment of fingerprint/face/speaker recognition systems, attacking deep learning based biometric systems has drawn more and more attention.
Audio and Speech Processing Cryptography and Security Sound
1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao
In this paper, we attempt to translate the speech signals into the image signals without the transcription stage.
Multimedia Sound Audio and Speech Processing
1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao
Attacking deep learning based biometric systems has drawn more and more attention with the wide deployment of fingerprint/face/speaker recognition systems, given the fact that the neural networks are vulnerable to the adversarial examples, which have been intentionally perturbed to remain almost imperceptible for human.