no code implementations • 17 Apr 2025 • Libo Zhang, Yongsheng Yu, Jiali Yao, Heng Fan
Despite excellence, they ignore a hard constraint that the unmasked regions in the input and the output should be the same, resulting in a gap between GAN inversion and image inpainting and thus degrading the performance.
1 code implementation • 13 Mar 2025 • Jiali Yao, Xinran Deng, Xin Gu, Mengrui Dai, Bing Fan, Zhipeng Zhang, Yan Huang, Heng Fan, Libo Zhang
Each sequence in BOSTVG, paired with a free-form textual query, encompasses a varying number of targets ranging from 1 to 10.
1 code implementation • 8 Mar 2024 • Yunhao Li, Qin Li, Hao Wang, Xue Ma, Jiali Yao, Shaohua Dong, Heng Fan, Libo Zhang
Current multi-object tracking (MOT) aims to predict trajectories of targets (i. e., ''where'') in videos.
no code implementations • 17 Jan 2022 • Tianyi Xie, Liucheng Liao, Cheng Bi, Benlai Tang, Xiang Yin, Jianfei Yang, Mingjie Wang, Jiali Yao, Yang Zhang, Zejun Ma
The task of few-shot visual dubbing focuses on synchronizing the lip movements with arbitrary speech input for any talking head video.
no code implementations • 3 Nov 2020 • Mingkun Huang, Jun Zhang, Meng Cai, Yang Zhang, Jiali Yao, Yongbin You, Yi He, Zejun Ma
In this work, we analyze the cause of the huge gradient variance in RNN-T training and proposed a new \textit{normalized jointer network} to overcome it.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 26 Feb 2020 • Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. black, Florian Metze
Multilingual models can improve language processing, particularly for low resource situations, by sharing parameters across languages.
1 code implementation • NAACL 2019 • Jiali Yao, Raphael Shu, Xinjian Li, Katsutoshi Ohtsuki, Hideki Nakayama
Input method editor (IME) converts sequential alphabet key inputs to words in a target language.
no code implementations • ICLR 2019 • Jiali Yao, Raphael Shu, Xinjian Li, Katsutoshi Ohtsuki, Hideki Nakayama
The input method is an essential service on every mobile and desktop devices that provides text suggestions.