no code implementations • 9 Aug 2021 • Xinsheng Wang, Qicong Xie, Jihua Zhu, Lei Xie, Scharenborg
In this paper, we present an automatic method to generate synchronized speech and talking-head videos on the basis of text and a single face image of an arbitrary person as input.