Search Results for author: Albert Antony

Found 2 papers, 1 papers with code

FastVLM: Efficient Vision Encoding for Vision Language Models

1 code implementation17 Dec 2024 Pavan Kumar Anasosalu Vasu, Fartash Faghri, Chun-Liang Li, Cem Koc, Nate True, Albert Antony, Gokul Santhanam, James Gabriel, Peter Grasch, Oncel Tuzel, Hadi Pouransari

At different operational resolutions, the vision encoder of a VLM can be optimized along two axes: reducing encoding latency and minimizing the number of visual tokens passed to the LLM, thereby lowering overall latency.

On-device neural speech synthesis

no code implementations17 Sep 2021 Sivanand Achanta, Albert Antony, Ladan Golipour, Jiangchuan Li, Tuomo Raitio, Ramya Rasipuram, Francesco Rossi, Jennifer Shi, Jaimin Upadhyay, David Winarsky, Hepeng Zhang

Recent advances in text-to-speech (TTS) synthesis, such as Tacotron and WaveRNN, have made it possible to construct a fully neural network based TTS system, by coupling the two components together.

Speech Synthesis text-to-speech +1

Cannot find the paper you are looking for? You can Submit a new open access paper.