no code implementations • 16 Jan 2024 • Yang Yang, George Sung, Shao-Fu Shih, Hakan Erdogan, Chehung Lee, Matthias Grundmann
We propose a neural network model that can separate target speech sources from interfering sources at different angular regions using two microphones.
no code implementations • 5 Jan 2024 • Yang Yang, Yury Kartynnik, Yunpeng Li, Jiuqiang Tang, Xing Li, George Sung, Matthias Grundmann
We present StreamVC, a streaming voice conversion solution that preserves the content and prosody of any source speech while matching the voice timbre from any target speech.
no code implementations • 19 Sep 2023 • Esha Uboweja, David Tian, Qifei Wang, Yi-Chun Kuo, Joe Zou, Lu Wang, George Sung, Matthias Grundmann
Our framework provides a pre-trained single-hand embedding model that can be fine-tuned for custom gesture recognition.
no code implementations • 10 Apr 2023 • Norberto Adrian Goussies, Kenji Hata, Shruthi Prabhakara, Abhishek Amit, Tony Aube, Carl Cepress, Diana Chang, Li-Te Cheng, Horia Stefan Ciurdar, Mike Cleron, Chelsey Fleming, Ashwin Ganti, Divyansh Garg, Niloofar Gheissari, Petra Luna Grutzik, David Hendon, Daniel Iglesia, Jin Kim, Stuart Kyle, Chris LaRosa, Roman Lewkow, Peter F McDermott, Chris Melancon, Paru Nackeeran, Neal Norwitz, Ali Rahimi, Brett Rampata, Carlos Sobrinho, George Sung, Natalie Zauhar, Palash Nandy
We present a novel self-contained camera-projector tabletop system with a lamp form-factor that brings digital intelligence to our tables.
no code implementations • 13 Mar 2023 • Yang Yang, Shao-Fu Shih, Hakan Erdogan, Jamie Menjay Lin, Chehung Lee, Yunpeng Li, George Sung, Matthias Grundmann
Multi-microphone speech enhancement problem is often decomposed into two decoupled steps: a beamformer that provides spatial filtering and a single-channel speech enhancement model that cleans up the beamformer output.
no code implementations • 24 Aug 2022 • Jamie Menjay Lin, Siargey Pisarchyk, Juhyun Lee, David Tian, Tingbo Hou, Karthik Raveendran, Raman Sarokin, George Sung, Trent Tolley, Matthias Grundmann
We introduce an efficient video segmentation system for resource-limited edge devices leveraging heterogeneous compute.
no code implementations • 29 Oct 2021 • George Sung, Kanstantsin Sokal, Esha Uboweja, Valentin Bazarevsky, Jonathan Baccash, Eduard Gabriel Bazavan, Chuo-Ling Chang, Matthias Grundmann
We present an on-device real-time hand gesture recognition (HGR) system, which detects a set of predefined static gestures from a single RGB camera.
4 code implementations • 18 Jun 2020 • Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, Matthias Grundmann
We present a real-time on-device hand tracking pipeline that predicts hand skeleton from single RGB camera for AR/VR applications.