Search Results for author: Bandhav Veluri

Found 7 papers, 4 papers with code

Beyond Turn-Based Interfaces: Synchronous LLMs as Full-Duplex Dialogue Agents

no code implementations23 Sep 2024 Bandhav Veluri, Benjamin N Peloquin, Bokai Yu, Hongyu Gong, Shyamnath Gollakota

Despite broad interest in modeling spoken dialogue agents, most approaches are inherently "half-duplex" -- restricted to turn-based interaction with responses requiring explicit prompting by the user or implicit tracking of interruption or silence events.

2k

IRIS: Wireless Ring for Vision-based Smart Home Interaction

no code implementations25 Jul 2024 Maruchi Kim, Antonio Glenn, Bandhav Veluri, Yunseo Lee, Eyoel Gebre, Aditya Bagaria, Shwetak Patel, Shyamnath Gollakota

Integrating cameras into wireless smart rings has been challenging due to size and power constraints.

SeamlessExpressiveLM: Speech Language Model for Expressive Speech-to-Speech Translation with Chain-of-Thought

no code implementations30 May 2024 Hongyu Gong, Bandhav Veluri

Expressive speech-to-speech translation (S2ST) is a key research topic in seamless communication, which focuses on the preservation of semantics and speaker vocal style in translated speech.

Language Modeling Language Modelling +2

Look Once to Hear: Target Speech Hearing with Noisy Examples

1 code implementation10 May 2024 Bandhav Veluri, Malek Itani, Tuochao Chen, Takuya Yoshioka, Shyamnath Gollakota

We present the first enrollment interface where the wearer looks at the target speaker for a few seconds to capture a single, short, highly noisy, binaural example of the target speaker.

Speech Extraction

Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables

1 code implementation1 Nov 2023 Bandhav Veluri, Malek Itani, Justin Chan, Takuya Yoshioka, Shyamnath Gollakota

To achieve this, we make two technical contributions: 1) we present the first neural network that can achieve binaural target sound extraction in the presence of interfering sounds and background noise, and 2) we design a training methodology that allows our system to generalize to real-world use.

Target Sound Extraction

NeuriCam: Key-Frame Video Super-Resolution and Colorization for IoT Cameras

1 code implementation25 Jul 2022 Bandhav Veluri, Collin Pernu, Ali Saffari, Joshua Smith, Michael Taylor, Shyamnath Gollakota

Our idea is to design a dual-mode camera system where the first mode is low-power (1. 1 mW) but only outputs grey-scale, low resolution, and noisy video and the second mode consumes much higher power (100 mW) but outputs color and higher resolution images.

Colorization Key-Frame-based Video Super-Resolution (K = 15)

Cannot find the paper you are looking for? You can Submit a new open access paper.