Search Results for author: Saurabh Adya

Found 16 papers, 1 papers with code

Device-Directed Speech Detection for Follow-up Conversations Using Large Language Models

no code implementations28 Oct 2024 Ognjen, Rudovic, Pranay Dighe, Yi Su, Vineet Garg, Sameer Dharur, Xiaochuan Niu, Ahmed H. Abdelaziz, Saurabh Adya, Ahmed Tewfik

To this end, we explore the notion of Large Language Models (LLMs) and model the first query when making inference about the follow-ups (based on the ASR-decoded text), via prompting of a pretrained LLM, or by adapting a binary classifier on top of the LLM.

Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features

no code implementations23 Oct 2023 Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H Tewfik

Device-directed speech detection (DDSD) is the binary classification task of distinguishing between queries directed at a voice assistant versus side conversation or background speech.

Automatic Speech Recognition Binary Classification +2

Streaming Anchor Loss: Augmenting Supervision with Temporal Significance

no code implementations9 Oct 2023 Utkarsh Oggy Sarawgi, John Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed Tewfik

Streaming neural network models for fast frame-wise responses to various speech and sensory signals are widely adopted on resource-constrained platforms.

eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models

no code implementations2 Sep 2023 Minsik Cho, Keivan A. Vahid, Qichen Fu, Saurabh Adya, Carlo C Del Mundo, Mohammad Rastegari, Devang Naik, Peter Zatloukal

Since Large Language Models or LLMs have demonstrated high-quality performance on many complex language tasks, there is a great interest in bringing these LLMs to mobile devices for faster responses and better privacy protection.

Clustering Quantization

Efficient Multimodal Neural Networks for Trigger-less Voice Assistants

no code implementations20 May 2023 Sai Srujana Buddi, Utkarsh Oggy Sarawgi, Tashweena Heeramun, Karan Sawnhey, Ed Yanosik, Saravana Rathinam, Saurabh Adya

The adoption of multimodal interactions by Voice Assistants (VAs) is growing rapidly to enhance human-computer interactions.

Decision Making

R2 Loss: Range Restriction Loss for Model Compression and Quantization

no code implementations14 Mar 2023 Arnav Kundu, Chungkuk Yoo, Srijan Mishra, Minsik Cho, Saurabh Adya

To overcome the challenge, we focus on outliers in weights of a pre-trained model which disrupt effective lower bit quantization and compression.

Classification Model Compression +2

DKM: Differentiable K-Means Clustering Layer for Neural Network Compression

no code implementations ICLR 2022 Minsik Cho, Keivan A. Vahid, Saurabh Adya, Mohammad Rastegari

For MobileNet-v1, which is a challenging DNN to compress, DKM delivers 63. 9% top-1 ImageNet1k accuracy with 0. 72 MB model size (22. 4x model compression factor).

Clustering Neural Network Compression

Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation

no code implementations14 May 2021 Vineet Garg, Wonil Chang, Siddharth Sigtia, Saurabh Adya, Pramod Simha, Pranay Dighe, Chandra Dhir

We propose a streaming transformer (TF) encoder architecture, which progressively processes incoming audio chunks and maintains audio context to perform both VTD and FTM tasks using only acoustic features.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Lattice-based Improvements for Voice Triggering Using Graph Neural Networks

no code implementations25 Jan 2020 Pranay Dighe, Saurabh Adya, Nuoyu Li, Srikanth Vishnubhotla, Devang Naik, Adithya Sagar, Ying Ma, Stephen Pulman, Jason Williams

A pure trigger-phrase detector model doesn't fully utilize the intent of the user speech whereas by using the complete decoding lattice of user audio, we can effectively mitigate speech not intended for the smart assistant.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Nonlinear Conjugate Gradients For Scaling Synchronous Distributed DNN Training

1 code implementation7 Dec 2018 Saurabh Adya, Vinay Palakkode, Oncel Tuzel

In this work, we propose and evaluate the stochastic preconditioned nonlinear conjugate gradient algorithm for large scale DNN training tasks.

16k General Classification

Democratizing Production-Scale Distributed Deep Learning

no code implementations31 Oct 2018 Minghuang Ma, Hadi Pouransari, Daniel Chao, Saurabh Adya, Santiago Akle Serrano, Yi Qin, Dan Gimnicher, Dominic Walsh

The interest and demand for training deep neural networks have been experiencing rapid growth, spanning a wide range of applications in both academia and industry.

Deep Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.