no code implementations • 21 Jun 2024 • Dhananjay Ram, Aditya Rawal, Momchil Hardalov, Nikolaos Pappas, Sheng Zha
Training with mixed data distributions is a common and important part of creating multi-task and instruction-following models.
1 code implementation • 16 Apr 2024 • Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan
In this paper, we showcase HLAT: a family of 7B and 70B decoder-only LLMs pre-trained using 4096 AWS Trainium accelerators over 1. 8 trillion tokens.
no code implementations • 19 Oct 2023 • Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao
These models leverage the attention mechanism to capture long- and short-range dependencies in the sequence.
no code implementations • 19 Nov 2019 • Dhananjay Ram, Lesly Miculicich, Hervé Bourlard
Here, we show that the CNN based matching outperforms DTW based matching using bottleneck features as well.
no code implementations • 30 Jun 2019 • Dhananjay Ram, Lesly Miculicich, Hervé Bourlard
State of the art solutions to query by example spoken term detection (QbE-STD) usually rely on bottleneck feature representation of the query and audio document to perform dynamic time warping (DTW) based template matching.
2 code implementations • EMNLP 2018 • Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson
Neural Machine Translation (NMT) can be improved by including document-level contextual information.
1 code implementation • NAACL 2018 • Lesly Miculicich Werlen, Nikolaos Pappas, Dhananjay Ram, Andrei Popescu-Belis
Neural sequence-to-sequence networks with attention have achieved remarkable performance for machine translation.
no code implementations • 19 Oct 2016 • Dhananjay Ram, Debasis Kundu, Rajesh M. Hegde
In this work, a Bayesian approach to speaker normalization is proposed to compensate for the degradation in performance of a speaker independent speech recognition system.