Search Results for author: Julian Salazar

Found 12 papers, 8 papers with code

Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM

no code implementations24 May 2023 Eliya Nachmani, Alon Levkovitch, Roy Hirsch, Julian Salazar, Chulayuth Asawaroengchai, Soroosh Mariooryad, Ehud Rivlin, RJ Skerry-Ryan, Michelle Tadmor Ramanovich

Key to our approach is a training objective that jointly supervises speech recognition, text continuation, and speech synthesis using only paired speech-text pairs, enabling a `cross-modal' chain-of-thought within a single decoding pass.

Language Modelling Question Answering +3

Zero-Shot End-to-End Spoken Language Understanding via Cross-Modal Selective Self-Training

1 code implementation22 May 2023 Jianfeng He, Julian Salazar, Kaisheng Yao, Haoqi Li, Jinglun Cai

End-to-end (E2E) spoken language understanding (SLU) is constrained by the cost of collecting speech-semantics pairs, especially when label domains change.

Natural Language Understanding Spoken Language Understanding

Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment

no code implementations NAACL 2021 Ethan A. Chi, Julian Salazar, Katrin Kirchhoff

Non-autoregressive models greatly improve decoding speed over typical sequence-to-sequence models, but suffer from degraded performance.

speech-recognition Speech Recognition

Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings

no code implementations15 Oct 2020 Phillip Keung, Julian Salazar, Yichao Lu, Noah A. Smith

We then improve an XLM-based unsupervised neural MT system pre-trained on Wikipedia by supplementing it with pseudo-parallel text mined from the same corpus, boosting unsupervised translation performance by up to 3. 5 BLEU on the WMT'14 French-English and WMT'16 German-English tasks and outperforming the previous state-of-the-art.

Machine Translation Sentence +2

Don't Use English Dev: On the Zero-Shot Cross-Lingual Evaluation of Contextual Embeddings

no code implementations EMNLP 2020 Phillip Keung, Yichao Lu, Julian Salazar, Vikas Bhardwaj

Multilingual contextual embeddings have demonstrated state-of-the-art performance in zero-shot cross-lingual transfer learning, where multilingual BERT is fine-tuned on one source language and evaluated on a different target language.

Model Selection Transfer Learning +2

Attentional Speech Recognition Models Misbehave on Out-of-domain Utterances

1 code implementation12 Feb 2020 Phillip Keung, Wei Niu, Yichao Lu, Julian Salazar, Vikas Bhardwaj

We discuss the problem of echographic transcription in autoregressive sequence-to-sequence attentional architectures for automatic speech recognition, where a model produces very long sequences of repetitive outputs when presented with out-of-domain utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Masked Language Model Scoring

6 code implementations ACL 2020 Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff

Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one.

Attribute Domain Adaptation +4

BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition

1 code implementation30 Jun 2019 Shaoshi Ling, Julian Salazar, Yuzong Liu, Katrin Kirchhoff

We introduce BERTphone, a Transformer encoder trained on large speech corpora that outputs phonetically-aware contextual representation vectors that can be used for both speaker and language recognition.

Avg Representation Learning +2

Self-Attention Networks for Connectionist Temporal Classification in Speech Recognition

1 code implementation22 Jan 2019 Julian Salazar, Katrin Kirchhoff, Zhiheng Huang

The success of self-attention in NLP has led to recent applications in end-to-end encoder-decoder architectures for speech recognition.

Classification General Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.