Search Results for author: Khazar Khorrami

Found 5 papers, 3 papers with code

Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

1 code implementation • 16 Jun 2023 • Huang Xie, Khazar Khorrami, Okko Räsänen, Tuomas Virtanen

Conversely, the results suggest that using only binary relevances defined by captioning-based audio-caption pairs is sufficient for contrastive learning.

Audio captioning Contrastive Learning +1

Paper
Code

Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System

no code implementations • 5 Jun 2023 • Khazar Khorrami, María Andrea Cruz Blandón, Tuomas Virtanen, Okko Räsänen

As a result, we find that sequential training with wav2vec 2. 0 first and VGS next provides higher performance on audio-visual retrieval compared to simultaneous optimization of both learning mechanisms.

Multi-Task Learning Representation Learning +3

Paper
Add Code

Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? -- A computational investigation

1 code implementation • 29 Sep 2021 • Khazar Khorrami, Okko Räsänen

We review the extent that the audiovisual aspect of LLH is supported by the existing computational studies.

Representation Learning

Paper
Code

Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models

1 code implementation • 5 Jul 2021 • Khazar Khorrami, Okko Räsänen

We compare the alignment performance using our proposed evaluation metrics to the semantic retrieval task commonly used to evaluate VGS models.

Cross-Modal Retrieval Object Localization +2

Paper
Code

A computational model of early language acquisition from audiovisual experiences of young infants

no code implementations • 24 Jun 2019 • Okko Räsänen, Khazar Khorrami

Earlier research has suggested that human infants might use statistical dependencies between speech and non-linguistic multimodal input to bootstrap their language learning before they know how to segment words from running speech.

Language Acquisition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.