Search Results for author: Naomi Harte

Found 12 papers, 9 papers with code

Bioacoustic Event Detection with prototypical networks and data augmentation

no code implementations16 Dec 2021 Mark Anderson, Naomi Harte

This report presents deep learning and data augmentation techniques used by a system entered into the Few-Shot Bioacoustic Event Detection for the DCASE2021 Challenge.

Data Augmentation Event Detection +1

Low Resource Species Agnostic Bird Activity Detection

no code implementations16 Dec 2021 Mark Anderson, John Kennedy, Naomi Harte

This paper explores low resource classifiers and features for the detection of bird activity, suitable for embedded Automatic Recording Units which are typically deployed for long term remote monitoring of bird populations.

Action Detection Activity Detection

AV Taris: Online Audio-Visual Speech Recognition

1 code implementation14 Dec 2020 George Sterpu, Naomi Harte

In recent years, Automatic Speech Recognition (ASR) technology has approached human-level performance on conversational speech under relatively clean listening conditions.

Action Detection Activity Detection +3

Deep Multi-Scale Feature Learning for Defocus Blur Estimation

1 code implementation24 Sep 2020 Ali Karaali, Naomi Harte, Claudio Rosito Jung

This paper presents an edge-based defocus blur estimation method from a single defocused image.

Edge Classification

Learning to Count Words in Fluent Speech enables Online Speech Recognition

1 code implementation8 Jun 2020 George Sterpu, Christian Saam, Naomi Harte

Sequence to Sequence models, in particular the Transformer, achieve state of the art results in Automatic Speech Recognition.

Automatic Speech Recognition

Should we hard-code the recurrence concept or learn it instead ? Exploring the Transformer architecture for Audio-Visual Speech Recognition

1 code implementation19 May 2020 George Sterpu, Christian Saam, Naomi Harte

The audio-visual speech fusion strategy AV Align has shown significant performance improvements in audio-visual speech recognition (AVSR) on the challenging LRS2 dataset.

Audio-Visual Speech Recognition Visual Speech Recognition

Neural Generation of Dialogue Response Timings

1 code implementation ACL 2020 Matthew Roddy, Naomi Harte

The timings of spoken response offsets in human dialogue have been shown to vary based on contextual elements of the dialogue.

How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition

1 code implementation17 Apr 2020 George Sterpu, Christian Saam, Naomi Harte

A recently proposed multimodal fusion strategy, AV Align, based on state-of-the-art sequence to sequence neural networks, attempts to model this relationship by explicitly aligning the acoustic and visual representations of speech.

Audio-Visual Speech Recognition Visual Speech Recognition

Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition

3 code implementations5 Sep 2018 George Sterpu, Christian Saam, Naomi Harte

Automatic speech recognition can potentially benefit from the lip motion patterns, complementing acoustic speech to improve the overall recognition performance, particularly in noise.

Automatic Speech Recognition

Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs

1 code implementation31 Aug 2018 Matthew Roddy, Gabriel Skantze, Naomi Harte

To design spoken dialog systems that can conduct fluid interactions it is desirable to incorporate cues from separate modalities into turn-taking models.

Investigating Speech Features for Continuous Turn-Taking Prediction Using LSTMs

1 code implementation29 Jun 2018 Matthew Roddy, Gabriel Skantze, Naomi Harte

The continuous predictions represent generalized turn-taking behaviors observed in the training data and can be applied to make decisions that are not just limited to end-of-turn detection.

Can DNNs Learn to Lipread Full Sentences?

no code implementations29 May 2018 George Sterpu, Christian Saam, Naomi Harte

Finding visual features and suitable models for lipreading tasks that are more complex than a well-constrained vocabulary has proven challenging.

General Classification Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.