Search Results for author: Helen L. Bear

Found 17 papers, 1 papers with code

Visually Exploring Multi-Purpose Audio Data

no code implementations9 Oct 2021 David Heise, Helen L. Bear

We analyse multi-purpose audio using tools to visualise similarities within the data that may be observed via unsupervised methods.

Acoustic Scene Classification Classification +1

Memory Controlled Sequential Self Attention for Sound Recognition

1 code implementation13 May 2020 Arjun Pankajakshan, Helen L. Bear, Vinod Subramanian, Emmanouil Benetos

In this paper we investigate the importance of the extent of memory in sequential self attention for sound recognition.

Event Detection Sound Event Detection

Comparing phonemes and visemes with DNN-based lipreading

no code implementations8 May 2018 Kwanchiva Thangthai, Helen L. Bear, Richard Harvey

We compare the performance of a lipreading system by modeling visual speech using either 13 viseme or 38 phoneme units.

Lipreading

Phoneme-to-viseme mappings: the good, the bad, and the ugly

no code implementations8 May 2018 Helen L. Bear, Richard Harvey

Not only is this ambiguity damaging to the performance of audio-visual classifiers operating on real expressive speech, there is also considerable choice between possible mappings.

Comparing heterogeneous visual gestures for measuring the diversity of visual speech signals

no code implementations8 May 2018 Helen L. Bear, Richard Harvey

Visual lip gestures observed whilst lipreading have a few working definitions, the most common two are; `the visual equivalent of a phoneme' and `phonemes which are indistinguishable on the lips'.

Clustering Lipreading

Resolution limits on visual speech recognition

no code implementations3 Oct 2017 Helen L. Bear, Richard Harvey, Barry-John Theobald, Yuxuan Lan

Visual-only speech recognition is dependent upon a number of factors that can be difficult to control, such as: lighting; identity; motion; emotion and expression.

Lip Reading speech-recognition +1

Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

no code implementations3 Oct 2017 Helen L. Bear, Stephen J. Cox, Richard W. Harvey

In machine lip-reading, which is identification of speech from visual-only information, there is evidence to show that visual speech is highly dependent upon the speaker [1].

Clustering Lip Reading

Some observations on computer lip-reading: moving from the dream to the reality

no code implementations3 Oct 2017 Helen L. Bear, Gari Owen, Richard Harvey, Barry-John Theobald

In the quest for greater computer lip-reading performance there are a number of tacit assumptions which are either present in the datasets (high resolution for example) or in the methods (recognition of spoken visual units called visemes for example).

Lip Reading

Visual gesture variability between talkers in continuous visual speech

no code implementations3 Oct 2017 Helen L. Bear

Benchmarked against SD results, and the isolated words performance, we test with RMAV dataset speakers and observe that with continuous speech, the trajectory between visemes has a greater negative effect on the speaker differentiation.

Lipreading

Understanding the visual speech signal

no code implementations3 Oct 2017 Helen L. Bear

For machines to lipread, or understand speech from lip movement, they decode lip-motions (known as visemes) into the spoken sounds.

Lipreading

Visual speech recognition: aligning terminologies for better understanding

no code implementations3 Oct 2017 Helen L. Bear, Sarah Taylor

This joining of two previously disparate areas with different perspectives on computer lipreading is creating opportunities for collaborations, but in doing so the literature is experiencing challenges in knowledge sharing due to multiple uses of terms and phrases and the range of methods for scoring results.

Lipreading speech-recognition +1

Which phoneme-to-viseme maps best improve visual-only computer lip-reading?

no code implementations3 Oct 2017 Helen L. Bear, Richard W. Harvey, Barry-John Theobald, Yuxuan Lan

A critical assumption of all current visual speech recognition systems is that there are visual speech units called visemes which can be mapped to units of acoustic speech, the phonemes.

Lip Reading speech-recognition +1

Decoding visemes: improving machine lipreading

no code implementations3 Oct 2017 Helen L. Bear

The term "viseme" is used in machine lipreading to represent a visual cue or gesture which corresponds to a subgroup of phonemes where the phonemes are visually indistinguishable.

Clustering General Classification +4

Finding phonemes: improving machine lip-reading

no code implementations3 Oct 2017 Helen L. Bear, Richard W. Harvey, Yuxuan Lan

In machine lip-reading there is continued debate and research around the correct classes to be used for recognition.

Lip Reading

Cannot find the paper you are looking for? You can Submit a new open access paper.