The standard approach to assess reliability of automatic speech
transcriptions is through the use of confidence scores. If accurate, these
scores provide a flexible mechanism to flag transcription errors for upstream
and downstream applications...
One challenging type of errors that recognisers
make are deletions. These errors are not accounted for by the standard
confidence estimation schemes and are hard to rectify in the upstream and
downstream processing. High deletion rates are prominent in limited resource
and highly mismatched training/testing conditions studied under IARPA Babel and
Material programs. This paper looks at the use of bidirectional recurrent
neural networks to yield confidence estimates in predicted as well as deleted
words. Several simple schemes are examined for combination. To assess
usefulness of this approach, the combined confidence score is examined for
untranscribed data selection that favours transcriptions with lower deletion
errors. Experiments are conducted using IARPA Babel/Material program languages.