no code implementations • LREC 2020 • Joao Monteiro, Md Jahangir Alam, Tiago Falk
Automatic speech processing applications often have to deal with the problem of aggregating local descriptors (i. e., representations of input speech data corresponding to specific portions across the time dimension) and turning them into a single fixed-dimension representation, known as global descriptor, on top of which downstream classification tasks can be performed.