Light Gated Multi Mini-patch Rxtractor For Audio Classification

Published in IEEE ICASSP HSCMA, 2024

Audio classification, which serves as a fundamental step foracoustic signal processing, has attacked a lot of research in-terest and numerous audio classification neural networks havebeen proposed. In these networks, down-sampling blockswhich compresses audio features are essential due to the com-putational capacity. However, compressing the signal will in-evitably cause the loss of relevant information. To mitigatethis issue, large amount of parameters are used. In this paper,we present a novel down-sampling method called gated multimini-patch extractor (GMME), in which multiple convolutivelayers are used to extract relevant information at different lev-els, including time frames, pseudo-frequency bins, and globalfeatures. And gate mechanism is adopted to retain the corre-lation with the original features. Several simulations demon-strate that, compared to the baseline, our method can achievecomparable or slightly better performance with significant re-duction of number of parameters.