Audio classification is a difficult task for computers due to the complexity of audio data. There are different approaches for automating the audio classification task. The most apparent approach is called End-to-end Learning, where the raw audio data is classified with minimal data conversions. Another approach is Spectrogram Learning, where the audio data is converted to images before classifying them. End-to-end Learning is a relatively new technique compared to Spectrogram Learning. The experiments contain a proof of concept, several optimizing experiments, and final experiments. The results of the experiments show that Spectrogram Learning is more reliable than End-to-end Learning. But that does not necessarily mean that End-to-end Learning is not an interesting technique for the future. It is more about the optimization process of Spectrogram Learning that is way ahead of the optimization process of End-to-end Learning. For now, it is still more profitable to use Spectrogram Learning for audio classification.
Click here for the Paper and the Poster
Rudy van den Bosch