Automatic Music Genre Classification
Abstract
Music genre classification is a branch of the audio classification domain, which assigns labels to an audio file or sample, based on genre, mood, or instruments. Recent research has shown Convolutional Neural Networks (CNNs), trained using mel-spectrograms, are an effective way to classify audio files. From Spotify and Shazam, to applications within construction, and even identifying disease-carrying insects, audio classification has clear value in the real-world.
In this paper we replicate the work of Schindler et al., who use CNNs to classify audio clips from four datasets: GTZAN, ISMIR, Latin Music Database and Million Song Dataset. We then attempt to improve their results in two ways: Data augmentation - using pitch shifting to create more training samples for the model to learn from. Transfer Learning - applying the method, using a DenseNet architecture, to our dataset. These additional techniques results in significantly improved performance overall.
Click the attached page to read the full paper.