Album Cover-Based Genre Classification

Album Cover-Based Genre Classification
Final Project
CS 429: Computer Vision
Matt Hoffman

Album Cover-Based Genre Classification
Introduction:
Automatic genre classification is a popular problem in the machine listening community. The ability to classify a song or album into a broad category without human interaction has numerous applications for the recording and music retail industries. Until now, most work on this problem has focused on extracting relevant features from audio data, or occasionally other metadata (musical notation, user playlists, etc.) which is frequently hand-labeled. This project attempts to make use of an additional source of data: the cover artwork associated with albums.

Previous Work:
Although significant amounts of work have been done on applying audio features to genre recognition, these features are obviously unavailable here. Instead, my project uses image features of the sort commonly used as low-level input to computer vision systems.

Data:
I acquired my data set from the AllMusic Guide (http://www.allmusic.com), which has an extensive database of artists and albums browsable by, among other things, genre. Images of album covers are stored as 200 pixel by 200 pixel JPEG images. I wrote a program in Java to automatically scrape a collection of these images (sorted by genre) from the allmusic.com website. This collection was then culled to 59 albums per genre for training and testing. The genres my system considers are Country, Easy Listening, Electronica, Hard Rock, Jazz, New Age, and Rap. This selection of genres was chosen in the hope that the album covers within each genre would be easily distinguished from those of other genres. Classifying an album as Folk versus Country, for example, might be difficult even for a human being based on album cover alone.

Below are links to tiled images of the album covers used in this project, organized by genre:
Country
Easy Listening
Electronica
Hard Rock
Jazz
New Age
Rap

Features:
The features I used fall into four categories: those based on color histograms, those derived from the frequency domain, those based on the Canny edge detection algorithm, and those based on a corner detection algorithm.

Color Histograms:
These are obtained for each image by calculating a 50-bin histogram of the hue, saturation, and brightness values of every pixel in the image. This results in 50 * 3 = 150 features describing the hue, saturation, and brightness content of the image.

Cepstral Coefficients:
The next set of features are obtained from the frequency-domain representation of each image, which is obtained by taking the magnitude spectrum of the image using a Discrete Cosine Transform (DCT). To obtain a coarser representation of the frequency content of the image, I take the DCT of the log-magnitude spectrum, obtaining a cepstral representation of the image. I then discard all but the lowest 10x10 square of coefficients, which contain information about the low-frequency activity in the spectrum, which should be most important. This adds another 10 * 10 = 100 features to describe the frequency content of the image.

Edge Features:
To capture information about the edge content of images, I apply the Canny edge detection algorithm to each album cover and take statistics about the resulting image. The first statistic I calculate is simply the number of pixels determined to be edges using nonmaximum suppression and hysteresis thresholding with a sigma of 3 pixels, a horizontal threshold of 0.3, and a vertical threshold of 0.3. I then, to get finer grain information about edge distribution, break the image into a grid of 25x25-pixel sections, and calculate the standard deviation, minimum, and maximum of the number of edge pixels in all the sections. To get specific information about the distribution of edge content around the border of the cover compared with the center, I also separately calculate the mean and standard deviations of the number of edge pixels in the sections of the grid that are in the top 1/3 of sections closest to the edge of the image and for those that are not. These operations produce an additional 8 features.

Corner Features:
These features are very similar to the edge content features, except applied to the output of a corner detection algorithm instead of an edge detection algorithm. Statistics about the pixels marked as edges (within a neighborhood of 3 pixels, with a sigma of 3 pixels, and a threshold of 50) are calculated as above about the total number of corners, the standard deviation, minimum, and maximum corners in the 25x25-pixel sections, and the means and standard deviations of the numbers of corners in sections around the edge and in the middle. This produces another 8 features.

In total, I extract 150 + 100 + 8 + 8 = 266 features for each image.

Classification:
I use a Support Vector Machine (SVM) classifier using a Radial Basis Function (RBF) kernel to do the actual classification. The implementation is from the WEKA Java library for machine learning, which implements a number of standard classifiers. I tried using several other classifiers, including naive Bayesian networks, ADABoost with decision stumps, and k-nearest neighbors, but SVMs produced the best results.

Results - Feature Data:

HSV Histograms:
The mean histograms across all albums for hue, saturation, and brightness can be found here:
Hue
Saturation
Brightness

The deviations from these means for the mean histogram values of each of the seven genres can be found here:
Country:
Hue
Saturation
Brightness

Easy Listening:
Hue
Saturation
Brightness

Electronica:
Hue
Saturation
Brightness

Hard Rock:
Hue
Saturation
Brightness

Jazz:
Hue
Saturation
Brightness

New Age:
Hue
Saturation
Brightness

Rap:
Hue
Saturation
Brightness

Composite graphs of deviations from the mean for all genres can be found here:
Hue
Saturation
Brightness

Cepstral Coefficients:
A plot of the mean of the cepstral coefficients for all images can be found here.

Plots of the deviations from the mean of the average cepstral coefficients for each genre can be found below:
Country
Easy Listening
Electronica
Hard Rock
Jazz
New Age
Rap

Edge Features:
Deviations from the mean for the mean edge feature values of each of the seven genres are summarized in this plot. The features are presented in the order described above: sum of all edge pixels; std dev, min, and max of edge pixels in each section; and mean and std dev of edge pixels in each section in the border and center of the image.

Corner Features:
Deviations from the mean for the mean corner feature values of each of the seven genres are summarized in this plot. These features are presented in the same order described above.

Results - Classification:
This graph summarizes the results of my system. The y-axis represents the percentage of albums that are classified correctly as belonging to their actual genre within x (the corresponding value on the x-axis) tries. Each line represents a different subset of the available features. So, for example, when using all of the features available, the correct genre is within the classifier's top three guesses in over 60% of cases, whereas random guessing (the baseline) would only have had the correct answer in its top three a little more than 40% of the time. To test my system I used repeated 10-fold cross-validation, dividing the data set randomly into 10 equal-sized subsets, testing the classifier with each and training with the rest. I repeated this process many times until the average results showed little change.

The system successfully guesses the genre on the first try only about 26.6% using all of the available features, but this is substantially better than the baseline result of 1/7 or 14.3%. That the system does not successfully guess the correct genre a majority of the time should not be too surprising, since it does not have access to the semantic information that human beings do, and even with such information it is completely possible for humans to misclassify an album's genre if only the cover art is available. Examples of such semantic information include titles and artist names ("25 Country Hits" gives a fairly good indication of the genre of an album, for example) or images of artists (rock musicians tend to look recognizably different from country musicians, but accurately distinguishing between them may require the cognitive ability to analyze fine visual details).

Each of the sets of features contributed something to the accuracy of the system, but the returns diminished substantially as more information was added. In the case of adding corner information to edge information, no improvement becomes visible until lower-ranked choices are considered. The most important features seem to be those associated with cepstral information and brightness histograms. It is possible that better results could be had by taking further advantage of information in the frequency domain, or by using some kind of template-based approach of the sort used in scene categorization, which this problem bears substantial resemblances to.

Below is a confusion matrix detailing what sorts of mistakes the program makes when it misclassifies albums. Names on the left indicate the correct genre, while the names at the top are those with which an album was confused. For example, 61 new age albums were identified as electronica albums. Interestingly, many albums of all genres were misclassified as electronica. This is perhaps due to the frequenly simple color schemes and designs of albums in this genre. Jazz was frequently mistaken for country, which may be accounted for by the overlap in time period that the albums being considered were recorded during. It is worth noting that release date probably plays as important a role in classification as genre, since design styles change over time, and the albums AllMusic Guide designates as "classic" for a genre will tend to clump together in time. So it is possible that the system is seizing on information that could better be correlated with release date than genre proper.



Conclusion:
This system did a marginally successful job of classifying images of album covers into genres. Its performance, though significantly better than baseline, was nonetheless not stellar. It is possible that more sophisticated shape matching techniques could be of use in improving these results, since no attempt to use such information was made. Although the performance of this classifier was not really adequate for use in applications on its own, it is possible that when combined with audio features it could be of some use in slightly improving results, since there is no reason to expect any of the features used in this system to show strong correlations with those

# Postato martedì 26 dicembre 2006 13:49

music

music
yaw belcourage Virage Yibougé 3a9lia tchanger fkol partie ndiro trager
allez allez louissi ou saidi gosso ou traoré samadinho markiha felfilét

wellimaybougé mayg3od m3ana yro7 b3id wella yensana
wellimaybougé mayg3od m3ana yro7 b3id wella yensana

Virage mehbol anti magana ya rajoui felhadra perimé ya rwa7 tchof le rouges fwembley
Welyoum kil3ada ballone ydor virage lhoul wa ya louissi 3awdha fretour
jouez jouez hamra ya 3zizti bla bik nsoufri nrab7o l3adian ondiro festivité

chofo sa7afa 3lina tehki wetgol equipna fort ovirage mehbol
chofo sa7afa 3lina tehki wetgol equipna fort ovirage mehbol

30dawra m3ak pour toujours waya les rouges faites la briller
welbotola wel3ossba obligé

lalalalalal oho oho lalala ohoho
waya les rouges faites la briller
welbotola wel3ossba obligé
Wal 7amra wal bayda mon amour
la la la la la la la la la

wal 7amra wal bayda mon amour
wal victoire aller retour
w fkol stade inod lhool
sa9siw 3lina fal modone
sa9siw 3lina fal modone

wled l7amra les kamikaz
w yro7o m3ak f kol blace
ya wal 3a9lia holliganz
hada virage les winners
hada virage les winners

ya fora fora la camora
ya tfaraj w t3allem lkoora
équipna man bakri machhoora
fidél m3ak f kol dora
fidél m3ak f kol dora

la la la la la la la la

wled lkhadra 3ayach ma7gour
chifouna la7d ydour
ya chaque semaine ga3ad yjkor
flwan l7amra rah madrour
flwan l7amra rah madrour
3ich nhar tsma3 khbar
la raja la les fars
vafin coulo ya oscar
wlidat romao les stars
wlidat romao les stars

lalala lalala lalala lalalal allez les rouges
lalala lalala lalala lalalal nchala bjouj
Green boys ra7et bye bye
yalal yalal ya la la
ya litahwa lkoura hou hou
ya rwa7 tazha m3ana hou hou *
frimija machhoura haaa
wa hna nmoutou 3liha
ou chafouna f derby hou hou
galou machi normal hou hou
virage tsunami hiiiiiiiii
3la balkoum ya l3adyan
yaw merci bcp louissi hou hou
zahitina f derby hou hou
m3ak raj3ou lpasser
nchalah nwalou kizman
kidina championne
zahina l etranger
gracia ragazzi
squaddra l'italie
green boys f 7alt bye bye
khrajna fihoum tay tay
nchalah l3am jay
paris ndouzou lmarseille
cette anne champions ligue ouhou
ndouzou fel babor
ndouzou l tunisie
roubla fi l algerie

Ya Les Verts
ki l3ada rah hna jinakoum ya les verts hadinakoum tsunami ja bin 3aynikoum suisi chahd 3lmikoum
chafou l virage koulou halwassa shab zatla et shab tassa kalou curva du nord madrassa yhabou l ballon machi siassa alllé allé allé


Of Chaque Derby
of chaque derby dima zahine o ndiro lihome l kochmare
o tsonami drabha 3lihome khala kolchi mahtare
c po normale hada li sare jabnaha bl courage
dkika 90 wi node l virage les fumie et la3jaje
alè alè alè alè alè alè robla fi kole blasse
alè alè alè alè alè alè winners les kamikaze


Vas-y bouger mon ami

lalalalla lalalalalall lallalallala laalallalalala
ya l7amra ya mon amour ya mon amour nhwaha man sogri
had l3am lballon ydour lballon ydour inchala championé
allez allez allez allez vas-y bougez ya mon ami
allez allez allez allez vas-y bougez ya mon ami



Vafin koulo ya Rajawi
ohohohoho ihihihihih yalah jina lwydad zina w championat lina
vafin coulou ya rajawi vafin coulou ya rajawi vafin coulou vafin coulou vafin coulou ya rajawi
allez allez allez allez allez winners allez allez allez chantez allez bougez
inchala had l3am championé
winners
hohohohoho ihihihihihi campioné campioné allez winners bouger
campioné campioné allez winners bouger



Merci bcp romao jose
ya choufo l7amra ki tl3at ya choufo niran ki cha3lat
viragna mahboul kil3ada river plait w lboca
merci beaucoup romao jose hohohoho champion maroc cette année
hohohoho merci beaucoup romao jose hohohoho champion maroc cette année hohohoho
la fiesta ndirouha wtgirou jamais twaslona fniveau
la fiesta ndirouha wtgirou jamais twaslona fniveau
w vafin coulou yal 3assima hohohohoho rajawi pirimer ki dima hohohohoho
w vafin coulou yal 3assima hohohohho rajawi pirimer ki dima hohohohoho
l'histoire championat rah ydour equipe l7amra dima fort
lpalmares man zman w3lach tkarhouna yal 3adyan
lalalalalalala hohohohoho hohohoho
lalallalalalalal lalalalalalalal lalalalala lalalalalala




L'Italie mon pays
Ki nchouf l7amra galbi ytfaja n7sab rou7i gawriiiiiiiii
3atal3a ngani w les flammes dawi w nsib rou7i nbougeeeeeeeeeez
wndirou la folie walah tgoul wimbley
whad l3am doubley wnsorna ya rabi
l'italie mon pays forza milan ac
ntya jrada tvibrer tvibrer flmagana tbomber
allez la curve ragazzis milan ac
laclicka tbougez allez allez ma3roufa fl'afrique rana dangez
chi tarlina ywali allez allez
wma3la balich bik yal bojadi rajawi fuck you m3a l7amra gadi
w frimija danger allez allez frimija danger allez allez
3adakhla lcasa tzha wtgani wtgoul ra kaynin jouj wa7da ftouche
l7amra fi bali wngoulha bla fchoch
magana krinboch wynadboch
jibou lina la coupe du trone
wskara 3lihom championne
lhamra wal bayda
ya 3liha man nabra
ohad l3am inchallahh
rwah njibo doublé
ALLEZ ALLEZ
nroho lsan sirooooo
roubla ran7na ndiro
hamra wbayda ac milooo
o les verts vanfanculo
ALLEZ ALLEZ
o les vert vanfanculooo 3lach menna tghiro
kif ma darna tdiroooo
Elroubla fi sansirooo



Rajawi Pirmer

laylo laylo laylo lay
laylo laylo laylo lay
laylo laylo laylo lay
celtic vafin coulo
celtic vafin coulo
alio alio alio allez
alio alio alio allez
alio alio alio allez
allez les rouges gagner
allez les rouges gagner
wntaya rajawi mazal pirimer
n7laf lik bi rabi ma3la bali
jomhourkom chamkar w dawi khawi
fuck 3lik w3la lfarawi
fuck 3lik w3la lfarawi

# Postato domenica 24 dicembre 2006 14:43

cava les wydad maroc rouge

cava les wydad maroc rouge
[ Aggiungi un commento ] [ Nessun commento ]

# Postato domenica 24 dicembre 2006 08:16

frimija bouji@hotmail.fr mon msn pour tous les homme et les femme casablanca wac

frimija bouji@hotmail.fr mon msn pour tous les homme et les femme casablanca wac
[ Aggiungi un commento ] [ Nessun commento ]

# Postato domenica 24 dicembre 2006 08:14