General Strategy
The basic idea is to use Last.fm's tags for genre tagging. In iTunes the genre tag is IMO best used when it only contains one single genre, i.e. something like "Electronica", not something like "Electronica / Dance". On the other hand dropping all but one tag would lose a lot of information, so I decided to use the groupings tag for additional information that is contained in the list of tags that an artist has on Last.fm. In the example above that would be something like "Electronica, Dance, 80s, German". In that way it is simple to use iTunes' Smart Playlist feature to create play lists of all, say, dance music. This approach is probably not suitable for classical music..
The ID3 field that is exposed in iTunes' UI as "grouping" is defined in the ID3v2 spec as:
- TIT1
- The 'Content group description' frame is used if the sound belongs to a larger category of sounds/music. For example, classical music is often sorted in different musical sections (e.g. "Piano Concerto", "Weather - Hurricane").
Practical Considerations
If one would just take an artist's highest-rated Last.fm tag for the genre one would end up with pretty inconsistent genre tags (think "hip-hop", "hip hop", and "hiphop"). Therefore, I chose to use a fixed set of values for genre. In a previous version of ID3 the list of possible genres was fixed. While this is clearly a terrible idea to start with it came along handy in this case. I used his as a fixed list for genres.
The second practical consideration was which Last.fm tags to include. In Last.fm parlance each artist tag comes with a weight (values form 0 to 100). Selecting only the tags with weight larger than 50 worked out fine for me (usually I had 1-5 tags per artist).
A third thing you might want to be aware of: if you programmatically change tags in an mp3 iTunes will not pick up these changes automatically. A simple way of letting it know: select the "Get Info" command on these items. This will trigger a reload of the new tag values.
Script
To run the script you will need the Python libraries mutagen and pylast installed. Run it with the option
-d directory_with_mp3s
The script will walk along this directory and modify all mp3s it finds. Also, you will need a Last.fm API key and set your API_KEY and API_SECRET accordingly in the script.
#!/usr/bin/env python
# encoding: utf-8
"""
tag_groupings.py
Created by Michael Marth on 2009-11-02.
Copyright (c) 2009 marth.software.services. All rights reserved.
"""
import sys
import getopt
import pylast
import os.path
from mutagen.id3 import TCON, ID3, TIT1
help_message = '''
Adds ID3 tags to mp3 files for genre and groupings. Tag values are retrieved from Last.FM. Usage:
-d mp3_directory
'''
class Usage(Exception):
def __init__(self, msg):
self.msg = msg
all_genres = TCON.GENRES
genre_cache = {}
groupings_cache = {}
API_KEY = "your key here"
API_SECRET = "your secret here"
network = pylast.get_lastfm_network(api_key = API_KEY, api_secret = API_SECRET)
def artist_to_genre(artist):
if genre_cache.has_key(artist):
return genre_cache[artist]
else:
tags = network.get_artist(artist).get_top_tags()
for tag in tags:
if all_genres.__contains__(tag[0].name.title()):
genre_cache[artist] = tag[0].name.title()
print "%20s %s" % (artist,tag[0].name.title())
return tag[0].name.title()
def artist_to_groupings(artist):
if groupings_cache.has_key(artist):
return groupings_cache[artist]
else:
tags = network.get_artist(artist).get_top_tags()
relevant_tags = []
for tag in tags:
if int(tag[1]) >= 50:
relevant_tags.append(tag[0].name.title())
groupings = ", ".join(relevant_tags)
groupings_cache[artist] = groupings
print "%20s %s" % (artist,groupings)
return groupings
def walk_mp3s():
for root, dirs, files in os.walk('.'):
for name in files:
if name.endswith(".mp3"):
audio = ID3(os.path.join(root, name))
artist = audio["TPE1"]
genre = artist_to_genre(artist[0])
grouping = artist_to_groupings(artist[0])
if genre != None:
audio["TCON"] = TCON(encoding=3, text=genre)
if grouping != None:
audio["TIT1"] = TIT1(encoding=3, text=grouping)
audio.save()
def main(argv=None):
if argv is None:
argv = sys.argv
try:
try:
opts, args = getopt.getopt(argv[1:], "ho:vd:", ["help", "output="])
except getopt.error, msg:
raise Usage(msg)
# option processing
for option, value in opts:
if option == "-v":
verbose = True
if option in ("-h", "--help"):
raise Usage(help_message)
if option in ("-o", "--output"):
output = value
if option in ("-d"):
try:
os.chdir(value)
except Exception,e:
print "error with directory " + value
print e
walk_mp3s()
except Usage, err:
print >> sys.stderr, sys.argv[0].split("/")[-1] + ": " + str(err.msg)
print >> sys.stderr, "\t for help use --help"
return 2
if __name__ == "__main__":
sys.exit(main())

