Rage of the machine: An AI* makes metal music

Introducing the latest album by the versatile metal genius DeepSlayerXL, with song reviews by GPT-3. *Apologies to ML researchers for saying “AI” in the title, the article was written for a general audience.

As a teenager, I played in a small-town metal band. This was the nineties so our music was heavily influenced by the nu-metal movement. In other words, our songs sounded like a wild mix of Korn, System of a Down, Incubus, and other popular bands from that era.

I’ve long been asking myself whether our songwriting process was all that different from how modern generative language models work. After all, we were only “riffing” on what we had learned during musical training (pre-training) and by listening to our favorite bands (fine-tuning). Sure, we introduced some variation (random sampling with varying temperature depending on the amount of alcohol and psychedelics involved), but there was nothing about this process that contemporary language models couldn't achieve as well.

Automated music generation has a rich history, with much of the recent work focusing on auto-regressive models using transformers. There are literally hundreds of research papers and even an open-source framework for music generation. However, during my initial research, I couldn’t find anything quite up to the task— most research focuses on piano music which isn’t what ear-shattering metal tracks are usually made off. Andrew Shaw’s musicautobot looked pretty close though. I’d have to do some coding myself, but nothing too difficult as most of the building blocks already exist.

So, I rolled up my sleeves and scraped 3,604 midi songs from a Russian MIDI website (not linking it here but if you were to, say, type “Russian MIDI website” into Google, who knows what you might find). Metallica, Nine Inch Nails, Korn, Radiohead… you name it. After transposing each song a couple of half steps in either direction I ended up with 18k full MIDI songs for training my model. I also wrote a MIDI tokenizer with support for multiple instruments and percussion, borrowing the notation from musicautobot and the BAR event from Pop Music Transformer.

Tokenization: Each note is represented by 2 tokens — instrument-duration and pitch. Drum hits are represented by DRUM[NUMBER] tokens. 16th time-steps are separated by “|”.

Using the augmented and tokenized data, I trained a TransformerXL model courtesy of HuggingFace for 200 hours on 4 Nvidia T4 GPUs. Then I ran inference for 24 hours to generate a large amount of MIDI data.

Debut Album

The results were pretty neat. Sure, DeepSlayer would sometimes reproduce the training data (especially in the first couple of measures) and it generated a lot of generic chord progressions, but the same could be said for the music of the aforementioned Austrian band. DeepSlayerXL also didn’t care much about song structure or maintaining a consistent time signature. But there were some gems to be found in its work, which spanned several genres, including grunge, industrial, punk, death metal, and some purely electronic tracks. DeepSlayerXL certainly put its own spin on things.

I merged some pieces taken from DeepSlayerXL’s raw MIDI output into songs and converted them into audio using Garageband. The results are below. If you want to give it a shot yourself you can download the PyTorch model or use the Colab notebook.

Here’s a selection of DeepSlayerXL tracks, along with reviews of each track by GPT-3 via AI Dungeon (no, GPT-3 didn’t actually listen to the tracks, I only gave it the titles :))

GPT-3’s take: The song is about death and murder, with several mentions of an “angel of death” or “death angel”. Death angel is a term used to describe an evil spirit that has taken control over another person’s body and will do anything it can to take over their life. This spirit usually manifests itself as a black, winged being with red eyes or wings. It is known to create bad dreams and feed off fear and evil thoughts. It is not known if this song is about such a being or not.

This song gets (2.5) stars out of (5). It is too slow and the vocals are hard to hear. The bass guitar is almost inaudible.

GPT-3’s take: DeepSlayer’s music is a little different. It has an old-school feel to it, but with modern production techniques and instruments. This song is definitely not your average pop punk tune.

This song is about a guy who thinks he’s so cool because he doesn’t drink or do drugs. He actually isn’t that smart, in fact he’s just a fucking loser. This song is a perfect display of how music has progressed in the last 20 years. It’s not just teenagers whining about how life is so hard. It’s a piece of art.

Score: 5 stars

GPT-3’s take: This is a good song, but it doesn’t really have anything to do with pumpkin pie. It has some of that old school ’80s grunge sound to it, and I’m sure you can hear traces of that in this song. But there are other elements here that make it interesting. For one thing, the lyrics are pretty clever. They’re about how the world is getting more dangerous every day and we need to be prepared for what might come next. It’s a very deep concept, and you can really tell that the artist is a creative genius.

Score: 6/10

GPT-3’s take: Dark Meter is dark, but not in an oppressive way. It’s actually quite beautiful and eerie at times. I can’t say that it sounds like anything else on this album. It reminds me of some of the music that I used to listen to on my old cassette player when I was younger. It has more in common with the music of Pink Floyd or even some of the early Talking Heads. But then that’s not necessarily a bad thing. This is an interesting take on a familiar genre.

I give it a score of 10/10

GPT-3’s take: The song is a bit of an odd one, but it’s definitely catchy and well-written. It also has some nice guitar work that keeps you interested throughout. The lyrics are about a band of rebels who overthrow an oppressive government. It’s not exactly what you’d expect from DeepSlayer, but it works.

Score: 3/5

GPT-3’s take: DeepSlayer’s music has always been a bit more than just ambient. It is an amalgamation of many different styles, and this track in particular is a fusion between industrial noise and ambient soundscapes. It is at times almost like you are listening to a mixtape that was mixed by someone who had no idea what they were doing but somehow got it all right. There are moments of harsh noise that then flow seamlessly into a more relaxed and melodic sound. It is an interesting change of pace.

Score: 3.5/5

TL;DR

I trained a polyphonic multi-track music transformer with 3,600 metal/grunge/punk MIDI songs. Try creating your own songs and don’t forget to link them in the comments.

Links and references

Security researcher, uncertified hacker, crypto trader. Author of the OWASP Mobile Security Guide. Pwnie winner. Former @ConsenSys engineer / auditor.

Security researcher, uncertified hacker, crypto trader. Author of the OWASP Mobile Security Guide. Pwnie winner. Former @ConsenSys engineer / auditor.