Creating a Willie Nelson Inspired Song with textgenrnn and spaCy

 

For this project I wanted to try my hand at some text generation. And as a Willie Nelson fan, I was hoping to use it to write a brand new Willie Nelson Song based off of all of his song lyrics. The lyrics were obtained easily through the Genius API and the text generation was performed with textgenrnn and spaCy. Here’s a link to the notebook on GitHub.


I originally started this project trying to write a new Bob Marley inspired song. The results weren’t great for a few reasons: (1) Bob Marley tends to repeat himself a lot, what the Jamaicans would call “chanting,” (2) Bob Marley spoke Jamaican patois, which is a Jamaican Creole language, and the neural net was trained on English web content, and (3) Bob Marley didn’t have that many songs or a long music career. Here’s what I got…


Not too bad, but I wanted to see if I could do better. So that led me to one of my favorite artists, Willie Nelson. Not only does he have an impressive catalogue of songs, but he is also an accomplished songwriter. 

The easiest way for me to obtain his lyrics would be to access them through the Genius API. This is a musical website that started off with just fun facts about songs, and through crowdsourcing was able to become the biggest lyrical database on the web. For the Python language there is a package called lyricsgenius that makes the entire process easy (I’ll post the link at the bottom of this post). I really want to stress here that I have used a lot of Python packages and this was absolutely the easiest one by far. From installation to accessing the lyrics, it was just so simple. Willie has almost 1000 songs on the platform! After a quick review I decided to remove the live songs, and finally ended up with about 850 unique songs and lyrics for each — much more than the 100 Bob Marley songs.


Next I had to process all of the lyrics using some beautiful NLP functions written by another data scientist who was hoping to write a new Beatles song… <Link to his GitHub here>. There is a series of functions to remove non-alphanumeric characters as well as break down each song into lines of lyrics. All of this to prepare for the spaCy NLP package for it to process these lyrics further. The spaCy package breaks down and tags each word.

“General-purpose pretrained models to predict named entities, part-of-speech tags and syntactic dependencies”

SpaCy is built for a handful of languages, and also offers a few different model options, like whether the vocabulary was trained on the entire web or just the news, and with sizes of large, medium, or small. For Willie Nelson, I opted for small in order to save training time and since Willie Nelson utilizes a lot of small words to express big ideas. I ran the model using Google Colab’s free GPU and it still took just under 6 hours! 

Once that was done and I was sure to save the NLP file. It was time to move the data over to textgenrnn, which is a text generating neural network that is able to train on a GPU and run on a CPU. It’s easy to set up and tune as well. On the GPU this took only 20 minutes or so. After it is run, it leaves you with a few different versions of text, using what it calls “Temperature“. The Temperature is how creative the algorithm is when generating new words and sentences. For instance, a Temperature of 1.0 is very creative and will generate some gibberish, while a Temperature of 0.5 will generate coherent words more closely to what it was trained on. 

A New Willie Nelson Song

So here are the new Willie Nelson inspired songs written entirely by a computer!


Conclusion

Not too bad! I especially really loved the ending. These lyrics were generated with a low Temperature, meaning they were very close to the textual data the model was trained on. You can see that many of the lyrics are just cut up bits of some of his song lyrics. That was the price I had to pay for not having any gibberish words. I will say that it is very poetic and at times it even rhymes, which is impressive. You can see that I decided to leave in the [Verse 1] or [Chorus 1] or [Instrumental Break] separators since I wanted to see how textgenrnn would handle these and for this Temperature I think it did a decent job of splitting the song up. And what a long song it is. I’m not sure any of Willie Nelson’s songs are quite this long. Overall I was really impressed with the combination of these algorithms. I think in the future I would like to try the same thing with GPT-2 and compare the results. If you’re interested in seeing more lyrics (even the silly ones), please follow the links to my GitHub Page and I’ve posted them all as PDFs.

Now, where did I put my guitar?

Get in touch at:       mr.sam.tritto@gmail.com