Mapping Hip-Hop : a Lyrical analysis

Cartography is fascinating because it can find applications in so many areas and can take so many different forms. In this article we are going to talk about Hip-Hop. Yes, this is a topic that can be mapped! And we will see exactly how this can be done. The Pudding, a weekly journal of visual essays, has developed a hip-hop map based on a lyrical analysis by artist.


How to create the analysis?

As with any map, this one relies on data. Its objective is to understand better the links between words and music, and in this case hip-hop music. So, The Pudding has decided to use 26 million words from the lyrics of the top 500 charting artists on the Billboard’s Rap Chart, which represents about 50 000 songs.

First, the most common words in hip-hop lyrics have been ranked: words such as “game” or “love” have thus been identified as recurring. But the most interesting is to compare their frequency of use according to musical styles. To do so, more than 47 million words than all musical genres except hip-hop have been taken into account for the comparison (Figure 1).

Figure 1: Usage of hip-hop words


Then we see that the word “love” is even more common in other styles of music: 71 occurences per 10 000 words against 21 in hip-hop songs. On the contrary “game” appears to be a word specific to hip-hop according to the graph. Indeed, the more it is situated at the right of the graph in the “blue triangle”, and the more it means that it’s used in hip-hop. Finally, some words are fully linked to an artist, and sometimes even created by him. These are often derived from slang, like “swag”, “bro” or “beef” for example.

Now that we have understood the logic, it has to be adapted for each artist that the analysis deals with. It will make us able to differenciate styles inside the genre itself, which will allow us to create a map grouping artists according to their lyrics.

To achieve this, the question to ask is “what makes a word central to an artist?” We have to characterize each artist by his lyrics. For example N.W.A (Niggaz With Attitude), a very famous rap group of the 1990’s, often says “police”. Indeed 37% of their track contain this word against 5% for rap in general, still according to the Billboard list. But if we look at others artists among the list, we can see that everyone uses it: N.W.A just uses it more often. To define a lyric style, we have to find a more specific word. As such, “Compton” is perfect, because 75% of the list never says it, while a handful of rappers uses it in up to 30% at least of their track. In this case the lyric style is closely ented to a geographic affiliation. Indeed, Compton is a Californian city in the county of Los Angeles. However, this is often what characterizes a style in rap.


How to create the map?

So, to create the map, the “central” words have to be identified for each artist. Taking into account what we have seen concerning the rarity and frequency of the occurences of words The Pudding chose to define a central word as:

  • « a word that an artist says more than the genre average (NWA’s use of “police”) »
  • « arare words in hip hop – if an artist says it, there’s a good chance you know who it was »

Then, a short list of these central words is written  and ranked for each artist, which makes it possible to compare them lyrically (Figure 2).

Figure 2 : Few examples of a central words lists


Thereafter, a mathematical process (called cosine similarity) assigns a value to reflecting their degree of proximity. Finally, a computer technique (named t-SNE) considers all the relationships between the artists and tries to position each of them as precisely as possible. Finally, a mapped result is obtained (Figure 3).

Figure 3 : The Lyrical Map of hip-hop artist


Like the journal says, the technique is not perfect, because when a rapper is identified as being most similar to another rapper, the reverse is not necessarily true. For example Schoolboy Q is most similar to Kendrick Lamar, but Kendrick Lamar is most similar to The Game ! However it can represent an artistic resemblance that an attribute table can’t illustrate. Moreover, some « clusters » can be spotted like the Wu-Tang-Clan (rap group) Cluster which reveals an example of very similar lexicon shared by many artists, based on Wu-Tang-Clan lyrics (Figure 4).

Figure 4: Illustration of a “cluster”: Wu-Tang-Clan


To Conclude

This process highlights all the specificities of language and lyrism that make the richness of hip-hop. Furthermore, it is an original way to make a map, and to be interested by a topic which may seem very far from cartography. Last but not least, if you are an amateur of hip-hop, it can help you discover other artists based on lyric style that you already like, and to understand a few lyrical rap styles.

To explore even more the geography of hip-hop, there is another map, a story map, which allows you to show and listen to the different styles depending on the geographical affiliation of the artists in the USA.



Sources :