Exploring Google

Google has its fingers in many pies but it is best known for the search engine. It’s a testament to the success of any company when the name becomes a verb and, just like Velcro, Superglue, Rollerblade, and Hoover, Google has infiltrated our vernacular. To google now means to search for information about something on the internet. Unlike the aforementioned companies however, by the very nature of a search engine we input information back to Google, so there is a feedback loop in place for it to become more entrenched in society. Whereas looking at data from Velcro (a wee factlet – it’s a portmanteau of the words velour and crochet) won’t tell us much about how we live our lives, examining how we use Google can deliver much more insight.

An Irish Mass

Google has a feature in place which allows you to explore how the world’s most popular search engine is used. It is called Google Trends and it can show you how frequently a particular term is searched across the world. The results can be broken down by country and by language, and you can explore the data for different time periods, too. You can go all the way back to 2004 in time so there is no shortage of data.

For example, below is a graph which shows interest in the search-term mass over the past five years in Ireland which can tell you something about the society. Firstly, notice that there are five large peaks and they correspond to Christmas of each year, when mass attendance is at an annual high. There are another five smaller peaks which mark each Easter, another big day out in the Roman Catholic calendar. The diligent reader might be able to pick out the Saint Patrick’s Day of each year, too.

mass-ireland-google-trends
Analysis of the search-term mass in Ireland over the past five years using Google Trends. The horizontal axis represents time and the vertical axis shows how often the term is searched, relative to the maximum number of searches.

The five places in Ireland where this search term was most popular were (in descending order) Maynooth, Galway, Castlebar, Tralee, and Kilkenny, with nowhere in Dublin within the top ten locations. Either people in Dublin don’t need to Google mass at Christmas and Easter as they already know what it is, or else they’re less bothered about it in the first place.

A little more investigation shows that mass is a distinctly Irish term. Using Google Trends, you will find that the search-term lacks the same temporal structure in other countries, including the United Kingdom.

Cheese Versus Kale

The most useful aspect of Google Trends is that is allows the user to compare the volume of searches between two or more different terms. You can also export the raw data and play around with it if you want to do your own analysis. As an example, I explored the interest in the terms cheese and kale for the last five years in the United Kingdom (the data were too noisy for Ireland) and downloaded the data.

I imported the raw numbers into Python and messed around with it. First, I averaged the five years of data to reduce the noise. Cheese was a much more popular term than kale and it was difficult to see them on the same plot, so I normalised both search results with respect to their highest and lowest results (sometimes called min-max normalisation). I then shifted the data so that it was centred around the current time (January) which is also the period of most activity for both terms. I plotted the data and used the Seaborn library with Matplotlib to make it look pretty.

cheese-kale-google-trends
The weekly search interest in the terms cheese and kale in the United Kingdom, averaged over the past five years. The data were collected using Google Trends and exported as a CSV file.

The above plot is what I ended up with and a few points are immediately apparent. Firstly, cheese hits its peak at Christmas time, where the interest builds steadily from the middle of November. Unsurprisingly, kale is at a maximum in the first week of January, as people try to get fit (Googling vegetables is a good start).

Where cheese is at a highpoint at Christmas, kale is at its lowest point. Again, this makes sense. As interest in cheese build slowly in the month beforehand, kale rockets up from its lowest point to its highest point in a week as people move from a mindset of indulgence at Christmas to fitness in January. Cheese also drops dramatically moving from December into January.

It is also worth noting that kale shows another peak in October/November, when the vegetable is in season. Interest in both cheese and kale is low in June and July, and this fits with our intuition.

The Year in Review

With 2017 coming to an end, you can also explore the year in general. Hurricane Irma was the most searched term globally with iPhone (an ever-popular term since its inception) following that. The list of most-searched people and actors is marred by the Hollywood sex scandal rather than those who brought new life to the art, which says a lot. Bitcoin also gets a mention, as do the various terrorist attacks. Amidst these popular search results, which make sense, there are also some less predictable results – for example, the most searched how to term in the world was how to make slime.

Conclusion

The big caveat is that there are many biases in these data, none of which I’ve gone into here. (For example, Google is more popular is certain countries and with certain demographics.) However, in broad-strokes, the way in which we use Google can give us a glimpse of how society functions, and I think there are at least three components to this.

Firstly, society evolves a little each year. The Hollywood scandal finally coming to light and new technology such as the iPhone being developed are examples of this.

Beneath this, we are undeniably creatures of annual habit. Our interest in kale following a predictable result – one which is less based on when the vegetable is in season and more related to when we decided a new year starts. If the Julian (and the Gregorian calendar, which is the slight modification that we use today) had a new year beginning in March, our interest in kale in January would be non-existent.

We saw cheese hit its high at Christmas but it wasn’t until the year 336 AD that the Roman Empire settled on 25 December to have birthday parties for Jesus. Other suggestions had been 21 March and 20 May and if they’d gotten the go ahead, our cheese habits would be shifted in time.

The third component is the noise, which sits below the march of progress and the our cyclic nature, reminding us that it isn’t easy to predict and explain everything – that sometimes people just want to know how to make slime.