Handling Tags in a SQL Database

This is something that comes up at least once for engineers working with web applications. There are several ways to achieve it, with different advantages and disadvantages.

Disclaimer

This article is focused on storing tags in a SQL database; it does not suggest this is the best or only way to do it. In fact, there are many other options available with systems like Redisthat do a fantastic job of this. However, if your stuck in SQL, feel it’s unnecessary to use another tool or simply have existing solutions similar to those below this is may be helpful as a cheatsheet.

I will aim to use ANSI SQL so it remains as portable as possible and will focus on the implementation and understanding rather than the abusing specific features of different SQL engines that may make it more efficient.

Let’s Get Relational

Most people will recommend that if your using a relational language like SQL you should be storing the data in a relational fashion, makes sense right? If you’ve done this before the follow code should make sense and may even look very familiar:

You may have a third table in the middle that joins unique tags to artists, and it’s not that much of a leap beyond what we have — I will just keep it simple for now.

Tags in an OR relationship; that is to fetch any artist where they have at least one of the following tags is quite simple:

Most people get stuck with the AND relationship; that is where you want the artist that contains all of the tags in a given set. Theres several says to do this — probably the easiest is to add an additional HAVINGclause:

The count(*) = 2 is more flexible than you think. It also allows you to use specify how many of the tags need to match, so you could say any 3 of the 5 tags:

If your tags are not unique (as in these examples) you simply have to specify tags as DISTINCT:

A less desirable way is to use a JOINfor each of the tags. I say less desirable because this can lead to a more complicated SQL query that is tricky to build. However, if you have a lot of tags to intersect and you know at least one tag is very rare the query planner might be able to do this more efficiently than the method above.

Originally published at http://elliot.land on November 26, 2015.

Written by

I’m a data nerd and TDD enthusiast originally from Sydney. Currently working for Uber in New York. My thoughts here are my own. 🤓 elliotchance@gmail.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store