SQL Aggregate Functions

This week’s article will be about SQL’s aggregate functions. I totally missed including it last week so I wanted to do a short write-up on it this week instead. Honestly, if you’re familiar with using formulas for tables in Excel, these functions should be second-nature for you. I’ll also be dropping in a few extra keywords to help you get those searches as specific as you need them to be.

As with last week, I’ll be using the same example using a table based on Medium articles. If you want to check that out, you can find it here.

First things first, here’s a look at our schema:

As you can probably figure out, this function will find us the average of a certain column. The basic structure is as follows:

SELECT AVG(column_name) FROM table_name;

Knowing this, it’s easy to write a SQL query that will gather the average amount of claps throughout all of the articles in our table:

SELECT AVG(claps) FROM articles;

But what if we need something more specific? What if I, personally, want to see my average amount of claps from this table? Well then a help WHERE keyword will do the trick:

SELECT AVG(claps) FROM articles WHERE author = "Cody Dupuis";

And that will give me my answer!

The next function is Sum(), which does exactly what you think it does. So if I wanted to write a statement to pull my total amount of claps from every article I’ve written, it would look like this:

SELECT SUM(claps) FROM articles WHERE author = "Cody Dupuis";

And that’s all there is to it. Simple enough, right? Well, since it’s so simple, let’s learn a new keyword: AS. This keyword allows us to give our output an alias. It’s also a simple implementation:

SELECT SUM(claps) FROM articles WHERE author = "Cody Dupuis" AS my_articles;

The only thing this keyword really adds is a different title to our output in the terminal, so instead of seeing SUM(claps) above the actual sum, it should now say my_articles.

Next we have Min() and Max(). These two functions are super simple, similar to Sum(). Let’s imagine I want to see the two articles published as part of The Startup publication that have the lowest amount of claps and the highest amount of claps:

SELECT MIN(claps) FROM articles WHERE publication = "The Startup";
SELECT MAX(claps) FROM articles WHERE publication = "The Startup";

That’s it. These two go hand in hand like the positive and negative sides of a battery. They are very simple and function the exact same way as using the Sum() function.

The final function I’ll be glossing over today is Count(). This function comes last because the way it works is slightly different from the others. This is a boilerplate statement using Count():

SELECT COUNT(column_name) FROM table_name;

This function’s return output is the amount of entries in the table that meet the criteria and are NOT empty values. Any row with a NULL value will be omitted from Count().

Okay, so now I want to use this function to see how many articles in my table were published as part of larger publication:

SELECT COUNT(publication) FROM articles;

This will return the amount of articles in our table that are part of a publication. Since writers can publish independently, anything that is published independently will have a NULL value in the publication column, so it isn’t included in the results.

But what if I want to know how many articles in my table are part of The Startup? I just need to make a minor adjustment to account for this change:

SELECT COUNT(*) FROM articles WHERE publication = "The Startup";

Easy! But I still want more specific info… so now what? Well, we get to use new keywords: AND, OR, and NOT.

I can now see how many articles in my table are part of a publication and specifically what publication, but now I want to see how many articles I have personally published as part of The Startup, and I’ll even alias it as my_startup_articles.

SELECT COUNT(*) FROM articles WHERE publication = "The Startup" AND author = "Cody Dupuis" AS my_startup_articles;

And there you have it! But there are two publications that I want to see now. What if I want to see how many articles are part of The Startup and Nerd For Tech AND have over 50 claps? It’ll be a little long, but it’s not too bad:

SELECT COUNT(*) FROM articles WHERE publication = "The Startup" OR publication = "Nerd For Tech" AND claps > 50;

That should do the trick! But now I want to see how many articles are part of Nerd For Tech that I didn’t write personally. Well, that’s also pretty easy, and in fact there are two different ways to write it:

SELECT COUNT(*) FROM articles WHERE NOT author = "Cody Dupuis" AND publication = "Nerd For Tech";SELECT COUNT(*) FROM articles WHERE publication = "Nerd For Tech" AND NOT author = "Cody Dupuis";

Lastly, all of the keywords we just use can be combined in a single statement to get the data we want. I now want to see the amount of articles that are written by me AND are part of the two publications I listed earlier:

SELECT COUNT(*) FROM articles WHERE author = "Cody Dupuis" AND (publication = "The Startup" OR publication = "Nerd For Tech");

I hope this helped you learn a thing or two about querying your database using SQL and SQL aggregate functions. All of the above keywords also work for generic SELECT statements and aren’t required to follow the aggregate functions. Thanks for reading and happy coding!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store