There are times we need to group data based on specific values.
SQL offers the GROUP BY clause which can group rows that have similar values into "summary" rows. It returns one row for each group. The interesting part is that each group can have an aggregate function applied to it that operates only on the grouped data.
Imagine that we have a database with songs and albums:
| song_id | title | album_id |
|---|---|---|
| 1 | Crawl | 10 |
| 2 | Oakland | 10 |
| 3 | Bonfire | 11 |
| 4 | Fire Fly | 11 |
| 5 | Heartbeat | 11 |
| 6 | Sober | 12 |
If we want to see how many songs are on each album. We can use a query like this:
SELECT album_id, count(song_id) AS song_count
FROM songs
GROUP BY album_id;
This query retrieves a count of all the songs on each album. One record is returned per album, and they each have their own count:
| album_id | song_count |
|---|---|
| 10 | 2 |
| 11 | 3 |
| 12 | 1 |
Let's get the balance of every user in the transactions table, all in a single query! Use a combination of the sum aggregation and the GROUP BY clause to return a single row for each user with transactions.
The row for each user should contain the user_id and their balance (a sum of the amounts of their successful transactions) called balance.