SQL 101 — CTE (Common Table Expression)

Nadia Selas Noversega
3 min readJun 15, 2022

--

Have you found yourself in a situation where you needed several steps to process data until you get the desired results, but creating a bunch of views and tables was not an option? In situations like this, I believe most of us will use a subquery.

CTE Form

During my early days in data, I would abuse subquery in from clause and then become confused when re-read the query later because my query was chaotic, and in the end, I would re-do my query just to modify things. Just think about it, even the future me cannot understand the past me, who made the query, let alone other people. This would be a serious problem if I ever resign and need to hand over my job.

After reading lots of articles and hours on YouTube, I finally found a better way to do subqueries, which was using a CTE. This might save you and your teammates hours of confusion in understanding a legacy query.

CTE (Common Table Expression)

As cited from sqlshack.com, CTE or Common Table Expression is a temporary named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. In short, it allows you to create an imaginary table within a query that you can select and process later on in the same query. There are two types of CTE: recursive CTE and non-recursive CTE. In this article, I will focus on non-recursive CTE.

So, what are the advantages of using CTE rather than normal subquery in from clause?

- Better readability

- Reusability of the temporary tables

Performance-wise, I think it is on par with using a normal subquery because CTE basically is also a subquery.

Please check compatibility with your database before using the window function. For MySQL, the window function is available in MySQL 8.

CTE Syntax

CTE is created using a with statement. It is possible to create multiple CTE in one query. I will give an example on using CTE.

I have a student_score table:

I need to get the names of students who ranked 1 & 2 in each class.

Our plans:

  1. Rank each student within their class
  2. Create a filter so only students with ranks 1 &2 appear.

The basic query for CTE is

with [CTE_name] as(your query for CTE)SELECTfrom [CTE_name]

For our case, I will write query as:

with student_rank as (
SELECT
rank() OVER (PARTITION BY class ORDER BY score DESC) rank,
class,
name,
score
FROM student_score
ORDER BY class asc, score desc
)
SELECT
*
from student_rank
where rank in (1,2)

Plan #1:

Within with statement, we created an imaginary table named student_rank to add rank field from student_score table that we had

Plan #2:

Now that we had ranked all students within their class, all we need is to filter the table to get show only #1 and #2. To do this, we select fields from our preprocessed imaginary table student_rank and apply filter where rank in (1,2).

This would be our results:

If your query gets more complex, it is possible to create more than one CTE and use CTE #1 results in CTE#2’s query.

Use Cases

CTEs are helpful in many ways especially when subqueries are needed. In my experience, I used CTE to:

- Create cohort tables (appears in many technical tests)

- Determine rank (lowest, second-lowest, etc) — (appears in many technical tests)

- Category grouping before another grouping

  • I will add more.. (or if you have another common use case please write in comment section below)

Write queries in CTE form and you (and your teammates) will save a lot of time! (learned this the hard way T_T)

P.S: If you have any questions or suggestions, do not hesitate to comment!

--

--