What are Common Table Expressions?
Common Table Expressions (CTEs) are a valuable feature in SQL that lets you create temporary result sets within a query. They simplify complex queries, enhance code readability, and improve query performance. CTEs are initiated using WITH
keyword.
Fig: CTE Syntax. Image from MariaDB
When to use CTEs?
CTEs are particularly useful to:
- Break down complex operations into simpler steps
- Handle hierarchical data structures
- Implement pagination for large result sets
- Streamline complex aggregation tasks
- Have reusable code if you need the same logic at multiple places
- Improve code readability and maintainability if your query involves subqueries, multiple joins, or intricate filtering conditions
Types of CTEs
Broadly CTEs can be classified into:
- Non-recursive (Simple) CTEs
- Recursive CTEs
1. Simple Common Table Expressions
Non-recursive CTEs are straightforward and do not involve self-reference. They are useful for simplifying complex queries, aggregations, and transformations by breaking them into smaller, more manageable steps.
Example: Total Salary by Department
WITH department_salary AS (
SELECT department_id, SUM(salary) AS total_salary
FROM employees
GROUP BY department_id
)
SELECT * FROM department_salary;
Here, the CTE department_salary
calculates the total salary for each department by using the SUM
and GROUP BY
functions. The main query then fetches the results from the CTE.
2. Recursive Table Expressions
Recursive CTEs are used to work with hierarchical or recursive data structures. They allow a query to reference its own output, enabling operations like traversing a tree structure or finding paths in a graph.
Example: Organization Hierarchy
Suppose we have a table named employees
with columns employee_id
, name
, and manager_id
, where manager_id
refers to the employee_id
of the employee's manager.
WITH RECURSIVE org_hierarchy AS (
SELECT employee_id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL -- Root level employees (managers)
UNION ALL
SELECT e.employee_id, e.name, e.manager_id, oh.level + 1
FROM employees AS e
JOIN org_hierarchy AS oh ON e.manager_id = oh.employee_id
)
SELECT * FROM org_hierarchy;
In this example, we define a recursive CTE named org_hierarchy
. The initial query retrieves root-level employees (managers) by selecting those with a NULL
manager_id
. The recursive part of the CTE uses the UNION ALL
clause to join the employees
table with the CTE itself, connecting employees to their respective managers using the manager_id
.
The recursive CTE is structured as follows:
- The anchor query selects the root-level employees (managers) and assigns them a level of 1.
- The recursive query selects employees who report to the managers found in the previous iteration, incrementing the level by 1.
- The final query retrieves the entire organizational hierarchy, including employees and their respective levels within the hierarchy.
Yes, recursive CTEs are confusing. I myself struggle a lot with them. It takes a long time to understand when to use them and why. 馃檭
Conclusion
In conclusion, Common Table Expressions (CTEs) are powerful for enhancing the readability, maintainability, and efficiency of complex queries.