Simplify Complex SQL Queries with Common Table Expressions (CTEs)

Karishma Shukla - Aug 21 '23 - - Dev Community

What are Common Table Expressions?

Common Table Expressions (CTEs) are a valuable feature in SQL that lets you create temporary result sets within a query. They simplify complex queries, enhance code readability, and improve query performance. CTEs are initiated using WITH keyword.

CTE Syntax
Fig: CTE Syntax. Image from MariaDB

When to use CTEs?

CTEs are particularly useful to:

  • Break down complex operations into simpler steps
  • Handle hierarchical data structures
  • Implement pagination for large result sets
  • Streamline complex aggregation tasks
  • Have reusable code if you need the same logic at multiple places
  • Improve code readability and maintainability if your query involves subqueries, multiple joins, or intricate filtering conditions

Types of CTEs

Broadly CTEs can be classified into:

  • Non-recursive (Simple) CTEs
  • Recursive CTEs

1. Simple Common Table Expressions

Non-recursive CTEs are straightforward and do not involve self-reference. They are useful for simplifying complex queries, aggregations, and transformations by breaking them into smaller, more manageable steps.

Example: Total Salary by Department

WITH department_salary AS (
  SELECT department_id, SUM(salary) AS total_salary
  FROM employees
  GROUP BY department_id
)
SELECT * FROM department_salary;
Enter fullscreen mode Exit fullscreen mode

Here, the CTE department_salary calculates the total salary for each department by using the SUM and GROUP BY functions. The main query then fetches the results from the CTE.

2. Recursive Table Expressions

Recursive CTEs are used to work with hierarchical or recursive data structures. They allow a query to reference its own output, enabling operations like traversing a tree structure or finding paths in a graph.

Example: Organization Hierarchy

Suppose we have a table named employees with columns employee_id, name, and manager_id, where manager_id refers to the employee_id of the employee's manager.

WITH RECURSIVE org_hierarchy AS (
  SELECT employee_id, name, manager_id, 1 AS level
  FROM employees
  WHERE manager_id IS NULL  -- Root level employees (managers)

  UNION ALL

  SELECT e.employee_id, e.name, e.manager_id, oh.level + 1
  FROM employees AS e
  JOIN org_hierarchy AS oh ON e.manager_id = oh.employee_id
)
SELECT * FROM org_hierarchy;
Enter fullscreen mode Exit fullscreen mode

In this example, we define a recursive CTE named org_hierarchy. The initial query retrieves root-level employees (managers) by selecting those with a NULL manager_id. The recursive part of the CTE uses the UNION ALL clause to join the employees table with the CTE itself, connecting employees to their respective managers using the manager_id.

The recursive CTE is structured as follows:

  • The anchor query selects the root-level employees (managers) and assigns them a level of 1.
  • The recursive query selects employees who report to the managers found in the previous iteration, incrementing the level by 1.
  • The final query retrieves the entire organizational hierarchy, including employees and their respective levels within the hierarchy.

Yes, recursive CTEs are confusing. I myself struggle a lot with them. It takes a long time to understand when to use them and why. 馃檭

Conclusion

In conclusion, Common Table Expressions (CTEs) are powerful for enhancing the readability, maintainability, and efficiency of complex queries.


Find me on GitHub, Twitter

. . . . . . . .