Oracle SQL - Cardinality & Cost
In the context of Oracle SQL, cardinality and cost are important concepts used by the SQL optimizer to evaluate and choose the most efficient execution plan for a query. Here's what each term means:
1. Cardinality:
Definition: Cardinality refers to the estimated number of rows that will be returned by a query or by a specific operation within a query (like a join or filter).
Significance: It helps the optimizer decide how much data is being processed at each step of the query execution. If the cardinality is too high, the optimizer might choose different strategies, like opting for a full table scan instead of using an index.
Example: In a query filtering by a condition like WHERE age = 25, the optimizer estimates how many rows from the table would match this condition. This estimation is based on table statistics (such as histograms and column distributions).
2. Cost:
Definition: Cost is a numerical value that represents the estimated resource consumption (CPU, memory, I/O) required to execute a query. It is derived from the number of operations, the size of data, and other factors involved in the query execution.
Significance: The optimizer uses this cost value to compare different possible execution plans for a query and chooses the one with the lowest cost.
Cost is not always equal to runtime. It represents the estimated relative expense rather than the actual time, but generally, a lower cost implies a faster query execution.
How Cardinality and Cost Work Together:
The optimizer first estimates the cardinality at different steps of the query and then uses these estimates to calculate the total cost for various execution plans.
For example, if a query involves multiple tables, the optimizer will estimate how many rows are returned at each step (joins, filters) and how expensive each join method or operation would be in terms of resource usage.
Practical Example in Oracle SQL:
For a query like:
SELECT * FROM employees WHERE department_id = 10;
Cardinality: The optimizer will estimate how many rows from the employees table have department_id = 10. If the table has 100,000 rows, but only 1,000 rows belong to department 10, the cardinality estimate would be 1,000.
Cost: Based on how the query is executed (e.g., full table scan, index access), the optimizer will calculate the cost. If an index on department_id exists, the cost might be lower than a full table scan.
Importance of Statistics:
Accurate cardinality and cost estimations depend on table statistics. If statistics are outdated or inaccurate, the optimizer's choice of execution plan may be suboptimal.
In summary:
Cardinality = Estimated number of rows affected by the query or operation.
Cost = Estimated resources required to execute the query, based on cardinality and other factors.