Just in CASE

Chidiebere Ogujeiofor - Nov 20 '19 - - Dev Community

I recently implemented a feature that required using the SQL CASE statement. Thus, I decided to write a post on the exciting capabilities of the SQL CASE statement. I would be using sample student result data which you can generate by running the SQL in this gist. Note that PostgreSQL is RDBMS used in this post but the CASE statement is similar in other RDBMS

What is the CASE statement

The CASE statement is the SQL way of creating conditional statements(if-else-then) when querying for column values. Here is the general structure of a CASE statement

CASE 
     WHEN {condition1} THEN {value1}
     WHEN {condition2} THEN {value2}
                  .
                  .
                  .
     WHEN {conditionN} THEN {valueN}
END
Enter fullscreen mode Exit fullscreen mode

The CASE statement comes in handy when:

  • Categorizing data
  • Filtering data
  • Aggregating data
  • Updating data

I would explain each use case in subsequent sections using the student results data

Categorising data

Let's say we have a table results that contains student results in several courses as follows

student_id subject score
1 Arts 49
1 Biology 0
1 Chemistry 0
1 Computer Science 74
1 Economics 73
1 English Language 63
1 History 72
1 Literature 0
1 Mathematics 0
1 Physical Education 88
1 Physics 71
1 Statistics 14
2 Arts 76
2 Biology 78
2 Chemistry 73
2 Commerce 76
2 Computer Science 77
2 Economics 81
2 English Language 0
2 History 71
2 Literature 73
2 Mathematics 66
2 Physical Education 70
2 Physics 73
2 Statistics 91
4 Arts 70
4 Biology 77
4 Chemistry 71
4 Commerce 66
4 Computer Science 52
4 Economics 70
4 English Language 0
4 History 78
4 Literature 77
4 Mathematics 71
4 Physical Education 70
4 Physics 80
. . .
. . .
. . .
. . .

If the pass mark for each course is 60 then we can use the CASE statement to categorize this data by displaying "Passed" when the student passes and "Failed" when the student fails. Here is how it looks like:

SELECT  subject, student_id, score, 
    CASE 
        WHEN score >= 60 THEN 'Passed'
        ELSE 'Failed'
    END AS status
FROM results 

ORDER BY subject, score, status;
Enter fullscreen mode Exit fullscreen mode

This would produce the following results:

subject student_id score status
Arts 12 45 Failed
Arts 21 47 Failed
Arts 1 49 Failed
Arts 20 50 Failed
Arts 23 67 Passed
Arts 16 67 Passed
Arts 6 69 Passed
Arts 17 70 Passed
Arts 11 70 Passed
Arts 4 70 Passed
Arts 22 70 Passed
Arts 19 71 Passed
Arts 15 71 Passed
Arts 8 71 Passed
Arts 5 72 Passed
Arts 7 75 Passed
Arts 2 76 Passed
Arts 18 76 Passed
Arts 13 77 Passed
Arts 10 79 Passed
Arts 14 80 Passed
Arts 9 80 Passed
Arts 24 83 Passed
Biology 23 0 Failed
Biology 21 0 Failed
Biology 12 0 Failed
Biology 1 0 Failed
Biology 15 0 Failed
Biology 20 55 Failed
Biology 22 56 Failed
Biology 11 57 Failed
Biology 6 63 Passed
Biology 5 64 Passed
Biology 8 65 Passed
Biology 13 65 Passed
Biology 10 66 Passed
Biology 16 70 Passed
Biology 19 71 Passed
Biology 18 73 Passed
Biology 14 74 Passed
Biology 17 77 Passed
Biology 4 77 Passed
Biology 2 78 Passed
Biology 7 81 Passed
Biology 9 84 Passed
Biology 24 85 Passed
Chemistry 21 0 Failed
Chemistry 1 0 Failed
Chemistry 23 36 Failed
Chemistry 20 40 Failed
Chemistry 12 40 Failed
Chemistry 22 41 Failed
Chemistry 15 45 Failed
Chemistry 10 51 Failed
Chemistry 11 56 Failed
Chemistry 17 60 Passed
Chemistry 5 61 Passed
Chemistry 8 61 Passed
Chemistry 13 62 Passed
Chemistry 6 63 Passed
Chemistry 16 67 Passed
Chemistry 14 70 Passed
Chemistry 4 71 Passed
Chemistry 18 71 Passed
Chemistry 2 73 Passed
Chemistry 7 74 Passed
Chemistry 19 75 Passed
Chemistry 9 81 Passed
Chemistry 24 82 Passed
Commerce 12 44 Failed
Commerce 22 55 Failed
Commerce 15 60 Passed
Commerce 11 61 Passed
Commerce 21 62 Passed
. . . .
. . . .
. . . .

Now we can see which student passed or failed each subject.

The CASE statement does not affect our other parts of our select statement above(although it can be used in other parts), thus we can decide to view the people that have passed/failed Statistics as follows:

SELECT student_id, subject, score, 
    CASE 
        WHEN score >= 60 THEN 'Passed'
        ELSE 'Failed'
    END AS status
FROM results 

WHERE subject = 'Statistics'
ORDER BY score
;

Enter fullscreen mode Exit fullscreen mode

We get the following results

student_id subject score status
1 Statistics 14 Failed
21 Statistics 48 Failed
22 Statistics 49 Failed
12 Statistics 52 Failed
6 Statistics 53 Failed
20 Statistics 54 Failed
15 Statistics 57 Failed
23 Statistics 57 Failed
13 Statistics 58 Failed
8 Statistics 60 Passed
14 Statistics 61 Passed
11 Statistics 62 Passed
18 Statistics 62 Passed
10 Statistics 65 Passed
5 Statistics 65 Passed
16 Statistics 66 Passed
4 Statistics 71 Passed
17 Statistics 76 Passed
7 Statistics 80 Passed
19 Statistics 80 Passed
9 Statistics 85 Passed
24 Statistics 88 Passed
2 Statistics 91 Passed

Now it is easy to see that there are quite a number of students that failed Statistics.

Adding more conditions to the CASE statement

Let's say we want to retrieve the grades(A, B, C,D, F) for each student in the Statistics course such that

an A is 90-100, a B is 80-89, a C is 70-79, a D is 60-69 and an F is <60. We can achieve that by using AND operator in the conditional part of our CASE statement as follows:

SELECT student_id, subject, score,  
    CASE 
        WHEN score >= 70 AND  score <= 79 THEN 'C'
        WHEN score >= 60 AND score <= 69 THEN 'D'
        WHEN score >= 90 THEN 'A'
        WHEN score >= 80 AND score <= 89  THEN 'B'

        ELSE 'F'
    END AS grade
FROM results
WHERE subject = 'Statistics'
ORDER BY score DESC
;
;
Enter fullscreen mode Exit fullscreen mode
student_id subject score grade
2 Statistics 91 A
24 Statistics 88 B
9 Statistics 85 B
7 Statistics 80 B
19 Statistics 80 B
17 Statistics 76 C
4 Statistics 71 C
16 Statistics 66 D
10 Statistics 65 D
5 Statistics 65 D
11 Statistics 62 D
18 Statistics 62 D
14 Statistics 61 D
8 Statistics 60 D
13 Statistics 58 F
15 Statistics 57 F
23 Statistics 57 F
20 Statistics 54 F
6 Statistics 53 F
12 Statistics 52 F
22 Statistics 49 F
21 Statistics 48 F
1 Statistics 14 F

You could also use the OR operator within the CASE statement. Also, note that the CASE statement checks the conditions in the order in which they were created thus the following CASE statement does the same thing as before:

CASE 
    WHEN score >= 90 THEN 'A'
    WHEN score >= 80 THEN 'B'
    WHEN score >= 70 THEN 'C'
    WHEN score >= 60 THEN 'D'


    ELSE 'F'
END AS grade
Enter fullscreen mode Exit fullscreen mode

Aggregating data

COUNT

Let us say we want to retrieve the number of students that passed and the number that failed each subject. We can do that using the following query:


SELECT subject,

    COUNT(CASE 
        WHEN score >= 60 THEN 'Passed'
    END ) as number_of_success,

    COUNT(CASE 
        WHEN score < 60 THEN 'Failed'
    END ) as number_of_failure
FROM results 


GROUP BY subject
ORDER BY subject;
Enter fullscreen mode Exit fullscreen mode
subject number_of_success number_of_failure
Arts 19 4
Biology 15 8
Chemistry 14 9
Commerce 20 2
Computer Science 19 4
Economics 22 1
English Language 16 6
History 21 2
Literature 21 2
Mathematics 21 2
Physical Education 23 0
Physics 23 0
Statistics 14 9
SUM

We can also get a similar result using the SUM function. This time we would be returning numbers (in this case 1 when the student passes and 0 otherwise) to the SUM function so that it can add all the numbers together.



SELECT subject,

    SUM(
        CASE 
            WHEN score >= 60 THEN 1
            ELSE 0
        END ) as number_of_success,

    SUM(
        CASE 
            WHEN score < 60 THEN 1
            ELSE 0
        END ) as number_of_failure
FROM results 


GROUP BY subject
ORDER BY subject;


Enter fullscreen mode Exit fullscreen mode
subject number_of_success number_of_failure
Arts 19 4
Biology 15 8
Chemistry 14 9
Commerce 20 2
Computer Science 19 4
Economics 22 1
English Language 16 6
History 21 2
Literature 21 2
Mathematics 21 2
Physical Education 23 0
Physics 23 0
Statistics 14 9

Gives exactly the same result as before.

AVG

The CASE statement can be used along with the AVG function to get the percentage of a set of values. For example, we can use the following query to get what percentage of students passed and failed each subject:

SELECT subject,

    AVG(
        CASE 
            WHEN score >= 60 THEN 1
            ELSE 0
        END )* 100 AS percentage_passed,

    AVG(
        CASE 
            WHEN score < 60 THEN 1
            ELSE 0
        END )* 100 AS percentage_failed
FROM results 


GROUP BY subject;


Enter fullscreen mode Exit fullscreen mode
subject percentage_passed percentage_failed
Arts 82.60869565217391304300 17.39130434782608695700
Biology 65.21739130434782608700 34.78260869565217391300
Chemistry 60.86956521739130434800 39.13043478260869565200
Commerce 90.90909090909090909100 9.09090909090909090900
Computer Science 82.60869565217391304300 17.39130434782608695700
Economics 95.65217391304347826100 4.34782608695652173900
English Language 72.72727272727272727300 27.27272727272727272700
History 91.30434782608695652200 8.69565217391304347800
Literature 91.30434782608695652200 8.69565217391304347800
Mathematics 91.30434782608695652200 8.69565217391304347800
Physical Education 100.00000000000000000000 0.00000000000000000000
Physics 100.00000000000000000000 0.00000000000000000000
Statistics 60.86956521739130434800 39.13043478260869565200

To make the result more readable let us round to 2 d.p:

SELECT subject,

    ROUND(AVG(
        CASE 
            WHEN score >= 60 THEN 1
            ELSE 0
        END )* 100 , 2) AS percentage_passed,

    ROUND(AVG(
        CASE 
            WHEN score < 60 THEN 1
            ELSE 0
        END )* 100 , 2) AS percentage_failed
FROM results 


GROUP BY subject
ORDER BY subject;

Enter fullscreen mode Exit fullscreen mode
subject percentage_passed percentage_failed
Arts 82.61 17.39
Biology 65.22 34.78
Chemistry 60.87 39.13
Commerce 90.91 9.09
Computer Science 82.61 17.39
Economics 95.65 4.35
English Language 72.73 27.27
History 91.30 8.70
Literature 91.30 8.70
Mathematics 91.30 8.70
Physical Education 100.00 0.00
Physics 100.00 0.00
Statistics 60.87 39.13
Any other aggregation function

You can use any other aggregation function with the CASE statement to get any result that requires some form of conditional operation.

Updating Data

Let's take a look at the students that passed/failed 'Chemistry' once again.

SELECT student_id, subject, score, 
    CASE 
        WHEN score >= 60 THEN 'Passed'
        ELSE 'Failed'
    END AS status
FROM results 

WHERE subject = 'Chemistry'
ORDER BY score
;
Enter fullscreen mode Exit fullscreen mode
student_id subject score status
1 Chemistry 0 Failed
21 Chemistry 0 Failed
23 Chemistry 36 Failed
12 Chemistry 40 Failed
20 Chemistry 40 Failed
22 Chemistry 41 Failed
15 Chemistry 45 Failed
10 Chemistry 51 Failed
11 Chemistry 56 Failed
17 Chemistry 60 Passed
8 Chemistry 61 Passed
5 Chemistry 61 Passed
13 Chemistry 62 Passed
6 Chemistry 63 Passed
16 Chemistry 67 Passed
14 Chemistry 70 Passed
18 Chemistry 71 Passed
4 Chemistry 71 Passed
2 Chemistry 73 Passed
7 Chemistry 74 Passed
19 Chemistry 75 Passed
9 Chemistry 81 Passed
24 Chemistry 82 Passed

If the students with ids 12, 10, 11 and 15 re-took the course and had the following scores

student_id score
12 70
10 80
11 85
15 90

Now we have to update the Chemistry table with this new result. One way we can do this is to use multiple UPDATE statements like so:


UPDATE results
    SET score = 70
    WHERE student_id = 12 AND subject = 'Chemistry';

UPDATE results
    SET score = 80
    WHERE student_id = 10 AND subject = 'Chemistry';

         .
         .
         .

Enter fullscreen mode Exit fullscreen mode

But this can be quite inefficient and might be a little difficult to maintain. You can perform multiple updates with different values using the CASE statement to specify a value for each student_id and an IN clause to select all the student_ids you want like so:


UPDATE results
    SET score = 
        CASE 
            WHEN student_id = 12 THEN 70
            WHEN student_id = 10 THEN 80
            WHEN student_id = 11 THEN 85
            WHEN student_id = 15 THEN 90
        END
    WHERE student_id IN (12, 10, 11, 15) AND subject = 'Chemistry';

Enter fullscreen mode Exit fullscreen mode

Now when we retrieve the data for 'Chemistry'

student_id subject score status
1 Chemistry 0 Failed
21 Chemistry 0 Failed
23 Chemistry 36 Failed
20 Chemistry 40 Failed
22 Chemistry 41 Failed
17 Chemistry 60 Passed
8 Chemistry 61 Passed
5 Chemistry 61 Passed
13 Chemistry 62 Passed
6 Chemistry 63 Passed
16 Chemistry 67 Passed
14 Chemistry 70 Passed
12 Chemistry 70 Passed
4 Chemistry 71 Passed
18 Chemistry 71 Passed
2 Chemistry 73 Passed
7 Chemistry 74 Passed
19 Chemistry 75 Passed
10 Chemistry 80 Passed
9 Chemistry 81 Passed
24 Chemistry 82 Passed
11 Chemistry 85 Passed
15 Chemistry 90 Passed

We can see that the students we specified were updated successfully.

Note that while doing an update and the data type returned in the CASE statement does not match the data type of the column, we would need to do an explicit cast. For example, if we had returned strings in our case statement previously

UPDATE results
    SET score = 
        CASE 
            WHEN student_id = 12 THEN '70'
            WHEN student_id = 10 THEN '80'
            WHEN student_id = 11 THEN '85'
            WHEN student_id = 15 THEN '90'
        END
    WHERE student_id IN (12, 10, 11, 15) AND subject = 'Chemistry';
Enter fullscreen mode Exit fullscreen mode

Then Postgresql would have thrown the following error

ERROR:  syntax error at or near "score"
LINE 2: SET SET score = 
Enter fullscreen mode Exit fullscreen mode

To solve this we just cast the values in the case to INT like so:

UPDATE results
    SET score = 
        CASE 
            WHEN student_id = 12 THEN '70'
            WHEN student_id = 10 THEN '80'
            WHEN student_id = 11 THEN '85'
            WHEN student_id = 15 THEN '90'
        END::int
    WHERE student_id IN (12, 10, 11, 15) AND subject = 'Chemistry';
Enter fullscreen mode Exit fullscreen mode

Works just like before!

Conclusion

So we have seen that the CASE statement can come in handy for categorizing, aggregating and updating data. So in case you ever want to perform a conditional statement in SQL, always remember to use the CASE statement

Dismissed img

. . . . . . . . . . . . . . .