A PRIMER ON POSTGRESQL STORED FUNCTIONS (PL/pgSQL)

Samuyi - Jan 7 '19 - - Dev Community

Postgresql functions extends the SQL language; it adds features that are normally found in programming languages such as control statements and loops to make an application developer’s life easier; they are stored and executed completely on a database server. Using functions means that you don’t have write ineffective code that would be a bottle neck in your application. Say for instance you need to fetch some data from a database for some computation and based on the result of the computation you need to fetch some extra data do some more computation and store the results in the database. This will require several calls to the database, worse off if the database server exists on a separate host as the application server then the network call adds to the execution time of the process. Complex logic like previously described can be placed in a function and executed all at once on the database server, removing all the unnecessary intermediate network calls.

PL/pgSQL

Postgresql is quite flexible when it comes to defining functions it lets you define functions in almost any of your favorite programming languages. There’s a module for python, R, javascript including but not least Java. The modules for this languages don’t come installed by default on the database. Another module for writing functions is the Pl/pgSQl module for writing functions, it’s based on the SQL language and it comes installed on the database by default since version 9. PL/pgSQL functions look like this:

  CREATE FUNCTION test_func(integer, text) RETURNS integer
   AS $$
     /* function body text  goes here */
   $$
   LANGUAGE plpgsql;

Basically we define a function called test_func with the CREATE FUNCTION syntax. The test function takes in two arguments, an integer and a text, we specify that we are returning an integer with RETURNS. We indicate the start of the body of the function with the AS, the $$ represents quoting of the beginning and end of the function body. The quoting of the body need not be $$ it could be regular single or double quotes, but if any of the latter two are used within the function body; they would have to be escaped.Lastly we define the procedural language we’re using with keyword LANGUAGE; in our case it is plpgsql.

The function body takes the form of a block; the block is defined as such:

       [ << label >> ]
       [ DECLARE
         declarations ]
       BEGIN
        statements
       END [ label ];

The ‘ << label >> ‘ represents a handle for a block and they’re optional. It is possible to nest blocks so labels act as a reference to the blocks. DECLARE as the name implies is for declaring variables used within the block. The BEGIN and END wrap the main logic of the function. The general DDL syntax for a function declaration has many more options you can check it out here.

As an example, suppose we have an application that manages a car sales organization; tracking new vehicles, employees and sales. The tables for the application are given below.

For our first function suppose we want to insert values into our sales table, we could go about it this way:

Our function takes two arguments, an employee and a car id, both being integers. Our function returns a ‘SETOF sales’; which means we’re returning a set of the type ‘sales’, every table in PostgreSQL is a composite type consisting of the types of it’s individual columns. Next we ‘DECLARE’ the variables for use within our function. To execute dynamic SQL commands we use the ‘EXECUTE’ keyword, the output of the command is put into the variables following the ‘INTO’ statement, and the variables used within the SQL statement follow the ‘USING’ keyword. The ‘RETURN QUERY’ keyword is used to return the type ‘SETOF sales’, since we’re returning a set of records we execute a select statement to return the necessary records. Notice that if we were to carry out this logic without stored functions we would have to make several round trips to the server to achieve our goal. To execute our function all we need do is run it like any other built in database function passing in the necessary arguments:

SELECT sales_func(1, 2);
SELECT sales_func(2, 3);
SELECT sales_func(6, 2);
SELECT sales_func(1, 3);
SELECT sales_func(4, 1);
SELECT sales_func(3, 1);
SELECT sales_func(5, 3);
SELECT sales_func(4, 2);
SELECT sales_func(6, 2);
SELECT sales_func(5, 2);
SELECT sales_func(2, 2);
SELECT sales_func(3, 2);

Our next function is for populating our sales summary table. It contains values for total sales and bonus figures for each employee for a quarter.

This function is a little more involved; it uses a ‘LOOP’. The loop, loops through every employee from the staff table and summarizes their sales figures for each quarter. The ‘LOOP’ used here is similar to the loops found in modern programming languages. It loops through the results of the SQL statement and assigns it to the employee variable. The employee variable is of record type from our declarations. To get a value from our employee variable we use a dot to access the columns of the result. The ‘RAISE NOTICE’ key word serves as a sort of print statement within the function and can be useful for debugging. After our loop we delete the data that has been processed within our loop. Our function returns a custom table, with column types similar to our final ‘SELECT’ statement. Returning a table is a way of returning a custom record if we don’t want to return every column from a table. The argument for the function has a default value; it is possible to use default values just like in we would for defining relations. Our argument defaults to the ‘CURRENT_TIMESTAMP’ value. The ‘end_date’ variable corresponds to a date of ‘3 months’ from current ‘start_date’, we add a period of three months to the start_date variable using the ‘interval 3 months’ key word.

Our final function is used for updating the values of the bonus for a car.

First we check that the car id exists. When we execute an SQL statement that returns rows and we want to discard the resulting rows; we need to use the ‘PERFORM’ keyword, if not there would be an error. In our case since we’re ‘selecting’ from a table we replace the ‘SELECT’ with a ‘PERFORM’. If our query returns any rows then it sets the special variable FOUND to true. We check if the car id doesn’t exist and raise an exception if true, we also check if the bonus is greater than thirteen percent. Our function returns a record of type cars, so we create a variable car of composite type cars and insert the the updated car into it.

CONCLUSION

Using plpgsql functions can make for more concise and efficient applications. This is especially true for data intensive applications especially were security is paramount. That being said using plpgsql for all application logic would probably be a bad idea. For non complex logic it’s probably best to stick with an ORM.

This features shown in this article are just the tip of the iceberg as regards what can be done with stored functions visit the postgres documentation for more.

. . . . . . .