Database Video Notes

Felice Forby - Oct 25 - - Dev Community

SQL Basics (M04 L01)

The Basic SELECT (M04 L02)

DATES, Order By, NULLS (M04 L03)

SQL statements can be divided into 2 categories:

  • Data definition language (DDL) statements: used for creating tables, relationships, and other structures.
  • Data manipulation language (DML) statements: used for queries, inserting, and data modification.

  • SQL is not procedural: it does not execute one statement at a time in a sequence, rather, it executes the entire query all at once.

  • All inputs to a SQL query are tables and the output from a query is another table called the Answer Set.

  • Some queries can produce interim/temporary answer sets. Note that these aren't seen but they consume resources

Optimizer calculates the optimal execution plan for a query.

  • Combine conditions with AND and OR, use parentheses when needed

Group Functions (M04 L04)

  • https://www.youtube.com/watch?v=v-ZUPvNAwuk
  • https://drive.google.com/file/d/18QiIO69xnHsYOgkXutHZFxQtIeRBQTlf/view?usp=drive_link

  • When you combine a GROUP function w/ a WHERE clause, the WHERE clause reduces the number of rows in the interim answer set before the GROUP function does its calculation.

  • A GROUP function w/o a GROUP BY returns only one row in the answer set

  • A GROUP function w/ a GROUP BY returns multiple summary rows in the answer set showing a subtotal across all rows from the interim answer set summed by the column on which we GROUP BY

  • RULE: When using a GROUP BY, every column in the SELECT statement must either be:

    • a column to which you apply the group function (note this column does not have to be in the group by)
    • a column that you are grouping by
    • (Strict mode will fail with errors)
select @@sql_mode;
-- strict mode
SET sql_mode = 'ONLY_FULL_GROUP_BY';
-- relaxed mode
SET sql_mode = '';
Enter fullscreen mode Exit fullscreen mode

HAVING

  • Like a WHERE clause against the interim answer set when you use a GROUP BY
  • Returns a subset of rows from the interim answer set

5 group functions:

  • SUM (only numeric)
  • AVG (only numeric)
  • COUNT (any column, or * to count rows)
  • MIN (work with any type of column)
  • MAX (work with any type of column)

SQL Subqueries (M05 L01)

Query within a query:

  • Subquery in WHERE: answer set to the inner query is used as a predicate in a WHERE clause in the outer query
    • Can also be used to get list for IN clause
  • Subquery in SELECT: answer set to the inner query is used as column in a SELECT
    • Need to name the subquery column
  • Subquery in FROM: answer set to the inner query is used as "virtual" table in a FROM clause
  • Need to name the subquery alias name

  • Subquery must always be enclosed in parentheses!

  • Can select from a different table

Co-Related SubQuery

  • Inner query references a value from outer query
  • Inner query is re-executed w/ each row returned by outer query
  • Rare, can be very slow performance!
Enter fullscreen mode Exit fullscreen mode

SQL JOINs (M05 L02)

Used for getting data from multiple tables.

  • Joining two tables requires that they have a common key (usually FK relationship) that appears in both tables. Keys must be same length and data type, but don't need to have the same name.
  • Joins consume extra resources
  • Need to prefix identical column names with the table name
SELECT LastName, FirstName, COUNT(OrderID) AS 'Orders'
FROM Employees, Orders
WHERE Employees.EmployeeId = Orders.EmployeeID
GROUP BY LastName, FirstName
ORDER BY COUNT(OrderID) DESC;
Enter fullscreen mode Exit fullscreen mode

The Cartesian Product (M05 L03)

More SQL JOINs (M05 L04)

Using the JOIN keyword as an alternate. The below are equivalent:

SELECT LastName, FirstName, COUNT(OrderID) AS 'Orders'
FROM Employees, Orders
WHERE Employees.EmployeeID = Orders.EmployeeID
GROUP BY LastName, FirstName
ORDER BY COUNT(OrderID) DESC;

SELECT LastName, FirstName, COUNT(OrderID) AS 'Orders'
FROM Employees 
INNER JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
GROUP BY LastName, FirstName
ORDER BY COUNT(OrderID) DESC;
Enter fullscreen mode Exit fullscreen mode
  • Every pair of tables being joined must have a common key
  • Every pair of common keys must have a condition stated in a WHERE clause or in the ON clause of the JOIN (if not could get Cartesian product)
  • Note that joins are expensive! If joining multiple tables, it could be better to do multiple queries and have the output of a join query saved as a temporary table, then bring that temp table into the next join query.

Joining multiple tables

SELECT LastName, FirstName, SUM(UnitPrice * Quantity) AS 'OrderValue'
FROM Employees 
JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
JOIN OrderDetails ON Orders.OrderID = OrderDetails.OrderID
GROUP BY LastName, FirstName;

-- With WHERE
WHERE Employees.EmployeeID = Orders.EmployeeID
  AND Orders.OrderID = OrderDetails.OrderID
Enter fullscreen mode Exit fullscreen mode
  • By default, a JOIN is an INNER JOIN
  • Explicit JOIN: when the JOIN keyword is used
  • Implicit JOIN: when WHERE is used to define the join

SQL Outer Join (M05 L05)

Image description

Outer JOIN is useful for analyzing data

Left outer join

Left table is the one mentioned first in the SELECT FROM. Below, Employees is left and Department is right. All rows from Employees (left) and only matching rows from Department (right) are shown.

SELECT *
FROM Employees E
LEFT OUTER JOIN Department D
ON E.DepartmentID = D.DepartmentID
Enter fullscreen mode Exit fullscreen mode

Right outer join

All rows from Department (right) and only matching rows from Employees (left)

SELECT *
FROM Employees E
RIGHT OUTER JOIN Department D
ON E.DepartmentID = D.DepartmentID
Enter fullscreen mode Exit fullscreen mode

Note on UNION

  • Assemble multiple queries connected by a UNION
  • Combine multiple answer sets
  • Each answer set must have same # of columns
  • Each column in all answer sets must have same domain (same set of data values)

Execution Plans

With any join, the DB engine optimizer must calculate the most efficient query execution plan. There are 3 basic methods in MySQL:

  • Nested Loop Join: when one join input table has a small number of rows and the other input table is large and indexed on the join key. Most efficient.
  • Merge Join: when tables being joined are both sorted on the join key. Next efficient.
  • Hash Join: when large, unsorted, non-indexed inputs are joined with an inner join with an "=" condition. Least efficient -> creates a product

Explain Plan option will show you an analysis of the execution plan calculation by the query optimizer

SQL DDL - CREATE (M06 L01)

Data Definition Language: Statements that CREATE and MODIFY database structures (tables, etc.)

  • Create, Alter, Drop

Create

Creates a new table or view

  • defines table name and its columns
  • defines data type, length for each column
  • defines constraints for each column
  • Usually drop table (if it exists) before creating
CREATE TABLE <table_name> (
  <column_name> <DATATYPE(L)>,
  <column_name> <DATATYPE(L)> NOT NULL,
  <column_name> <DATATYPE(L)> CONSTRAINT <constraint_name> <TYPE>
)
Enter fullscreen mode Exit fullscreen mode

Example

CREATE TABLE shippers (
  ShipperID   INT NOT NULL,
  CompanyName VARCHAR(40) NOT NULL,
  Phone       VARCHAR(20) NOT NULL DEFAULT '0'
)
Enter fullscreen mode Exit fullscreen mode

NOT NULL: Prevent inserting a row when the column has no value, can provide a default value.

DROP

Drops named DB table

DROP TABLE IF EXISTS <table_name>;
Enter fullscreen mode Exit fullscreen mode

DATA TYPES available depend on the DB engine (MySQL, Oracle, Posgres, etc.)

  • CHAR: fixed width string, shorter strings get padded
  • VARCHAR: variable width string, longer strings get truncated, shorter strings does NOT store blanks
  • FLOAT: floating-point data type that stores approx. values for real numbers
  • DECIMAL: fixed-precision data type that stores exact values, often for dollar cents. DECIMAL(9, 2) means 9 total digits with 2 digits to the right of the decimal point.
  • INT: number w/o decimal places (also TINYINT, SMALLINT, etc)
  • BLOB: binary large object (binary image, etc)
  • DATE: Just year, month, day
  • DATETIME: Year, month, day, with time
  • TIMESTAMP
  • ENUM: like a small array
  • BOOLEAN: actually tinyint(1)

SQL DDL - ALTER (M06 L02)

ALTER allows you to change different characteristics of database structures like tables, columns, and views

Alter a table:

  • ADD or DROP columns
  • ADD or DROP constraints

Alter a column:

  • SET or DROP a default
  • SET a DATA_TYPE
-- Change table name
ALTER TABLE items RENAME TO products;

-- Change name of column
ALTER TABLE products
RENAME COLUMN itemCode TO itemRef;

-- Add column
ALTER TABLE products
ADD COLUMN scanCode VARCHAR(5) NULL;

-- Change column data type
ALTER TABLE products
MODIFY COLUMN itemCode CHAR(4);

-- Drop column
ALTER TABLE products
DROP COLUMN itemCode;
Enter fullscreen mode Exit fullscreen mode

DML - Insert, Update, Delete (M06 L03)

Data Manipulation Language - Populates and modify the data

  • Insert, Update, Delete

Insert

Two formats:

-- if no column & value are specified, NULL or default is assigned)
INSERT INTO <table_name> (column, column, column)
VALUES (value, value, value);

-- must have a value or NULL for every column in table
INSERT INTO <table_name>
VALUES (value, value, value, value);
Enter fullscreen mode Exit fullscreen mode

Update

UPDATE statement assigns new values to specific columns based on a condition

UPDATE <table_name>
SET <column> = <new_value>
WHERE condition;
Enter fullscreen mode Exit fullscreen mode

If no condition is given, it updates all rows! (check if this is correct)

Example

UPDATE Employees
SET firstName = "timmy",
    title = "Mr."
WHERE employeeID = 21;
Enter fullscreen mode Exit fullscreen mode

TRUNCATE

TRUNCATE statement removes all rows, but keeps structure of table

TRUNCATE TABLE <table_name>;
Enter fullscreen mode Exit fullscreen mode

DROP

DROP statement removes all rows and the table structure (completely removes table)

DROP TABLE <table_name>;
Enter fullscreen mode Exit fullscreen mode

DELETE

Removes rows in a table based on WHERE clause

DELETE FROM <table_name>
WHERE <condition>;
Enter fullscreen mode Exit fullscreen mode
DELETE FROM Employees
WHERE employeeID IN (10, 20, 30);
Enter fullscreen mode Exit fullscreen mode

NOTE: a DELETE without a WHERE is the same as TRUNCATE!

SQL Constraints (M07 L01)

Five types of constraints:

  • PRIMARY KEY
  • NULL/NOT NULL
  • UNIQUE
  • CHECK
  • FOREIGN KEY

Primary key constraint

  • can be defined at column level if one column
    • comes with unique constraint, not null constraint
  • Must be defined at table level if more than one column
CREATE TABLE persons (
  ID INT PRIMARY KEY,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL
);

CREATE TABLE persons (
  ID INT,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL, 
  Phone CHAR(10) NOT NULL,
  PRIMARY KEY (LastName, Phone)
);
Enter fullscreen mode Exit fullscreen mode

To make a constraint modifiable (via ALTER), you must give it a name:

CREATE TABLE persons (
  ID INT,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL, 
  Phone CHAR(10) NOT NULL,
  CONSTRAINT PK_Person PRIMARY KEY (ID, LastName)
);

ALTER TABLE Persons DROP CONSTRAINT PK_Person;

ALTER TABLE Persons 
ADD CONSTRAINT PK_Person PRIMARY KEY (ID, LastName);
Enter fullscreen mode Exit fullscreen mode
  • most shops always name constaints and have standards for naming them.

UNIQUE

May be defined at column level if one column

CREATE TABLE persons (
  ID INT PRIMARY KEY,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL,
  Email VARCHAR(255) UNIQUE
);

CREATE TABLE persons (
  ID INT PRIMARY KEY,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL, 
  Phone CHAR(10) NOT NULL,
  UNIQUE (LastName, FirstName)
  -- or with name
  CONSTRAINT UN_Name UNIQUE KEY (LastName, FirstName)
);
Enter fullscreen mode Exit fullscreen mode

CHECK

Applies a condition to a column. Rules for conditions same as WHERE clause

CREATE TABLE persons (
  ID INT PRIMARY KEY,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL, 
  Age INT,
  CONSTRAINT CK_Age CHECK (Age > 18)
);
Enter fullscreen mode Exit fullscreen mode

Will only insert if age is > 18

FOREIGN KEY

Defined at column level

CREATE TABLE persons (
  ID INT PRIMARY KEY,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL, 
  DepartmentNumber INT FOREIGN KEY
     REFERENCES Department (DeptID) NOT NULL,
  Age INT,
);
Enter fullscreen mode Exit fullscreen mode

Table level: allows you to give name

CREATE TABLE persons (
  ID INT PRIMARY KEY,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL, 
  DepartmentNumber INT,
  Age INT,
  CONSTRAINT FK_Department FOREIGN KEY (DepartmentNumber)
     REFERENCES Department (DeptID)
);
Enter fullscreen mode Exit fullscreen mode

Can add later with an Alter:

ALTER TABLE Persons
ADD CONSTRAINT FK_Department
FOREIGN KEY (Department)
REFERENCES Department (DeptID);
Enter fullscreen mode Exit fullscreen mode

Foreign key constraints used to maintain Referential Integrity (RI). There are rules that can be set on Parent tables for when an update or delete is done, for what to do with the child row:

  • SET NULL: sets child value NULL
  • SET DEFAULT: sets child value to column default
  • CASCADE: deletes or updates the child (if parent is deleted, deletes child, or if parent id is updated it will update child ref id)
  • NO ACTION/RESTRICT: prevents action on table
CREATE TABLE persons (
  ID INT PRIMARY KEY,
  LastName VARCHAR(255) NOT NULL,
  FirstName VARCHAR(255) NOT NULL, 
  DepartmentNumber INT,
  Age INT,
  CONSTRAINT FK_Department FOREIGN KEY (DepartmentNumber)
     REFERENCES Department (DeptID)
     ON DELETE SET NULL
     ON UPDATE CASCADE
);
Enter fullscreen mode Exit fullscreen mode

Depending on the data model and the optionality, the foreign key can be NULL.

SQL Triggers (MO7 L02)

Triggers are:

  • a pieces of SQL code stored with a table
  • associated with a table
  • associated with an event
  • the code fires when the event happens

Why use triggers

  • enforce business rules
  • moves code from an application program to the DB
  • Performance improvement - all work done on server
  • convenient for DB people to store these rules in the DB and not the application, helps to enforce business rules. Advantagious from the DB perspective to store code related to the DB or effects the DB in the database itself instead of the app.
  • Gives DB more control over DB
  • Note: trigger is stored with the table, so when table is dropped trigger is dropped

Trigger events:

  • Insert, update, delete

Trigger timer

  • Before, after, instead of
  • Depends on the DB on what is available

Instead of: E.g. Instead of doing an insert, fire the trigger

CREATE TRIGGER trigger_name [BEFORE|AFTER] event(DELETE, etc)
ON table_name trigger_type
BEGIN
  -- trigger_logic
END;
Enter fullscreen mode Exit fullscreen mode
  • First, specify the name of the trigger after the CREATE TRIGGER clause.
  • Next, use either BEFORE or AFTER keyword to determine when to the trigger should occur in response to a specific event e.g., INSERT, UPDATE, or DELETE.
  • Then, specify the name of the table to which the trigger binds.
  • After, specify the type of trigger using either FOR EACH ROW or FOR EACH STATEMENT. We will discuss more on this in the next section.
  • Finally, put the logic of the trigger in the BEGIN ... END block.

Row level trigger vs. statement level trigger

There are two types of triggers: row and statement level triggers.

A row level trigger executes each time a row is affected by an UPDATE statement. If the UPDATE statement affects 10 rows, the row level trigger would execute 10 times, each time per row. If the UPDATE statement does not affect any row, the row level trigger is not executed at all.

Different from the row level trigger, a statement level trigger is called once regardless of how many rows affect by the UPDATE statement. Note that if the UPDATE statement did not affect any rows, the trigger will still be executed.

When creating a trigger, you can specify whether a trigger is row or statement level by using the FOR EACH ROW or FOR EACH STATEMENTrespectively.

Special features

When trigger is fired, DBMS supplies:

  • OLD and NEW values for the update
  • NEW values for inserts
  • OLD values for deletes

You can reference OLD and/or NEW value of a column within the code of the trigger

Example:

  • Archival data before a delete
  • E.g. move data to an archival orders table
-- Gets fired when an Orders record is deleted
CREATE TRIGGER before_order_delete
  BEFORE DELETE ON Orders
  FOR EACH ROW
BEGIN
  INSERT INTO OrdersArchive
  SET OrderID = OLD.OrderID,
      CustomerID = OLD.CustomerID,
      EmployeeID = OLD.EmployeeID,
      OrderDate = OLD.OrderDate,
      ArchiveDate = NOW();
END;
Enter fullscreen mode Exit fullscreen mode

Example from Lab:

CREATE TRIGGER update_student_gpa
AFTER INSERT ON ClassGrade
FOR EACH ROW
BEGIN
  UPDATE Student
  -- Use subquery to calculate the new GPA and then use result to set the student's gpa column.
  SET gpa = (SELECT SUM(Class.units * ClassGrade.grade)/SUM(units)
             FROM ClassGrade
             LEFT JOIN Class ON ClassGrade.cID = Class.cID
             -- New.sID is the student ID from the newly inserted ClassGrade
             WHERE ClassGrade.sID = NEW.sID)
  -- Make sure to update the student referenced in the newly inserted ClassGrade
  WHERE Student.sID = NEW.sID;
END;
Enter fullscreen mode Exit fullscreen mode

SQL Case and View (MO7 L03)

CASE

CASE expression allows SQL to process conditionals like IF/THEN/ELSE in other programming languages (there is no if/then/else in SQL)

The CASE is an expression that can be used anywhere in teh SQL where an expression is allowed

CASE
  WHEN <condition_1> THEN <result_1>
  WHEN <condition_2> THEN <result_2>
  WHEN <condition_3> THEN <result_3>
  ELSE <else_result>
END
Enter fullscreen mode Exit fullscreen mode
  • If a WHEN condition is true, the WHEN ends with that result
  • If the WHEN condition is false, control passes to next WHEN

Example:

-- adds "category" to answer set
SELECT productID, productName, unitPrice
CASE
  WHEN unitPrice < 20 THEN "Economy"
  WHEN unitPrice < 80 THEN "Standard"
  ELSE "Premium"
END category
FROM Products;
Enter fullscreen mode Exit fullscreen mode

Views

  • VIEW is an empty shell of a table definition
  • The view contains no data until it's queried
  • Kind of like a "Virtual Table"
  • Each time view is queried, the underlying query that populates the view is re-executed
  • Data in the view table does not persist

Why use views?

  • Security, maintaining confidentiality
    • View can prevent unauthorized users from seeing data that must be kept secure or confidential
  • Complexity
    • The view can hide complex SQL from unauthorized or inexperienced users
    • Can instruct people to use views
-- Hiding salary from an employees table
CREATE VIEW EmployeeSecure AS
  SELECT EmployeeID, LastName, FirstName, HireDate
  FROM Employees;

-- Using a view
SELECT * from Employees;
Enter fullscreen mode Exit fullscreen mode
CREATE VIEW <view_name> AS
  <view_query>;
Enter fullscreen mode Exit fullscreen mode

Auto-increment and UNION (MO7 L04)

Auto-increment

  • Can assign the AUTO_INCREMENT "constraint" to a column when creating a table
  • Auto increment typically is used as surrogate key when no other keys are available
  • Auto-increment columns are ints that automatically increment by 1 when a new row is inserted
  • Created and managed by DB
  • Guarantees a unique key value for each new row
CREATE TABLE Shoppers (
  ShopperID INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
  ShopperName VARCHAR(40) NOT NULL
);
Enter fullscreen mode Exit fullscreen mode
  • Auto-increment keys are not reused when a row is deleted and a new one is inserted. It goes up from the last used key

UNION

SQL UNION operator allows SQL to combine multiple answer sets into a single answer set.

When combining multiple answer sets into one, you must be sure that both sets contain the same number of columns and that the columns in the sets have the same data types.

-- First query
SELECT categoryName, SUM(unitsInStock) AS "Total units"
FROM Categories C
JOIN Products P ON C.categoryID = P.categoryID
GROUP BY categoryName;

-- Second query
SELECT "Total", SUM(unitsInStock) AS "Total units"
FROM Products;

-- Unioned query
-- Both have same number columns and they are the same datatype
SELECT categoryName, SUM(unitsInStock) AS "Total units"
FROM Categories C
JOIN Products P ON C.categoryID = P.categoryID
GROUP BY categoryName
UNION
SELECT "Total", SUM(unitsInStock) AS "Total units"
FROM Products;
Enter fullscreen mode Exit fullscreen mode
. . . . . . . . . . . . . . . . . . . . . . . . . .