SQL 101: Introduction to Structured Query Language

Gichuki Edwin - Sep 29 - - Dev Community

What is SQL?

  • SQL is a widely used programming language used for managing and manipulating relational databases. SQL allows you to define and query databases. It gives you the benefits to manage database objects as well as create, modify, explore and summarize data stored in a database. These opertions are often referred to as CRUD operations. SQL is widely used in software applications and data analysis due to its simplicity and efficiency in handling structured data.

Key Components of SQL

  • SQL consists of various commands, categorized into different groups based on their purpose. Here’s a breakdown of the primary categories:

Data Definition Language (DDL)

DDL commands are used to define the structure of the database, including creating, altering, and deleting tables.

  • CREATE: Used to create a new table or database.
CREATE TABLE customers (
  customer_id INT PRIMARY KEY,
  name VARCHAR(50),
  email VARCHAR(50)
);
Enter fullscreen mode Exit fullscreen mode
  • ALTER: Modifies an existing table, like adding or removing a column.
ALTER TABLE customers ADD phone_number VARCHAR(15);
Enter fullscreen mode Exit fullscreen mode
  • DROP: Deletes a table or database.
DROP TABLE customers;
Enter fullscreen mode Exit fullscreen mode

Data Manipulation Language (DML)

  • DML is used to manage data within tables, focusing on data retrieval, insertion, updates, and deletion.

SELECT: Retrieves data from one or more tables.

SELECT name,
email
FROM customers;
Enter fullscreen mode Exit fullscreen mode
  • INSERT: Adds new rows of data into a table.
INSERT INTO customers (customer_id, name, email) 
VALUES (1, 'John Doe', 'john@example.com');
Enter fullscreen mode Exit fullscreen mode
  • UPDATE: Modifies existing data in a table.
UPDATE customers 
SET email = 'newemail@example.com'
WHERE customer_id = 1;
Enter fullscreen mode Exit fullscreen mode
  • DELETE: Removes rows of data from a table.
DELETE FROM customers WHERE customer_id = 1;
Enter fullscreen mode Exit fullscreen mode
  1. Transaction Control Language (TCL)
  2. TCL manages transactions within a database, ensuring data consistency.
  • COMMIT: Saves all changes made during the current transaction.
COMMIT;
Enter fullscreen mode Exit fullscreen mode
  • ROLLBACK: Reverts the changes made during the current transaction.
ROLLBACK;
Enter fullscreen mode Exit fullscreen mode

Basic SQL Syntax and Structure

  • SQL syntax is designed to be easy to read and write. Each query follows a structured flow:

  • SELECT Clause – Specifies the columns to retrieve.

SELECT
name,
email
FROM customers;
Enter fullscreen mode Exit fullscreen mode
  • FROM Clause – Specifies the table from which data is retrieved.
FROM customers;
Enter fullscreen mode Exit fullscreen mode
  • WHERE Clause – Applies conditions to filter the data.
WHERE email LIKE '%gmail.com';
Enter fullscreen mode Exit fullscreen mode
  • ORDER BY Clause – Sorts the results by one or more columns.
ORDER BY name ASC;
Enter fullscreen mode Exit fullscreen mode
  • GROUP BY Clause – Groups rows with the same values into summary rows.
SELECT
state,
COUNT(*)
FROM customers
GROUP BY state;
Enter fullscreen mode Exit fullscreen mode

SQL Joins

  • SQL joins combine rows from two or more tables based on a related column between them. The most common types of joins are:

  • INNER JOIN: Returns rows that have matching values in both tables.

SELECT orders.order_id, customers.name
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id;
Enter fullscreen mode Exit fullscreen mode
  • LEFT JOIN: Returns all rows from the left table and the matched rows from the right table. If there's no match, NULL values are returned.
SELECT orders.order_id, customers.name
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.customer_id;
Enter fullscreen mode Exit fullscreen mode
  • RIGHT JOIN: Similar to the LEFT JOIN but returns all rows from the right table.
SELECT orders.order_id, customers.name
FROM orders
RIGHT JOIN customers ON orders.customer_id = customers.customer_id;
Enter fullscreen mode Exit fullscreen mode
  • FULL OUTER JOIN: Returns rows when there is a match in one of the tables.
SELECT orders.order_id, customers.name
FROM orders
FULL OUTER JOIN customers ON orders.customer_id = customers.customer_id;
Enter fullscreen mode Exit fullscreen mode

Common SQL Functions

  • SQL includes built-in functions that help with data manipulation and aggregation:

Aggregate Functions: These include SUM, COUNT, AVG, MIN, and MAX, which are used to perform calculations on sets of rows.

SELECT
SUM(amount)
FROM sales;
Enter fullscreen mode Exit fullscreen mode
  • String Functions: These functions manipulate string values, such as UPPER(), LOWER(), CONCAT(), etc.
SELECT
UPPER(name)
FROM customers;
Enter fullscreen mode Exit fullscreen mode
  • Date Functions: SQL provides various functions to manipulate and format dates, like NOW(), DATEADD(), and DATE_TRUNC().
SELECT
DATE_TRUNC('month',sale_date) AS month,
SUM(amount) AS monthly_total_sale
FROM sales
GROUP BY month
ORDER BY month;
Enter fullscreen mode Exit fullscreen mode

Best Practices for Writing SQL Queries

  • Use Proper Naming Rules and Conventions: Use clear and meaningful names for tables and columns. There are several SQL style patterns to follow when naming tables and columns and other objects. They include,
    1. Camel case - the first letter of each word is capitalised except for the first word eg establishmentNumber
    2. Pascal case - the first letter of each word is capitalised eg SolarGenaration
    3. Snake case - all the words are lowercased and separated by underscores eg employee_table
  • Limit Data with WHERE Clause: Always filter data as much as possible to avoid retrieving unnecessary rows.
  • Control columns with constraints: The column data types already defines the kind of data that it accepts but SQL provides several additional constraints that let us further specify acceptable values for the columns based on rules and logic tests. Constraints help maintain the quality of data and ensure integrity of the relationships between tables. Commonly used constarints include,
    1. PRIMARY KEY constraints - Ensures that each column that makes up the key has a unique value for each row and that no column in the key can have a missing value
    2. FOREIGN KEY constraint - Ensures that data in related tables does not end up unrelated to the primary key
    3. CHECK - Evaluates whether the data falls within the values we specify
    4. UNIQUE - Ensures that values in a column are unique in each row of the table
    5. NOT NULL - Prevents NULL values in a column
  • Use Indexes: Indexes can drastically speed up the performance of queries, especially on large datasets.
  • Test Queries: Test your queries on a smaller subset of data to avoid performance issues.
  • Avoid SELECT *: Always specify the columns you need, as retrieving all columns can slow down performance.

Why use SQL

  • SQL is a fundamental skill for anyone working with data—whether in software development, business intelligence, or data analysis.
  • Using a robust SQL database system allows you to work with terabytes of data, multiple related tables, and thousands of columns and gives you improved programmatic control over the structure of your data, leading to efficiency, speed, and accuracy.
  • SQL is also an excellent adjuct to programming languages used in data science such as R and Python

Conclusion

  • SQL provides the building blocks for interacting with relational databases. Whether you're querying data, performing complex joins, or managing database structures, SQL enables powerful and efficient data management. Understanding the basics of SQL is a crucial first step towards mastering database management and data analysis.
. . . . . . . .