SQL · Expert

Real-World SQL

You now know how to read, write, design, and optimise SQL. This final chapter bridges the gap between “works in a playground” and “runs safely in production”: schema migrations without downtime, defending against SQL injection, enforcing least-privilege access, formatting conventions, and a capstone project that exercises every technique from the tutorial.

01
Section One · Evolving schemas safely

Schema Migrations

Production schemas change constantly — adding columns, renaming tables, creating indexes. The challenge is making these changes without downtime and without losing data. A migration is a versioned SQL script that moves the schema from state N to state N+1.

ALTER TABLE safely in production

-- Add a column (safe: instant in PostgreSQL, fast in MySQL 8+ with INSTANT)
ALTER TABLE users ADD COLUMN phone TEXT;

-- Add a NOT NULL column with a default (safe in PG 11+: no rewrite)
ALTER TABLE users ADD COLUMN is_active BOOLEAN NOT NULL DEFAULT true;

-- Rename a column (PG 9.4+: instant metadata change)
ALTER TABLE users RENAME COLUMN full_name TO display_name;

-- Drop a column (safe but irreversible)
ALTER TABLE users DROP COLUMN phone;
Safe (instant/fast)

ADD COLUMN (nullable or with DEFAULT)

PostgreSQL 11+ writes the default in metadata only — no full-table rewrite. MySQL 8.0.12+ has INSTANT for adding columns at the end.

Safe (instant)

RENAME COLUMN / RENAME TABLE

Metadata-only change. But your application code must be updated simultaneously or queries will break.

Dangerous (locks table)

ADD COLUMN ... NOT NULL (no default, PG <11)

Requires a full-table rewrite to fill in the value. Blocks all writes on the table for the duration. Never do this on a large table in production.

Dangerous (locks table)

CREATE INDEX (without CONCURRENTLY)

A standard CREATE INDEX holds a write lock. Use CREATE INDEX CONCURRENTLY in PostgreSQL (slower but non-blocking).

Zero-downtime migration strategies

When a change can’t be done in a single instant step, use a multi-phase approach:

  1. Expand: add the new column/table alongside the old one. Write to both; read from old.
  2. Migrate: backfill the new column with data from the old one (in batches to avoid long locks).
  3. Switch: change the application to read from the new column. Verify.
  4. Contract: drop the old column once all reads have moved.
Example — rename a column safely: don’t just ALTER TABLE RENAME COLUMN (instant but breaks running code). Instead: (1) add the new column, (2) deploy code that writes to both, (3) backfill old rows, (4) deploy code that reads from new, (5) drop old column. Total: 3 deploys, zero downtime.

Migration tooling

Flyway

Java / JVM ecosystem

SQL-file based migrations with a version table. Supports undo migrations. Integrates with Spring Boot, Maven, Gradle.

Liquibase

XML/YAML/SQL changelogs

Database-agnostic changelog format. Supports rollbacks, diff generation, and multiple target databases from one changelog.

Alembic

Python / SQLAlchemy

Generates migration scripts from model diffs. Auto-detects new columns, dropped tables, index changes. Pairs with Flask/FastAPI.

Rails Migrations

Ruby on Rails

Ruby DSL for schema changes. Built-in reversibility (change method). The original “migration” pattern that inspired all others.

02
Section Two · Defending your data

Security

SQL injection — how it works

SQL injection happens when user input is concatenated directly into a query string. The attacker provides input that changes the meaning of the SQL.

-- VULNERABLE (string concatenation in application code):
-- query = "SELECT * FROM users WHERE email = '" + userInput + "'"
-- If userInput = "' OR '1'='1" then the query becomes:
SELECT * FROM users WHERE email = '' OR '1'='1';
-- This returns ALL users. The attacker just bypassed authentication.

-- Even worse: "'; DROP TABLE users; --"
SELECT * FROM users WHERE email = ''; DROP TABLE users; --';
-- Table deleted. Game over.

Parameterised queries — the fix

The only reliable defence: pass user input as a parameter, never inline it into the SQL string. The database treats parameters as values, not as code — they can never change the query structure.

-- Python (psycopg2)
cursor.execute("SELECT * FROM users WHERE email = %s", (user_input,))

-- Node.js (pg)
client.query('SELECT * FROM users WHERE email = $1', [userInput])

-- Java (JDBC PreparedStatement)
PreparedStatement ps = conn.prepareStatement("SELECT * FROM users WHERE email = ?");
ps.setString(1, userInput);

-- The input "' OR '1'='1" is now treated as a literal string value.
-- It searches for a user whose email literally contains those characters.
-- No injection possible.
Rule: never build SQL strings with concatenation or string interpolation. Every language has parameterised query support. If your ORM generates queries, verify it’s using parameters for user-supplied values (most do by default).

Least-privilege roles: GRANT / REVOKE

Don’t connect your application to the database as a superuser. Create a role with only the permissions it needs.

-- Create an application role with minimal access
CREATE ROLE app_readonly LOGIN PASSWORD 'strong_random_password';
GRANT CONNECT ON DATABASE myapp TO app_readonly;
GRANT USAGE ON SCHEMA public TO app_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO app_readonly;

-- A write role for the backend API
CREATE ROLE app_writer LOGIN PASSWORD 'another_strong_password';
GRANT CONNECT ON DATABASE myapp TO app_writer;
GRANT USAGE ON SCHEMA public TO app_writer;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_writer;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO app_writer;

-- Revoke access to sensitive tables from the analytics role
REVOKE ALL ON TABLE users FROM app_analytics;

Row-Level Security (RLS) — PostgreSQL

RLS lets you define per-row access rules. Each query automatically has an invisible WHERE clause appended — users can only see/modify rows they’re allowed to.

-- Enable RLS on orders table
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;

-- Policy: users can only see their own orders
CREATE POLICY user_orders ON orders
  FOR SELECT
  USING (user_id = current_setting('app.current_user_id')::integer);

-- The application sets the user context before queries:
SET app.current_user_id = '1';
SELECT * FROM orders;  -- only sees orders where user_id = 1

-- Admin bypass: superusers and table owners skip RLS by default
-- Force RLS even for the table owner:
ALTER TABLE orders FORCE ROW LEVEL SECURITY;
RLS advantage

Defence in depth

Even if application code has a bug (missing WHERE), RLS ensures users never see other users’ data. The policy is enforced at the database level.

RLS caveat

Performance impact

The USING clause is appended to every query. Complex policies on large tables can prevent index usage. Always EXPLAIN ANALYZE with RLS enabled.

03
Section Three · Readable, maintainable SQL

Style & Conventions

SQL has no official style guide, but consistent formatting makes queries dramatically easier to review, debug, and modify. Here are the conventions used throughout this tutorial:

Formatting standards

-- Keywords uppercase, identifiers lowercase
SELECT u.full_name,
       o.placed_at,
       SUM(oi.quantity * oi.unit_price) AS order_total
FROM   users u
JOIN   orders o      ON o.user_id = u.id
JOIN   order_items oi ON oi.order_id = o.id
WHERE  o.status = 'paid'
  AND  o.placed_at >= '2025-01-01'
GROUP BY u.full_name, o.placed_at
HAVING SUM(oi.quantity * oi.unit_price) > 50
ORDER BY order_total DESC
LIMIT  10;
Keywords

UPPERCASE

SELECT, FROM, WHERE, JOIN, GROUP BY, ORDER BY, INSERT, UPDATE, DELETE. Makes structure visible at a glance.

Identifiers

lowercase_snake_case

Table and column names: order_items, placed_at, user_id. Never use spaces or mixed case.

Alignment

River style

Align keywords to the right of a vertical “river” so the clause content starts at the same column. Makes scanning easy.

Aliases

Short but meaningful

u for users, o for orders, oi for order_items. Always alias when joining ≥2 tables.

Naming conventions

  • Tables: plural nouns (users, orders, order_items).
  • Columns: singular (email, placed_at, user_id). Foreign keys = referenced_table_singular_id.
  • Indexes: idx_tablename_col1_col2.
  • Constraints: pk_tablename, uq_tablename_col, fk_tablename_col, chk_tablename_rule.
  • Boolean columns: prefix with is_ or has_ (is_active, has_verified_email).
  • Timestamps: suffix with _at (created_at, updated_at, deleted_at).

Linting with SQLFluff

SQLFluff is an open-source SQL linter and formatter. It enforces consistent style across a team and catches common issues (trailing commas, implicit aliases, ambiguous column references).

-- Install
pip install sqlfluff

-- Lint a file
sqlfluff lint my_query.sql --dialect postgres

-- Auto-fix formatting
sqlfluff fix my_query.sql --dialect postgres
Team tip: add SQLFluff to your CI pipeline as a pre-commit hook or GitHub Action. Like code formatting, SQL style arguments disappear once the linter makes the decision automatically.
04
Section Four · Putting it all together

Capstone Project

Design a schema and write 10 queries that exercise multiple techniques from across the tutorial. Use the spec below as your requirements document.

The spec: Online Bookstore

You are building the database for a small online bookstore. Requirements:

  • Customers can register with an email and name.
  • Books have a title, author, ISBN (unique), price, and one or more genres (many-to-many).
  • Customers can place orders containing one or more books, each with a quantity.
  • Orders have a status lifecycle: pending → paid → shipped → delivered.
  • Customers can leave reviews (1–5 stars + text) on books they have purchased.

Task 1: Design the schema

Create tables normalised to 3NF. Include appropriate PKs, FKs (with ON DELETE actions), NOT NULL, CHECK, and DEFAULT constraints. Create indexes on FK columns and common query patterns.

Task 2: Write 10 queries

Each query should use at least one technique from the tutorial. Aim to cover:

  1. Basic SELECT + WHERE + ORDER BY — list all books in a genre sorted by price.
  2. Aggregation + GROUP BY + HAVING — find genres with more than 5 books.
  3. JOIN (multi-table) — show each customer’s most recent order with total amount.
  4. LEFT JOIN + NULL check — find customers who have never placed an order.
  5. Subquery / EXISTS — find books that have never been ordered.
  6. CTE + window function — rank books by total revenue within each genre.
  7. Running total — monthly cumulative revenue for the past year.
  8. UPSERT — insert or update a book’s price by ISBN.
  9. Recursive CTE — if genres have a parent_genre_id, list all sub-genres of “Fiction”.
  10. LATERAL / top-N per group — top 3 best-selling books per genre.

Self-review checklist

Schema

Check these

  • Every table has a PK
  • FK columns have indexes (Ch 08)
  • NOT NULL on all required columns
  • CHECK on price ≥ 0, rating 1–5
  • Junction table for books ↔ genres
  • No 1NF/2NF/3NF violations
Queries

Check these

  • No SELECT * in production queries
  • Joins use explicit syntax (not comma)
  • NOT IN replaced with NOT EXISTS (NULLs)
  • Window frame explicit where needed
  • EXPLAIN ANALYZE on the 3 most complex
  • Formatting follows the style guide (§3)
Congratulations! You’ve completed all 12 chapters of the SQL tutorial. You now have the vocabulary, mental models, and practical patterns to:
  • Design normalised schemas with proper constraints (Ch 07)
  • Write queries ranging from simple SELECTs to recursive CTEs and window functions (Ch 01–10)
  • Optimise with indexes and EXPLAIN (Ch 08)
  • Write safely with transactions, parameterised queries, and least-privilege roles (Ch 06, 12)
  • Use advanced patterns — upsert, JSON, pivot, LATERAL — in production (Ch 11)
  • Evolve schemas without downtime (Ch 12)

The capstone above is your final exam. Build it. Run it. Break it. Fix it. That’s how the knowledge sticks.

Chapter 11 — Advanced SQL Patterns  ·  Back to Chapter 01 →