SQL ยท Beginner

What Is a Database?

A database is a place that answers questions about your data. This chapter introduces the mental model โ€” tables, rows, columns โ€” runs your first SELECT, explains what NULL really means, and shows the three SQL dialects you will see throughout this tutorial.

01 Database ยท 02 WHERE ยท 03 Sort ยท 04 Aggregate ยท 05 Joins ยท 06 Write ยท 07 Schema ยท 08 Indexes ยท 09 CTEs ยท 10 Window ยท 11 Advanced ยท 12 Production · 13 PL/SQL
01
Section One ยท Mental Model

Why Databases Exist

SPREADSHEET one file ยท one user RELATIONAL DATABASE users orders products many tables ยท many users ยท queries

A spreadsheet is fine when one person edits a few hundred rows. It stops working the moment two people need to update it at the same time, or the moment you have a million rows, or the moment you want to answer a question like "which customers bought item X in the last 7 days?" A database solves all three problems: it stores data on disk in a structured format, lets many users read and write concurrently without corruption, and exposes a query language โ€” SQL โ€” to ask questions of the data.

A relational database stores data in tables. A table has named columns (each with a type โ€” text, number, date) and contains rows (each row is one record). The relational model โ€” invented by E. F. Codd in 1970 โ€” is just this: data lives in tables, tables can reference each other, and you query them with a declarative language. The software that manages these tables is called a relational database management system, or RDBMS: PostgreSQL, MySQL, SQLite, SQL Server, and Oracle are all RDBMSes.

Vocabulary โ€” the parts of a table
users id email full_name 1 ada@example.com Ada Lovelace 2 linus@example.com Linus Torvalds 3 grace@example.com Grace Hopper table row (record) column (field) cell (one value) header (column names)
Table

A named collection of rows

Picture a sheet with a name (users), a fixed set of columns, and an unordered set of rows. The order of rows in storage is an implementation detail โ€” never rely on it without ORDER BY.

Row

One record

A single instance of whatever the table represents โ€” one user, one order, one product. Rows are uniquely identified by a primary key (covered in Chapter 07).

Column

A named, typed field

Every column has a name (email) and a data type (TEXT, INTEGER, TIMESTAMP). The type determines what values are legal and how comparisons behave.

RDBMS

The software

The engine that stores tables on disk, runs queries, enforces types and constraints, and coordinates concurrent access. PostgreSQL, MySQL, SQLite are common open-source RDBMSes.

E-commerce mental picture: Imagine an online store. You have a users table (one row per customer), a products table (one row per item for sale), an orders table (one row per checkout), and an order_items table (one row per product within an order). These four tables โ€” wired together with relationships โ€” power every report the business will ever ask for. We will use this exact schema for the rest of the tutorial.
02
Section Two ยท Reading Data

Your First Query

SQL is a declarative language: you describe what you want, not how to get it. The most basic question you can ask a database is "show me everything in this table." The answer is a result set โ€” itself a table of rows and columns, returned to your client.

Read every row, every column

The * means "all columns." Useful while exploring; avoid in production code (more on that in Chapter 03).

SELECT * FROM users;

Result:

id  | email                | full_name      | created_at
----+----------------------+----------------+---------------------
1   | ada@example.com      | Ada Lovelace   | 2025-01-04 10:22:00
2   | linus@example.com    | Linus Torvalds | 2025-01-09 14:05:33
3   | grace@example.com    | Grace Hopper   | NULL

Pick the columns you actually need

Naming columns explicitly is faster (less data over the wire), self-documenting, and survives schema changes.

SELECT email, full_name
FROM   users;
email                | full_name
---------------------+---------------
ada@example.com      | Ada Lovelace
linus@example.com    | Linus Torvalds
grace@example.com    | Grace Hopper

What does NULL mean?

NULL is not zero, not empty string, not false โ€” it means unknown or missing. Grace Hopper's row above has created_at = NULL, meaning the system does not know when her account was created. This is critical because any comparison against NULL yields NULL, not true or false:

-- โš  This returns ZERO rows, even if some created_at are NULL:
SELECT * FROM users WHERE created_at = NULL;

-- โœ… Use IS NULL / IS NOT NULL instead:
SELECT * FROM users WHERE created_at IS NULL;
Common mistake: Writing = NULL or != NULL. Both silently return no rows. Always use IS NULL / IS NOT NULL. We cover this in detail in Chapter 02.
Why = NULL never matches โ€” a small truth table
expression result row matches? 5 = 5 TRUE โœ“ yes 5 = 6 FALSE โœ— no NULL = NULL NULL โœ— no (!) NULL = 5 NULL โœ— no NULL IS NULL TRUE โœ“ yes WHERE only keeps rows whose expression is TRUE โ€” NULL is filtered out, same as FALSE.

Anatomy of a SELECT โ€” write order vs execution order

SQL is read top-to-bottom โ€” but the database does not execute the clauses in that order. You write SELECT first, but the engine evaluates FROM first (get the rows), then WHERE (filter), then SELECT (pick columns), then ORDER BY (sort). This single fact explains many beginner gotchas in later chapters โ€” for example, why you can use a column alias in ORDER BY but not in WHERE.

You write it like this โ€ฆ but the database runs it like that
โ‘  WRITE ORDER (what you type) SELECT email, full_name FROM users WHERE created_at IS NOT NULL ORDER BY created_at DESC; columns first โ€” the question "what columns do I want?" engine reorders โ‘ก EXECUTION ORDER (what runs) 1 FROM users โ†’ load all rows 2 WHERE โ†’ drop NULL rows 3 SELECT โ†’ project columns 4 ORDER BY โ†’ sort result data flows top โ†’ bottom; each step feeds the next โš  Column aliases defined in SELECT (step 3) are NOT visible inside WHERE (step 2).
Why this matters later: when Chapter 04 introduces GROUP BY, the full processing order becomes FROM โ†’ WHERE โ†’ GROUP BY โ†’ HAVING โ†’ SELECT โ†’ ORDER BY โ†’ LIMIT. Every "I can't use X here" error you'll hit in SQL comes down to this order โ€” the clause you're using runs before the thing it depends on.
03
Section Three ยท The Ecosystem

SQL Dialects

SQL is a standard (ISO/IEC 9075), but every database vendor extends it. The core โ€” SELECT, WHERE, JOIN, GROUP BY โ€” is identical everywhere. The edges โ€” string functions, date handling, JSON support, upsert syntax โ€” differ. This tutorial uses PostgreSQL syntax as the default and calls out where MySQL and SQLite diverge.

PostgreSQL

The reference dialect here

Most standards-compliant of the popular OSS engines. Rich type system (JSONB, arrays, ranges), strong window functions, recursive CTEs, and excellent EXPLAIN ANALYZE. Default for this tutorial.

MySQL

Ubiquitous in web apps

Powers a huge fraction of LAMP-stack sites. Slightly looser on the standard โ€” historically permissive about GROUP BY (see Chapter 04). Added CTEs and window functions in 8.0.

SQLite

The zero-setup playground

A single-file embedded database, no server required. Dynamic typing (column "type" is a hint, not a constraint). Perfect for learning โ€” every browser-based playground uses it under the hood.

Others

SQL Server, Oracle

Used heavily in enterprise. SQL Server uses TOP n instead of LIMIT n; Oracle uses ROWNUM / FETCH FIRST n ROWS. The mental model is identical โ€” only the keywords change.

Free playgrounds โ€” pick one and follow along

  • sqliteonline.com โ€” runs in your browser, supports SQLite / PostgreSQL / MySQL with one click. Recommended for this tutorial.
  • db-fiddle.com โ€” paste a schema, share a link to a query. Great for asking for help.
  • Local PostgreSQL โ€” brew install postgresql or apt install postgresql, then psql. Closest to a real production setup.
How to follow along: Open a playground, paste the sample schema from ยง4 below, and run every query as you read it. Then change one thing โ€” a column, a filter, a value โ€” and run it again. Typing the query yourself is the difference between recognising SQL and being able to write it.
04
Section Four ยท The Schema We Will Use

The Tutorial's Sample Schema

Every chapter from here on uses the same four-table e-commerce schema. Run this block once in your playground; later chapters will refer back to it. Don't worry about understanding every keyword โ€” CREATE TABLE is covered properly in Chapter 07.

-- Customers
CREATE TABLE users (
  id         INTEGER PRIMARY KEY,
  email      TEXT NOT NULL UNIQUE,
  full_name  TEXT NOT NULL,
  created_at TIMESTAMP
);

-- Catalog
CREATE TABLE products (
  id         INTEGER PRIMARY KEY,
  sku        TEXT NOT NULL UNIQUE,
  name       TEXT NOT NULL,
  price      NUMERIC(10,2) NOT NULL,
  category   TEXT
);

-- One row per checkout
CREATE TABLE orders (
  id         INTEGER PRIMARY KEY,
  user_id    INTEGER NOT NULL REFERENCES users(id),
  status     TEXT NOT NULL,             -- 'pending' | 'paid' | 'shipped' | 'cancelled'
  placed_at  TIMESTAMP NOT NULL
);

-- One row per product within an order
CREATE TABLE order_items (
  order_id    INTEGER NOT NULL REFERENCES orders(id),
  product_id  INTEGER NOT NULL REFERENCES products(id),
  quantity    INTEGER NOT NULL,
  unit_price  NUMERIC(10,2) NOT NULL,
  PRIMARY KEY (order_id, product_id)
);

-- Seed data
INSERT INTO users (id, email, full_name, created_at) VALUES
  (1, 'ada@example.com',   'Ada Lovelace',   '2025-01-04 10:22:00'),
  (2, 'linus@example.com', 'Linus Torvalds', '2025-01-09 14:05:33'),
  (3, 'grace@example.com', 'Grace Hopper',   NULL);

INSERT INTO products (id, sku, name, price, category) VALUES
  (101, 'BK-001', 'SQL Cookbook',    34.99, 'books'),
  (102, 'BK-002', 'Designing Data',  42.00, 'books'),
  (103, 'MG-001', 'Database Mug',     9.50, 'merch');

INSERT INTO orders (id, user_id, status, placed_at) VALUES
  (1001, 1, 'paid',     '2025-02-10 09:00:00'),
  (1002, 2, 'paid',     '2025-02-11 11:30:00'),
  (1003, 1, 'shipped',  '2025-02-14 16:45:00'),
  (1004, 3, 'cancelled','2025-02-15 08:10:00');

INSERT INTO order_items (order_id, product_id, quantity, unit_price) VALUES
  (1001, 101, 1, 34.99),
  (1001, 103, 2,  9.50),
  (1002, 102, 1, 42.00),
  (1003, 101, 1, 34.99),
  (1003, 102, 1, 42.00),
  (1004, 103, 1,  9.50);

Sanity-check it loaded

SELECT COUNT(*) AS user_count    FROM users;       -- expect 3
SELECT COUNT(*) AS product_count FROM products;   -- expect 3
SELECT COUNT(*) AS order_count   FROM orders;     -- expect 4
SELECT COUNT(*) AS item_count    FROM order_items;-- expect 6
Next up โ€” Chapter 02: we will start asking real questions of this data. Which orders are still pending? Which users joined in January? Which products cost more than $20? All of that is the WHERE clause.

๐Ÿ‘‰ Continue to Chapter 02 โ€” Filtering with WHERE โ†’