SQL ยท Beginner

Filtering with WHERE

The WHERE clause is how a database becomes useful โ€” instead of "give me everything", you ask "give me the rows that match these conditions". This chapter covers comparison operators, combining predicates with AND/OR/NOT, pattern matching with LIKE, range/list shortcuts, and the three-valued logic that makes NULL surprising.

01 Database ยท 02 WHERE ยท 03 Sort ยท 04 Aggregate ยท 05 Joins ยท 06 Write ยท 07 Schema ยท 08 Indexes ยท 09 CTEs ยท 10 Window ยท 11 Advanced ยท 12 Production · 13 PL/SQL
01
Section One ยท Comparison Operators

WHERE Basics

all rows WHERE predicate matching rows one row at a time: predicate TRUE โ†’ keep, FALSE/NULL โ†’ drop

Think of WHERE as a row-by-row test. The database walks the table, evaluates your predicate against each row, and keeps the row only if the predicate evaluates to TRUE. Both FALSE and NULL rows are discarded โ€” that's a fact we'll come back to in ยง4.

The six comparison operators

These work on numbers, strings, and dates. SQL is whitespace-tolerant โ€” formatting is up to you.

-- Equality and inequality
SELECT * FROM products WHERE price = 42.00;
SELECT * FROM products WHERE price != 42.00;   -- also written <> in standard SQL
SELECT * FROM products WHERE price <> 42.00;   -- identical to !=

-- Range
SELECT * FROM products WHERE price <  20;
SELECT * FROM products WHERE price <= 20;
SELECT * FROM products WHERE price >  30;
SELECT * FROM products WHERE price >= 30;
id  | sku    | name           | price | category
----+--------+----------------+-------+----------
103 | MG-001 | Database Mug   |  9.50 | merch

Numbers vs strings โ€” the quoting rule

String and date literals must be wrapped in single quotes. Numbers must not. Double quotes mean something else in most dialects (an identifier โ€” a column or table name), so don't use them for values.

-- โœ… Number: no quotes
SELECT * FROM products WHERE price = 42;

-- โœ… String: single quotes
SELECT * FROM products WHERE category = 'books';

-- โœ… Date / timestamp: single quotes, ISO-8601 format
SELECT * FROM orders WHERE placed_at >= '2025-02-12';

-- โš  This is wrong in PostgreSQL: "books" is interpreted as a column name
SELECT * FROM products WHERE category = "books";
-- ERROR: column "books" does not exist

Case sensitivity โ€” a dialect minefield

Whether 'Books' matches 'books' depends on three things: the database engine, the column's collation, and (for PostgreSQL) which operator you use. This trips up beginners more than any other quirk.

PostgreSQL

String comparisons are case-sensitive

'books' = 'Books' is FALSE. Use ILIKE (case-insensitive LIKE) or LOWER(col) = LOWER('Books') when you need to match either case.

MySQL

Depends on the collation

Default collations like utf8mb4_general_ci are case-insensitive โ€” 'books' = 'Books' is TRUE. Switch to a _bin collation for case-sensitive matching.

SQLite

Case-sensitive by default for =

But LIKE is case-insensitive for ASCII by default โ€” the opposite of =. Set PRAGMA case_sensitive_like = ON to change it.

Identifiers

Table & column names

PostgreSQL folds unquoted names to lowercase; SQL Server is case-insensitive; quoted identifiers ("MyColumn") are case-sensitive everywhere. Stick to snake_case and you'll never hit this.

Common mistake: writing WHERE category = "books" with double quotes. It works in MySQL (which is non-standard here) but breaks in PostgreSQL with a confusing "column does not exist" error. Use single quotes for values, every time.
02
Section Two ยท Logical Operators

Combining Conditions

Real questions are rarely a single comparison. "Show me paid orders placed in the last week from users who joined this year" is three conditions glued together. SQL gives you AND, OR, and NOT for that โ€” same three connectives as every programming language, with one twist around precedence.

-- AND: both must be true
SELECT * FROM orders
WHERE  status = 'paid'
  AND  placed_at >= '2025-02-11';

-- OR: at least one must be true
SELECT * FROM orders
WHERE  status = 'paid'
   OR  status = 'shipped';

-- NOT: invert a condition
SELECT * FROM orders
WHERE  NOT status = 'cancelled';

Precedence โ€” AND binds tighter than OR

Multiply-before-add. Without parentheses, SQL evaluates AND before OR, exactly like 2 + 3 ร— 4 = 14, not 20. This is the single most common logical bug beginners write โ€” and it returns answers that look plausible, so it's easy to miss.

Same words, different tree โ€” parentheses change the result
โ‘  WITHOUT parens โ€” surprising status='paid' OR status='shipped' AND price > 50 SQL parses this as: OR status='paid' AND status='shipped' price > 50 = all paid (any price) + shipped only over $50 โ‘ก WITH parens โ€” what you meant (status='paid' OR status='shipped') AND price > 50 SQL parses this as: AND OR status='paid' status='shipped' price > 50 = paid or shipped, only over $50

A worked example

Using the sample schema, find every order that was either paid or shipped and was placed on or after Feb 12.

-- โš  Without parens โ€” probably not what you meant
SELECT id, status, placed_at
FROM   orders
WHERE  status = 'paid'
   OR  status = 'shipped'
  AND  placed_at >= '2025-02-12';
-- Returns orders 1001, 1002 (any 'paid', regardless of date)
-- and 1003 (shipped AND late enough). Order 1004 cancelled is out.

-- โœ… With parens โ€” the question we actually asked
SELECT id, status, placed_at
FROM   orders
WHERE  (status = 'paid' OR status = 'shipped')
  AND  placed_at >= '2025-02-12';
-- Returns only 1003 โ€” the only paid-or-shipped order on/after Feb 12.
Rule of thumb: any time you mix AND and OR in the same query, add the parentheses anyway. Even when they're not strictly required, they make intent obvious to the next reader (often: you, six months from now).

De Morgan's laws โ€” distributing NOT

This matters whenever you want to exclude a combination of conditions. Suppose a manager asks: "Show me every order that is not a paid order from February." That's a negation of an AND โ€” and rewriting it as separate conditions is often easier to read and easier to index (Chapter 08). Negating a compound predicate flips the connective:

-- NOT (A AND B)   โ‰ก   (NOT A) OR  (NOT B)
-- NOT (A OR  B)   โ‰ก   (NOT A) AND (NOT B)

-- "Orders that are NOT (paid AND placed in Feb)":
SELECT * FROM orders
WHERE  NOT (status = 'paid' AND placed_at >= '2025-02-01');

-- Equivalent โ€” distributed NOT:
SELECT * FROM orders
WHERE  status != 'paid'
   OR  placed_at < '2025-02-01';
03
Section Three ยท Beyond Equality

Patterns, Ranges & Lists

Equality covers a lot, but real questions need more. "Emails ending in @example.com", "orders placed between Feb 10 and Feb 14", "products in books or merch but not electronics" โ€” each of these has a clean shortcut.

LIKE & wildcards

LIKE compares against a pattern with two wildcards:

  • % โ€” matches zero or more of any character
  • _ โ€” matches exactly one character
Wildcard cheat sheet โ€” same input, different patterns
pattern 'ada@example.com' tested against result 'ada%' starts with "ada" โœ“ TRUE '%@example.com' ends with "@example.com" โœ“ TRUE '%example%' "example" anywhere inside โœ“ TRUE '_da@example.com' 1 char, then "da@example.com" โœ“ TRUE 'ada' no wildcards โ€” must equal exactly โœ— FALSE why: with no % or _, LIKE collapses to "=", and 'ada' โ‰  'ada@example.com' '__@example.com' exactly 2 chars before @ โœ— FALSE why: each _ matches exactly 1 char โ€” "ada" has 3, so the count is off
-- Emails ending in @example.com
SELECT * FROM users WHERE email LIKE '%@example.com';

-- Names starting with "L"
SELECT * FROM users WHERE full_name LIKE 'L%';

-- 6-character SKU starting with 'BK-'
SELECT * FROM products WHERE sku LIKE 'BK-___';

-- Negation
SELECT * FROM users WHERE email NOT LIKE '%@example.com';
Dialect note โ€” case-insensitive matching:
  • PostgreSQL: use ILIKE โ€” email ILIKE '%@EXAMPLE.com'.
  • MySQL: LIKE is already case-insensitive under default collations.
  • SQLite: LIKE is case-insensitive for ASCII; case-sensitive for Unicode. PRAGMA case_sensitive_like = ON flips it.
Common mistake: writing WHERE name LIKE 'Ada' with no wildcards. It's just a slower =. Use = for exact matches.

BETWEEN โ€ฆ AND โ€” inclusive ranges

BETWEEN x AND y is shorthand for col >= x AND col <= y โ€” both endpoints are included. It works on numbers, dates, and even strings (alphabetic range).

-- Orders placed Feb 10 through Feb 14 (inclusive)
SELECT * FROM orders
WHERE  placed_at BETWEEN '2025-02-10' AND '2025-02-14';

-- Products priced $10 through $40 (inclusive)
SELECT * FROM products WHERE price BETWEEN 10 AND 40;

-- Negation
SELECT * FROM products WHERE price NOT BETWEEN 10 AND 40;
Date gotcha: placed_at BETWEEN '2025-02-10' AND '2025-02-14' includes Feb 14 โ€” but only the instant 2025-02-14 00:00:00. An order at 2025-02-14 16:45:00 is excluded! Prefer half-open ranges for timestamps:
placed_at >= '2025-02-10' AND placed_at < '2025-02-15'.

IN (list) โ€” multiple equality checks

IN is shorthand for a chain of OR col = value. Cleaner to write, faster to read, and the engine can usually evaluate it more efficiently than the equivalent OR chain.

-- Long form
SELECT * FROM orders
WHERE  status = 'paid' OR status = 'shipped' OR status = 'pending';

-- Same query, idiomatic
SELECT * FROM orders WHERE status IN ('paid', 'shipped', 'pending');

-- Negation
SELECT * FROM orders WHERE status NOT IN ('cancelled');
NULL trap in NOT IN: if the list contains NULL, NOT IN returns zero rows. Why? NOT IN (1, 2, NULL) expands to x != 1 AND x != 2 AND x != NULL, and that last comparison is always NULL, dragging the whole AND down to NULL โ†’ row dropped. Filter NULLs out of the list, or use NOT EXISTS (covered in Chapter 09).
04
Section Four ยท The NULL Problem

Three-Valued Logic & IS NULL

Chapter 01 introduced the rule: NULL means "unknown", and any comparison against NULL yields NULL, not TRUE or FALSE. Now we have to deal with what happens when NULL meets AND, OR, and NOT. This is the part of SQL where pure boolean intuition stops working โ€” but a single truth table covers all of it.

Three-valued logic โ€” every predicate is one of TRUE ยท FALSE ยท NULL
AND A \ B TRUE FALSE NULL TRUE TRUE FALSE NULL FALSE FALSE FALSE FALSE NULL NULL FALSE NULL OR A \ B TRUE FALSE NULL TRUE TRUE TRUE TRUE FALSE TRUE FALSE NULL NULL TRUE NULL NULL NOT A NOT A TRUE FALSE FALSE TRUE NULL NULL The WHERE rule A row is kept only when the predicate is TRUE. Both FALSE and NULL drop the row. Note: NOT NULL stays NULL โ€” that's why NOT col = 'x' silently excludes rows where col IS NULL.

Two practical consequences

Both of these look "obviously equivalent" โ€” they aren't.

-- โ‘  WHERE x = 5 OR x != 5
-- You might think this matches every row. It does NOT match rows where x IS NULL,
-- because both halves evaluate to NULL and OR(NULL, NULL) = NULL.

-- โ‘ก WHERE NOT status = 'paid'
-- This drops rows where status IS NULL โ€” because NOT NULL is still NULL.
-- If you want "everything except paid, including unknown", you need:
SELECT * FROM orders WHERE status != 'paid' OR status IS NULL;

IS NULL and IS NOT NULL โ€” the only safe checks

IS NULL never returns NULL โ€” it returns TRUE or FALSE, every time. It's the single escape hatch out of three-valued logic.

-- Find users with no recorded signup date
SELECT * FROM users WHERE created_at IS NULL;

-- Same, negated
SELECT * FROM users WHERE created_at IS NOT NULL;

-- Combine with other predicates as usual
SELECT * FROM users
WHERE  created_at IS NOT NULL
  AND  created_at >= '2025-01-01';
Preview โ€” COALESCE: SQL has a helper called COALESCE(a, b, โ€ฆ) that returns the first non-NULL argument, so you can "substitute" a value for NULL. It's most useful when you're shaping the SELECT list around aggregates โ€” we'll introduce it properly in Chapter 04, where NULLs in COUNT/SUM/AVG create real-world reasons to use it. For now, just remember the name.
Mental shortcut for the whole chapter:
  • Single quotes for values, double quotes (or none) for identifiers.
  • Parentheses any time you mix AND and OR.
  • Half-open ranges (>= start AND < end) for timestamps, BETWEEN for integers.
  • NULL drops the row in WHERE, in NOT, and inside NOT IN. IS NULL is the only safe check.

โ† Chapter 01 โ€” What Is a Database  ยท  ๐Ÿ‘‰ Continue to Chapter 03 โ€” Sorting & Limiting โ†’