Sorting & Limiting
By default a relational table is an unordered set β the order you see rows is whatever the engine finds convenient. ORDER BY imposes a sort. LIMIT and OFFSET let you slice that sorted output into pages. This chapter explains how the sort actually works (including the NULL-placement trap), why OFFSET is fine for a "page 2" link but lethal for "page 1000", and how to use aliases and DISTINCT without surprising yourself.
Ordering Results
A table without ORDER BY is an unordered set. The engine returns rows in whatever order is cheapest at this moment β could be insertion order, could be index order, could change after a vacuum or an upgrade. If row order matters in your output, you must say so explicitly.
Basic syntax β one column, ascending or descending
ORDER BY is the last clause to run, after SELECT projects the result. The default direction is ASC (ascending) β you don't have to write it, but doing so makes intent obvious.
-- Cheapest products first (ASC is the default; explicit is clearer)
SELECT name, price
FROM products
ORDER BY price ASC;
-- Most expensive first
SELECT name, price
FROM products
ORDER BY price DESC; name | price
---------------+------
Database Mug | 9.50
SQL Cookbook | 34.99
Designing Data | 42.00
Multi-column sort β primary, tie-breaker, tie-breakerβ¦
You can list more than one sort key. The first key is the primary sort; subsequent keys break ties only when the previous key is equal. Each key gets its own direction.
ORDER BY category ASC, price DESC-- "books" first (alphabetical), then within each category, highest price first
SELECT name, category, price
FROM products
ORDER BY category ASC, price DESC; The mental model: imagine bucketing rows by the first key, then sorting each bucket by the second key, then concatenating the buckets. Add a third key and you sort within each (category, price) pair, and so on.
Sorting by position or alias
ORDER BY can refer to columns three ways: by name, by 1-based position in the SELECT list, or by an alias defined in the SELECT list. Aliases work here because ORDER BY runs after SELECT in the logical processing order (Chapter 01, Β§2).
-- By name
SELECT name, price FROM products ORDER BY price DESC;
-- By position β "2" means the second column in the SELECT list
SELECT name, price FROM products ORDER BY 2 DESC;
-- By alias defined in SELECT
SELECT name, price * 1.1 AS price_with_tax
FROM products
ORDER BY price_with_tax DESC; WHERE. WHERE price_with_tax > 40 fails because WHERE runs before SELECT. The same alias works fine in ORDER BY, which runs after. (Chapter 01, Β§2 has the full pipeline.)
Style note:
ORDER BY 2 is convenient when poking around but fragile β adding a column to the SELECT list silently breaks it. Prefer names or aliases in committed code.
Where do NULLs go?
A sort key that contains NULL raises a question the SQL standard does not answer: should NULLs sort first or last? Every dialect picks a default β and the defaults disagree. This is one of the few places where the same query gives a different row order on different engines.
ORDER BY created_at ASC on the same dataForcing NULL placement (portable enough)
PostgreSQL and SQLite (3.30+) accept the standard NULLS FIRST / NULLS LAST qualifier. MySQL does not β there you sort on a synthetic key that pushes NULLs to whichever end you want.
-- β
PostgreSQL, SQLite 3.30+
SELECT full_name, created_at
FROM users
ORDER BY created_at ASC NULLS LAST;
-- β
MySQL workaround β sort by "is NULL" first
SELECT full_name, created_at
FROM users
ORDER BY (created_at IS NULL) ASC, -- 0 (not null) before 1 (null)
created_at ASC;
-- "(created_at IS NULL)" returns 0 or 1; 0 sorts first β NULLs end up LAST. Pagination
LIMIT n says "give me at most n rows." OFFSET k says "skip the first k of them." Together they're the textbook recipe for "page 2 of search results" β and they work fine until your data grows. Then the cost of OFFSET starts hurting in a way that's worth understanding before you ship it.
LIMIT basics
LIMIT almost always appears with ORDER BY β "top 5" only makes sense if you've decided what "top" means. Without an ORDER BY, LIMIT just hands back some n rows, with no guarantee about which n.
-- Three cheapest products
SELECT name, price
FROM products
ORDER BY price ASC
LIMIT 3;
-- Most recent paid order
SELECT id, user_id, placed_at
FROM orders
WHERE status = 'paid'
ORDER BY placed_at DESC
LIMIT 1; OFFSET for "page N"
OFFSET k LIMIT n means "discard the first k rows of the ordered result, then return the next n." For a 20-per-page list, page 1 is OFFSET 0 LIMIT 20, page 2 is OFFSET 20 LIMIT 20, page 47 is OFFSET 920 LIMIT 20.
-- Page 2 of products, 2 per page, cheapest first
SELECT name, price
FROM products
ORDER BY price ASC
LIMIT 2 OFFSET 2; - PostgreSQL Β· MySQL Β· SQLite:
LIMIT n OFFSET k(or the olderLIMIT k, nin MySQL β note the reversed argument order; avoid it). - SQL Server:
OFFSET k ROWS FETCH NEXT n ROWS ONLY(requires anORDER BY). - Oracle 12c+:
OFFSET k ROWS FETCH FIRST n ROWS ONLYβ same standard syntax.
The OFFSET performance cliff
Here's the part most tutorials skip. OFFSET 1000 doesn't magically jump to row 1001. The database has to produce the first 1000 rows in sort order, then throw them away. Doubling the page number roughly doubles the work. By page 5000 the query is doing 100,000Γ the work of page 1 β and the user only ever sees 20 rows of output.
OFFSET gets slower as you page deeperKeyset pagination β paginate by a unique anchor
Instead of "skip the first 19,980 rows", keyset pagination says "give me the next 20 rows after this specific sort-key value." It needs the same ordering, but the database can use an index to jump directly to the cursor β constant time, regardless of how deep the user has paged.
-- First page: most recent orders (no cursor yet)
SELECT id, placed_at
FROM orders
ORDER BY placed_at DESC, id DESC -- secondary key breaks ties
LIMIT 20;
-- Subsequent pages β pass back the last row's (placed_at, id) from the previous page.
-- Suppose the last row of page 1 was placed_at = '2025-02-11 11:30:00', id = 1002.
SELECT id, placed_at
FROM orders
WHERE (placed_at, id) < ('2025-02-11 11:30:00', 1002) -- row-tuple comparison
ORDER BY placed_at DESC, id DESC
LIMIT 20; Two important details:
- Tie-breaking column. If two rows have the same
placed_at, comparing on that alone could skip rows or repeat them. Includingidguarantees a total order. - Row-tuple comparison.
(a, b) < (x, y)means "a < x, or (a = x and b < y)" β exactly the lexicographic test you'd write by hand. PostgreSQL, MySQL 8+, and SQLite support it. On older engines, expand it manually.
When it's fine
Small tables, shallow pages (page 1β10), admin UIs, or pre-aggregated reports. The simplicity is worth it.
When it bites
Large tables with deep paging, "next" buttons on infinite scroll, anything user-facing where p99 latency matters.
When to reach for it
Activity feeds, search-result paging, API "since this id" endpoints β any pattern where the next page anchors on the previous one.
Trade-offs
You give up "jump to page 47" β keyset is sequential by design. And the URL must carry the cursor, not just a page number.
ORDER BY. Page 1 might return rows 1β20, page 2 might return rows 5, 14, 22, 28β¦ because the engine is free to scan in any order both times. Always sort, even if the sort key feels obvious.
Controlling the SELECT List
The list of expressions between SELECT and FROM is called the projection. It decides which columns are in the result, what they're named, and whether duplicates are kept. Three habits separate exploratory queries from production-ready ones.
Specific columns vs SELECT *
In an interactive shell, SELECT * is a quick way to see what a table looks like. In application code it's a footgun for three reasons:
- Bandwidth. A row with 30 columns might be 4 KB even when your code only reads 2. Multiply by every query, every user, every day.
- Schema drift. Tomorrow someone adds a
password_hashcolumn tousers. Your "innocent"SELECT *is now leaking secrets into your API response. - Order assumptions.
SELECT *hands back columns in whatever order the table defines them. Code that destructures by position breaks when a column is added or reordered.
-- β Exploratory only
SELECT * FROM users;
-- β
Production: spell it out
SELECT id, email, full_name FROM users; Column aliases with AS
An alias renames a column in the result set. The underlying table is untouched. Aliases serve three purposes: hide ugly expressions, give computed columns meaningful names, and disambiguate when two joined tables have a column called id (Chapter 05).
-- Computed column with a clear name
SELECT id,
name,
price * 1.1 AS price_with_tax,
price * 0.9 AS sale_price
FROM products;
-- The AS keyword is optional β these two are identical:
SELECT name AS product_name FROM products;
SELECT name product_name FROM products; -- legal but harder to read
-- Aliases with spaces or special chars need double quotes
SELECT name AS "Product Name", price AS "Price (USD)"
FROM products; - SELECT list β yes (you defined it here, so subsequent expressions in some dialects can use it; portable code shouldn't rely on this).
- WHERE / JOIN / GROUP BY β no. These all run before SELECT. Repeat the expression or wrap in a subquery / CTE (Chapter 09).
- HAVING β no in standard SQL; yes in MySQL. Don't rely on it.
- ORDER BY β yes everywhere. ORDER BY is the last clause.
DISTINCT β remove duplicate rows
DISTINCT applies to the whole row the SELECT produces β not just the first column. This is the part beginners regularly get wrong.
DISTINCT applies to the projected row, not to a single column-- 4 rows β unique orders that have at least one line item
SELECT DISTINCT order_id FROM order_items;
-- 6 rows β every (order, product) combination is unique already, so DISTINCT does nothing useful
SELECT DISTINCT order_id, product_id FROM order_items;
-- β A common confusion: this counts unique categories, NOT unique products in each category
SELECT DISTINCT category FROM products; -- 2 rows: 'books', 'merch'
-- β This counts unique (category, name) pairs β basically every product, because names are already unique
SELECT DISTINCT category, name FROM products; -- one row per product DISTINCT to "fix" a join that returned duplicate rows. It hides the symptom but not the cause β the join is doing extra work for every duplicate, then you throw the duplicates away. The proper fix is usually GROUP BY (Chapter 04) or a properly-scoped join condition (Chapter 05).
Putting the full clause order to use
A query that pulls the three cheapest non-cancelled orders, with computed totals and a friendly column name, exercises everything in this chapter and Chapter 02:
SELECT id AS order_id,
status,
placed_at,
(SELECT SUM(quantity * unit_price)
FROM order_items oi
WHERE oi.order_id = orders.id) AS total_usd
FROM orders
WHERE status != 'cancelled' -- Β§2 β filter rows
ORDER BY placed_at DESC NULLS LAST, -- Β§3.1 β sort, with explicit NULL placement
id DESC -- tie-breaker
LIMIT 3; -- Β§3.2 β only the first three - No
ORDER BY= no order. Don't trust what you see. - Multi-column sort = primary key, then tie-breaker, then tie-breaker.
- NULL placement differs by dialect β spell out
NULLS FIRST/NULLS LASTwhen it matters. OFFSETis fine for shallow pages; switch to keyset pagination for deep ones.DISTINCTde-duplicates the whole row. Adding columns to the SELECT list makes rows less likely to collapse.- Aliases work in
ORDER BYbut not inWHERE/GROUP BYβ same reason as Chapter 01's execution-order diagram.
β Chapter 02 β Filtering with WHERE Β· π Continue to Chapter 04 β Aggregation β