Back to Notes

SQL UNION & Set Operations

SQL UNION & Set Operations

Combine results from multiple SELECT statements. All SELECTs must have the same number of columns with compatible data types.


UNION — Remove Duplicates

-- All cities from customers AND suppliers (deduped)
SELECT city FROM customers
UNION
SELECT city FROM suppliers;

Performance: UNION does a sort/hash to remove duplicates — expensive on large sets.


UNION ALL — Keep Duplicates (Faster)

-- All transactions from two tables, including duplicates
SELECT id, amount, 'payment' AS type FROM payments
UNION ALL
SELECT id, amount, 'refund'  AS type FROM refunds
ORDER BY id;

Always prefer UNION ALL unless deduplication is required. No sort step = faster.


Practical Use Cases

Combine Data from Multiple Tables (Same Schema)

-- Active users from two regions
SELECT user_id, name, 'US' AS region FROM users_us
UNION ALL
SELECT user_id, name, 'EU' AS region FROM users_eu;

Historical + Current Data

SELECT order_id, total, created_at FROM orders_archive WHERE YEAR(created_at) < 2025
UNION ALL
SELECT order_id, total, created_at FROM orders WHERE YEAR(created_at) >= 2025
ORDER BY created_at;

Combine Aggregated and Raw Data

-- Detail rows + summary row (manual ROLLUP alternative)
SELECT department, name, salary FROM employees
UNION ALL
SELECT 'TOTAL', NULL, SUM(salary) FROM employees;

INTERSECT — Rows in Both

-- Products ordered by BOTH user 1 and user 2
SELECT product_id FROM orders WHERE user_id = 1
INTERSECT
SELECT product_id FROM orders WHERE user_id = 2;

Equivalent with JOIN:

SELECT DISTINCT o1.product_id
FROM orders o1
JOIN orders o2 ON o1.product_id = o2.product_id
WHERE o1.user_id = 1 AND o2.user_id = 2;

EXCEPT (MINUS in Oracle) — Rows in First but Not Second

-- Products ordered by user 1 but NOT user 2
SELECT product_id FROM orders WHERE user_id = 1
EXCEPT
SELECT product_id FROM orders WHERE user_id = 2;

Equivalent anti-join:

SELECT DISTINCT product_id FROM orders WHERE user_id = 1
AND product_id NOT IN (SELECT product_id FROM orders WHERE user_id = 2);

Column Aliases — Use First SELECT's Names

SELECT name AS person_name, email FROM employees
UNION ALL
SELECT name, email FROM contractors;
-- Column headers come from FIRST SELECT

ORDER BY — Applies to Final Result

-- ORDER BY on combined result, not per-SELECT
SELECT name, 'employee'   AS type FROM employees
UNION ALL
SELECT name, 'contractor' AS type FROM contractors
ORDER BY name;  -- sorts combined result

-- ORDER BY column index
SELECT name, salary FROM employees
UNION ALL
SELECT name, hourly_rate * 2080 FROM contractors
ORDER BY 2 DESC;  -- order by 2nd column

UNION vs JOIN — Different Operations

UNIONJOIN
DirectionStacks rows verticallyCombines columns horizontally
RequirementSame column count/typesMatching keys
UseCombine same-shape datasetsCombine related datasets

Gotchas

-- NULL: UNION treats NULL = NULL for deduplication
SELECT NULL UNION SELECT NULL;  -- returns 1 row (not 2)

-- Data types must be compatible, not identical
SELECT 1 UNION SELECT 1.5;  -- works, promotes to DECIMAL

-- Column count mismatch = error
SELECT a, b FROM t1 UNION SELECT a FROM t2;  -- ❌ error
-- Fix: pad with NULL
SELECT a, b FROM t1 UNION SELECT a, NULL FROM t2;  -- ✅

Related

  • [[JOINs]] — horizontal combination instead of vertical stacking
  • [[Basics]] — SELECT, WHERE