Recursive SQL
From Null-pointer
Contents |
MySQL
Oracle
Setting up the example
CREATE TABLE emp(empid INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(10),
salary DECIMAL(9, 2),
mgrid INTEGER);
INSERT INTO emp
VALUES ( 1, 'Jones', 30000, 10),
( 2, 'Hall', 35000, 10),
( 3, 'Kim', 40000, 10),
( 4, 'Lindsay', 38000, 10),
( 5, 'McKeough', 42000, 11),
( 6, 'Barnes', 41000, 11),
( 7, 'O Neil', 36000, 12),
( 8, 'Smith', 34000, 12),
( 9, 'Shoeman', 33000, 12),
(10, 'Monroe', 50000, 15),
(11, 'Zander', 52000, 16),
(12, 'Henry', 51000, 16),
(13, 'Aaron', 54000, 15),
(14, 'Scott', 53000, 16),
(15, 'Mills', 70000, 17),
(16, 'Goyal', 80000, 17),
(17, 'Urbassek', 95000, NULL);
A simple recursive query
SELECT name FROM emp START WITH name = 'Goyal' CONNECT BY PRIOR empid = mgrid
START WITH denotes the seed of the recursion while CONNECT BY describes the recursive step. That is how to get from step n to step (n + 1). Since it is important to distinguigh between the nth and the (n + 1)th step during name resolution, PRIOR is used to show that empid belongs to the nth step while mgrid belongs to step (n + 1)th. So with empid being 16 for step 1, mgrid must be 16 as well and hence step 2 produces 'Scott,' 'Henry,' and 'Zander.' Their empids will now serve as PRIOR to step 3, and so on and so forth.
The Oracle syntax is very concise. The SQL standard and DB2 syntax does not use any recursion-specific keywords and uses regular SQL to describe the exact same relationships. As you will see, it is more verbose, but equally straightforward:
WITH n(empid, name) AS
(SELECT empid, name
FROM emp
WHERE name = 'Goyal'
UNION ALL
SELECT nplus1.empid, nplus1.name
FROM emp as nplus1, n
WHERE n.empid = nplus1.mgrid)
SELECT name FROM n;
The WITH clause allows for the definition of a named query within a statement, which can be referred to at a later point in the same statement. Its technical term is a common table expression (CTE). CTEs are by now supported in most major SQL-based DBMS, including Oracle 9i, SQL Server 2005, and, of course, in DB2.
What makes this CTE special is that it is referred to within its very own defintion. This is what distinguishes a regular CTE from a recursive CTE. Here I named the CTE n to correlate to the recursive steps. A recursive CTE consists of two parts combined with a UNION ALL:
- The seed or step 1 of the recursion. This is what is described in Oracle using START WITH. In a recursive CTE, it is simply any query providing a set of rows. In this case we query the emp table and filter for 'Goyal.' We select the name of course and also the empid, because we need it for the recursive step.
- The recursive step going from n to (n + 1). Here we refer to step n (the CTE n) and join in step (n + 1) using the same predicate used in CONNECT BY. Instead of PRIOR, regular correlation names are used to distinguish n from (n + 1).

