What is a Subquery?
A subquery (also known as an inner query or nested query) is a SQL query that is embedded within the WHERE
or HAVING
clause of another SQL query (called the outer query or main query).
Its primary purpose is to return data that will be used by the main query as a condition to further restrict the data to be retrieved. Think of it as asking a question to answer a bigger question.
Key Characteristics:
- A subquery is always enclosed in parentheses
()
. - It can be used with operators like
=
,<
,>
,IN
,NOT IN
,EXISTS
,ANY
,ALL
, etc. - It can be placed in the
SELECT
,FROM
,WHERE
, orHAVING
clause, but is most common in theWHERE
clause. - Subqueries can be nested within other subqueries (though this can get complex and impact performance).
Example Scenario: The Database
Let's imagine a simple database with two tables to illustrate our examples:
1. employees
table:
employee_id | name | department_id | salary |
---|---|---|---|
101 | Alice | 1 | 75000 |
102 | Bob | 2 | 50000 |
103 | Charlie | 1 | 60000 |
104 | David | 3 | 90000 |
2. departments
table:
department_id | department_name |
---|---|
1 | HR |
2 | Marketing |
3 | IT |
Types of Subqueries with Examples
1. Subquery with a Scalar Result (Single Value)
This is the simplest type. The inner query returns exactly one row and one column. It's often used with comparison operators like =
, >
, <
.
Question: "Find all employees who earn more than the average salary."
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
How it works:
- The subquery runs first:
(SELECT AVG(salary) FROM employees)
calculates the average salary from the entire table (e.g.,(75000+50000+60000+90000)/4 = 68750
). - The result is passed to the main query: The main query now effectively becomes:
SELECT name, salary FROM employees WHERE salary > 68750
. - The main query runs: It returns Alice (75000) and David (90000).
2. Subquery with a List of Values (Multiple Rows)
The inner query returns a single column but multiple rows. You must use operators like IN
or NOT IN
to handle the list of results.
Question: "Find the names of all employees who work in either 'HR' or 'IT'."
SELECT name
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE department_name IN ('HR', 'IT')
);
How it works:
- The subquery runs first:
(SELECT department_id FROM departments WHERE department_name IN ('HR', 'IT'))
returns a list of IDs:[1, 3]
. - The result is passed to the main query: The main query becomes:
SELECT name FROM employees WHERE department_id IN (1, 3)
. - The main query runs: It returns Alice (HR/1), Charlie (HR/1), and David (IT/3).
3. Correlated Subquery
This is a more advanced type where the inner query depends on the outer query for its values. The inner query is executed once for each row processed by the outer query.
Question: "Find all employees whose salary is greater than the average salary of their own department."
SELECT e1.name, e1.salary, e1.department_id
FROM employees e1
WHERE salary > (
SELECT AVG(salary)
FROM employees e2
WHERE e2.department_id = e1.department_id -- Correlation
);
How it works:
For each employee (e1
) in the outer query:
- The subquery takes the current employee's
department_id
(e.g.,1
for Alice). - It calculates the average salary only for that specific department (e.g., the average for HR is
(75000 + 60000)/2 = 67500
). - The outer query then checks if Alice's salary (75000) is greater than her department's average (67500). If yes, the row is included.
- This process repeats for Bob, Charlie, and David.
4. Subquery in the SELECT
Clause
You can also use a subquery to calculate a value for each row in the output.
Question: "Show each employee's name, salary, and the average salary of their department."
SELECT
name,
salary,
department_id,
(SELECT AVG(salary) FROM employees e2 WHERE e2.department_id = e1.department_id) AS dept_avg_salary
FROM employees e1;
Result:
name | salary | department_id | dept_avg_salary |
---|---|---|---|
Alice | 75000 | 1 | 67500 |
Bob | 50000 | 2 | 50000 |
Charlie | 60000 | 1 | 67500 |
David | 90000 | 3 | 90000 |
Subquery vs. JOIN
Many problems solved with subqueries can also be solved with JOIN
clauses. For example, the "HR or IT" question could be written with a JOIN
:
SELECT e.name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE d.department_name IN ('HR', 'IT');
Which one to use?
- Readability: Often a matter of preference. Subqueries can be more intuitive for some problems (e.g., "find employees above average").
- Performance: This depends heavily on the database system and the specific query. Modern database optimizers are very good at converting subqueries to joins and vice versa. It's always best to test both versions if performance is critical.
- Correlated Subqueries: Can be less performant than
JOIN
s because they execute repeatedly for each row. In such cases, aJOIN
might be a more efficient choice.
At Online Learner, we're on a mission to ignite a passion for learning and empower individuals to reach their full potential. Founded by a team of dedicated educators and industry experts, our platform is designed to provide accessible and engaging educational resources for learners of all ages and backgrounds.
Terms Disclaimer About Us Contact Us
Copyright 2023-2025 © All rights reserved.