If you are new to SQL, this query can look perfectly reasonable at first:
SELECT name
FROM users
WHERE phone = NULL;
You have a users table.
Some users did not enter their phone number.
The phone column looks empty.
So you try to find those rows with phone = NULL.
But the result comes back empty.
The table is not necessarily wrong.
The column name may be correct.
The query may even look logical.
The problem is that NULL is not a normal value.
To understand why this query returns nothing, we need to understand what NULL really means.
NULL is not an empty value
At first, NULL looks like an empty cell.
It feels similar to 0 or an empty string.
But in SQL, these are different things:
0 -> a number
'' -> an empty string
NULL -> an unknown or unrecorded state
0 is a value.
It means the number zero.
'' is also a value.
It means a string with no characters.
But NULL is different.
NULL does not mean “empty” in the same way.
It means the value is unknown, missing, or not recorded.
That difference looks small, but it affects comparisons, calculations, and aggregate functions.
Why = NULL does not work
Look again at the original query:
SELECT name
FROM users
WHERE phone = NULL;
This condition asks:
Is phone equal to NULL?
But NULL means unknown.
SQL cannot say that one unknown value is equal to another unknown value.
Even this expression is not true:
NULL = NULL
The result is not true.
It is unknown.
And WHERE only keeps rows where the condition is true.
So when you write:
WHERE phone = NULL
SQL does not find the rows where phone is NULL.
The comparison itself does not become true.
That is why the correct query is:
SELECT name
FROM users
WHERE phone IS NULL;
And if you want rows where phone has a value:
SELECT name
FROM users
WHERE phone IS NOT NULL;
IS NULL is not just another style of writing = NULL.
It asks a different question:
Is this value in an unknown or unrecorded state?
NULL spreads through calculations
NULL also affects calculations.
If an unknown value is part of a calculation, the result is usually unknown too.
SELECT 100 + NULL; -- NULL
SELECT NULL * 5; -- NULL
This makes sense if you think about what NULL means.
If you do not know one part of the calculation, you cannot know the final result.
For example:
SELECT item_id,
price * quantity AS total
FROM items;
Imagine this table:
| item_id | price | quantity | total |
|---|---|---|---|
| 1 | 100 | 2 | 200 |
| 2 | NULL | 3 | NULL |
| 3 | 200 | NULL | NULL |
| 4 | NULL | NULL | NULL |
If price is unknown, the total is unknown.
If quantity is unknown, the total is also unknown.
The calculation is not broken.
SQL is preserving the fact that something is unknown.
Sometimes you may want to replace NULL with another value.
For that, SQL provides COALESCE.
SELECT item_id,
COALESCE(price, 0) * COALESCE(quantity, 0) AS total
FROM items;
COALESCE(price, 0) means:
Use price if it is not NULL.
If price is NULL, use 0 instead.
But this should be done carefully.
Replacing NULL with 0 is not just a technical fix.
It changes the meaning of the data.
There is a difference between:
The value is zero.
and:
We do not know the value.
SQL keeps that difference visible.
Aggregate functions treat NULL differently
There is another place where NULL often surprises beginners.
Aggregate functions usually skip NULL values.
For example, imagine a score column with these values:
80, NULL, 90, NULL, 70
Now look at this query:
SELECT SUM(score),
AVG(score),
MAX(score),
MIN(score)
FROM scores;
SUM(score) returns:
240
It does not treat NULL as 0.
It skips the unknown values.
AVG(score) returns:
80
Because it calculates:
(80 + 90 + 70) / 3
It does not divide by all five rows.
That is important.
NULL is not zero.
It is ignored by many aggregate functions.
COUNT(*) and COUNT(column) are different
The most common mistake is with COUNT.
Look at this query:
SELECT COUNT(*),
COUNT(score)
FROM scores;
These two expressions do not ask the same question.
COUNT(*) asks:
How many rows are there?
COUNT(score) asks:
How many rows have a known value in the score column?
So if the table has five rows, but only three rows have a score:
COUNT(*) -> 5
COUNT(score) -> 3
Both answers are correct.
They are just answering different questions.
This matters in real data.
The number of users is not always the same as the number of users with phone numbers.
The number of orders is not always the same as the number of orders with payment dates.
The number of rows is not always the number of known values.
NULL is a record of uncertainty
NULL is not just an empty space.
It is a mark left by uncertainty.
A user did not enter a phone number.
A measurement was not recorded.
A delivery address was not decided yet.
A value was not known when the row was stored.
That is why this does not work:
WHERE phone = NULL
That is why this becomes NULL:
NULL + 1
That is why these two counts are different:
COUNT(*)
COUNT(column)
They all come from the same idea:
Unknown values remain unknown until we decide how to handle them.
Sometimes we replace them.
Sometimes we exclude them.
Sometimes we leave them as they are.
The database is honest about what it does not know.
How we read that uncertainty is up to us.
Originally published at Dechive:
https://dechive.dev/en/archive/what-null-leaves-behind
Top comments (0)