DEV Community

Sreekar Reddy
Sreekar Reddy

Posted on β€’ Originally published at sreekarreddy.com

🌸 Bloom Filters Explained Like You're 5

Probably yes or definitely no

Day 139 of 149

πŸ‘‰ Full deep-dive with code examples


The Nightclub Bouncer Analogy

Imagine a nightclub bouncer with a list:

  • "Is this person on the banned list?"
  • Instead of checking the whole list, they have a quick mental system
  • Sometimes they might say "maybe on the list" when they're not
  • In a standard Bloom filter (with consistent hashing and only additions), they won't say "definitely not" when someone actually is in the set

Bloom Filters work like this bouncer!

Fast to check, might have false positives, and (in the standard version) no false negatives.

More precisely: in standard use (only adding items, consistent hashing, no corruption), Bloom filters are designed to avoid false negatives.


The Problem They Solve

Checking if something is in a huge set can be expensive (especially if the data lives on disk or across the network):

  • "Has this user already seen this article?"
  • "Is this email address in our database?"
  • "Is this URL malicious?"

Bloom filters help you quickly rule out "definitely not present" so you can skip unnecessary expensive lookups.


How Bloom Filters Work

Simple idea:

  1. Create a bit array (lots of zeros and ones)
  2. When adding item: run it through hash functions, set those bits to 1
  3. When checking item: run through same hashes, are all bits 1?
    • All ones? β†’ Probably in the set
    • Any zeros? β†’ Definitely NOT in the set

The Trade-Off

What you give up:

  • Can't be completely certain something IS in the set
  • Standard bloom filters don't support reliable deletion without extra bookkeeping

What you get:

  • Incredibly fast lookups
  • Very memory efficient
  • Certain when something is NOT in the set (as long as you only add items)

Note: Bloom filters are typically used as a fast pre-check. If it says "maybe", you still verify with the real data.


Real Uses

  • Web browsers β†’ Is this URL in the malware list?
  • Databases β†’ Is this key probably in this file?
  • Spell checkers β†’ Is this probably a real word?
  • Social media β†’ Has user already seen this post?

The Key Insight

Bloom filters are about speed vs certainty:

  • Traditional: Slow but exact
  • Bloom filter: Super fast, false positives possible, no false negatives

Perfect when "probably yes" is good enough!


In One Sentence

Bloom Filters quickly tell you if something is definitely not in a set, or maybe (but not certainly) is.


πŸ”— Enjoying these? Follow for daily ELI5 explanations!

Making complex tech concepts simple, one day at a time.

Top comments (0)