DEV Community

Cover image for Python Generators and yield: Lazy Sequences That Scale
German Yamil
German Yamil

Posted on

Python Generators and yield: Lazy Sequences That Scale

Python Generators and yield: Lazy Sequences That Scale

๐ŸŽ Free: AI Publishing Checklist โ€” 7 steps in Python ยท Full pipeline: germy5.gumroad.com/l/xhxkzz (pay what you want, min $9.99)


The Problem: Loading Everything Into Memory

Imagine you need to process a 2 GB log file. The instinctive approach:

# Don't do this with large files
def read_all_lines(path):
    with open(path) as f:
        return f.readlines()   # entire file loaded into RAM

for line in read_all_lines("huge.log"):
    process(line)
Enter fullscreen mode Exit fullscreen mode

That works until your file is bigger than available RAM โ€” then it crashes, or slows to a crawl swapping to disk. The same problem shows up with API responses, database result sets, and LLM output streams: you're building the whole thing before touching a single item.

Generators solve this by producing values one at a time, on demand.


What Generators Are

A generator is a function that uses yield instead of return. When called, it doesn't run any code โ€” it returns a generator object. The code only runs when you ask for the next value.

# Regular function โ€” runs immediately, returns a list
def make_squares_list(n):
    result = []
    for x in range(n):
        result.append(x ** 2)
    return result   # all n values computed right now

# Generator function โ€” runs lazily, one value at a time
def make_squares_gen(n):
    for x in range(n):
        yield x ** 2   # pauses here, resumes on next()
Enter fullscreen mode Exit fullscreen mode

Calling the generator function doesn't run a single line of the body:

gen = make_squares_gen(5)
print(gen)  # <generator object make_squares_gen at 0x...>
Enter fullscreen mode Exit fullscreen mode

Values are pulled out with next() or by iterating with a for loop:

gen = make_squares_gen(5)

print(next(gen))  # 0
print(next(gen))  # 1
print(next(gen))  # 4

# Or just consume it entirely
for value in make_squares_gen(5):
    print(value)
# 0
# 1
# 4
# 9
# 16
Enter fullscreen mode Exit fullscreen mode

Each call to next() runs the function body until the next yield, suspends, and hands the value back. When the function returns (or falls off the end), the generator raises StopIteration โ€” which for loops handle automatically.


Generator Expressions

Just like list comprehensions have a dict/set/generator variant, you can write a generator expression by using parentheses instead of square brackets:

Before (list โ€” eager, all in memory):

squares = [x ** 2 for x in range(1_000_000)]
# builds list of 1M integers right now
total = sum(squares)
Enter fullscreen mode Exit fullscreen mode

After (generator โ€” lazy, one item at a time):

squares = (x ** 2 for x in range(1_000_000))
# no values computed yet
total = sum(squares)  # computes each square as sum() pulls them
Enter fullscreen mode Exit fullscreen mode

The result is identical but the generator version never holds more than one value in memory at a time. For sum(), max(), min(), any(), all(), or any loop that consumes the sequence once โ€” prefer a generator expression.

# Count lines matching a pattern without loading the file
count = sum(1 for line in open("huge.log") if "ERROR" in line)
Enter fullscreen mode Exit fullscreen mode

Real Use Case: Reading Large Files

The canonical generator pattern for files:

Before:

def get_error_lines(path):
    with open(path) as f:
        lines = f.readlines()        # entire file in RAM
    return [l for l in lines if "ERROR" in l]

for line in get_error_lines("server.log"):
    print(line.strip())
Enter fullscreen mode Exit fullscreen mode

After:

def get_error_lines(path):
    with open(path) as f:
        for line in f:               # file object is itself a generator
            if "ERROR" in line:
                yield line.strip()   # one line at a time

for line in get_error_lines("server.log"):
    print(line)
Enter fullscreen mode Exit fullscreen mode

Memory usage goes from O(file size) to O(one line). A 2 GB log file that crashed your script now processes in constant memory.


Chaining Generators: The Pipeline Pattern

Generators compose naturally. You can chain them so each stage pulls from the previous one โ€” a data pipeline where nothing is fully materialized:

def read_lines(path):
    """Yield lines from a file one at a time."""
    with open(path) as f:
        yield from f                        # yield from delegates to another iterable

def filter_errors(lines):
    """Pass through only ERROR lines."""
    for line in lines:
        if "ERROR" in line:
            yield line.strip()

def parse_timestamp(lines):
    """Extract (timestamp, message) from each line."""
    for line in lines:
        parts = line.split(" ", 2)
        if len(parts) >= 3:
            yield parts[0], parts[2]       # (timestamp, message)

# Wire them together โ€” nothing runs until we iterate
path = "server.log"
pipeline = parse_timestamp(filter_errors(read_lines(path)))

for timestamp, message in pipeline:
    print(f"{timestamp}: {message}")
Enter fullscreen mode Exit fullscreen mode

Each stage is independent and testable. The whole pipeline runs in constant memory regardless of file size. This is the same pattern that powers Unix pipes โ€” and Python's standard library uses it everywhere.


send(): Two-Way Communication

Generators can also receive values back through send(). This turns a generator into a coroutine โ€” useful for stateful streaming pipelines.

def running_average():
    """Receive values via send(), yield the running average."""
    total = 0
    count = 0
    value = yield          # first next() primes the generator
    while True:
        total += value
        count += 1
        value = yield total / count

avg = running_average()
next(avg)                  # prime it (advance to first yield)

print(avg.send(10))        # 10.0
print(avg.send(20))        # 15.0
print(avg.send(30))        # 20.0
Enter fullscreen mode Exit fullscreen mode

send() is a niche feature โ€” most code never needs it. But it's how Python's asyncio was originally built, and it's useful when you need to pass control information back into a running generator (e.g., signaling it to flush or reset).


When to Use Generators vs Lists

Situation Use
Iterate once, large data Generator
Need random access by index List
Pass to sum(), max(), any() Generator expression
Need len() List
Streaming API / file processing Generator
Build result for multiple callers List
Pipeline of transformations Chained generators
Small data, used multiple times List

The rule of thumb: if you only need to walk through values once and order is maintained, a generator will use less memory and often be faster to start (no upfront build cost).


Real Pipeline Example: Streaming LLM Output

In the ebook automation pipeline, chapters are generated by calling an LLM API that returns output as a stream. Generators are the natural fit:

Before (buffer everything, then write):

def generate_chapter(outline: dict) -> str:
    chunks = []
    for chunk in llm_stream(outline["prompt"]):
        chunks.append(chunk)          # buffer entire response
    return "".join(chunks)            # then return it all

content = generate_chapter(outline)
with open("ch01.md", "w") as f:
    f.write(content)                  # write all at once
Enter fullscreen mode Exit fullscreen mode

After (stream directly to disk):

def stream_chapter(outline: dict):
    """Yield text chunks as the LLM produces them."""
    for chunk in llm_stream(outline["prompt"]):
        yield chunk

def write_chapter(path: str, chunks):
    """Write chunks to file as they arrive."""
    with open(path, "w", encoding="utf-8") as f:
        for chunk in chunks:
            f.write(chunk)
            f.flush()                 # visible on disk immediately

write_chapter("ch01.md", stream_chapter(outline))
Enter fullscreen mode Exit fullscreen mode

The second version shows progress in real time, uses constant memory regardless of chapter length, and can be interrupted cleanly at any yield point. Add a word-count filter stage between stream_chapter and write_chapter and you never have to change either function.


Quick Recap

# Generator function
def count_up(n):
    for i in range(n):
        yield i

# Generator expression
gen = (i * 2 for i in range(10))

# yield from (delegate to another iterable)
def chain(*iterables):
    for it in iterables:
        yield from it

# Consuming
for val in count_up(5):
    print(val)

total = sum(i ** 2 for i in range(1000))  # no list built
Enter fullscreen mode Exit fullscreen mode

Generators are one of the most underused features in everyday Python. Once you train yourself to reach for yield whenever you're building a sequence to iterate over once, you'll write cleaner, more memory-efficient code by default.


The full pipeline uses generators for streaming chapter output โ€” less memory, faster processing: germy5.gumroad.com/l/xhxkzz โ€” pay what you want, min $9.99.

If this saved you time, the โค๏ธ button helps other developers find it.


Further Reading

Top comments (0)