Jangwook Kim

Posted on May 13 • Originally published at effloow.com

Cloudflare Project Think: Durable Agent Runtime Guide

#cloudflare #aiagents #durableexecution #serverless

Most AI agents on serverless platforms share the same fatal flaw: they can't survive a restart. If the underlying worker crashes or cold-starts mid-task, the agent's progress disappears. The typical workaround is to keep tasks short and stateless — which means you cannot run a 10-minute research loop, a multi-file refactor, or an autonomous investigation that makes 50 external calls.

Cloudflare's Project Think, announced during Agents Week 2026 (April 2026), is a direct answer to that constraint. It ships a set of primitives — fiber checkpointing, sub-agents, a persistent Session API, and a 5-tier execution ladder — all wired into an opinionated base class (@cloudflare/think) that runs on Durable Objects.

Effloow Lab inspected the SDK packages, confirmed installability, and traced the API surface from official docs and the open-source cloudflare/agents repository. The following is a source-based guide to how Project Think works and when to use it. See data/lab-runs/cloudflare-project-think-durable-agent-runtime-2026.md for the full evidence note.

Why Serverless Agents Break — and Why Project Think Fixes It

A standard Cloudflare Worker is a request handler: it starts, does work, returns a response, and dies. Cloudflare Workflows added durable multi-step execution, but the state machine is managed outside your code and requires a separate infrastructure primitive.

Project Think takes a different approach. Each agent runs inside a Durable Object — a stateful micro-server with its own SQLite database, WebSocket connections, and scheduling. That alone gives agents persistence. But Project Think goes further by introducing fibers: durable invocations that can checkpoint their own instruction pointer directly into the co-located SQLite database.

The practical result: an agent can run a 30-step task, checkpoint after each step, survive a server restart, and resume exactly where it left off — without any external workflow orchestrator.

This is the critical architectural distinction from Cloudflare Dynamic Workers (covered in an earlier Effloow article on Dynamic Workers), which handle sandboxed code execution but are stateless by design. Project Think layers durable execution on top of the full Cloudflare platform stack.

The Five Primitives of Project Think

1. Fibers — Checkpointed Execution

The fiber is the foundational primitive. Unlike a regular async function, a fiber can call ctx.stash() to serialize the current state of its local variables into SQLite. If the Durable Object restarts, runFiber rehydrates from the last stash point.

import { runFiber } from "@cloudflare/think";

export class ResearchAgent extends Think<Env, unknown> {
  async onTask(query: string) {
    return runFiber(this.ctx, async (ctx) => {
      const sources = await searchWeb(query);
      await ctx.stash({ sources });         // checkpoint 1

      const summaries = await summarize(sources);
      await ctx.stash({ sources, summaries }); // checkpoint 2

      return synthesize(summaries);
    });
  }
}

Each ctx.stash() call writes to the Durable Object's SQLite database. On resume, the fiber fast-forwards to the last stash point. For long-horizon tasks — multi-file code reviews, iterative search loops, automated report generation — this removes the "start over" failure mode entirely.

Fibers also include automatic keepalive for long-running operations and handle non-deterministic workloads that would time out in a standard Worker context.

2. Sub-Agents (Facets) — Isolated Child Agents with Typed RPC

Project Think supports spawning child agents as facets — child Durable Objects colocated with the parent on the same machine. Each facet has:

Its own isolated SQLite database (no shared state)
Its own execution context and fiber support
A typed RPC stub returned to the parent for method calls

// Parent agent spawning a specialist sub-agent
const extractor = await this.spawnFacet("data-extractor", DataExtractorAgent);
const structured = await extractor.parseDocument(rawText);

const validator = await this.spawnFacet("validator", ValidationAgent);
const result = await validator.check(structured);

This pattern is more predictable than passing messages through a shared queue. Because the facet RPC is typed, TypeScript catches mismatches at compile time. And because facets are colocated, the latency for inter-agent calls is dramatically lower than network-based agent-to-agent communication.

Facets are useful when you need to decompose a task into specialist roles — a researcher, a writer, a fact-checker — without those roles sharing any mutable state.

3. The Session API — Relational Conversation Trees

Standard chat agent implementations append messages to a flat array. That works for simple Q&A but breaks when you need to explore alternatives without polluting the main reasoning path.

Project Think's Session API stores messages as a relational tree, with each message carrying a parent_id. This enables three capabilities that flat-list approaches cannot support:

Forking: The agent can branch off a conversation node to explore an alternative without modifying the main path. If the alternative fails, the original path is untouched.

Non-destructive compaction: Rather than truncating context when the window fills, the Session API creates a compaction overlay — a summary that sits beside the original messages without replacing them. The full history is still queryable.

Full-text search: FTS5 indexing over all stored messages lets the agent retrieve relevant earlier context without re-reading the entire history into the LLM context window.

export class LongHorizonAgent extends Think<Env, unknown> {
  configureSession() {
    return {
      systemPrompt: "You are a thorough technical researcher.",
      contextBlocks: [
        { type: "text", content: this.env.DOMAIN_KNOWLEDGE }
      ]
    };
  }
}

All session storage runs on the Durable Object's local SQLite — no external vector database required for the conversation layer.

4. The Execution Ladder — Graduated Code Trust

One of Project Think's most distinctive ideas is the execution ladder: a tiered system of code execution environments that agents escalate through based on the trust level required by a task.

Tier	Name	Package / API	Capability	Trust Level
0	Workspace	`@cloudflare/shell`	Durable filesystem (SQLite + R2)	Fully trusted
1	Dynamic Worker	`@cloudflare/codemode`	Sandboxed V8 isolate, no network	LLM-generated code
2	npm	`@cloudflare/worker-bundler`	Fetch npm pkgs, esbuild, load into DW	Third-party packages
3	Browser	Cloudflare Browser Run	Navigate, click, extract	Web content
4	Sandbox	`cloudflare/sandbox-sdk`	Full Linux env, git, cargo, npm test	Untrusted workloads

Agents do not jump directly to Tier 4 for every task. A simple data transformation can run in Tier 1 (a sandboxed V8 isolate that starts in milliseconds). A task requiring npm packages escalates to Tier 2. A task that needs to test a full Rust codebase goes to Tier 4.

The ladder enforces the principle of least privilege: agents operate at the lowest tier that can handle the task, escalating only when needed. This keeps the security surface small and execution fast for common cases.

5. Self-Authored Extensions — Agents Writing Their Own Tools

The final primitive is the most experimental: agents can write their own tools at runtime. An agent inspects a task, decides it needs a capability it doesn't have, generates a tool implementation, and loads it into a Dynamic Worker for execution — all within the same session.

This is not the same as calling an external tool-use API. The agent generates actual TypeScript code, bundles it with @cloudflare/worker-bundler, and executes it in a Tier 1 or Tier 2 environment. The generated tool becomes part of the agent's toolkit for the duration of the session.

In practice, this is useful for tasks where the required transformation or extraction logic cannot be fully specified in advance — for example, parsing a novel API response format or implementing a domain-specific calculation that varies per client.

The `@cloudflare/think` Base Class

All five primitives are exposed through the Think base class, which handles the full chat lifecycle: agentic loop, message persistence, streaming, tool execution, stream resumption, and extensions.

Installation:

npm install @cloudflare/think agents ai @cloudflare/shell zod workers-ai-provider

Minimal example:

import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";

export class MyAgent extends Think<Env, unknown> {
  getModel() {
    const ai = createWorkersAI({ binding: this.env.AI });
    // Workers AI free tier includes @cf/meta/llama-3.3-70b-instruct
    return ai("@cf/meta/llama-3.3-70b-instruct");
  }

  configureSession() {
    return {
      systemPrompt: "You are a helpful assistant.",
    };
  }
}

The wrangler.toml binding wires the Durable Object:

[[durable_objects.bindings]]
name = "MY_AGENT"
class_name = "MyAgent"

[[migrations]]
tag = "v1"
new_sqlite_classes = ["MyAgent"]

The cloudflare/agents GitHub repository contains 30+ self-contained example agents demonstrating fibers, facets, sessions, and execution ladder integration. The docs/think/index.md file in that repository is the most complete reference beyond the official documentation.

Project Think vs. Dynamic Workers vs. Cloudflare Workflows

Developers familiar with Cloudflare's existing primitives will have one question: how does this fit alongside Dynamic Workers and Workflows?

Dynamic Workers (covered in Effloow's Dynamic Workers guide) are stateless sandboxed V8 isolates for executing LLM-generated code. They correspond to Tier 1 of Project Think's execution ladder. They are not durable.

Cloudflare Workflows provide durable multi-step execution, but the state machine lives outside your Worker. Steps are defined declaratively, and Cloudflare's infrastructure manages replay. This is powerful for ETL pipelines and scheduled jobs, but the agent has no access to its own state between steps.

Project Think puts the state machine inside the agent itself via fibers and the co-located SQLite database. The agent is both the executor and the state store. This gives more flexibility for agentic patterns where the next step depends on reasoning about the previous step's output — not just a declared execution graph.

The right choice depends on your workload:

Stateless code execution only → Dynamic Workers
Declarative multi-step pipeline with retry guarantees → Cloudflare Workflows
Autonomous agents with reasoning-driven state transitions → Project Think

Common Mistakes When Building Durable Agents

Checkpoint too infrequently. If you only call ctx.stash() at the end of a multi-minute operation, a crash at minute 8 means re-running 8 minutes of work. Checkpoint after each meaningful unit — after a web request, after a parsing step, after a tool call returns.

Share state through the parent's SQLite instead of facet isolation. Facets exist precisely so specialist sub-agents do not see each other's state. Routing everything through the parent's database re-introduces the coupling you were trying to avoid.

Escalate to Tier 4 for every code execution task. Cloudflare Sandbox (Tier 4) has more overhead than Dynamic Workers (Tier 1). Use Tier 4 only when the task genuinely needs a Linux environment — git operations, compiled languages, or full test runners.

Ignore compaction until the context window overflows. Plan compaction as a regular scheduled step, not an emergency measure. The Session API's non-destructive overlay lets you compact early and often without losing history.

Treat @cloudflare/think as production-stable. As of May 2026, Project Think is in experimental preview. The package version is 0.0.1-experimental.x. The API surface is intended to be stable, but Cloudflare explicitly says it will continue to evolve. Treat it as early-adopter infrastructure.

Practical Application: When to Choose Project Think

Project Think is well-suited to agent workloads that:

Exceed Cloudflare Worker's standard CPU time limits
Require specialist sub-tasks that should not share state
Need to explore multiple reasoning paths without forking the entire agent
Generate and execute code as part of their reasoning loop
Must maintain conversation history across days or weeks for personalization

It is less well-suited to:

Simple request/response pipelines (standard Worker is simpler)
Batch jobs without agent reasoning (Cloudflare Workflows is more appropriate)
Workloads requiring GPUs or dedicated compute (no GPU support on Workers)

FAQ

Q: Does Project Think work with any LLM or only Workers AI?

Project Think's Think base class is model-agnostic — getModel() can return any model compatible with the Vercel AI SDK's provider interface. Workers AI (workers-ai-provider) is the zero-egress option for Cloudflare-hosted models, but you can wire in OpenAI, Anthropic, or any other provider via the AI SDK.

Q: What's the cost of fiber checkpointing?

Each ctx.stash() writes to the Durable Object's SQLite database — a local write, not a network call. The overhead is the same as any SQLite write on the same machine. Cloudflare does not charge extra for SQLite writes beyond the standard Durable Object storage pricing. For most agents, checkpointing 10–50 times per session adds negligible cost.

Q: Can sub-agents (facets) span multiple geographic regions?

Facets are colocated with the parent Durable Object on the same machine by design — this is what makes their typed RPC low-latency. They do not span regions. If you need geographically distributed agent coordination, that requires a different architecture (message queues or service bindings across Workers).

Q: Is Project Think production-ready in May 2026?

No. It is in experimental preview. Cloudflare describes the API surface as stable but explicitly notes it will evolve. For production workloads, monitor the cloudflare/agents GitHub repository and the Cloudflare changelog for GA announcements.

Q: How does the Session API relate to a vector database?

The Session API is not a semantic search layer — it is a relational message store with FTS5 full-text search. It handles conversation history, forking, and compaction well. For semantic retrieval over large external knowledge bases, you still need a vector database (Cloudflare Vectorize, Pinecone, etc.). They are complementary, not alternatives.

Key Takeaways

Project Think solves the fundamental durability problem in serverless AI agents: agents can now checkpoint progress and survive restarts without re-running from the beginning.
The five core primitives — fibers, sub-agents (facets), the Session API, the execution ladder, and self-authored extensions — address distinct failure modes in long-horizon agentic workloads.
@cloudflare/think is the opinionated base class that wires all primitives together; it is model-agnostic and works with any Vercel AI SDK provider.
The 5-tier execution ladder enforces least-privilege code execution, keeping fast tasks in lightweight V8 isolates and escalating to full Linux environments only when necessary.
As of May 2026, Project Think is in experimental preview. The API is intended to be stable but will continue to evolve — suitable for early adoption and evaluation, not yet for production-critical deployments.

Bottom Line

Project Think is the most complete answer Cloudflare has given to "how do I run an AI agent that lasts longer than a serverless function?" The fiber + facet + session combination solves real architectural problems, not theoretical ones. Get familiar with it now — when it reaches GA, it will become the default pattern for serious agent infrastructure on the Workers platform.

DEV Community

Cloudflare Project Think: Durable Agent Runtime Guide

Why Serverless Agents Break — and Why Project Think Fixes It

The Five Primitives of Project Think

1. Fibers — Checkpointed Execution

2. Sub-Agents (Facets) — Isolated Child Agents with Typed RPC

3. The Session API — Relational Conversation Trees

4. The Execution Ladder — Graduated Code Trust

5. Self-Authored Extensions — Agents Writing Their Own Tools

The `@cloudflare/think` Base Class

Project Think vs. Dynamic Workers vs. Cloudflare Workflows

Common Mistakes When Building Durable Agents

Practical Application: When to Choose Project Think

FAQ

Q: Does Project Think work with any LLM or only Workers AI?

Q: What's the cost of fiber checkpointing?

Q: Can sub-agents (facets) span multiple geographic regions?

Q: Is Project Think production-ready in May 2026?

Q: How does the Session API relate to a vector database?

Key Takeaways

Top comments (0)

Why Serverless Agents Break — and Why Project Think Fixes It

The Five Primitives of Project Think

1. Fibers — Checkpointed Execution

2. Sub-Agents (Facets) — Isolated Child Agents with Typed RPC

3. The Session API — Relational Conversation Trees

4. The Execution Ladder — Graduated Code Trust

5. Self-Authored Extensions — Agents Writing Their Own Tools

The @cloudflare/think Base Class

Project Think vs. Dynamic Workers vs. Cloudflare Workflows

Common Mistakes When Building Durable Agents

Practical Application: When to Choose Project Think

FAQ

Q: Does Project Think work with any LLM or only Workers AI?

Q: What's the cost of fiber checkpointing?

Q: Can sub-agents (facets) span multiple geographic regions?

Q: Is Project Think production-ready in May 2026?

Q: How does the Session API relate to a vector database?

Key Takeaways

The `@cloudflare/think` Base Class