DEV Community

Cover image for gogpu/ui v0.1.21: Enterprise Render Pipeline — Layer Tree, Damage Tracking, 0% GPU Idle
Andrey Kolkov
Andrey Kolkov

Posted on

gogpu/ui v0.1.21: Enterprise Render Pipeline — Layer Tree, Damage Tracking, 0% GPU Idle

Two months ago we released gogpu/ui v0.1.0 — 22 widgets, 3 design systems, ~150K lines of pure Go. Since then we shipped 21 patch releases, and the rendering pipeline is unrecognizable.

This post covers what changed and why it matters.

The Problem

v0.1.0 re-rendered the entire widget tree every frame. A 48×48 spinner in one corner caused the GPU to redraw 800×600 of static content. Hover over a button? Full tree walk. Open a dropdown? Full tree walk. This was fine for demos, not for production.

We studied how five frameworks solve this — Flutter, Chrome, Qt6, Android HWUI, Skia — and found the same architecture everywhere: Layer Tree + boundary isolation + damage tracking.

What We Built (v0.1.14 → v0.1.21)

Layer Tree Compositor

Every RepaintBoundary widget now owns a node in a persistent Layer Tree:

OffsetLayer (root)
├── PictureLayer (toolbar — clean, reuse texture)
├── PictureLayer (sidebar — clean, reuse texture)
├── ClipRectLayer (scrollview viewport)
│   └── PictureLayer (content — dirty, re-record)
└── PictureLayer (spinner — dirty, re-record 48×48)
Enter fullscreen mode Exit fullscreen mode

Four layer types — OffsetLayer, PictureLayer, ClipRectLayer, OpacityLayer — compose the frame. Clean layers reuse their GPU texture from the previous frame. Only dirty layers re-render.

This is the same pattern Flutter calls flushPaint + compositeFrame. We validated it against all five reference frameworks before writing a line of code.

0% GPU When Idle

The frame loop checks a flat dirty set — O(1), not O(n) tree walk:

if !w.HasDirtyBoundaries() && !w.NeedsRedraw() && !w.NeedsAnimationFrame() {
    return // nothing changed, skip frame entirely
}
Enter fullscreen mode Exit fullscreen mode

When the UI is idle, the GPU does zero work. Measured: 0% GPU across all six examples (hello, signals, taskmanager, gallery, ide, modular-compositor).

Previous approach walked the entire widget tree every frame to check if anything needed redraw. For 200 boundaries, the new approach is 45× faster.

Per-Boundary GPU Textures

Each RepaintBoundary renders into its own offscreen MSAA texture. When a child boundary becomes dirty, only that boundary's texture is re-rendered. The compositor blits all textures in a single non-MSAA pass.

A 48×48 spinner touching 2,304 pixels no longer forces the GPU to process 480,000 pixels of unchanged content.

Multi-Rect Damage

When multiple widgets are dirty in different screen regions, we don't union them into one giant rect. Each dirty rect gets its own GPU scissor:

Frame N: spinner (48×48) + status bar (800×24)
→ Two scissor rects, not one 800×600 rect
→ Zero pixel waste
Enter fullscreen mode Exit fullscreen mode

The damage pipeline flows through the full stack: uigg RenderDirectWithDamageRectswgpu PresentWithDamage. Ring buffer stores rect lists for N-buffer swapchains. Threshold at 16 rects merges to union (GDK/Sway pattern).

Persistent Layer Tree

UpdateLayerTree() reuses layer objects across frames instead of rebuilding the tree:

Metric Before After
Allocs per frame (200 boundaries) 613 13
Reduction 97.9%

Flutter calls this addRetained. Android calls it RenderNode reuse. We measured allocation profiles against both and matched their patterns.

The Numbers

Metric v0.1.0 v0.1.21
Lines (total / code) 150K / 105K 195K / 141K
Tests 6,000 7,200+
Coverage 97% 97%+
Packages 56 56
GPU idle (static UI) 5-18% 0%
Frame skip check O(n) tree walk O(1) flat set
Allocs/frame (200 boundaries) 613 13
Spinner GPU work full window 48×48 scissor

Ecosystem Update

The rendering pipeline required changes across four repositories. Here's where the ecosystem stands:

Repository Version Lines Code What It Does
naga v0.17.13 323K 240K Shader compiler: WGSL → SPIR-V, MSL, GLSL, HLSL, DXIL
gg v0.46.8 240K 171K 2D graphics: Skia-class rasterizer, GPU SDF, scene compositor
wgpu v0.27.3 211K 164K Pure Go WebGPU: Vulkan, DX12, Metal, GLES, Software
ui v0.1.21 195K 141K GUI toolkit: 22 widgets, 4 themes, Layer Tree pipeline
gogpu v0.34.3 61K 45K App framework: windowing, input, three-mode render loop
+ gpucontext, gputypes, systray, audio 19K 13K Shared interfaces, system tray, audio engine
Total 1,049K 774K 3,140 files across 9 repositories

1M+ total lines. 774K lines of code. Zero CGO. Zero Rust. Zero C.

Recent ecosystem highlights since the v0.1.0 article:

  • First Pure Go DXIL generator — naga compiles WGSL shaders directly to DXIL bytecode, eliminating the HLSL→FXC/DXC dependency. 161/170 IDxcValidator pass rate. Article.
  • Born ML v0.8.0 migrated to gogpu/wgpu — production ML framework running on our GPU stack. 105 GPU tests pass, HRM model trained 20 epochs. Article.
  • CJK text rendering — script-aware hinting, exact-size rasterization, Tier 6 routing for Chinese/Japanese/Korean glyphs.
  • LCD ClearType auto-detection — Windows SPI + registry, macOS None, Linux Xft/Wayland. Per-platform subpixel layout.
  • Software backend for CI — deterministic GPU without GPU hardware. Pixel-exact e2e tests prove scissor rects at HAL level.
  • Community deep-dive — independent technical analysis of gogpu/wgpu (Chinese) covering the zero-CGO syscall architecture, Snatchable resource lifecycle, and buffer state tracking internals. Always good to see the community dig into the implementation.

The Foundation Is Ready

This is the release where we stopped rebuilding and started building on top.

For the past two months every release was infrastructure: retained-mode rendering, scene composition, Layer Tree, damage tracking, boundary isolation. The kind of plumbing that's invisible to users but determines whether a framework can scale to real applications.

That plumbing is now in place. The render pipeline follows the same architectural patterns as Flutter, Chrome, and Qt6 — not because we copied them, but because we studied all five independently and arrived at the same conclusions. Layer Tree composition, per-boundary GPU textures, multi-rect damage, persistent allocation — these are industry-proven patterns, and they're production-ready in gogpu/ui.

The ecosystem has stabilized around this architecture. naga (shader compiler), wgpu (WebGPU HAL), gg (2D graphics), and gogpu (windowing) all reached the point where API churn is minimal and releases are incremental improvements, not rewrites. Nine repositories, 1M+ lines, and the dependency chain holds.

What this means going forward: the pipeline will be optimized, not rebuilt. Future releases will focus on:

  • New widgets — the 22 we ship today cover most use cases, but enterprise apps need more (color picker, date picker, rich text editor, tree grid)
  • Performance polish — reducing GPU usage for animated widgets from 10% to <3%, ListView recycling, texture GC
  • Platform accessibility — UIA on Windows, AT-SPI2 on Linux, NSAccessibility on macOS
  • Developer experience — better docs, more examples, smoother onboarding

The hard part is behind us. The interesting part is ahead.

Try It

git clone https://github.com/gogpu/ui.git
cd ui/examples/gallery
go run .
Enter fullscreen mode Exit fullscreen mode

Four design systems ship out of the box: Material Design 3, JetBrains DevTools, Microsoft Fluent, Apple Cupertino. Switch between them at runtime in the gallery example.

Backend selection via environment variable:

GOGPU_GRAPHICS_API=vulkan   go run ./examples/ide/
GOGPU_GRAPHICS_API=dx12     go run ./examples/ide/
GOGPU_GRAPHICS_API=gles     go run ./examples/ide/
GOGPU_GRAPHICS_API=software go run ./examples/ide/
Enter fullscreen mode Exit fullscreen mode

No code changes needed.

Help Us Get There

gogpu/ui is at the stage where the architecture is proven but the user base is small. We need real-world testing to catch edge cases that no amount of 97% coverage will find.

Test it. Clone a repo, run an example, try building something with it. If it breaks — that's valuable. File an issue, and we'll fix it. If it works — that's valuable too. Tell us what you built.

Spread the word. Most Go developers don't know this exists yet. A post on Reddit, a tweet, a mention in your team's Slack — it all helps. The project grows through people who try it and talk about it, not through marketing.

Write about it. Tutorials, experience reports, comparisons, critiques — all welcome. If you build something interesting with gogpu/ui, write about the process. The ecosystem needs content from people other than us.

Contribute. You don't need to touch the render pipeline. Documentation improvements, new examples, widget ideas, accessibility testing, CI on different hardware — there's work at every level. Check CONTRIBUTING.md or just open a discussion.

The codebase is 1M+ lines of pure Go with zero CGO. The foundation is solid. What it needs now is people building on it.


GitHub · Discussions · CHANGELOG · Reddit r/golang · X/Twitter

Top comments (0)