HYUN SOO LEE

Posted on May 11

How I Built an Automated Korean Saju (四柱) Content Pipeline with Claude Vision + Python

#bazi #kpop #saju #korea

The Problem: Structured Data Trapped in Images

Korean Saju (四柱推命) content is booming. Platforms like Naver, Kakao, and a growing wave of indie astrology apps serve millions of readings per month. But the production bottleneck is brutal: every reading starts with a manse calendar screenshot — a dense image containing the four pillars (年柱/月柱/日柱/時柱), ten-god labels (十星), twelve-stage cycle markers (十二運星), and special stars (神殺) — all rendered as overlapping hanja circles and Korean tags.

If you want to publish that reading as a long-form blog post, a YouTube script, a dev.to technical explainer, or a Naver blog entry, you have to manually re-read the image, re-type the data, and re-format for each channel. For a small team running 20–30 subjects per month, that is a full-time job before any writing starts.

This post documents the pipeline I built to fix that — using Claude Vision for structured extraction, a multi-stage prompt chain for content generation, and channel-specific formatters for final output. The Saju reading for SEVENTEEN member Mingyu (戊土 day master, 1997-04-06, male, unknown birth hour) served as the live QA case for the entire system.

Pipeline Overview

[Input Form Screenshot]
↓
[Claude Vision: Structured Extraction]
↓
[Validation Layer: Pillar Consistency Check]
↓
[Prompt Chain: Analysis Blocks]
↓
[Channel Formatter: dev.to / Naver / YouTube Script]
↓
[QA Gate: Guardrail Checks]
↓
[Published Output]

Five stages. Each stage has a defined input schema, output schema, and failure mode. Let me walk through each.

Stage 1 — Claude Vision Extraction

The manse calendar image ( 02_manse_calendar_complete.png) contains the following data classes:

Subject metadata: name, gender, solar date, lunar date, birth hour
Four pillars: 年柱(丁丑), 月柱(甲辰), 日柱(戊寅), 時柱(戊午) — each with hanja characters rendered inside colored circles
Ten-god labels per pillar: 年柱 = 正印(정인) / 劫財(겁재), 月柱 = 偏官(편관) / 比肩(비견), 日柱 = 比肩(비견) / 偏官(편관), 時柱 = 比肩(비견) / 正印(정인)
Twelve-stage cycle markers: 時柱 = 帝旺(제왕), 日柱 = 長生(장생), 月柱 = 冠帶(관대), 年柱 = 養(양)
Special stars (神殺): 年柱 천을귀인(天乙貴人), 백호살(白虎殺), 화개살(華蓋殺); 月柱 백호살(白虎殺); 時柱 양인살(羊刃殺), 도화살(桃花殺); 日柱 편관(장생)
Current Daewoon (大運): 庚子(경자) — 天干庚 = 食神(식신), 地支子 = 正財(정재)
2026 Sewoon (歲運): 丙午(병오) — age 30, 天干丙 = 食神(식신) marker visible, 地支午 = 正財(정재) marker visible

The extraction prompt is a single-shot structured JSON request:

Extract ALL visible hanja characters, Korean ten-god labels,
twelve-stage cycle names, and special star names from this
manse calendar image. Return as JSON with keys:
subject, four_pillars, daewoon, sewoon.
Do NOT infer. Only extract what is visually present.
Flag any character you are less than 90% confident about.

Key engineering decision: the Flag uncertainty instruction. Without it, Claude will confidently hallucinate a ten-god label for a pillar where the image is partially obscured. In the Mingyu test case, the birth hour was unknown (모름), so the 時柱天干 ten-god label was rendered with a placeholder — the extractor correctly flagged this rather than guessing.

Stage 2 — Validation Layer

Before any content generation runs, a Python validation script checks internal consistency:

def validate_pillars(extracted: dict) -> list[str]:
errors = []
# Day master must appear in day pillar heavenly stem
if extracted["day_pillar"]["stem"] != extracted["day_master"]:
errors.append("Day master mismatch")
# Ten-god labels must be consistent with day master
# (simplified: check Friend/比肩 only appears when stem == day master element)
for pillar in ["year", "month", "hour"]:
stem = extracted[pillar]["stem"]
label = extracted[pillar]["stem_god"]
if not ten_god_consistent(extracted["day_master"], stem, label):
errors.append(f"{pillar} stem ten-god inconsistent: {stem} / {label}")
return errors

In the Mingyu case, the validation caught one edge case: the 2026 Sewoon image shows 庚子 in the Daewoon column and 丙午 in the Sewoon column, but the ten-god markers in the Sewoon column are rendered at a smaller font size and partially overlap the Daewoon column. The extractor initially assigned 食神(식신) to 丙 relative to the 庚 Daewoon stem rather than relative to the 戊 day master. The validator caught the mismatch and re-ran the extraction with an explicit anchor instruction: always compute ten-god labels relative to the day master stem (日干), not relative to the Daewoon stem.

This is the single most common error in automated Saju extraction. Build the check before you build anything else.

Stage 3 — Prompt Chain: Analysis Blocks

Once the structured JSON is validated, the content generation runs as a sequential block chain, not a single mega-prompt. Each block receives the validated JSON plus the output of the previous block as context.

Block	Input	Output
Core structure analysis	Four pillars JSON	Five-element ratio, body strength (身强/身弱), yongshin (用神)
Daewoon × Sewoon interaction	Above + Daewoon/Sewoon JSON	Clash (冲), combination (合), ten-god activation summary
Domain readings	Above + domain list	Career, wealth, relationships, health — each 150–200 tokens
Risk signals	Above	2–3 specific risk patterns with structural basis
Quarterly strategy	All above	Q1–Q4 action framing

The Mingyu pipeline produced the following core structural read: 戊土(무토) day master, 土 at 50% of the five-element distribution, 木 25%, 火 25%, 金·水 at 0%. Strong body (身强) structure. 金·水 is the practical yongshin (用神) because 食傷 and 財星 are absent from the main stems and branch primary qi. The current Daewoon 庚子 brings 金·水 energy — structurally favorable — but the 2026 Sewoon 丙午 adds 火·印星 energy that reinforces the already-heavy 土, requiring output regulation rather than expansion.

The key prompt engineering rule here: each block prompt ends with Return only the analysis for this block. Do not summarize previous blocks. Do not add caveats not supported by the extracted data. This keeps token usage predictable and prevents the model from hallucinating structural details it did not extract.

Stage 4 — Channel Formatter

The same analysis JSON feeds three formatters in parallel:

dev.to formatter (this article): technical framing, English, markdown headers, code blocks for pipeline logic, manse calendar data cited as structured input rather than mystical oracle.

Naver Blog formatter: Korean, conversational tone, HTML-compatible line breaks, domain-specific vocabulary (운세, 사주, 대운), no code blocks, CTA pointing to the full reading platform.

YouTube Script formatter: spoken Korean, 8–12 minute target length, hook in first 30 seconds referencing the specific structural pattern (e.g., "土가 절반인 사주에서 2026년에 무슨 일이 생기는지"), chapter markers at each domain block.

Each formatter has a channel-specific guardrail list appended to the system prompt:

GUARDRAILS (dev.to):

No fortune-telling framing
No certainty language (must/will/definitely)
No privacy speculation
Cite image source for all data claims
CTA = runartree.com

Stage 5 — QA Gate

Before any output is published, an automated QA pass checks:

Certainty language scan: regex for "반드시", "무조건", "절대", "확실히", "will definitely", "guaranteed" — any match blocks publication
Data consistency check: every hanja character in the output must appear in the validated extraction JSON — no hallucinated pillars
Tone check: sentiment classifier flags outputs that score above threshold on anxiety-inducing language
Word count gate: channel-specific minimum/maximum
CTA presence: output must contain the CTA URL exactly once

The Mingyu test case passed all five gates on the second run. The first run failed the certainty language scan because the career block used the phrase "성과가 반드시 쌓입니다" — caught, revised to "성과가 쌓이는 흐름입니다."

[INFO_GRAPHIC] Mingyu 2026 — Structural Snapshot

Day Master: 戊土 (Strong Earth) | Body Strength: 身强
Five Elements: 土50% / 木25% / 火25% / 金·水 0%
Yongshin (用神): 金·Water

四柱 (Four Pillars):
年柱: 丁(天乙貴人·白虎殺·華蓋殺) / 丑(劫財·養)
月柱: 甲(偏官·★) / 辰(比肩·冠帶·白虎殺)
日柱: 戊 / 寅(偏官·長生)
時柱: 戊(比肩·羊刃殺·桃花殺) / 午(正印·帝旺)

Daewoon: 庚子 | 食神 / 正財 | 子↔午冲, 子↔丑合
2026 Sewoon: 丙午 | 偏印 / 正印 | 午+午 double-trigger

Score: 68/100 | Flow: Upper-Mid

The Reversal: 天乙貴人 in 年支

The structural story of this chart is heavy 土, absent 金·Water, strong body — sounds like a closed, self-sufficient system that resists external input. That reading is mostly correct. But the 年支丑(축) carries 天乙貴人(Heavenly Noble Star), which is the single most reliable "unexpected door opens" marker in classical Saju.

Zi Ping Zhen Quan (子平真詮, Chapter on Useful Gods) notes that 貴人 stars activate most visibly when the chart is otherwise under pressure — precisely the condition created by the 子↔午 clash in the current Daewoon. The practical pipeline implication: the QA system flags 天乙貴人 as a positive anomaly that must appear in the output, because omitting it would produce a structurally incomplete and tonally misleading read.

Lessons Learned

1. Anchor ten-god labels to the day master, always. The most common extraction error. Build the validator before the generator.

2. Block-chain prompts outperform single mega-prompts. Smaller, sequential blocks produce more consistent structural logic and are easier to QA.

3. Unknown birth hour is not a blocker. The 時柱 is flagged as uncertain, the analysis runs on the remaining three pillars, and the output notes the limitation explicitly. Do not hallucinate a birth hour.

4. Channel formatters need their own guardrail lists. What is appropriate framing for a technical dev.to post is wrong for a Naver Blog post aimed at a general Korean audience, and vice versa.

5. The 午↔午 double-trigger is a real extraction challenge. When the 時柱地支 and the 2026 Sewoon 地支 are identical (both 午), some extractors collapse them into a single entry. The validator must check for this explicitly.

Summary

Claude Vision extracts structured four-pillar data from manse calendar screenshots; validation anchors all ten-god labels to the day master stem before any content generation runs
A sequential block-chain prompt architecture produces domain-specific analysis (career, wealth, relationships, health) with traceable structural basis — for Mingyu's 戊土 chart, the 2026 丙午 Sewoon adds 火·印星 pressure to an already 土-heavy structure, making output regulation the core strategic frame
Channel-specific formatters and a five-gate QA pass convert the same validated JSON into dev.to, Naver Blog, and YouTube Script outputs without manual reformatting

Want to explore this pipeline or run a reading for your own chart?
Full manse calendar generation, structured analysis, and long-form content output are available at runartree.com.

This article uses a published Saju reading as a technical test case for pipeline documentation purposes. Saju analysis reflects structural pattern interpretation based on classical Chinese metaphysics and does not constitute prediction, guarantee, or professional advice of any kind. All outputs should be read as probabilistic framing, not certainty.

Project link

This article is based on an automated content workflow for a Korean Saju platform.

Website: https://runartree.com?utm_source=devto&utm_medium=article&utm_campaign=saju_automation
Stack: Python, Claude Vision, channel-specific formatting, content QA
Domain: Korean Saju / Bazi content automation

The key lesson is simple: generation alone is not enough. A useful publishing pipeline also needs formatting, QA, tracking links, and channel-specific editorial rules.

Bazi interpretation. Not medical, legal, or investment advice.

DEV Community