DEV Community

HYUN SOO LEE
HYUN SOO LEE

Posted on

Automating Korean Saju (四柱) Content at Scale: A Python + Claude Vision Pipeline

Automating Korean Saju (四柱) Content at Scale: A Python + Claude Vision Pipeline

Three facts that broke our first prototype: a single Saju chart image contains six distinct data layers, every channel wants a different voice, and one wrong Heavenly Stem (天干) ruins reader trust permanently.


The Problem

Korean Saju (사주, 四柱命理) content is exploding across social platforms. Creators publish chart analyses tied to celebrities, cultural moments, and trending dates. The bottleneck is not demand — it is the extraction-to-publish pipeline.

A standard Manse chart image (만세력) packs the following into one screenshot:

  • Four pillars (年柱·月柱·日柱·時柱) each with a Heavenly Stem (天干) and Earthly Branch (地支)
  • Ten-God relationships (十神) for every stem and branch
  • Twelve Growth Stages (十二運星) per branch
  • Spirit Killings (神殺) attached to specific pillars
  • A current Great Fortune cycle (大運) with its own stem, branch, and Ten-God
  • An Annual Fortune (歲運) overlay for the target year

Manual extraction by a trained analyst takes 15–25 minutes per chart. At volume — say, a weekly content calendar covering 10 subjects — that is 250 minutes of specialist labor before a single word is written.

Our goal: reduce extraction + first-draft time to under 90 seconds per chart while keeping accuracy high enough to survive expert review.


Pipeline Overview

The pipeline has five stages:

[Image Input] → [Vision Extraction] → [Structured Validation] → [Prompt Assembly] → [Channel Formatter] → [QA Gate] → [Output]

Each stage is a discrete Python module. Stages 3 and 6 are the ones that saved us from shipping garbage.


Stage 1 — Image Ingestion

We accept PNG/JPG uploads via a small FastAPI endpoint. Images are resized to a maximum of 1600px on the long edge before being base64-encoded for the Vision API call. Larger images did not improve extraction accuracy in our tests and added latency.

from PIL import Image
import base64, io

def prepare_image(path: str, max_long_edge: int = 1600) -> str:
img = Image.open(path)
img.thumbnail((max_long_edge, max_long_edge))
buf = io.BytesIO()
img.save(buf, format="PNG")
return base64.b64encode(buf.getvalue()).decode()


Stage 2 — Claude Vision Extraction

This is the core of the system. We pass the encoded image to Claude with a strict extraction prompt. The prompt instructs the model to return a JSON object with exactly the fields we expect — no inference, no interpretation, only what is visually present in the image.

Key prompt rules we enforced after painful iteration:

  1. Read the label, do not infer. If the image shows 偏財 (Indirect Wealth) on the Hour Pillar Heavenly Stem, output "偏財". Do not substitute 正財 (Direct Wealth) because the underlying stem calculation suggests it.
  2. Preserve hanja. All Ten-God labels, pillar characters, and spirit killing names must be returned in their original Chinese characters alongside the Korean hangul label shown in the image.
  3. Null over guess. If a field is not visible — birth hour unknown, for example — return null. Do not hallucinate a value.
  4. Separate layers. The Annual Fortune (歲運) overlay must be stored in its own object, not merged into the pillar array.

Sample extraction schema (abbreviated):

{
"subject": "BLACKPINK",
"gender": "female",
"solar_date": "2016-08-08",
"lunar_date": "2016-07-06",
"pillars": {
"year": {"stem": "丙", "branch": "申", "stem_god": "偏財", "branch_god": "偏印", "stage": "長生", "killings": ["文昌貴人","暗錄","月空","驛馬殺"]},
"month": {"stem": "丙", "branch": "申", "stem_god": "偏財", "branch_god": "偏印", "stage": "長生", "killings": ["文昌貴人","暗錄","月空","驛馬殺"]},
"day": {"stem": "壬", "branch": "戌", "stem_god": "比肩", "branch_god": "偏官", "stage": "冠帶", "killings": ["月德貴人","白虎殺","火蓋殺"]},
"hour": {"stem": "丙", "branch": "午", "stem_god": "偏財", "branch_god": "正財", "stage": "胎", "killings": ["月空","羊刃殺"]}
},
"great_fortune": {"age": 10, "stem": "甲", "branch": "午", "stem_god": "食神", "branch_god": "正財", "stage": "胎"},
"annual_fortune_2026": {"stem": "丙", "branch": "午", "stem_god": "偏財", "branch_god": "正財", "killing": null}
}


Stage 3 — Structured Validation

Before any content is generated, a Pydantic model validates the extracted JSON. This catches the most common Vision errors:

  • Stem/branch combinations that are calendrically impossible
  • Ten-God labels that contradict the Day Master (日主) stem
  • Missing required fields for the target content type

from pydantic import BaseModel, validator
from typing import Optional

VALID_STEMS = ["甲","乙","丙","丁","戊","己","庚","辛","壬","癸"]
VALID_BRANCHES = ["子","丑","寅","卯","辰","巳","午","未","申","酉","戌","亥"]

class Pillar(BaseModel):
stem: str
branch: str
stem_god: str
branch_god: str
stage: str
killings: list[str] = []

@validator("stem")
def stem_must_be_valid(cls, v):
    assert v in VALID_STEMS, f"Invalid stem: {v}"
    return v

@validator("branch")
def branch_must_be_valid(cls, v):
    assert v in VALID_BRANCHES, f"Invalid branch: {v}"
    return v
Enter fullscreen mode Exit fullscreen mode

Validation failures route to a human review queue rather than silently proceeding. Our false-positive rate on valid charts is under 3%.


Stage 4 — Prompt Assembly

Once the structured data is validated, a prompt builder assembles the generation prompt. This is where channel strategy diverges.

We maintain a ChannelConfig object per destination:

@dataclass
class ChannelConfig:
channel: str # "devto" | "instagram" | "threads" | "youtube_desc"
word_target: int
tone: str # "technical" | "casual" | "narrative"
required_blocks: list[str]
forbidden_patterns: list[str]
language: str # "en" | "ko" | "mixed"

For dev.to, the tone is technical, the language is en, and forbidden patterns include absolute certainty language ("will definitely", "100%", "guaranteed"). For Instagram, the tone is casual and the word target drops to 150.

The prompt builder injects the validated chart JSON, the channel config, and a block template list. It does not pass raw image bytes to the generation call — only the already-validated structured data. This separation means the generation model never has to do double duty as both extractor and writer.


Stage 5 — Channel Formatter

The formatter takes the raw generation output and applies channel-specific post-processing:

  • Dev.to: injects YAML frontmatter, converts section headers to ##, ensures code blocks use proper fencing, appends CTA link
  • Instagram: strips all markdown, enforces emoji density cap (max 1 per 30 chars), truncates to 2200 chars
  • Threads: max 500 chars, single hook sentence + link
  • YouTube description: timestamps placeholder, keyword density check

The formatter also handles the INFO_GRAPHIC block — a markdown table summarizing the four pillars in a scannable format for dev.to readers:

Pillar Stem (天干) Branch (地支) Stem God (十神) Branch God Stage (運星)
Year 年柱 Indirect Wealth 偏財 Indirect Seal 偏印 Growth 長生
Month 月柱 Indirect Wealth 偏財 Indirect Seal 偏印 Growth 長生
Day 日柱 Friend 比肩 Seven Killings 偏官 Officer Belt 冠帶
Hour 時柱 Indirect Wealth 偏財 Direct Wealth 正財 Embryo 胎

Stage 6 — QA Gate

This is the stage most pipelines skip and then regret. Our QA gate runs four checks:

1. Guardrail scan — regex pass for absolute-certainty language, gossip triggers (relationship status assertions, health diagnoses), and any hangul characters that leaked into an English-target output.

2. Ten-God consistency check — re-derives the expected Ten-God for each stem from the Day Master using a lookup table and flags mismatches between derived and extracted values. This catches the single most common Vision hallucination: swapping 偏財 (Indirect Wealth) and 正財 (Direct Wealth) when the image label is small.

3. Word count band — flags outputs outside ±15% of the channel word target.

4. CTA presence — confirms the destination URL appears exactly once.

Outputs that fail any check are either auto-corrected (word count, CTA) or routed to human review (Ten-God mismatch, guardrail hit).


Lessons Learned

Extraction and generation must be separate API calls. Asking one model to both read the image and write the article in a single prompt produces confident-sounding errors. Split the jobs.

Spirit Killings (神殺) are the hardest layer. Names like Goat Blade (羊刃殺), Travel Horse (驛馬殺), and Empty Void (空亡) appear in small font with inconsistent spacing. We added a dedicated second Vision pass focused only on the killing labels after the main extraction, then merged results. Accuracy on killings went from 71% to 94%.

The Annual Fortune (歲運) overlay is visually ambiguous. In many Manse images, the 2026 歲運 column uses the same color scheme as the Great Fortune (大運) column. Explicit prompt instruction — "the Annual Fortune column is the leftmost column labeled with a year, not an age" — was necessary to prevent confusion.

Pydantic validation pays for itself in the first week. Before we added it, approximately one in eight generated articles contained an impossible stem/branch pair that any trained reader would immediately distrust. Validation eliminated that category of error entirely.

Channel config as code, not as prompt text. Early versions embedded channel instructions directly in the generation prompt. This made A/B testing formats expensive. Moving channel rules into a ChannelConfig dataclass and keeping the generation prompt channel-agnostic made iteration dramatically faster.


Summary

  • A five-stage pipeline (ingest → extract → validate → generate → QA) reduces Saju chart analysis time from 25 minutes to under 90 seconds while maintaining expert-reviewable accuracy.
  • Separating Vision extraction from text generation, and validating the structured intermediate, is the single highest-leverage architectural decision.
  • Channel-specific formatting belongs in config objects and post-processors, not in generation prompts.

This article describes a technical content automation pipeline. Chart data used for pipeline illustration is drawn from publicly available Manse chart images. No claim is made about any individual's future, career, relationships, or outcomes. Saju analysis is a traditional interpretive framework; all outputs should be treated as exploratory content, not as predictive fact.

Explore the tooling at runartree.com


Project link

This article is based on an automated content workflow for a Korean Saju platform.

The key lesson is simple: generation alone is not enough. A useful publishing pipeline also needs formatting, QA, tracking links, and channel-specific editorial rules.


Bazi interpretation. Not medical, legal, or investment advice.

Top comments (0)