DEV Community

Hariharan Sharma
Hariharan Sharma

Posted on

Building a JSON Schema Validator from Scratch (Part 1)

The motive for this

I wanted to go beyond surface-level understanding and actually get deep into JSON Schema and validator architecture from first principles.

My goal is not just to use JSON Schema, but to understand how validators work internally so I can:

  • Build a JSON Schema validator from scratch
  • Develop stronger schema design intuition
  • Participate more meaningfully in JSON Schema community discussions
  • Contribute to the JSON Schema ecosystem with actual technical depth instead of vague theory

This path was also inspired by advice from experienced community members, and honestly, it made perfect sense. If you want to contribute seriously, reading documentation casually is not enough. You need to understand the specification, implementation trade-offs, and architectural reasoning behind it.

What I have covered so far

I started by revisiting the official JSON Schema getting started documentation to strengthen foundational concepts, then moved directly into the validation specification itself.

Some of the more challenging but important concepts I explored include:

Interoperability considerations

Understanding how validators must behave consistently across programming languages, especially around:

  • Arbitrary precision numbers
  • String validation edge cases
  • Regular expression portability

This is critical because building a validator means thinking beyond language-specific limitations.

Meta-schema and dialect bootstrapping

One of the most important concepts was understanding that before validating JSON instances, validators must first validate schemas themselves through:

  • $schema
  • $vocabulary
  • Draft compatibility
  • Meta-schema validation

This completely changed how I view schema loading and validator initialization.

Validation keyword mechanics

I explored deeper validation behavior for:

  • minimum
  • maximum
  • exclusiveMinimum
  • exclusiveMaximum
  • multipleOf
  • contains
  • minContains
  • maxContains

Especially array counting logic and numeric correctness, which are far more nuanced than they initially appear.

Specification architecture

Rather than treating documentation like reference material, I am approaching it as implementation blueprints.

This means understanding:

  • Why certain design decisions exist
  • Where interoperability breaks
  • Practical compromises used by existing validators like AJV
  • How vocabulary systems future-proof the specification

Why this matters for building a validator from scratch

This process is helping me create a proper roadmap for implementation, including:

  • Parsing schemas
  • Meta-schema validation
  • Vocabulary support
  • Accurate numeric validation
  • Regex safety
  • Schema compilation strategies
  • Standards compliance

Instead of blindly coding features, I am building an architectural understanding first.

Long-term vision

These docs will serve as ongoing reference points throughout development.

I plan to repeatedly return to them while implementing each validator component so that my project aligns as closely as possible with real specification behavior.

Beyond the project itself, this journey is preparing me to:

  • Contribute to JSON Schema discussions
  • Understand validator ecosystem limitations
  • Build better tooling
  • Potentially contribute directly to JSON Schema implementations or specification conversations

Final thought

It is slower, more technical, and occasionally absurdly dense, but it is also one of the best ways to develop genuine expertise rather than shallow framework familiarity.

Top comments (0)