Made Büro

Posted on May 11

OpenModels: Explore LLM Models and Inference Providers

#ai #llm #openai #opensource

The number of LLM providers keeps growing and so does the confusion around pricing, availability and compatibility. OpenModels is an open-source project that brings structure to this landscape: a single registry where models, providers, and their relationships are documented, validated, and queryable.

Why This Matters

The AI inference ecosystem is fragmented in ways that cost teams real time and money:

The same model can cost 10x more depending on which provider you use
Latency between providers varies radically even for identical models
Uptime is unknown until you experience an outage yourself
Pricing pages change without notice, and there's no structured way to track it
Choosing a provider still means opening 15 tabs and building a spreadsheet

The AI ecosystem already has excellent tooling for training and inference. What is still missing is standardized infrastructure visibility across providers.

OpenModels focuses on that operational layer.

What is OpenModels

OpenModels is an open infrastructure project for discovering, validating, and comparing LLM models and inference providers.

The project combines:

an open registry of model and provider metadata
structured JSON schemas with automated validation
provider normalization (pricing, rate limits, regions)
real-time telemetry collection (health, latency, uptime)
a searchable web interface and REST API
operational intelligence for choosing the right provider

The goal is simple: make the modern AI inference ecosystem observable and comparable.

Current State

The registry currently tracks:

62 models — from OpenAI, Anthropic, Google, Meta, DeepSeek, Mistral, xAI and others
30 providers — including Together AI, Groq, DeepInfra, Cerebras, Fireworks, SambaNova and more
90+ provider-model mappings — each with pricing, rate limits, and region data

The public registry is community-maintained through YAML definitions on GitHub:
github.com/openmodelsrun/openmodels

Architecture

The project is intentionally split into two layers.

1. Open Registry

The public repository contains normalized ecosystem data:

openmodels/
├── models/           # Canonical model definitions
├── providers/        # Inference provider definitions
├── mappings/         # Provider-model links with pricing
│   ├── anthropic/
│   ├── openai/
│   ├── together-ai/
│   └── ...
└── schemas/          # JSON Schema (Draft 7) for validation

Every YAML file is validated on pull request via GitHub Actions:

YAML syntax check
JSON Schema conformance
Referential integrity (mappings must reference existing models and providers)
Duplicate ID detection

Nothing merges without passing all four checks.

2. Platform Layer

The platform consumes registry data and adds operational intelligence:

REST API (api.openmodels.run) — model discovery, provider comparison, telemetry endpoints
Web Interface (openmodels.run) — search, browse, compare with Command Palette and ecosystem graph
Telemetry Workers — health probes every 5 minutes, latency probes every 15 minutes

Core Concepts

The registry is built around three entities:

┌─────────┐         ┌───────────┐         ┌──────────┐
│  Model  │◄────────│  Mapping  │────────►│ Provider │
└─────────┘         └───────────┘         └──────────┘
  (what)          (pricing, limits)         (where)

Model — a canonical LLM definition (e.g., DeepSeek V3, Claude Opus 4.6). Describes capabilities, modalities, context window, and licensing. Vendor-neutral.

Provider — an inference service (e.g., Together AI, Groq). Describes API endpoint, auth type, regions, and compatibility format.

Mapping — the glue. Connects a model to a provider with specific pricing, rate limits, and available regions.

This creates a many-to-many relationship: one model can be served by multiple providers at different price points, and one provider can serve dozens of models.

Example: Llama 4 Scout across providers

Provider	Input (per 1M tokens)	Output (per 1M tokens)	RPM
DeepInfra	$0.06	$0.18	600
Groq	$0.11	$0.34	30
Cerebras	$0.60	$0.60	30

Same model, three providers, 10x price difference. That's the kind of visibility the registry provides.

What a Model Definition Looks Like

id: deepseek-v3
name: DeepSeek V3
description: DeepSeek's third-generation large language model with mixture-of-experts architecture.
capabilities:
  - chat
  - completion
  - function-calling
  - code-generation
  - reasoning
modalities:
  - text
  - code
context_window: 128000
licensing: other
created_at: "2024-12-01T00:00:00.000Z"
updated_at: "2025-01-15T00:00:00.000Z"

What a Mapping Looks Like

model_id: deepseek-v3
provider_id: together-ai
provider_model_name: deepseek-ai/DeepSeek-V3
pricing:
  input_per_million: 0.90
  output_per_million: 0.90
  currency: USD
rate_limits:
  requests_per_minute: 600
  tokens_per_minute: 1000000
context_window_override: null
available_regions:
  - us-east-1
created_at: "2025-01-01T00:00:00.000Z"
updated_at: "2025-01-01T00:00:00.000Z"

Telemetry

The platform continuously monitors provider health and performance:

Metric	Interval	Retention
Health status (up/down)	Every 5 min	30 days
Time to first token (TTFT)	Every 15 min	30 days
Total response time	Every 15 min	30 days
Availability (uptime %)	Computed	Rolling 7 days

When a user queries ranked providers for a model, the API computes a composite score:

Factor	Weight
Uptime (7-day rolling)	40%
Median latency (TTFT)	30%
Price per million tokens	20%
Median total response time	10%

This means the API can answer: "Which provider should I use for DeepSeek V3 right now?" — factoring in live performance, not just listed specs.

API

The public API at api.openmodels.run provides model discovery, provider comparison, and telemetry data. Key endpoints:

GET /api/models                        # Search and list models
GET /api/models/:id/providers          # Providers for a model with pricing
GET /api/models/:id/compare            # Side-by-side provider comparison
GET /api/telemetry/ranked/:model_id    # Ranked providers by live performance
GET /api/search                        # Unified search across the registry

Full API reference: docs.openmodels.run/api-reference

Web Interface

The web app at openmodels.run provides:

Model search with sort by name, recency, context window, or provider count
Provider comparison view (pricing, latency, uptime side-by-side)
Command Palette (Cmd+K / Ctrl+K) for instant global search
Interactive ecosystem graph (node visualization of model-provider relationships)
Popular models section with relevance-based ranking
Category navigation by capability, modality, and license
Dark/light mode
Mobile-responsive layout

Coverage

The registry covers models from OpenAI, Anthropic, Google, Meta, DeepSeek, Mistral, xAI, Alibaba, Microsoft, NVIDIA, Cohere, Moonshot, Zhipu, MiniMax, and others — across 30 inference providers including first-party APIs and third-party platforms like Together AI, Groq, DeepInfra, Fireworks, SambaNova, Nebius and Scaleway.

Full model and provider lists are available in the registry repository.

Contributing

Adding a model, provider, or mapping is a pull request:

Fork the repository
Create a YAML file following the schema
Run python validate_registry.py locally
Open a PR — CI validates automatically
Merge after review

All IDs use kebab-case (deepseek-v3, together-ai). All timestamps are ISO 8601. The schemas enforce structure, so invalid data never enters the registry.

Current Focus

Telemetry reliability and probe coverage
Provider ranking accuracy
OpenAPI specification for the public API
Expanding ecosystem coverage (models, providers, regions)

DEV Community