Subscription inference API

permute.sh

OpenAI-compatible text and image inference with API keys, monthly plans, and one customer console.

Platform

One API, one account, one console.

Permute keeps model access, billing, and API keys in one place.

OpenAI-Compatible

Use the same SDKs and switch only the base URL.

Subscription Quota

Plans include text and image capacity each month.

Unified Account

Use one account for direct API access and connected products.

Unified Console

Status, billing, and API keys stay in one place.

Overview

Built for production use.

Permute is built for teams that want one API, one billing flow, and one account.

Independent service

Permute runs as its own inference product with its own accounts, billing, and support.

Production-first

Use stable model names, one billing flow, and one console.

Focused product

The public product stays focused on keys, plans, models, billing, and status.

Principles

What the product is built to do.

  • OpenAI-compatible API contracts
  • Monthly plans with included quota
  • Stable model names with clear pricing
  • Status, billing, and keys in one console

Drop-In Endpoint

Same SDKs, controlled quota.

Use one OpenAI-compatible endpoint for text and image requests, with plans, API keys, and account access managed in the same console.

inference.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.permute.sh/v1",
    api_key="pm_..."
)

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Ship it"}]
)

Model Catalog

Text and image routes from day one.

Text and image generation ship together in v1, with service health surfaced through the console and status page.

Text Models

  • GLM-5.1
  • GLM-5
  • Kimi K2.5
  • Kimi K2.6
  • DeepSeek V4
  • DeepSeek V4 Flash
  • DeepSeek V3.2

Image Models

  • Lumina
  • Anima

Enterprise

Custom model hosting for large-volume teams.

Need a model that is not listed? For enterprise volume, Permute can host requested models on dedicated infrastructure, handle maintenance, and provide an API endpoint for your workload.

For enterprise plans or custom model hosting, contact our sales team.

  • Models requested for large-volume use
  • Dedicated hosting and maintenance on Permute infrastructure
  • Private API endpoint with custom pricing and plan terms