Documentation.

Quickstart, supported hardware, and inference backends. Full deployment guides delivered with our engineering team.

Talk to the team Explore the platform

[ Getting Started ]

Quickstart

Pull the container, serve a model, call the API. No account, no licence key. Running in under a minute.

CLI reference

Every command, every flag. Lives inside the binary so it always matches your version.

$ s88 --help

Deployment guides

Hardware audits, air-gapped installs, production hardening. Delivered with our forward-deployed engineering team.

Talk to the team

[ Hardware ]

Supported hardware.

Sector88 runs on what you have. These are the hardware families we test against directly.

NVIDIA CUDA

RTX 30/40/50, A-series, L-series, H100, H200, Jetson

AMD ROCm

MI250X, MI300X accelerators

Intel

Gaudi 2, Gaudi 3, Xeon CPUs

Qualcomm

Cloud AI 100 accelerators

Apple Silicon

M-series via Metal

CPU-only

x86 and ARM, server and edge

Google TPU

v4 / v5, validated per deployment

Custom silicon

On request, forward-deployed

If your hardware is not listed, it probably still works. Sector88 falls back to a CPU path on anything without a native backend.

[ Backends ]

Inference backends.

Sector88 orchestrates the inference engines you already trust. Runtime picks the right backend for your hardware and keeps it in sync across a fleet.

llama.cpp

GGUF, CPU + GPU, edge

Supported

vLLM

PagedAttention, throughput

Supported

SGLang

Structured generation, high throughput

Supported

TensorRT-LLM

NVIDIA datacenter path

Roadmap

NVIDIA Dynamo

Large-cluster orchestration

Roadmap

Custom backends

Per-deployment via engineering

On request

[ API ]

API shape.

OpenAI-compatible. Point any client at your Sector88 Runtime instead of api.openai.com.

# Chat completions

POST /v1/chat/completions

# Models

GET /v1/models

# Completions (legacy)

POST /v1/completions

# Embeddings

POST /v1/embeddings

# Health

GET /health

Need deployment guides?

Production hardening, SSO and RBAC, air-gapped installs. Delivered with the engineering team.

Talk to the team