Fig. 4.0 · Deployment

Your environment.
Your terms.

Cloud, on-prem, edge, air-gapped. Install over a clean network or an empty one.

Talk to the team Explore Runtime

[ Models ]

Three ways to deploy.

Pick the model that matches how your team works. Switch later if it changes.

Model 01

your-server.local

$ curl get.sector88.co | sh

→ runtime installed

$ s88 serve

Self-hosted

You install it. We support it.

Runtime + Hub binaries
Docs & reference architectures
Email support

Cloud · On-prem

Model 02

hub.sector88.co

Fleet healthy

node-prod-01

node-prod-02

node-prod-03 · syncing

Managed

We run Hub. Runtime stays with you.

Hosted Hub with SLA
Runtime on your hardware
Dedicated support

On-prem · Edge

Model 03

deployment · ENG-2847

On-site Phase 4 / 5

AuditInstallBenchHardenLive

Forward-deployed

We come on site. We stay until live.

On-site hardware audit
Benchmarks & hardening
Knowledge transfer

Air-gapped · Classified

[ Environments ]

Anywhere you can put a server. And some places you cannot.

Cloud

AWS, Azure, GCP. Your VPC, your account, your billing.

On-premise

Your data centre, your hardware, your network. Full sovereignty.

Edge

Remote sites, vehicles, ships, ground stations. Intermittent links.

Air-gapped

Zero outbound connectivity. Install via offline media. No phone-home.

Classified

SCIF, sovereign clouds, IRAP, ITAR-friendly. Cleared engineers.

[ Process ]

From kickoff to live in production.

Every deployment runs through the same five phases. Self-hosted teams run them themselves. Forward-deployed sites get our engineers on site for each one.

Audit

Hardware probe

We inventory the hardware, network constraints, and security posture.

Install

Runtime + Hub

Drop the binary, point Runtime at Hub, models load. Hours, not weeks.

Benchmark

Real workloads

Throughput, latency, cost measured against your traffic. Reports exported.

Harden

Security review

Egress firewalls, encryption at rest, RBAC, audit pipelines, red-team probes.

Live

Production

Cutover. Knowledge transfer. We step back. The platform stays.

Hardware Agnostic

Any GPU, any backend, any model, anywhere.

Hardware Platforms

NVIDIA CUDA

Popular

AMD ROCm

Intel Gaudi / Xeon

Google TPU

Qualcomm AI

Apple Silicon

CPU Servers

View all supported hardware →

Inference Backends

PyTorch Supported

Native inference

vLLM Supported

PagedAttention optimization

llama.cpp Supported

GGUF models, CPU/GPU

TensorRT-LLM Roadmap

NVIDIA optimization

Triton Roadmap

NVIDIA inference server

Ollama Roadmap

Developer tooling

View all supported backends →

[ Licensing ]

Simple licensing.

Per-node, per-year. No usage metering. No token counting. No surprise bills.

Runtime

Per node, per year. A node is any machine running Sector88 Runtime.

Hub

Self-hosted Hub included with Enterprise. Managed Hub priced by fleet size.

Forward-deployed engagements

Scoped per site. We quote after the hardware audit, not before.

Deploy where the cloud cannot reach.

Self-hosted, managed, or forward-deployed. We pick the model with you, then make it live.

Talk to the team

Your environment. Your terms.