Fig. 4.0 · Deployment

Your environment. Your terms.

Cloud, on-prem, edge, air-gapped. Install over a clean network or an empty one.

[ Models ]

Three ways to deploy.

Pick the model that matches how your team works. Switch later if it changes.

Model 01

your-server.local
$ curl get.sector88.co | sh
→ runtime installed
$ s88 serve

Self-hosted

You install it. We support it.

  • Runtime + Hub binaries
  • Docs & reference architectures
  • Email support

Cloud · On-prem

Model 02

hub.sector88.co
Fleet healthy
node-prod-01
node-prod-02
node-prod-03 · syncing

Managed

We run Hub. Runtime stays with you.

  • Hosted Hub with SLA
  • Runtime on your hardware
  • Dedicated support

On-prem · Edge

Model 03

deployment · ENG-2847
On-site Phase 4 / 5
AuditInstallBenchHardenLive

Forward-deployed

We come on site. We stay until live.

  • On-site hardware audit
  • Benchmarks & hardening
  • Knowledge transfer

Air-gapped · Classified

Forward-deployed environment

[ Environments ]

Anywhere you can put a server. And some places you cannot.

Cloud

AWS, Azure, GCP. Your VPC, your account, your billing.

On-premise

Your data centre, your hardware, your network. Full sovereignty.

Edge

Remote sites, vehicles, ships, ground stations. Intermittent links.

Air-gapped

Zero outbound connectivity. Install via offline media. No phone-home.

Classified

SCIF, sovereign clouds, IRAP, ITAR-friendly. Cleared engineers.

[ Process ]

From kickoff to live in production.

Every deployment runs through the same five phases. Self-hosted teams run them themselves. Forward-deployed sites get our engineers on site for each one.

01

Audit

Hardware probe

We inventory the hardware, network constraints, and security posture.

02

Install

Runtime + Hub

Drop the binary, point Runtime at Hub, models load. Hours, not weeks.

03

Benchmark

Real workloads

Throughput, latency, cost measured against your traffic. Reports exported.

04

Harden

Security review

Egress firewalls, encryption at rest, RBAC, audit pipelines, red-team probes.

05

Live

Production

Cutover. Knowledge transfer. We step back. The platform stays.

Hardware Agnostic

Any GPU, any backend, any model, anywhere.

Hardware Platforms

NVIDIA CUDA
Popular
AMD ROCm
Intel Gaudi / Xeon
Google TPU
Qualcomm AI
Apple Silicon
CPU Servers

Inference Backends

PyTorch Supported
Native inference
vLLM Supported
PagedAttention optimization
llama.cpp Supported
GGUF models, CPU/GPU
TensorRT-LLM Roadmap
NVIDIA optimization
Triton Roadmap
NVIDIA inference server
Ollama Roadmap
Developer tooling

[ Licensing ]

Simple licensing.

Per-node, per-year. No usage metering. No token counting. No surprise bills.

Runtime

Per node, per year. A node is any machine running Sector88 Runtime.

Hub

Self-hosted Hub included with Enterprise. Managed Hub priced by fleet size.

Forward-deployed engagements

Scoped per site. We quote after the hardware audit, not before.

Deploy where the cloud cannot reach.

Self-hosted, managed, or forward-deployed. We pick the model with you, then make it live.