Frequently Asked Questions
Common questions about S88 Runtime and Hub.
Sector88 builds production-grade inference infrastructure for constrained hardware. S88 Runtime is our memory-efficient inference engine that prevents crashes and maximizes GPU utilization. S88 Hub is the management platform providing real-time monitoring, performance analytics, and fleet control. Purpose-built for air-gapped, edge, and sovereign deployments.
S88 Runtime is our reliability layer for inference on constrained hardware. It focuses on crash-avoidance, safe deployment defaults, and production telemetry. Advanced memory tiering and predictive policies are on the roadmap.
S88 Hub is the operational control plane. Real-time monitoring of VRAM, RAM, power, and thermal metrics. Comprehensive performance analysis with detailed reports and raw data exports. Fleet management for multi-node deployments. Web dashboard for visualization and control. Built-in Prometheus integration.
Sector88 works with these engines, not against them. While vLLM, Triton, Ollama, and llama.cpp optimize for throughput, Sector88 focuses on the operational layer: safe defaults, telemetry, and reliability controls for constrained/regulated environments. Roadmap items include deeper memory orchestration and policy-driven optimization.
We partner with integrators, OEMs, hardware vendors, and enterprise resellers. Pilot programs include technical evaluation on your infrastructure, baseline performance analysis, deployment support, and access to S88 for testing. Contact partnerships@sector88.co to discuss opportunities.
OOM (Out Of Memory) crashes occur when GPU VRAM is exhausted during inference, causing abrupt process termination. This results in service interruption, failed requests, and unreliable deployments. Sector88 prevents OOM crashes through intelligent memory orchestration and graceful degradation, ensuring continuous operation under resource constraints.
S88 is hardware-agnostic. We support NVIDIA CUDA, AMD ROCm, Intel Gaudi/Xeon, Google TPU, Qualcomm AI, Apple Silicon, and CPU-only servers. Any platform that can run PyTorch or inference engines.
Yes. S88 is production-ready and shipping to select partners. We're currently in controlled access to ensure proper deployment support. Request access through our contact page.
Yes. S88 never terminates on OOM. Graceful degradation through back-pressure queuing, context clipping, and intelligent eviction maintains service continuity under memory pressure. The system remains responsive and operational, ensuring zero downtime from memory constraints.
Yes. S88 is purpose-built for air-gapped and sovereign deployments. Zero external dependencies at runtime. Models load from local filesystem. Telemetry stays local via Prometheus. All inference data remains on-premises. Designed for SCIF, classified, and offline environments.
Any LLM supported by PyTorch, vLLM, or llama.cpp. This includes Llama, Mistral, Qwen, DeepSeek, and thousands of HuggingFace models. Both HuggingFace format and GGUF models are supported.
Infrastructure assessment, baseline benchmarks showing current performance, deployment runbook with production configuration, and access to S88 Runtime and Hub for testing on your hardware.
Initial deployment takes minutes. Clone repository, install dependencies, configure for your model and hardware. Full production deployment typically completed within days, depending on infrastructure complexity.
S88 is built for regulated environments. Zero prompt logging by default, audit-ready telemetry without content exposure, configurable retention policies, and complete offline operation. Designed for HIPAA, PCI-DSS, FedRAMP, and defense compliance requirements.
S88 scales from single GPUs to distributed clusters. Multi-GPU tensor parallelism and multi-node deployments are supported. Fleet management capabilities available through S88 Hub.
Built-in Prometheus metrics, structured event logs, performance tracking (tokens/sec, latency, TTFT), memory metrics (VRAM, RAM, SSD), GPU metrics (temperature, power, utilization), and SLO tracking. S88 Hub provides real-time dashboards and professional scorecard generation.
Yes. We provide infrastructure evaluation and baseline benchmarks to help you assess fit before deployment. This includes testing on your hardware, performance scorecards, and technical consultation. Contact us to discuss evaluation options.
Still have questions?
Contact Us