> services / scale

Infrastructure engineering and migrations for Web3, AI and ZK

Greenfield builds, cloud → bare-metal migrations, regional splits, hard-fork cutovers, burst capacity. We move running systems. Reversibly. Without losing state or signatures.

> what's included

Scale covers any project that reshapes infrastructure: growth, moves, new regions, hard-forks. The scope lands in a migration plan.

Scope Owned by us Owned by you
Migration plan: target architecture, dry-runs, rollback ✓ Owned -
Sourcing new regions and bare-metal partners ✓ Owned -
Multi-region Kubernetes topology, mesh, DNS cutover ✓ Owned -
State transfer (chain data, model weights, prover state) ✓ Owned -
Key ceremony for validator key migration ✓ Coordinated (we direct) ✓ Co-owned (keys stay with you)
Burst scenarios: 100 nodes in 72h ✓ Owned -
Business decision to migrate / trade-offs - (we analyse, recommend) ✓ Owned
Billing for old and new infra during transition window - ✓ Owned

> migration scenarios

Every migration is planned as a reversible process. First a dry-run in an isolated copy of the target environment, then a canary rollout of a subset, then cutover with a rollback that fires within an hour. No "Friday-evening prod jumps".

Cloud → bare-metal

Cost optimisation

Moving GPU workloads or node fleets from public cloud to bare-metal providers. Typical savings: 40-60% on cost-per-token / cost-per-block.

Multi-region split

Latency & resilience

Splitting single-region infra across 3+ regions. Survives regional outages, drops p95 latency for global traffic.

Hard-fork cutover

Chain upgrades

Coordinated validator upgrades for hard-fork blocks. Pre-flight dry-runs, version pinning, rollback playbook in case the majority doesn't fork.

Burst engagement

100 nodes in 72h

Incentivized testnets, fresh network launches, GPU farms for incentive seasons. Sourcing, deployment, monitoring: all in parallel. Aim: top-tier rewards.

Each scenario gets its own runbook, rollback plan and checkpoints. All migrations include a key ceremony for signing systems: HSM/KMS workflow, multi-party approval, audit trail in Git.

> stack we ship with

Web3: Cosmos SDK Geth Reth OP Stack Arbitrum Orbit Polygon CDK EigenDA Celestia
AI / LLM: vLLM Triton TensorRT-LLM NVIDIA H100 / A100 Ray Kubeflow
ZK: SP1 RISC Zero Boundless Brevis Jolt Halo2
DePIN: Filecoin Akash Render io.net Gensyn
Platform: Kubernetes Terraform Ansible Prometheus Grafana Loki OpenTelemetry PagerDuty

> engagement models

Scale projects are usually time-boxed engagements with fixed milestones.

> security baseline for migrations

Migration is the moment of highest risk for key material. What we always enforce on a scale project.

Control What we ensure
Key ceremony Multi-party-approved procedure for all validator keys. Audit trail. Keys never leave your HSM/KMS.
HSM / KMS workflow YubiHSM, AWS KMS, GCP KMS, Vault supported. Signing by process, not by handing over material.
Dry-runs Every migration runs first in an isolated copy of the target environment.
Rollback playbook Pre-built rollback plan with <60 min RTO at every cutover step.
Distributed locks Guarantees single-instance signing per chain throughout the transition window.

> what we'd build for you

Real migrations and scale projects across the four ICPs.

Web3 / Validators

Testnet → mainnet migration with key ceremony and zero-loss state transfer. Coordinated cutover, signing monitoring before and after, rollback playbook in case of chain split.

AI / LLM Inference

Spot-fleet to dedicated H100 cluster, cost-per-token cut by 60%. Model-weight migration, cache warm-up, blue-green cutover for production traffic.

ZK / Prover Farms

Prover capacity ramped 5x for a proving-incentive season. GPU sourcing in <2 weeks, scheduler expansion, proof-throughput monitoring across the network.

DePIN / Distributed Networks

Node fleet rebalanced across 12 regions in 72h to hit reward thresholds. Payout-data analysis, regional sourcing, phased rollout, results reconciliation after one week.

> SLA tiers

After the scale phase, infra moves into operate. Three coverage levels.

Tier Response p95 (Sev-1) Coverage Incident report Engineer hours / mo
Bronze 30 min Business hours, 5×8 Within 48h 40
Silver 15 min 24/7 on-call rotation Within 24h 80
Gold 5 min 24/7 with dedicated engineer Within 12h 160+

> related services

> FAQ

Yes, with a proper key ceremony and distributed locks. The cutover window is usually one missed epoch, with no double-sign and no slashed stake. Specifics depend on the chain.

Cloud → bare-metal for a single application: 4-6 weeks. Multi-region split with DNS cutover: 3-4 weeks. 100-node burst for a testnet: 72h. Hard-fork cutover: a planned window, usually 2-4 weeks of prep.

Every step has a rollback playbook with <60 min RTO. The old stack stays alive until the new one is fully validated. That doubles billing during the transition window, but that's the explicit price of reversibility.

We direct it: we write the procedure, run dry-runs, coordinate participants. Keys stay with you (HSM/KMS); signing happens through your processes. Audit trail in Git.

Yes. Often the brief is "we already have infra, move it for us". If you want the operate phase after migration, we roll into a retainer. If not, we hand off with runbooks and your team takes it from there.

> ready to move the infra?

Tell us about the workload. We reply within 24 hours.