Auto scaling
Auto scaling for game servers is the ability to start new server instances on demand as player load increases, and release them when demand drops — without manual intervention. When a matchmaker requests a new session, the orchestrator finds available capacity, starts a container, and returns a connection address to players — all in under a second on modern platforms.
Why scaling speed matters for multiplayer games
For most software, scaling happens over minutes and nobody notices. For multiplayer games, scaling speed is player-facing. If a matchmaker requests a server and has to wait 60 seconds for it to provision, that delay shows up in matchmaking queue times. During a launch spike — when thousands of players hit simultaneously — slow scaling means players either wait or can’t connect at all.
Container-based orchestration makes fast scaling possible. Because the game server image is pre-pulled onto infrastructure and containers start with near-zero overhead, new sessions can be ready in under a second. VM-based scaling, by contrast, requires provisioning a full virtual machine — a process that typically takes one to several minutes even on optimised cloud configurations.
Gameye starts new containers in 0.5 seconds on average. During Chivalry 2’s launch, that speed allowed the platform to absorb a player surge nearly twice the expected peak in the first 30 minutes, with no downtime.
Regional and time-zone scaling
Player load isn’t uniform — it follows regional time zones, content release windows, and free weekend spikes. A well-designed auto scaling system accounts for this by tracking active session counts per region and adjusting capacity accordingly. Sessions in one region can be wound down as another region’s peak begins, keeping costs tied to actual demand rather than worst-case provisioning.
Burst capacity
For sustained baseline load, bare metal infrastructure offers the best cost-per-session. But bare metal capacity is finite. Auto scaling systems that can burst into cloud capacity — spinning up additional sessions on cloud instances when bare metal is saturated — can handle demand spikes without over-provisioning permanent hardware. Gameye manages both bare metal and cloud capacity under the same API, scaling across providers automatically when demand requires it.
The cost dimension
Auto scaling removes the cost of idle servers, but the pricing model still matters. Platforms that charge per-GB for data transfer will see costs scale with player count regardless of how efficiently sessions are managed. Capacity-based pricing with no egress fees means scaling up doesn’t introduce unexpected costs — only the compute reserved is billed, per second.
See also: How Gameye orchestration works · Chivalry 2 case study — scaling to 250,000 concurrent players at launch · Gameye vs. AWS GameLift — scaling speed comparison · Game server orchestration