Load Balancing

Load balancing is the process of distributing work — player connections, session requests, API calls — across multiple servers so that no single machine carries more traffic than it can handle. When one server is busy, the load balancer routes new requests to a less busy one.

Load balancing vs. auto-scaling

These two concepts are often confused but do different things:

  • Load balancing distributes traffic across the servers you already have running.
  • Auto-scaling adds or removes servers in response to overall demand.

A production game infrastructure typically uses both: the orchestrator auto-scales to match player concurrency, and a load balancer distributes incoming connections across the available pool.

Types of load balancing

Round-robin sends each new request to the next server in sequence — simple, but ignores how busy each server actually is. Least-connections routes to whichever server has the fewest active sessions — better for uneven workloads. Geographic routing sends players to the nearest server cluster, reducing latency before the connection is even established.

Load balancing for dedicated game servers

In game server infrastructure, “load balancing” often means session placement — deciding which physical machine in which region should run a new match. An orchestrator like Gameye handles this automatically: when a matchmaker requests a session, Gameye evaluates available capacity across providers and regions, places the session on the optimal host, and returns a connection address to the client. The placement decision happens in milliseconds.

See also: Auto-scaling · Scalability · How Gameye game server orchestration works

Back to Glossary