[System Design] Distributed System & Load Balancing

Key Characteristic

SREAM

Scalibility

Horizontal Scaling Horizontal scaling means that you scale by adding more servers into your pool of resources .

Example: MongoDB, Cassandra.

Vertical Scaling Vertical scaling means that you scale by adding more power (CPU, RAM, Storage, etc.) to an existing server.

Example: MySQL.

Reliability

Reliability is the probability that a system will fail in a given period.

Example: Amazon, one of the primary requirement is that any user transaction should never be canceled due to a failure of the machine that is running that transaction.

A reliable distributed system achieves this through redundancy of both the software components and data. If the server carrying the user’s shopping cart fails, another server that has the exact replica of the shopping cart should replace it.

Redundancy has a cost.

Efficiency

Two measurements:

Latency: (response time) The delay to get the first item

Bandwidth: (Throughput) The number of items delivered in a given time unit.

These two are corresponding to the below unit costs:

  • Number of messages globally sent by the nodes of the system regardless of the message size.
  • Size of messages representing the volume of data exchanges.

Availability

Availability is the time a system remains operational to perform its required function in a specific period. Availability takes into account maintainability, repair time, spares availability and other logistics considerations.

Reliability is availability over time considering the full range of possible real-world condition that can occur.

Manageability

How easy to operate and maintain.

Load Balancing

  • Help to spread the traffic acrosss a cluster of servers.
  • Keep track of the availability of all the resources

We can put load balancers in three places:

  • Between the user and the web server
  • Between web servers and an internal platform layer, like application servers or cache servers
  • Between internal platform layer and database.

Load Balancing Algorithm

Before distributing the requests, the load balancer does a Health Check with all the servers.

  • Least Connection Method directs traffic to the server with the fewest active connections
  • Least Response Time Method directs traffic to the server with the fewest active connections and the lowest average response time
  • Least Bandwidth Method selects the server that is currently serving the least amount of traffic measured in megabits per second (Mbps)
  • Round Robin Method cycles through a list of servers and sends each new request to the next server. When it reaches the end of the list, it starts over at the beginning. It is most useful when the servers are of equal specification and there are not many persistent connections.
  • Weighted Round Robin Method is designed to better handle servers with different processing capacities. Each server is assigned a weight. Servers with higher weights receive new connections before those with less weights and servers with higher weights get more connections than those with less weights.
  • IP Hash a hash of the IP address of the client is calculated to redirect the request to a server.

Types of Load Balancer

  1. SDN This allows the control of multiple load balancing.
  2. UDP (User Datagram Protocal) often used for live broadcasts and online games when speed is important and there is little need for error correction. UDP has low latency because it does not provide time-consuming health checks.
  3. TCP (Transmission Control Protocal) provides a reliable and error-checked stream of packets to IP addresses, which can otherwise easily be lost or corrupted.
  4. SLB (Server Load Balancing) prioritizes responses to the specific requests from clients over the network.

Redundant Load Balancer

To prevent LB from becoming the single point of failure, a second load balancer can be connected to the first to form a cluster. Each LB monitors the health of the other and, since both of them are equally capable of serving traffic and failure detection, in the event the main load balancer fails, the second load balancer takes over.

Ways to implement Load Balancer

  • Smart Clients: It is a client which takes a pool of service hosts and balances load across them, detects downed hosts and avoids sending requests their way (they also have to detect recovered hosts, deal with adding new hosts, etc, making them fun to get working decently and a terror to setup).
  • Hardware load balancers: the most expensive but very high performance
  • Software load balancers: If you want to avoid the pain of creating a smart client, and purchasing dedicated hardware is excessive, then the universe has been kind enough to provide a hybrid: software load-balancers.