Careful!

You are browsing documentation for a version of Kuma that is not the latest release.

MeshLoadBalancingStrategy

This policy uses new policy matching algorithm.

This policy enables Kuma to configure the load balancing strategy for traffic between services in the mesh. When using this policy, the localityAwareLoadBalancing flag is ignored.

TargetRef support matrix

TargetRef type	top level	to	from
Mesh	✅	✅	❌
MeshSubset	✅	❌	❌
MeshService	✅	✅	❌
MeshServiceSubset	✅	❌	❌

To learn more about the information in this table, see the matching docs.

Configuration

LocalityAwareness

Locality-aware load balancing is enabled by default unlike its predecessor localityAwareLoadBalancing.

disabled – (optional) allows to disable locality-aware load balancing. When disabled requests are distributed across all endpoints regardless of locality.

LoadBalancer

type - available values are RoundRobin, LeastRequest, RingHash, Random, Maglev.

RoundRobin

RoundRobin is a load balancing algorithm that distributes requests across available upstream hosts in round-robin order.

LeastRequest

LeastRequest selects N random available hosts as specified in choiceCount (2 by default) and picks the host which has the fewest active requests.

choiceCount - (optional) is the number of random healthy hosts from which the host with the fewest active requests will be chosen. Defaults to 2 so that Envoy performs two-choice selection if the field is not set.

RingHash

RingHash implements consistent hashing to upstream hosts. Each host is mapped onto a circle (the “ring”) by hashing its address; each request is then routed to a host by hashing some property of the request, and finding the nearest corresponding host clockwise around the ring.

hashFunction - (optional) available values are XX_HASH, MURMUR_HASH_2. Default is XX_HASH.
minRingSize - (optional) minimum hash ring size. The larger the ring is (that is, the more hashes there are for each provided host) the better the request distribution will reflect the desired weights. Defaults to 1024 entries, and limited to 8M entries.
maxRingSize - (optional) maximum hash ring size. Defaults to 8M entries, and limited to 8M entries, but can be lowered to further constrain resource use.
hashPolicies - (optional) specify a list of request/connection properties that are used to calculate a hash. These hash policies are executed in the specified order. If a hash policy has the “terminal” attribute set to true, and there is already a hash generated, the hash is returned immediately, ignoring the rest of the hash policy list.
- type - available values are Header, Cookie, Connection, QueryParameter, FilterState
- terminal - is a flag that short-circuits the hash computing. This field provides a ‘fallback’ style of configuration: “if a terminal policy doesn’t work, fallback to rest of the policy list”, it saves time when the terminal policy works. If true, and there is already a hash computed, ignore rest of the list of hash polices.
- header:
  - name - the name of the request header that will be used to obtain the hash key.
- cookie:
  - name - the name of the cookie that will be used to obtain the hash key.
  - ttl - (optional) if specified, a cookie with this time to live will be generated if the cookie is not present.
  - path - (optional) the name of the path for the cookie.
- connection:
  - sourceIP - if true, then hashing is based on a source IP address.
- queryParameter:
  - name - the name of the URL query parameter that will be used to obtain the hash key. If the parameter is not present, no hash will be produced. Query parameter names are case-sensitive.
- filterState:
  - key the name of the Object in the per-request filterState, which is an Envoy::Hashable object. If there is no data associated with the key, or the stored object is not Envoy::Hashable, no hash will be produced.

Random

Random selects a random available host. The random load balancer generally performs better than round-robin if no health checking policy is configured. Random selection avoids bias towards the host in the set that comes after a failed host.

Maglev

Maglev implements consistent hashing to upstream hosts. Maglev can be used as a drop in replacement for the ring hash load balancer any place in which consistent hashing is desired.

tableSize - (optional) the table size for Maglev hashing. Maglev aims for “minimal disruption” rather than an absolute guarantee. Minimal disruption means that when the set of upstream hosts change, a connection will likely be sent to the same upstream as it was before. Increasing the table size reduces the amount of disruption. The table size must be prime number limited to 5000011. If it is not specified, the default is 65537.
hashPolicies - (optional) specify a list of request/connection properties that are used to calculate a hash. These hash policies are executed in the specified order. If a hash policy has the “terminal” attribute set to true, and there is already a hash generated, the hash is returned immediately, ignoring the rest of the hash policy list.
- type - available values are Header, Cookie, Connection, QueryParameter, FilterState
- terminal - is a flag that short-circuits the hash computing. This field provides a ‘fallback’ style of configuration: “if a terminal policy doesn’t work, fallback to rest of the policy list”, it saves time when the terminal policy works. If true, and there is already a hash computed, ignore rest of the list of hash polices.
- header:
  - name - the name of the request header that will be used to obtain the hash key.
- cookie:
  - name - the name of the cookie that will be used to obtain the hash key.
  - ttl - (optional) if specified, a cookie with this time to live will be generated if the cookie is not present.
  - path - (optional) the name of the path for the cookie.
- connection:
  - sourceIP - if true, then hashing is based on a source IP address.
- queryParameter:
  - name - the name of the URL query parameter that will be used to obtain the hash key. If the parameter is not present, no hash will be produced. Query parameter names are case-sensitive.
- filterState:
  - key the name of the Object in the per-request filterState, which is an Envoy::Hashable object. If there is no data associated with the key, or the stored object is not Envoy::Hashable, no hash will be produced.

Examples

RingHash load balancing from web to backend

Load balance requests from frontend to backend based on the HTTP header x-header:

        
          Kubernetes
        
      

        
          Universal
        
      

    
      apiVersion: kuma.io/v1alpha1
kind: MeshLoadBalancingStrategy
metadata:
  name: ring-hash
  namespace: kuma-system
  labels:
    kuma.io/mesh: default
spec:
  targetRef:
    kind: MeshSubset
    tags:
      kuma.io/service: web
  to:
  - targetRef:
      kind: MeshService
      name: backend_kuma-demo_svc_8080
    default:
      loadBalancer:
        type: RingHash
        ringHash:
          hashPolicies:
          - type: Header
            header:
              name: x-header

    
      type: MeshLoadBalancingStrategy
name: ring-hash
mesh: default
spec:
  targetRef:
    kind: MeshSubset
    tags:
      kuma.io/service: web
  to:
  - targetRef:
      kind: MeshService
      name: backend
    default:
      loadBalancer:
        type: RingHash
        ringHash:
          hashPolicies:
          - type: Header
            header:
              name: x-header

    
  

Disable locality-aware load balancing for backend

Requests to backend will be spread evenly across all zones where backend is deployed.

        
          Kubernetes
        
      

        
          Universal
        
      

    
      apiVersion: kuma.io/v1alpha1
kind: MeshLoadBalancingStrategy
metadata:
  name: disable-la-to-backend
  namespace: kuma-system
  labels:
    kuma.io/mesh: default
spec:
  targetRef:
    kind: Mesh
  to:
  - targetRef:
      kind: MeshService
      name: backend_kuma-demo_svc_8080
    default:
      localityAwareness:
        disabled: true

    
      type: MeshLoadBalancingStrategy
name: disable-la-to-backend
mesh: default
spec:
  targetRef:
    kind: Mesh
  to:
  - targetRef:
      kind: MeshService
      name: backend
    default:
      localityAwareness:
        disabled: true

    
  

Load balancing HTTP traffic through zone proxies

If you proxy HTTP traffic through zone proxies (zone ingress/egress), you may notice that the traffic does not reach every instance of the destination service. In the case of in-zone traffic (without zone proxies on a request path), the client is aware of all server endpoints, so if you have 10 server endpoints the traffic goes to all of them. In the case of cross-zone traffic, the client is only aware of zone ingress endpoints, so if you have 10 server endpoints and 1 zone ingress, the client only sees one zone ingress endpoint. Because zone ingress is just a TCP passthrough proxy (it does not terminate TLS), it only load balances TCP connections over server endpoints.

HTTP traffic between Envoys is upgraded to HTTP/2 automatically for performance benefits. The client’s Envoy leverages HTTP/2 multiplexing therefore it opens only a few TCP connections.

You can mitigate this problem by adjusting max_requests_per_connection setting on Envoy Cluster. For example

        
          Kubernetes
        
      

        
          Universal
        
      

    
      apiVersion: kuma.io/v1alpha1
kind: MeshProxyPatch
metadata:
  name: max-requests-per-conn
  namespace: kuma-system
  labels:
    kuma.io/mesh: default
spec:
  targetRef:
    kind: Mesh
  default:
    appendModifications:
    - cluster:
        operation: Patch
        match:
          name: demo-app_kuma-demo_svc_5000
          origin: outbound
        value: 'max_requests_per_connection: 1

          '

    
      type: MeshProxyPatch
name: max-requests-per-conn
mesh: default
spec:
  targetRef:
    kind: Mesh
  default:
    appendModifications:
    - cluster:
        operation: Patch
        match:
          name: demo-app_kuma-demo_svc_5000
          origin: outbound
        value: 'max_requests_per_connection: 1

          '

    
  

This way, we allow only one in-flight request on a TCP connection. Consequently, the client will open more TCP connections, leading to fairer load balancing. The downside is that we now have to establish and maintain more TCP connections. Keep this in mind as you adjust the value to suit your needs.