Careful!
You are browsing documentation for a version of Kuma that is not the latest release.
MeshCircuitBreaker
This policy uses new policy matching algorithm. Do not combine with CircuitBreaker.
This policy will look for errors in the live traffic being exchanged between our data plane proxies. It will mark a data proxy as unhealthy if certain conditions are met. The policy will ensure that no additional traffic can reach an unhealthy data plane proxy until it is healthy again.
Circuit breakers - unlike active MeshHealthChecks - do not send additional traffic to our data plane proxies but they rather inspect the existing service traffic. They are also commonly used to prevent cascading failures.
Like a real-world circuit breaker when the circuit is closed then traffic between a source and destination data plane proxy is allowed to freely flow through it. When it is open then the traffic is interrupted.
The conditions that determine when a circuit breaker is closed or open are being configured on connection limits or outlier detection basis. For outlier detection to open circuit breaker you can configure what we call detectors. This policy provides 5 different types of detectors, and they are triggered on some deviations in the upstream service behavior. All detectors could coexist on the same outbound interface.
Once one of the detectors has been triggered the corresponding data plane proxy is ejected from the set of the load
balancer for a period equal to baseEjectionTime. Every further ejection of the same data plane
proxy will further extend the baseEjectionTime multiplied by the number of ejections: for example
the fourth ejection will be lasting for a period of time of 4 * baseEjectionTime
.
This policy provides passive checks. If you want to configure active checks, please utilize the MeshHealthCheck policy. Data plane proxies with passive checks won’t explicitly send requests to other data plane proxies to determine if target proxies are healthy or not.
TargetRef support matrix
targetRef |
Allowed kinds |
---|---|
targetRef.kind |
Mesh , MeshSubset , MeshService , MeshServiceSubset |
to[].targetRef.kind |
Mesh , MeshService |
from[].targetRef.kind |
Mesh |
To learn more about the information in this table, see the matching docs.
Configuration
Connection limits
maxConnections
- (optional) The maximum number of connections allowed to be made to the upstream Envoy Cluster. If not specified then equal to 1024.maxConnectionPools
- (optional) The maximum number of connection pools per Envoy Cluster that are concurrently supported at once. Set this for Envoy Clusters which create a large number of connection pools. If not specified, the default is unlimited.maxPendingRequests
- (optional) The maximum number of pending requests that are allowed to the upstream Envoy Cluster. This limit is applied as a connection limit for non-HTTP traffic. If not specified then equal to 1024.maxRetries
- (optional) The maximum number of parallel retries that will be allowed to the upstream Envoy Cluster. If not specified then equal to 3.maxRequests
- (optional) The maximum number of parallel requests that are allowed to be made to the upstream Envoy Cluster. This limit does not apply to non-HTTP traffic. If not specified then equal to 1024.
Outlier detection
Outlier detection can be configured for HTTP, TCP or gRPC traffic.
For gRPC requests, the outlier detection will use the HTTP status mapped from the grpc-status
response header.
disabled
- (optional) When set to true, outlierDetection configuration won’t take any effect.interval
- (optional) The time interval between ejection analysis sweeps. This can result in both new ejections and hosts being returned to service.baseEjectionTime
- (optional) The base time that a host is ejected for. The real time is equal to the base time multiplied by the number of times the host has been ejected.maxEjectionPercent
- (optional) The maximum % of an upstream Envoy Clusters that can be ejected due to outlier detection. Defaults to 10% but will eject at least one host regardless of the value.splitExternalAndLocalErrors
- (optional) Determines whether to distinguish local origin failures from external errors. If set to true the following configuration parameters are taken into account:detectors.localOriginFailures.consecutive
.detectors
- Contains configuration for supported outlier detectors. At least one detector needs to be configured when policy is configured for outlier detection.
Detectors configuration
Configuration for supported outlier detectors. At least one detector needs to be configured when policy is configured for outlier detection.
Depending on mode the outlier detection can take into account all or externally originated (transaction) errors only.
Default mode is when splitExternalAndLocalErrors
is not set or equal false
This detection type takes into account all generated errors: locally originated and externally originated (transaction) errors.
Configuration
totalFailures.consecutive
- The number of consecutive server-side error responses (for HTTP traffic, 5xx responses; for TCP traffic, connection failures; etc.) before a consecutive total failure ejection occurs.
Example
Examples
Basic circuit breaker for outbound traffic from web, to backend service
Outlier detection for inbound traffic to backend service
All policy options
Spec is the specification of the Kuma MeshCircuitBreaker resource.
Type: object
Properties
- from
- From list makes a match between clients and corresponding configurations
- Type:
array
- Items
- Type:
object
- Properties
- default
- Default is a configuration specific to the group of destinationsreferenced in 'targetRef'
- Type:
object
- Properties
- connectionLimits
- ConnectionLimits contains configuration of each circuit breaking limit,which when exceeded makes the circuit breaker to become open (no trafficis allowed like no current is allowed in the circuits when physicalcircuit breaker ir open)
- Type:
object
- Properties
- maxConnectionPools
- The maximum number of connection pools per cluster that are concurrentlysupported at once. Set this for clusters which create a large number ofconnection pools.
- Type:
integer
- maxConnections
- The maximum number of connections allowed to be made to the upstreamcluster.
- Type:
integer
- maxPendingRequests
- The maximum number of pending requests that are allowed to the upstreamcluster. This limit is applied as a connection limit for non-HTTPtraffic.
- Type:
integer
- maxRequests
- The maximum number of parallel requests that are allowed to be madeto the upstream cluster. This limit does not apply to non-HTTP traffic.
- Type:
integer
- maxRetries
- The maximum number of parallel retries that will be allowed tothe upstream cluster.
- Type:
integer
- maxConnectionPools
- outlierDetection
- OutlierDetection contains the configuration of the process of dynamicallydetermining whether some number of hosts in an upstream cluster areperforming unlike the others and removing them from the healthy loadbalancing set. Performance might be along different axes such asconsecutive failures, temporal success rate, temporal latency, etc.Outlier detection is a form of passive health checking.
- Type:
object
- Properties
- baseEjectionTime
- The base time that a host is ejected for. The real time is equal tothe base time multiplied by the number of times the host has beenejected.
- Type:
string
- detectors
- Contains configuration for supported outlier detectors
- Type:
object
- Properties
- failurePercentage
- Failure Percentage based outlier detection functions similarly to successrate detection, in that it relies on success rate data from each host ina cluster. However, rather than compare those values to the mean successrate of the cluster as a whole, they are compared to a flatuser-configured threshold. This threshold is configured via theoutlierDetection.failurePercentageThreshold field.The other configuration fields for failure percentage based detection aresimilar to the fields for success rate detection. As with success ratedetection, detection will not be performed for a host if its requestvolume over the aggregation interval is less than theoutlierDetection.detectors.failurePercentage.requestVolume value.Detection also will not be performed for a cluster if the number of hostswith the minimum required request volume in an interval is less than theoutlierDetection.detectors.failurePercentage.minimumHosts value.
- Type:
object
- Properties
- minimumHosts
- The minimum number of hosts in a cluster in order to perform failurepercentage-based ejection. If the total number of hosts in the cluster isless than this value, failure percentage-based ejection will not beperformed.
- Type:
integer
- requestVolume
- The minimum number of total requests that must be collected in oneinterval (as defined by the interval duration above) to perform failurepercentage-based ejection for this host. If the volume is lower than thissetting, failure percentage-based ejection will not be performed for thishost.
- Type:
integer
- threshold
- The failure percentage to use when determining failure percentage-basedoutlier detection. If the failure percentage of a given host is greaterthan or equal to this value, it will be ejected.
- Type:
integer
- minimumHosts
- gatewayFailures
- In the default mode (outlierDetection.splitExternalLocalOriginErrors isfalse) this detection type takes into account a subset of 5xx errors,called "gateway errors" (502, 503 or 504 status code) and local originfailures, such as timeout, TCP reset etc.In split mode (outlierDetection.splitExternalLocalOriginErrors is true)this detection type takes into account a subset of 5xx errors, called"gateway errors" (502, 503 or 504 status code) and is supported only bythe http router.
- Type:
object
- Properties
- consecutive
- The number of consecutive gateway failures (502, 503, 504 status codes)before a consecutive gateway failure ejection occurs.
- Type:
integer
- consecutive
- localOriginFailures
- This detection type is enabled only whenoutlierDetection.splitExternalLocalOriginErrors is true and takes intoaccount only locally originated errors (timeout, reset, etc).If Envoy repeatedly cannot connect to an upstream host or communicationwith the upstream host is repeatedly interrupted, it will be ejected.Various locally originated problems are detected: timeout, TCP reset,ICMP errors, etc. This detection type is supported by http router andtcp proxy.
- Type:
object
- Properties
- consecutive
- The number of consecutive locally originated failures before ejectionoccurs. Parameter takes effect only when splitExternalAndLocalErrorsis set to true.
- Type:
integer
- consecutive
- successRate
- Success Rate based outlier detection aggregates success rate data fromevery host in a cluster. Then at given intervals ejects hosts based onstatistical outlier detection. Success Rate outlier detection will not becalculated for a host if its request volume over the aggregation intervalis less than the outlierDetection.detectors.successRate.requestVolumevalue.Moreover, detection will not be performed for a cluster if the number ofhosts with the minimum required request volume in an interval is lessthan the outlierDetection.detectors.successRate.minimumHosts value.In the default configuration mode(outlierDetection.splitExternalLocalOriginErrors is false) this detectiontype takes into account all types of errors: locally and externallyoriginated.In split mode (outlierDetection.splitExternalLocalOriginErrors is true),locally originated errors and externally originated (transaction) errorsare counted and treated separately.
- Type:
object
- Properties
- minimumHosts
- The number of hosts in a cluster that must have enough request volume todetect success rate outliers. If the number of hosts is less than thissetting, outlier detection via success rate statistics is not performedfor any host in the cluster.
- Type:
integer
- requestVolume
- The minimum number of total requests that must be collected in oneinterval (as defined by the interval duration configured inoutlierDetection section) to include this host in success rate basedoutlier detection. If the volume is lower than this setting, outlierdetection via success rate statistics is not performed for that host.
- Type:
integer
- standardDeviationFactor
- This factor is used to determine the ejection threshold for success rateoutlier ejection. The ejection threshold is the difference betweenthe mean success rate, and the product of this factor and the standarddeviation of the mean success rate: mean - (standarddeviation *successratestandarddeviationfactor).Either int or decimal represented as string._
- minimumHosts
- totalFailures
- In the default mode (outlierDetection.splitExternalAndLocalErrors isfalse) this detection type takes into account all generated errors:locally originated and externally originated (transaction) errors.In split mode (outlierDetection.splitExternalLocalOriginErrors is true)this detection type takes into account only externally originated(transaction) errors, ignoring locally originated errors.If an upstream host is an HTTP-server, only 5xx types of error are takeninto account (see Consecutive Gateway Failure for exceptions).Properly formatted responses, even when they carry an operational error(like index not found, access denied) are not taken into account.
- Type:
object
- Properties
- consecutive
- The number of consecutive server-side error responses (for HTTP traffic,5xx responses; for TCP traffic, connection failures; for Redis, failureto respond PONG; etc.) before a consecutive total failure ejectionoccurs.
- Type:
integer
- consecutive
- failurePercentage
- disabled
- When set to true, outlierDetection configuration won't take any effect
- Type:
boolean
- interval
- The time interval between ejection analysis sweeps. This can result inboth new ejections and hosts being returned to service.
- Type:
string
- maxEjectionPercent
- The maximum % of an upstream cluster that can be ejected due to outlierdetection. Defaults to 10% but will eject at least one host regardless ofthe value.
- Type:
integer
- splitExternalAndLocalErrors
- Determines whether to distinguish local origin failures from externalerrors. If set to true the following configuration parameters are takeninto account: detectors.localOriginFailures.consecutive
- Type:
boolean
- baseEjectionTime
- connectionLimits
- targetRef
required
- TargetRef is a reference to the resource that represents a group ofdestinations.
- Type:
object
- Properties
- kind
- Kind of the referenced resource
- Type:
string
- The value is restricted to the following:
- "Mesh"
- "MeshSubset"
- "MeshGateway"
- "MeshService"
- "MeshServiceSubset"
- "MeshHTTPRoute"
- mesh
- Mesh is reserved for future use to identify cross mesh resources.
- Type:
string
- name
- Name of the referenced resource. Can only be used with kinds:
MeshService
,MeshServiceSubset
andMeshGatewayRoute
- Type:
string
- Name of the referenced resource. Can only be used with kinds:
- proxyTypes
- ProxyTypes specifies the data plane types that are subject to the policy. When not specified,all data plane types are targeted by the policy.
- Type:
array
- Item Count: ≥ 1
- Items
- Type:
string
- The value is restricted to the following:
- "Sidecar"
- "Gateway"
- tags
- Tags used to select a subset of proxies by tags. Can only be used with kinds
MeshSubset
andMeshServiceSubset
- Type:
object
- This schema accepts additional properties.
- Properties
- Tags used to select a subset of proxies by tags. Can only be used with kinds
- kind
- default
- targetRef
required
- TargetRef is a reference to the resource the policy takes an effect on.The resource could be either a real store object or virtual resourcedefined in place.
- Type:
object
- Properties
- kind
- Kind of the referenced resource
- Type:
string
- The value is restricted to the following:
- "Mesh"
- "MeshSubset"
- "MeshGateway"
- "MeshService"
- "MeshServiceSubset"
- "MeshHTTPRoute"
- mesh
- Mesh is reserved for future use to identify cross mesh resources.
- Type:
string
- name
- Name of the referenced resource. Can only be used with kinds:
MeshService
,MeshServiceSubset
andMeshGatewayRoute
- Type:
string
- Name of the referenced resource. Can only be used with kinds:
- proxyTypes
- ProxyTypes specifies the data plane types that are subject to the policy. When not specified,all data plane types are targeted by the policy.
- Type:
array
- Item Count: ≥ 1
- Items
- Type:
string
- The value is restricted to the following:
- "Sidecar"
- "Gateway"
- tags
- Tags used to select a subset of proxies by tags. Can only be used with kinds
MeshSubset
andMeshServiceSubset
- Type:
object
- This schema accepts additional properties.
- Properties
- Tags used to select a subset of proxies by tags. Can only be used with kinds
- kind
- to
- To list makes a match between the consumed services and correspondingconfigurations
- Type:
array
- Items
- Type:
object
- Properties
- default
- Default is a configuration specific to the group of destinationsreferenced in 'targetRef'
- Type:
object
- Properties
- connectionLimits
- ConnectionLimits contains configuration of each circuit breaking limit,which when exceeded makes the circuit breaker to become open (no trafficis allowed like no current is allowed in the circuits when physicalcircuit breaker ir open)
- Type:
object
- Properties
- maxConnectionPools
- The maximum number of connection pools per cluster that are concurrentlysupported at once. Set this for clusters which create a large number ofconnection pools.
- Type:
integer
- maxConnections
- The maximum number of connections allowed to be made to the upstreamcluster.
- Type:
integer
- maxPendingRequests
- The maximum number of pending requests that are allowed to the upstreamcluster. This limit is applied as a connection limit for non-HTTPtraffic.
- Type:
integer
- maxRequests
- The maximum number of parallel requests that are allowed to be madeto the upstream cluster. This limit does not apply to non-HTTP traffic.
- Type:
integer
- maxRetries
- The maximum number of parallel retries that will be allowed tothe upstream cluster.
- Type:
integer
- maxConnectionPools
- outlierDetection
- OutlierDetection contains the configuration of the process of dynamicallydetermining whether some number of hosts in an upstream cluster areperforming unlike the others and removing them from the healthy loadbalancing set. Performance might be along different axes such asconsecutive failures, temporal success rate, temporal latency, etc.Outlier detection is a form of passive health checking.
- Type:
object
- Properties
- baseEjectionTime
- The base time that a host is ejected for. The real time is equal tothe base time multiplied by the number of times the host has beenejected.
- Type:
string
- detectors
- Contains configuration for supported outlier detectors
- Type:
object
- Properties
- failurePercentage
- Failure Percentage based outlier detection functions similarly to successrate detection, in that it relies on success rate data from each host ina cluster. However, rather than compare those values to the mean successrate of the cluster as a whole, they are compared to a flatuser-configured threshold. This threshold is configured via theoutlierDetection.failurePercentageThreshold field.The other configuration fields for failure percentage based detection aresimilar to the fields for success rate detection. As with success ratedetection, detection will not be performed for a host if its requestvolume over the aggregation interval is less than theoutlierDetection.detectors.failurePercentage.requestVolume value.Detection also will not be performed for a cluster if the number of hostswith the minimum required request volume in an interval is less than theoutlierDetection.detectors.failurePercentage.minimumHosts value.
- Type:
object
- Properties
- minimumHosts
- The minimum number of hosts in a cluster in order to perform failurepercentage-based ejection. If the total number of hosts in the cluster isless than this value, failure percentage-based ejection will not beperformed.
- Type:
integer
- requestVolume
- The minimum number of total requests that must be collected in oneinterval (as defined by the interval duration above) to perform failurepercentage-based ejection for this host. If the volume is lower than thissetting, failure percentage-based ejection will not be performed for thishost.
- Type:
integer
- threshold
- The failure percentage to use when determining failure percentage-basedoutlier detection. If the failure percentage of a given host is greaterthan or equal to this value, it will be ejected.
- Type:
integer
- minimumHosts
- gatewayFailures
- In the default mode (outlierDetection.splitExternalLocalOriginErrors isfalse) this detection type takes into account a subset of 5xx errors,called "gateway errors" (502, 503 or 504 status code) and local originfailures, such as timeout, TCP reset etc.In split mode (outlierDetection.splitExternalLocalOriginErrors is true)this detection type takes into account a subset of 5xx errors, called"gateway errors" (502, 503 or 504 status code) and is supported only bythe http router.
- Type:
object
- Properties
- consecutive
- The number of consecutive gateway failures (502, 503, 504 status codes)before a consecutive gateway failure ejection occurs.
- Type:
integer
- consecutive
- localOriginFailures
- This detection type is enabled only whenoutlierDetection.splitExternalLocalOriginErrors is true and takes intoaccount only locally originated errors (timeout, reset, etc).If Envoy repeatedly cannot connect to an upstream host or communicationwith the upstream host is repeatedly interrupted, it will be ejected.Various locally originated problems are detected: timeout, TCP reset,ICMP errors, etc. This detection type is supported by http router andtcp proxy.
- Type:
object
- Properties
- consecutive
- The number of consecutive locally originated failures before ejectionoccurs. Parameter takes effect only when splitExternalAndLocalErrorsis set to true.
- Type:
integer
- consecutive
- successRate
- Success Rate based outlier detection aggregates success rate data fromevery host in a cluster. Then at given intervals ejects hosts based onstatistical outlier detection. Success Rate outlier detection will not becalculated for a host if its request volume over the aggregation intervalis less than the outlierDetection.detectors.successRate.requestVolumevalue.Moreover, detection will not be performed for a cluster if the number ofhosts with the minimum required request volume in an interval is lessthan the outlierDetection.detectors.successRate.minimumHosts value.In the default configuration mode(outlierDetection.splitExternalLocalOriginErrors is false) this detectiontype takes into account all types of errors: locally and externallyoriginated.In split mode (outlierDetection.splitExternalLocalOriginErrors is true),locally originated errors and externally originated (transaction) errorsare counted and treated separately.
- Type:
object
- Properties
- minimumHosts
- The number of hosts in a cluster that must have enough request volume todetect success rate outliers. If the number of hosts is less than thissetting, outlier detection via success rate statistics is not performedfor any host in the cluster.
- Type:
integer
- requestVolume
- The minimum number of total requests that must be collected in oneinterval (as defined by the interval duration configured inoutlierDetection section) to include this host in success rate basedoutlier detection. If the volume is lower than this setting, outlierdetection via success rate statistics is not performed for that host.
- Type:
integer
- standardDeviationFactor
- This factor is used to determine the ejection threshold for success rateoutlier ejection. The ejection threshold is the difference betweenthe mean success rate, and the product of this factor and the standarddeviation of the mean success rate: mean - (standarddeviation *successratestandarddeviationfactor).Either int or decimal represented as string._
- minimumHosts
- totalFailures
- In the default mode (outlierDetection.splitExternalAndLocalErrors isfalse) this detection type takes into account all generated errors:locally originated and externally originated (transaction) errors.In split mode (outlierDetection.splitExternalLocalOriginErrors is true)this detection type takes into account only externally originated(transaction) errors, ignoring locally originated errors.If an upstream host is an HTTP-server, only 5xx types of error are takeninto account (see Consecutive Gateway Failure for exceptions).Properly formatted responses, even when they carry an operational error(like index not found, access denied) are not taken into account.
- Type:
object
- Properties
- consecutive
- The number of consecutive server-side error responses (for HTTP traffic,5xx responses; for TCP traffic, connection failures; for Redis, failureto respond PONG; etc.) before a consecutive total failure ejectionoccurs.
- Type:
integer
- consecutive
- failurePercentage
- disabled
- When set to true, outlierDetection configuration won't take any effect
- Type:
boolean
- interval
- The time interval between ejection analysis sweeps. This can result inboth new ejections and hosts being returned to service.
- Type:
string
- maxEjectionPercent
- The maximum % of an upstream cluster that can be ejected due to outlierdetection. Defaults to 10% but will eject at least one host regardless ofthe value.
- Type:
integer
- splitExternalAndLocalErrors
- Determines whether to distinguish local origin failures from externalerrors. If set to true the following configuration parameters are takeninto account: detectors.localOriginFailures.consecutive
- Type:
boolean
- baseEjectionTime
- connectionLimits
- targetRef
required
- TargetRef is a reference to the resource that represents a group ofdestinations.
- Type:
object
- Properties
- kind
- Kind of the referenced resource
- Type:
string
- The value is restricted to the following:
- "Mesh"
- "MeshSubset"
- "MeshGateway"
- "MeshService"
- "MeshServiceSubset"
- "MeshHTTPRoute"
- mesh
- Mesh is reserved for future use to identify cross mesh resources.
- Type:
string
- name
- Name of the referenced resource. Can only be used with kinds:
MeshService
,MeshServiceSubset
andMeshGatewayRoute
- Type:
string
- Name of the referenced resource. Can only be used with kinds:
- proxyTypes
- ProxyTypes specifies the data plane types that are subject to the policy. When not specified,all data plane types are targeted by the policy.
- Type:
array
- Item Count: ≥ 1
- Items
- Type:
string
- The value is restricted to the following:
- "Sidecar"
- "Gateway"
- tags
- Tags used to select a subset of proxies by tags. Can only be used with kinds
MeshSubset
andMeshServiceSubset
- Type:
object
- This schema accepts additional properties.
- Properties
- Tags used to select a subset of proxies by tags. Can only be used with kinds
- kind
- default
Generated with json-schema-md-doc