Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadbalancer: Add ConnectTracker and make HealthIndicator extend it #2818

Merged

Conversation

bryce-anderson
Copy link
Contributor

@bryce-anderson bryce-anderson commented Jan 30, 2024

Motivation:

We can't rely on the request path to pass in connection failures because if we can't get a connection we can never add the observers. That means we need the Host implementation to pass it in directly.

Modifications:

Add a new interface, ConnectTracker, that mirrors RequestTracker in purpose but with the intent of monitoring session establishment related events and have HealthIndicator extend it.

Result:

We should be able to track connection failures now and factor those into the HealthTracker implementations.

Note

See #2817 which is substantially the same but doesn't introduce a new interface.

Motivation:

We can't rely on the request path to pass in connection
failures because if we can't get a connection we can never
add the observers. That means we need the `Host` implementation
to pass it in directly.

Modifications:

Add two new methods to `HealthIndicator` that provide the current
time and and a callback for submitting connection failures.

Result:

We should be able to track connection failures now and factor
those into the `HealthTracker` implementations.
Copy link
Contributor

@tkountis tkountis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the separate interface. A few things for your consideration

  1. Consider renaming the RequestTracker callbacks to include the request in their name for better readability for use-cases like the health-indicator that implements both.
  2. onConnectionSuccess is noop, but I still think this needs to be transactional. If we observed a connection failure we should also track the connection success to negate that effect. In my minds eye, the data path is a separate concern.
  3. Consider influencing the score for the connection path too. IMO connection latency is a health indicator too.
  4. If you do the above, consider separating the two scoring mechanisms (ie. latency-trackers), in the past we have noted better behavior when the connection path had different decay configuration (possibly even penalties).

@bryce-anderson
Copy link
Contributor Author

bryce-anderson commented Feb 1, 2024

I like the separate interface. A few things for your consideration

  1. Consider renaming the RequestTracker callbacks to include the request in their name for better readability for use-cases like the health-indicator that implements both.

👍 done.

  1. onConnectionSuccess is noop, but I still think this needs to be transactional. If we observed a connection failure we should also track the connection success to negate that effect. In my minds eye, the data path is a separate concern.
  2. Consider influencing the score for the connection path too. IMO connection latency is a health indicator too.
  3. If you do the above, consider separating the two scoring mechanisms (ie. latency-trackers), in the past we have noted better behavior when the connection path had different decay configuration (possibly even penalties).

I think these are worth considering but I'd prefer to leave them for later. The only implementation right now is the XDS style health checks and at least the envoy version is pretty similar to what is is here. That isn't to say we shouldn't try to do better but I'd prefer to deliberately examine that in a separate change whereas here I'd prefer to just wire it through.

@bryce-anderson bryce-anderson merged commit 29657c9 into apple:main Feb 2, 2024
15 checks passed
@bryce-anderson bryce-anderson deleted the bl_anderson/connection_failures_opt2 branch February 2, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants