Skip to content

Commit

Permalink
added copy for infra healthcheck rule
Browse files Browse the repository at this point in the history
  • Loading branch information
lukecookssw committed Feb 12, 2025
1 parent 12a8898 commit 97a1dde
Show file tree
Hide file tree
Showing 4 changed files with 39 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ index:
- isolate-your-logic-and-remove-dependencies-on-instances-of-objects
- microsoft-recommended-frameworks-for-automated-ui-driven-functional-testing
- have-tests-for-performance
- have-a-healthcheck-page-to-test-your-website-dependencies
- infrastructure-healthchecks
- isolate-your-logic-from-your-io-to-increase-the-testability
- when-adding-a-unit-test-for-an-edge-case-the-naming-convention-should-be-the-issue-id
- test-your-javascript
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -1,25 +1,56 @@
---
seoDescription: Test your website's dependencies and functionality with a comprehensive health check page.
seoDescription: Website dependency and infrastructure healthchecks
type: rule
title: Do you have a /HealthCheck (was /zsValidate) page to test your website
dependencies?
uri: have-a-healthcheck-page-to-test-your-website-dependencies
title: Do you Healthcheck your Infrastructure?
uri: infrastructure-healthchecks
authors:
- title: Adam Cogan
url: https://ssw.com.au/people/adam-cogan
- title: Lewis Toh
url: https://ssw.com.au/people/lewis-toh
- title: Luke Cook
url: https://ssw.com.au/people/luke-cook

related: []
redirects:
- do-you-have-a-healthcheck-was-zsvalidate-page-to-test-your-website-dependencies
- do-you-have-a-healthcheck-(was-zsvalidate)-page-to-test-your-website-dependencies
- have-a-healthcheck-page-to-test-your-website-dependencies
created: 2020-03-12T20:57:37.000Z
archivedreason: null
guid: 015fcac3-c2c2-4d25-a6cd-1317eed69fc6
---

There are two kinds of errors, coding errors and system health errors. Coding errors should ideally be found during development (by compiling, debugging, or running unit tests), while system health errors should be found by application health checks.
Most developers include [healthchecks for their own applications](/have-a-healthcheck-page-to-make-sure-your-website-is-healthy/), but modern solutions are often highly dependent on external cloud infrastructure. When critical services go down, your app could become unresponsive or fail entirely. Ensuring your infrastructure is healthy is just as important as your app.

<!--endintro-->

Refer to the following rules for details:
## Your app is only as healthy as its infrastructure
Enterprise applications typically leverage a large number of cloud services; databases, caches, message queues, and more recently LLMs and other cloud-only AI services. These pieces of infrastructure are crucial to the health of your own application, and as such should be given the same care and attention to monitoring as your own code. If any component of your infrastructure fails, your app may not function as expected, potentially leading to outages, performance issues, or degraded user experience. Monitoring the health of infrastructure services is not just a technical task; it ensures the continuity of business operations and user satisfaction.

- See SSW Rules - [Do you have a HealthCheck page to make sure your website is healthy?](/have-a-healthcheck-page-to-make-sure-your-website-is-healthy)
`youtube: https://www.youtube.com/watch?v=4abSfjdzqms`
**Figure: How to add Healthchecks in ASP.NET Core (11 min)**

## Adding a custom healthcheck
Here's a quick code snippet that can be used to add a healthcheck to an external API.



## Alerts and responses
Adding comprehensive healthchecks is great, but if no-one is told about it - what's the point? There are awesome tools available to notify Site Reliability Engineers (SREs) or SysAdmins when something is offline, so make sure your app is set up to use them! For instance, Azure's [Azure Monitor Alerts](https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-overview) and AWS' [CloudWatch](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html) provide a suite of configurable options for who, what, when, and how alerts should be fired.

## Healthcheck UIs
Depending on your needs, you may want to bake in a healthcheck UI directly into your app. Packages like [AspNetCore.HealthChecks.UI](https://www.nuget.org/packages/AspNetCore.HealthChecks.UI/) make this a breeze, and can often act as your canary in the coalmine. Cloud providers' native status/health pages can take a while to update, so having your own can be a huge timesaver.


## Handle offline infrastructure gracefully
When using non-critical infrastructure like an LLM-powered chatbot, make sure to implement graceful degradation strategies. Instead of failing completely, this allows your app to respond intelligently to infrastructure outages, whether through fallback logic, informative user messages, or retry mechanisms when the service is back online.


::: bad
![Figure: Bad example – The user is given the chance to interact with a feature that is currently unavailable.](infra-bad-example.png)
:::

::: good
![Figure: Good example – The user is pre-emptively shown a message that shows this feature is currently unavailable.](infra-good-example.png)
:::

0 comments on commit 97a1dde

Please sign in to comment.