-
Notifications
You must be signed in to change notification settings - Fork 180
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
added copy for infra healthcheck rule
- Loading branch information
1 parent
12a8898
commit 97a1dde
Showing
4 changed files
with
39 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+42.1 KB
...have-a-healthcheck-page-to-test-your-website-dependencies/infra-bad-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+25 KB
...ave-a-healthcheck-page-to-test-your-website-dependencies/infra-good-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
45 changes: 38 additions & 7 deletions
45
rules/have-a-healthcheck-page-to-test-your-website-dependencies/rule.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,56 @@ | ||
--- | ||
seoDescription: Test your website's dependencies and functionality with a comprehensive health check page. | ||
seoDescription: Website dependency and infrastructure healthchecks | ||
type: rule | ||
title: Do you have a /HealthCheck (was /zsValidate) page to test your website | ||
dependencies? | ||
uri: have-a-healthcheck-page-to-test-your-website-dependencies | ||
title: Do you Healthcheck your Infrastructure? | ||
uri: infrastructure-healthchecks | ||
authors: | ||
- title: Adam Cogan | ||
url: https://ssw.com.au/people/adam-cogan | ||
- title: Lewis Toh | ||
url: https://ssw.com.au/people/lewis-toh | ||
- title: Luke Cook | ||
url: https://ssw.com.au/people/luke-cook | ||
|
||
related: [] | ||
redirects: | ||
- do-you-have-a-healthcheck-was-zsvalidate-page-to-test-your-website-dependencies | ||
- do-you-have-a-healthcheck-(was-zsvalidate)-page-to-test-your-website-dependencies | ||
- have-a-healthcheck-page-to-test-your-website-dependencies | ||
created: 2020-03-12T20:57:37.000Z | ||
archivedreason: null | ||
guid: 015fcac3-c2c2-4d25-a6cd-1317eed69fc6 | ||
--- | ||
|
||
There are two kinds of errors, coding errors and system health errors. Coding errors should ideally be found during development (by compiling, debugging, or running unit tests), while system health errors should be found by application health checks. | ||
Most developers include [healthchecks for their own applications](/have-a-healthcheck-page-to-make-sure-your-website-is-healthy/), but modern solutions are often highly dependent on external cloud infrastructure. When critical services go down, your app could become unresponsive or fail entirely. Ensuring your infrastructure is healthy is just as important as your app. | ||
|
||
<!--endintro--> | ||
|
||
Refer to the following rules for details: | ||
## Your app is only as healthy as its infrastructure | ||
Enterprise applications typically leverage a large number of cloud services; databases, caches, message queues, and more recently LLMs and other cloud-only AI services. These pieces of infrastructure are crucial to the health of your own application, and as such should be given the same care and attention to monitoring as your own code. If any component of your infrastructure fails, your app may not function as expected, potentially leading to outages, performance issues, or degraded user experience. Monitoring the health of infrastructure services is not just a technical task; it ensures the continuity of business operations and user satisfaction. | ||
|
||
- See SSW Rules - [Do you have a HealthCheck page to make sure your website is healthy?](/have-a-healthcheck-page-to-make-sure-your-website-is-healthy) | ||
`youtube: https://www.youtube.com/watch?v=4abSfjdzqms` | ||
**Figure: How to add Healthchecks in ASP.NET Core (11 min)** | ||
|
||
## Adding a custom healthcheck | ||
Here's a quick code snippet that can be used to add a healthcheck to an external API. | ||
|
||
|
||
|
||
## Alerts and responses | ||
Adding comprehensive healthchecks is great, but if no-one is told about it - what's the point? There are awesome tools available to notify Site Reliability Engineers (SREs) or SysAdmins when something is offline, so make sure your app is set up to use them! For instance, Azure's [Azure Monitor Alerts](https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-overview) and AWS' [CloudWatch](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html) provide a suite of configurable options for who, what, when, and how alerts should be fired. | ||
|
||
## Healthcheck UIs | ||
Depending on your needs, you may want to bake in a healthcheck UI directly into your app. Packages like [AspNetCore.HealthChecks.UI](https://www.nuget.org/packages/AspNetCore.HealthChecks.UI/) make this a breeze, and can often act as your canary in the coalmine. Cloud providers' native status/health pages can take a while to update, so having your own can be a huge timesaver. | ||
|
||
|
||
## Handle offline infrastructure gracefully | ||
When using non-critical infrastructure like an LLM-powered chatbot, make sure to implement graceful degradation strategies. Instead of failing completely, this allows your app to respond intelligently to infrastructure outages, whether through fallback logic, informative user messages, or retry mechanisms when the service is back online. | ||
|
||
|
||
::: bad | ||
data:image/s3,"s3://crabby-images/9e2e2/9e2e2cd8851100380fab5d6d3d76bb4a34598a92" alt="Figure: Bad example – The user is given the chance to interact with a feature that is currently unavailable." | ||
::: | ||
|
||
::: good | ||
data:image/s3,"s3://crabby-images/3d7c3/3d7c3847ce23a9fe1f6a3e172a24d4a4e119b744" alt="Figure: Good example – The user is pre-emptively shown a message that shows this feature is currently unavailable." | ||
::: |