Azure-Samples · pamelafox · Jul 23, 2024 · Jun 22, 2024 · Jul 23, 2024 · Jul 23, 2024
diff --git a/README.md b/README.md
@@ -13,6 +13,7 @@ This project is designed for deployment to Azure using [the Azure Developer CLI]
 * [Local development](#local-development)
 * [Costs](#costs)
 * [Security guidelines](#security-guidelines)
+* [Guidance](#guidance)
 * [Resources](#resources)
 
 ## Features
@@ -176,12 +177,22 @@ You may try the [Azure pricing calculator](https://azure.microsoft.com/pricing/c
 * Azure PostgreSQL Flexible Server: Burstable Tier with 1 CPU core, 32GB storage. Pricing is hourly. [Pricing](https://azure.microsoft.com/pricing/details/postgresql/flexible-server/)
 * Azure Monitor: Pay-as-you-go tier. Costs based on data ingested. [Pricing](https://azure.microsoft.com/pricing/details/monitor/)
 
-## Security Guidelines
+## Security guidelines
 
 This template uses [Managed Identity](https://learn.microsoft.com/entra/identity/managed-identities-azure-resources/overview) for authenticating to the Azure services used (Azure OpenAI, Azure PostgreSQL Flexible Server).
 
 Additionally, we have added a [GitHub Action](https://github.com/microsoft/security-devops-action) that scans the infrastructure-as-code files and generates a report containing any detected issues. To ensure continued best practices in your own repository, we recommend that anyone creating solutions based on our templates ensure that the [Github secret scanning](https://docs.github.com/code-security/secret-scanning/about-secret-scanning) setting is enabled.
 
+## Guidance
+
+Further documentation is available in the `docs/` folder:
+
+* [Deploying with existing resources](docs/deploy_existing.md)
+* [Monitoring with Azure Monitor](docs/monitoring.md)
+* [Load testing](docs/loadtesting.md)
+
+Please post in the issue tracker with any questions or issues.
+
 ## Resources
 
 * [RAG chat with Azure AI Search + Python](https://github.com/Azure-Samples/azure-search-openai-demo/)

diff --git a/docs/images/appinsights_trace.png b/docs/images/appinsights_trace.png
diff --git a/docs/images/locust_loadtest.png b/docs/images/locust_loadtest.png
diff --git a/docs/loadtesting.md b/docs/loadtesting.md
@@ -0,0 +1,38 @@
+## Load testing
+
+We recommend running a loadtest for your expected number of users.
+You can use the [locust tool](https://docs.locust.io/) with the `locustfile.py` in this sample
+or set up a loadtest with Azure Load Testing.
+
+To use locust, first install the dev requirements that includes locust:
+
+```shell
+python -m pip install -r requirements-dev.txt
+```
+
+Or manually install locust:
+
+```shell
+python -m pip install locust
+```
+
+Then run the locust command, specifying the name of the User class to use from `locustfile.py` or using the default class. We've provided a `ChatUser` class that simulates a user asking questions and receiving answers.
+
+```shell
+locust
+```
+
+Open the locust UI at [http://localhost:8089/](http://localhost:8089/), the URI displayed in the terminal.
+
+Start a new test with the URI of your website, e.g. `https://my-chat-app.containerapps.io`.
+Do *not* end the URI with a slash. You can start by pointing at your localhost if you're concerned
+more about load on OpenAI than on Azure Container Apps and PostgreSQL Flexible Server.
+
+For the number of users and spawn rate, we recommend starting with 20 users and 1 users/second.
+From there, you can keep increasing the number of users to simulate your expected load.
+
+Here's an example loadtest for 20 users and a spawn rate of 1 per second:
+
+![Screenshot of Locust charts showing 5 requests per second](images/locust_loadtest.png)
+
+After each test, check the local or App Service logs to see if there are any errors.
diff --git a/docs/monitoring.md b/docs/monitoring.md
@@ -0,0 +1,17 @@
+
+# Monitoring with Azure Monitor
+
+By default, deployed apps use Application Insights for the tracing of each request, along with the logging of errors.
+
+To see the performance data, go to the Application Insights resource in your resource group, click on the "Investigate -> Performance" blade and navigate to any HTTP request to see the timing data.
+To inspect the performance of chat requests, use the "Drill into Samples" button to see end-to-end traces of all the API calls made for any chat request:
+
+![Tracing screenshot](images/appinsights_trace.png)
+
+To see any exceptions and server errors, navigate to the "Investigate -> Failures" blade and use the filtering tools to locate a specific exception. You can see Python stack traces on the right-hand side.
+
+You can also see chart summaries on a dashboard by running the following command:
+
+```shell
+azd monitor
+```