kubernetes-sigs · bingikarthik · May 13, 2025
diff --git a/README.md b/README.md
@@ -1,142 +1,157 @@
 # Usage Metrics Collector
 
-The usage-metrics-collector is a Prometheus metrics collector optimized for collecting kube usage and
-capacity metrics.
+A Prometheus metrics collector optimized for Kubernetes usage and capacity metrics.
 
-## Motivation
+## Overview
 
-Why not just use promql and recording rules?
+The usage-metrics-collector provides enhanced metrics collection for Kubernetes environments, focusing on resource usage and capacity monitoring with improved scalability and insights.
 
-- Scale
-  - Aggregate at collection time to reduce prometheus work
-  - Export aggregated metrics without the raw metrics to reduce prometheus storage
-- Insight
-  - Join labels at collection time (e.g. set the priority class on pod metrics)
-  - Set hard to resolve labels (e.g. set the workload kind on pod metrics)
-  - View node-level cgroup utilization (e.g. kubepods vs system.slice metrics)
-- Fidelity
-  - Scrape utilization at 1s intervals as raw metrics
-  - Perform aggregations on the 1s interval metrics (e.g. get the p95 1s utilization sample for all replicas of a workload)
+## Why Use This Instead of Standard Prometheus?
 
-### Example
+### Scale Optimization
+- Aggregates metrics at collection time to reduce Prometheus workload
+- Exports only aggregated metrics to reduce storage requirements
 
-[collector.yaml](config/metrics-prometheus-collector/configmaps/collector.yaml)
+### Enhanced Insights
+- Joins labels at collection time (e.g., adds priority class to pod metrics)
+- Sets hard-to-resolve labels (e.g., workload kind on pod metrics)
+- Provides node-level cgroup utilization metrics (kubepods vs system.slice)
+
+### High Fidelity
+- Scrapes utilization at 1s intervals as raw metrics
+- Performs aggregations on high-frequency metrics (e.g., p95 utilization across workload replicas)
+
+### Configuration Example
+See [collector.yaml](config/metrics-prometheus-collector/configmaps/collector.yaml) for a sample configuration.
 
 ## Architecture
 
-### considerations
-
-- Metrics must be highly configurable
-  - The metrics labels derived from objects
-  - The aggregations (sum, average, median, max, p95, histograms, etc) must be configurable
-- Metrics should be able to be pushed to additional sources such as cloud storage buckets, BigQuery, etc
-- Metric computation must scale to large clusters with lots of churn.
-  - Run aggregations in parallel
-  - Re-use previous results as much as possible
-- Scrapes should always immediately get a result, even when complex aggregations in large clusters take minutes to compute.
-- Utilization metrics should not be published until the data is present from a sufficient number of nodes.
-  This is to prevent showing "low" utilization numbers before all nodes have sent results.
-- There should be no graphs in data.  There must always be at least 1 ready and healthy replica which can be scraped by prometheus.
-- Sampler pods which become unhealthy due to issues on a node should be continuously recreated until they are functional again.
-- All cluster objects and utilization samples are cached in the collector so memory must be optimized.
-- It is difficult to horizontally scale the collector.  Offloading computations to the samplers is preferred.
-
-### metrics-node-sampler
+### Design Considerations
+
+- **Configurability**: Highly configurable metrics with customizable labels and aggregations
+- **Extensibility**: Push metrics to additional sources (cloud storage, BigQuery)
+- **Scalability**: Parallel aggregations and result caching for large clusters
+- **Responsiveness**: Immediate results for Prometheus scrapes, even for complex calculations
+- **Accuracy**: Only publishes metrics after collecting sufficient data from nodes
+- **Reliability**: Maintains healthy replicas for consistent scraping
+- **Resilience**: Auto-recreates unhealthy sampler pods
+- **Efficiency**: Optimized memory usage for cached objects and samples
+- **Performance**: Offloads computation to samplers where possible
+
+### Components
 
+#### metrics-node-sampler
 - Runs as a DaemonSet on each Node
-- Periodically reads utilization data for cpu and memory and stores in ring buffer
-  - Period and number of samples is configurable
-- Reads host metrics directly from cgroups psuedo filesystem (e.g. cpu usage for all of kubepods cgroup)
-- Reads container metrics from containerd (e.g. cpu usage for an individual container)
-- Periodically pushes metrics to collectors
-  - After time period
-  - After new pod starts running
-  - Before shutting down
-- Finds collectors to push to via DNS
-- Collectors can register manually with each sampler
-- Performs some precomputations such as averages.
-
-### metrics-prometheus-collector
-
-- Runs as a Deployment with multiple replicas for HA
-- Highly configurable metrics
-  - Metric labels may be derived from annotations / labels on other objects (e.g. pod metrics should have metric labels pulled from node conditions)
-  - Metrics may be pre-aggregated prior to being scraped by prometheus (e.g. reduce cardinality, produce quantiles and histograms)
-- Periodically get the metrics and cache them to be scraped (i.e. minimize scrape time by eagerly computing results)
-- Registers itself and all collector replicas with each sampler
-- Recieves metrics from node-samplers as a utilization source
-- Waits until has sufficient samples before providing results
-- Waits until results have been scraped before marking self as Ready
-- Can write additional metrics to local files
-
-### collector side-cars
-
-- May expose additional metrics read from external sources
-- May write local files to persistent storage for futher analysis
-
-## Exposed Metrics
+- Collects CPU and memory utilization data in a configurable ring buffer
+- Reads host metrics from cgroups filesystem and container metrics from containerd
+- Pushes metrics to collectors on schedule, pod creation, and before shutdown
+- Discovers collectors via DNS and accepts manual collector registration
+- Performs initial aggregations like averages
+
+#### metrics-prometheus-collector
+- Runs as a high-availability Deployment
+- Configures metrics with rich label derivation capabilities
+- Pre-aggregates metrics to reduce cardinality and compute histograms/quantiles
+- Caches computed results to minimize scrape time
+- Registers with all node samplers to receive utilization data
+- Waits for sufficient samples before exposing metrics
+- Signals readiness after initial scrape
+- Supports writing additional metrics to local files
+
+#### collector side-cars
+- Expose metrics from external sources
+- Write metrics to persistent storage for further analysis
+
+## Metrics
 
 A sample of the exposed metrics is available in [METRICS.md](METRICS.md).
 
-In addition to these metrics, a series of performance related metrics are published for the collection process.
-These metrics are documented in [performance analysis document](docs/performance-analysis.md).
-
-## Getting started
-
-**Note**: No usage-metrics-collector container image is publicly hosted.  Folks will need to build and publish
-this own until this is resolved.
-
-### Installing into a cluster
-
-#### Kind cluster
-
-**Important**: requires using cgroups v1.
-
-- Must set for Docker on Mac using [these docs](https://docs.docker.com/desktop/release-notes/#for-mac-28)
-- Must set for GKE for 1.26+ clusters
-
-1. Create a kind cluster
-  - `kind create cluster`
-2. Build the image
-  - `docker build . -t usage-metrics-collector:v0.0.0`
-3. Load the image into kind
-  - `kind load docker-image usage-metrics-collector:v0.0.0`
-4. Install the config
-  - `kustomize build config | kubectl apply -f -`
-5. Update your context to use the usage-metrics-collector namespace by default
-  - `kubectl config set-context --current --namespace=usage-metrics-collector`
-
-### Kicking the tires
-
-1. Make sure the pods are healthy
-  - `kubectl get pods`
-2. Make sure the services have endpoints
-  - `kubectl describe services`
-3. Get the metrics from the collector itself
-  - `kubectl exec -t -i $(kubectl get pods -o name -l app=metrics-prometheus-collector) -- curl localhost:8080/metrics`
-  - wait for service to be ready
-  - `kubectl port-forward service/metrics-prometheus-collector 8080:8080`
-  - visit `localhost:8080/metrics` in your browser
-4. Get the metrics from prometheus
-  - `kubectl port-forward $(kubectl get pods -o name -l app=prometheus) 9090:9090`
-  - visit `localhost:9090/` in your browser
-5. View the metrics in Grafana
-  - `kubectl port-forward service/grafana 3000:3000`
-  - visit `localhost:3000` in your browser
-  - enter `admin` for the username and password
-  - go to "Explore"
-  - change the source to "prometheus"
-  - enter `kube_usage_` into the metric field
-  - remove the label filters
-  - click "Run Query"
-
-### Specifying aggregation rules
+Performance-related metrics for the collection process are documented in the [performance analysis document](docs/performance-analysis.md).
+
+## Getting Started
+
+**Note**: No usage-metrics-collector container image is publicly hosted. You'll need to build and publish your own.
+
+### Installation with Kind
+
+**Important**: Requires cgroups v1
+- For Docker on Mac: Configure using [these docs](https://docs.docker.com/desktop/release-notes/#for-mac-28)
+- For GKE 1.26+ clusters: Enable cgroups v1
+
+1. Create a kind cluster:
+   ```
+   kind create cluster
+   ```
+
+2. Build the image:
+   ```
+   docker build . -t usage-metrics-collector:v0.0.0
+   ```
+
+3. Load the image into kind:
+   ```
+   kind load docker-image usage-metrics-collector:v0.0.0
+   ```
+
+4. Install the configuration:
+   ```
+   kustomize build config | kubectl apply -f -
+   ```
+
+5. Set your context to the collector namespace:
+   ```
+   kubectl config set-context --current --namespace=usage-metrics-collector
+   ```
+
+### Verification Steps
+
+1. Check pod health:
+   ```
+   kubectl get pods
+   ```
+
+2. Verify service endpoints:
+   ```
+   kubectl describe services
+   ```
+
+3. View collector metrics directly:
+   ```
+   kubectl exec -t -i $(kubectl get pods -o name -l app=metrics-prometheus-collector) -- curl localhost:8080/metrics
+   ```
+
+   Or using port forwarding:
+   ```
+   kubectl port-forward service/metrics-prometheus-collector 8080:8080
+   ```
+   Then visit `localhost:8080/metrics` in your browser.
+
+4. Access Prometheus metrics:
+   ```
+   kubectl port-forward $(kubectl get pods -o name -l app=prometheus) 9090:9090
+   ```
+   Then visit `localhost:9090/` in your browser.
+
+5. View metrics in Grafana:
+   ```
+   kubectl port-forward service/grafana 3000:3000
+   ```
+   Then:
+   - Visit `localhost:3000` in your browser
+   - Login with username/password: `admin`/`admin`
+   - Go to "Explore"
+   - Select "prometheus" as the source
+   - Enter `kube_usage_` in the metric field
+   - Remove any label filters
+   - Click "Run Query"
+
+### Customizing Aggregation Rules
 
 1. Edit [config/metrics-prometheus-collector/configmaps/collector.yaml](config/metrics-prometheus-collector/configmaps/collector.yaml)
 2. Run `make run-local`
-3. View the updated metrics in grafana
+3. View the updated metrics in Grafana
 
-## Code of conduct
+## Code of Conduct
 
 Participation in the Kubernetes community is governed by the [Kubernetes Code of Conduct](code-of-conduct.md).