Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prometheus #148

Merged
merged 1 commit into from
Feb 5, 2024
Merged

Add prometheus #148

merged 1 commit into from
Feb 5, 2024

Conversation

RMcVelia
Copy link
Collaborator

@RMcVelia RMcVelia commented Jan 22, 2024

Context

Add prometheus monitoring for non-prod clusters

Currently scrapes below metrics

  • prometheus service
    /metrics from the prometheus svc endpoint
  • cadvisor
    /api/v1/nodes/${node}/proxy/metrics/cadvisor
  • nodes
    /api/v1/nodes/${node}/proxy/metrics
  • api server
    /metrics from the prometheus api endpoint
    had to disable tls verify for it to work.

May need to think about if we want all of these metrics.
Can check node api metrics via kubectl get --raw /api/v1/nodes/${node}/proxy/metrics or /proxy/metrics/cadvisor

Alert rules, scrape configs, etc will be improved in later PR's.

Changes proposed in this pull request

Add required terraform resources
Add initial doc

Guidance to review

make development terraform-plan ENVIRONMENT=cluster6
scripts/pfwd.sh
http://localhost:8080/

Before merging

Remove "--storage.tsdb.retention.time=12h" from the prometheus deployment

Checklist

  • I have performed a self-review of my code, including formatting and typos
  • I have cleaned the commit history
  • I have added the Devops label
  • I have attached the pull request to the trello card

@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch from e0b1417 to abbf9d4 Compare January 22, 2024 15:20
Copy link
Contributor

@temitope777 temitope777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite interesting @RMcVelia
good job

@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch 2 times, most recently from b6bc56c to 3d34871 Compare January 23, 2024 15:57
@RMcVelia RMcVelia marked this pull request as ready for review January 24, 2024 16:14
@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch 2 times, most recently from ca1776d to 0ea4656 Compare January 25, 2024 10:39
@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch from 0ea4656 to 434fb5f Compare January 26, 2024 08:18
@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch 2 times, most recently from dbcb11f to 8d047ef Compare January 31, 2024 19:19
@RMcVelia RMcVelia requested a review from saliceti February 1, 2024 09:08
@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch from 8d047ef to ea71c0b Compare February 1, 2024 10:33
@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch from ea71c0b to e2e0733 Compare February 5, 2024 11:32
@RMcVelia RMcVelia force-pushed the 911-create-prometheus-service branch from e2e0733 to e333753 Compare February 5, 2024 12:43
@RMcVelia RMcVelia merged commit 7677964 into main Feb 5, 2024
4 checks passed
@RMcVelia RMcVelia deleted the 911-create-prometheus-service branch February 5, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants