Skip to content

feat(alerts): New KubePdbNotEnoughHealthyPods alert #1045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

skl
Copy link
Collaborator

@skl skl commented Mar 25, 2025

Fixes #1028

@skl skl added the enhancement New feature or request label Mar 25, 2025
@skl skl self-assigned this Mar 25, 2025
@skl skl added the keepalive Use to prevent automatic closing label Mar 25, 2025
Copy link

@paulajulve paulajulve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@skl
Copy link
Collaborator Author

skl commented Mar 25, 2025

On hold, pending internal trial (~30 days).

@skl
Copy link
Collaborator Author

skl commented Mar 25, 2025

FYI test deployment shows that I may need to revert part of #1013, which is causing an error with our internal pipeline:

time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=windows_pod_container_available ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=windows_container_total_runtime ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=windows_container_memory_usage ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=windows_container_private_working_set_usage ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=windows_container_network_received_bytes_total ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=windows_container_network_transmitted_bytes_total ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=kube_pod_windows_container_resource_memory_request ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=kube_pod_windows_container_resource_memory_limit ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=kube_pod_windows_container_resource_cpu_cores_request ruleGroup=windows.pod.rules
time="2025-03-25T14:28:28Z" level=error msg="bad recording rule name" error="recording rule name does not match level:metric:operation format, must contain at least one colon" file=output/mimir/dev-us-central-0.yaml rule=kube_pod_windows_container_resource_cpu_cores_limit ruleGroup=windows.pod.rules
mimirtool: error: 10 erroneous recording rule names, try --help

The windows recording rules do not follow the correct name conventions.

@@ -2,5 +2,4 @@
(import 'apps.libsonnet') +
(import 'kube_scheduler.libsonnet') +
(import 'node.libsonnet') +
(import 'kubelet.libsonnet') +
(import 'windows.libsonnet')
(import 'kubelet.libsonnet')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes windows rules from the default build, see this comment where I was previously concerned about this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request keepalive Use to prevent automatic closing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement]: Add alerts for PodDisruptionBudgets
2 participants