Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(helm): Allow configurable severity and thresholds for alerts #13730

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

schahal
Copy link
Contributor

@schahal schahal commented Aug 1, 2024

What this PR does / why we need it:

The Loki helm chart comes with hardcoded PrometheusRules where users are unable to change the e.g., severity or threshold to best fit their environment's behavior.

This PR allows users to set some overrides.

For example, in my environment, a critical severity may mean a phone call in the middle of the night. So I have a values.yaml with the following override:

monitoring:
  rules:
    configs:
      LokiRequestPanics:
        # warn me if there are more than 3 panics over the last hour
        lookbackPeriod: 60m
        severity: warning
        threshold: 3

Which issue(s) this PR fixes:
Fixes #11219

Special notes for your reviewer:

  • This is simply pulling the default attributes of the PrometheusRule expressions into values.yaml
  • This is backwards compatible with the monitoring.rules.disabled keys (marking that as deprecated and to use configs for enabling/disabling). But we keep the deprecated keys for backwards compatibility
  • Testing: Ran helm template test --set loki.useTestSchema=true --set monitoring.rules.enabled=true --debug . --validate and confirmed not rendered diff before and after.

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@schahal schahal requested a review from a team as a code owner August 1, 2024 00:23
@CLAassistant
Copy link

CLAassistant commented Aug 1, 2024

CLA assistant check
All committers have signed the CLA.

@schahal schahal force-pushed the feat/make-alert-threshold-severity-configurable branch 2 times, most recently from 4182fae to 6e2e41e Compare August 2, 2024 22:58
@schahal schahal force-pushed the feat/make-alert-threshold-severity-configurable branch 3 times, most recently from d215185 to b72825d Compare August 5, 2024 22:44
@schahal schahal force-pushed the feat/make-alert-threshold-severity-configurable branch from b72825d to 75f35e0 Compare August 28, 2024 14:08
@schahal
Copy link
Contributor Author

schahal commented Aug 28, 2024

Hi @JStickler @DylanGuedes @chaudum @vlad-diachenko : I'm wondering how we can get a reviewer/approval for this? I don't see any guidance in the Contributors doc or guidelines.

Tagged you as I saw the last couple feat(helm) approvals came from you (apologies in advance if it should be someone else, or if I missed something in CONTRIBUTING.md).

This is to address/resolve #11219

@schahal schahal force-pushed the feat/make-alert-threshold-severity-configurable branch from 75f35e0 to d35288c Compare September 3, 2024 20:18
@schahal schahal force-pushed the feat/make-alert-threshold-severity-configurable branch from d35288c to a262d62 Compare September 5, 2024 15:28
@schahal
Copy link
Contributor Author

schahal commented Sep 5, 2024

Adding @trevorwhitney and @paul1r also for approval of the CI workflows and possible review/approve/merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The "production" Loki chart doesn't allow customizing the Alert PrometheusRules
2 participants