Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.18](backport #42714) [metricbeat] Refactor kubernetes bearer token authentication #42783

Merged
merged 1 commit into from
Feb 19, 2025

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Feb 19, 2025

Proposed commit message

[metricbeat] Refactor kubernetes bearer token authentication

Instead of doing retries on 401 errors, use a mechanism from client-go which simply reloads the token periodically in the background.

Also, don't stop logging errors after the first 401. These errors, if present, need to be addressed by the cluster operator, so we should make them more prominent.

We have a report of the current mechanism running into race conditions in some OpenShift clusters. The exact root cause is unknown, but this change should address it.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

After this change, we will continue logging errors when we get a 401 from the API Server of kubelet, whereas up until now we'd only log the first one.

How to test this PR locally

  1. Build a local metricbeat docker image using mage package.
  2. Start a kind cluster.
  3. Upload the docker image to the kind cluster.
  4. Install metricbeat in the cluster using the official manifests.
  5. Wait for an hour until the auth token gets rotated.
  6. Look at records coming from the kubernetes module for the non-state metricsets in Kibana:

Screenshot_20250217_120534

Related issues


This is an automatic backport of pull request #42714 done by [Mergify](https://mergify.com).

Instead of doing retries on 401 errors, use a mechanism from client-go
which simply reloads the token periodically in the background.

Also, don't stop logging errors after the first 401. These errors, if
present, need to be addressed by the cluster operator, so we should make
them more prominent.

(cherry picked from commit c61c0fe)
@mergify mergify bot added the backport label Feb 19, 2025
@mergify mergify bot requested review from a team as code owners February 19, 2025 13:09
@mergify mergify bot requested review from rdner and faec and removed request for a team February 19, 2025 13:09
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 19, 2025
@github-actions github-actions bot added enhancement cleanup Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team bugfix labels Feb 19, 2025
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 19, 2025
@pierrehilbert pierrehilbert added Team:obs-ds-hosted-services Label for the Observability Hosted Services team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Feb 19, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pierrehilbert pierrehilbert requested review from swiatekm and removed request for rdner and faec February 19, 2025 14:35
@swiatekm swiatekm merged commit 908e19f into 8.18 Feb 19, 2025
40 checks passed
@swiatekm swiatekm deleted the mergify/bp/8.18/pr-42714 branch February 19, 2025 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport bugfix cleanup enhancement Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants