Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freezes if kube-apiserver isn't available at startup #35

Open
iblackman opened this issue Feb 24, 2025 · 0 comments
Open

Freezes if kube-apiserver isn't available at startup #35

iblackman opened this issue Feb 24, 2025 · 0 comments

Comments

@iblackman
Copy link
Contributor

iblackman commented Feb 24, 2025

Description

In the case where the kube-apiserver isn't available or unstable the controller ends up failing to list secrets, namespaces and configmaps and stops, it doesn't error (enough for the liveness to kick in) or retries.

Example of logs output when it gets in that state:

{"level":"info","timestamp":"2025-02-24T15:53:32Z","msg":"Starting Keess. Running on local cluster: app-beta-gm"}
{"level":"debug","timestamp":"2025-02-24T15:53:32Z","msg":"Namespace polling interval: 60 seconds"}
{"level":"debug","timestamp":"2025-02-24T15:53:32Z","msg":"Polling interval: 60 seconds"}
{"level":"debug","timestamp":"2025-02-24T15:53:32Z","msg":"Housekeeping interval: 60 seconds"}
{"level":"debug","timestamp":"2025-02-24T15:53:32Z","msg":"Log level: debug"}
{"level":"debug","timestamp":"2025-02-24T15:53:32Z","msg":"Kubeconfig path: /root/.kube/config"}
{"level":"info","timestamp":"2025-02-24T15:53:32Z","msg":"Remote clusters: [app-beta-hq app-beta-px app-prod-hq app-prod-gm]"}
{"level":"error","timestamp":"2025-02-24T15:54:02Z","msg":"Failed to list namespaces: Get \"https://10.64.192.1:443/api/v1/namespaces\": dial tcp 10.64.192.1:443: i/o timeout"}
{"level":"error","timestamp":"2025-02-24T15:54:02Z","msg":"Failed to list secrets: Get \"https://10.64.192.1:443/api/v1/secrets?labelSelector=keess.powerhrg.com%2Fmanaged\": dial tcp 10.64.192.1:443: i/o timeout"}
{"level":"error","timestamp":"2025-02-24T15:54:02Z","msg":"Failed to list configMaps: Get \"https://10.64.192.1:443/api/v1/configmaps?labelSelector=keess.powerhrg.com%2Fmanaged\": dial tcp 10.64.192.1:443: i/o timeout"}
{"level":"error","timestamp":"2025-02-24T15:54:02Z","msg":"Failed to list configMaps: Get \"https://10.64.192.1:443/api/v1/configmaps?labelSelector=keess.powerhrg.com%2Fsync\": dial tcp 10.64.192.1:443: i/o timeout"}
{"level":"error","timestamp":"2025-02-24T15:54:02Z","msg":"Failed to list secrets: Get \"https://10.64.192.1:443/api/v1/secrets?labelSelector=keess.powerhrg.com%2Fsync\": dial tcp 10.64.192.1:443: i/o timeout"}

Expected behavior

It would be expected for it to retry or restart if it can't list the resources, so it can try again until the apiserver is available and it can continue without manual intervention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant