Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods with listener volumes can not be restarted after their node is restarted #262

Open
nightkr opened this issue Dec 12, 2024 · 0 comments
Labels

Comments

@nightkr
Copy link
Member

nightkr commented Dec 12, 2024

Affected Stackable version

nightly (but it's an old issue)

Current and expected behavior

Repro:

  1. Install SDP
  2. Create a pod that mounts a list-op volume (such as examples/)
  3. Wait for it to start up
  4. Restart the node that the pod is running on (k3d note: run docker kill k3d-node-name && docker restart k3d-node-name, a plain docker restart k3d-node-name will delete all pods on the node first, bypassing the issue. other distributions may do similar things)

Expected:
The pod is eventually restarted and comes back online once the node is running again.

Current behaviour:
The pod gets stuck as unknown. If kubectl described, a bunch of error messages about failed file writes have been written as Events:

MountVolume.SetUp failed for volume "pvc-438df3de-141b-4dbe-a97f-065ba704c3b2" : rpc error: code = Internal desc = failed to prepare pod dir at "/var/lib/kubelet/pods/5784b379-b5cb-4d80-9bf5-64f40bb551ca/volumes/kubernetes.io~csi/pvc-438df3de-141b-4dbe-a97f-065ba704c3b2/mount": failed to write content: File exists (os error 17)

Possible solution

Workaround: Deleting the pod should let it be recreated with a new ID, which is able to start without conflicts.

  1. Allow overwriting volume files (currently this is disallowed as a safeguard, but that could be disabled)
  2. Mount volumes as a tmpfs (like secret-operator currently does, but this requires the operator to always run with root privileges)

Additional context

No response

Environment

Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.0+k3s1

Would you like to work on fixing this bug?

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant