Pods with listener volumes can not be restarted after their node is restarted #262

nightkr · 2024-12-12T08:53:01Z

Affected Stackable version

nightly (but it's an old issue)

Current and expected behavior

Repro:

Install SDP
Create a pod that mounts a list-op volume (such as examples/)
Wait for it to start up
Restart the node that the pod is running on (k3d note: run docker kill k3d-node-name && docker restart k3d-node-name, a plain docker restart k3d-node-name will delete all pods on the node first, bypassing the issue. other distributions may do similar things)

Expected:
The pod is eventually restarted and comes back online once the node is running again.

Current behaviour:
The pod gets stuck as unknown. If kubectl described, a bunch of error messages about failed file writes have been written as Events:

MountVolume.SetUp failed for volume "pvc-438df3de-141b-4dbe-a97f-065ba704c3b2" : rpc error: code = Internal desc = failed to prepare pod dir at "/var/lib/kubelet/pods/5784b379-b5cb-4d80-9bf5-64f40bb551ca/volumes/kubernetes.io~csi/pvc-438df3de-141b-4dbe-a97f-065ba704c3b2/mount": failed to write content: File exists (os error 17)

Possible solution

Workaround: Deleting the pod should let it be recreated with a new ID, which is able to start without conflicts.

Allow overwriting volume files (currently this is disallowed as a safeguard, but that could be disabled)
Mount volumes as a tmpfs (like secret-operator currently does, but this requires the operator to always run with root privileges)

Additional context

No response

Environment

Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.0+k3s1

Would you like to work on fixing this bug?

yes

The text was updated successfully, but these errors were encountered:

nightkr added the type/bug label Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pods with listener volumes can not be restarted after their node is restarted #262

Pods with listener volumes can not be restarted after their node is restarted #262

nightkr commented Dec 12, 2024

Pods with listener volumes can not be restarted after their node is restarted #262

Pods with listener volumes can not be restarted after their node is restarted #262

Comments

nightkr commented Dec 12, 2024

Affected Stackable version

Current and expected behavior

Possible solution

Additional context

Environment

Would you like to work on fixing this bug?