Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 329 #331

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Fix issue 329 #331

wants to merge 5 commits into from

Conversation

ashug06
Copy link

@ashug06 ashug06 commented Mar 4, 2025

Summary

This PR fixes issue #329 by making necessary updates to local.rs and formatting source.wdl.

Changes

  • Modified wdl-engine/src/backend/local.rs: (describe the change if you understand it)
  • Formatted wdl-lint/tests/lints/inconsistent-newlines/source.wdl to maintain consistency.

Notes

  • Let me know if any further changes are needed!

@a-frantz
Copy link
Member

a-frantz commented Mar 5, 2025

please rewrite your PR description according to the PR template we wrote and ensure all the tests and checks are working locally

@adthrasher adthrasher requested review from adthrasher, a-frantz, peterhuene and claymcleod and removed request for adthrasher and a-frantz March 5, 2025 14:38
@peterhuene
Copy link
Collaborator

Hi @ashug06.

Unfortunately your changes don't address #329. The issue is that for containerized backends (such as the Docker backend), we need a way for it to chown the files inside of the output directory without elevating privileges, ideally only once as a "cleanup" step and not on every task execution.

The way miniwdl handles this is to start a minimal container and perform the chown operation from within that container, which also runs as root and therefore the user running miniwdl doesn't need to sudo.

let status = status.with_context(|| {
format!("failed to wait for termination of task child process {id}")
})?;
let status = status.with_context(|| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no changes to the local backend to fix #329.

@@ -2,7 +2,7 @@
## Note that due an inexact path separator replacement in the tests,
## error messages in the baseline will show `/<escape>` instead of `\<escape>`.

version 1.1

version 1.1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, this file should not have any changes; I suspect your git or editor are automatically changing the line endings in this file, but it is intentionally using a mix of new lines to test a lint.

If this is coming from git, I suspect we are missing a .gitattributes in the test directory to ignore it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it happened to me as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put up #338 to exclude the test from autocrlf behavior, which I suspect is the problem here.

@ashug06
Copy link
Author

ashug06 commented Mar 6, 2025

Hi @peterhuene,

Thank you for your feedback! I now understand that the required fix should involve handling chown inside the containerized backend rather than modifying local.rs. I appreciate the reference to miniwdl's approach, and I'll look into implementing a similar solution.
For now, I’ll revert the unintended changes to source.wdl and start working on the correct implementation. It might take me some time to complete, but I'll keep you updated on my progress.
Let me know if there are any additional suggestions or resources that could help!

@ashug06
Copy link
Author

ashug06 commented Mar 6, 2025

Hey @peterhuene, I'm trying to build the project, but I'm getting an error because the nodes-api branch is missing in bollard. Could you clarify which version or branch I should use?
Screenshot 2025-03-06 124709

@peterhuene
Copy link
Collaborator

peterhuene commented Mar 6, 2025

Hi @ashug06.

That branch was deleted in favor of updating the patch in Cargo.toml to the upstream repository now that the fix was merged.

To get things building again, rebase your branch off of stjude-rust-labs/wdl's main branch.

@ashug06
Copy link
Author

ashug06 commented Mar 8, 2025

Hi @peterhuene, based on your feedback, I have implemented a chown_output_directory function to change the ownership of the output directory inside the container before removal. I plan to call this function before remove_container() in the relevant places.

Does this approach align with what you had in mind, or would you suggest any modifications?

@ashug06 ashug06 closed this Mar 8, 2025
@ashug06 ashug06 reopened this Mar 11, 2025
@ashug06
Copy link
Author

ashug06 commented Mar 17, 2025

Hi @peterhuene, I have updated remove_container to call chown_output_directory before removal. Let me know if any further modifications are needed!

Copy link
Collaborator

@peterhuene peterhuene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ashug06. I don't see your changes to the docker backend that I was expecting. Is your local branch pushed to your remote repo?

bollard Outdated
Copy link
Collaborator

@peterhuene peterhuene Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have a bollard submodule.

What patches to bollard did you need to implement this PR? bollard itself should have everything we need to make this PR work (we also only use bollard transitively through crankshaft).

My expectations around an implementation here is that the docker backend in wdl-engine gets called once at the end of evaluation to perform a "cleanup" operation, and the cleanup operation for the docker backend would be to run a task that does a recursive chown on the entire output directory (thus we only ever do this once per run).

git Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR shouldn't have a git file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Peter,
Thanks for the review! I’ll check why my changes to the Docker backend aren’t visible and ensure they are pushed properly.

Regarding the bollard submodule, I didn’t intend to introduce it as a submodule—I'll remove it if it's unnecessary. My modifications were focused on implementing the cleanup operation, but I’ll revisit how bollard is used via crankshaft.

For the cleanup operation, I now understand that it should be executed once at the end of evaluation to recursively chown the entire output directory. I’ll adjust my implementation accordingly.

I also noticed the accidental inclusion of a .git file—I'll remove it in the next update.

Thanks for the guidance! I’ll update the PR soon.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed there's no existing Docker backend in wdl-engine/src/backend/. I created docker.rs with a cleanup function that runs chown -R inside a minimal container at the end of execution. Is this approach correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicating from my slack message:

The docker backend is implemented via src/crankshaft.rs (crankshaft is the library that's abstracting task execution and will support more than just docker soon). We should probably add a cleanup method to the TaskExecutionBackend trait that is called whenever the engine finishes top-level evaluation (upon success or failure) and the crankshaft backend, if configured to use Docker, should perform the chown

@claymcleod claymcleod added the S-awaiting-revisions State: awaiting revisions based on code review feedback label Mar 19, 2025
@ashug06 ashug06 requested a review from peterhuene March 20, 2025 18:09
@claymcleod claymcleod force-pushed the main branch 2 times, most recently from 136d955 to 3dc8137 Compare March 26, 2025 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-awaiting-revisions State: awaiting revisions based on code review feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants