-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Determine if signac can be used for workflow management #13
Comments
I wonder if Parsl could be a good fit for this? We already have a fair amount of experience with Parsl executors on BlueWaters since they are shared with funcX and it's nice that you can use Python as your workflow definition language |
Does Parsl also keep track of the data provenance produced during the workflow? |
From the Parsl help slack channel:
|
I'm going to the signac office hours today to discuss with them if it would be a reasonable solution here, but it seems like a more complete workflow solution than Parsl in that it is able to handle the entire workflow and provenance end to end. The less inventing of data management for recombination of hundreds/thousands of jobs per stage that I need to do the better. 👍 |
So after attending the and the good news is that they think that even with all of the containerization this workflow should be well suited to using @BenGalewsky, @bdice and I are also going to try to do some pair programming next week once I've gone through the docs and intro workflow tutorial and attempted to implement some of the workflow. If you'd like to join as well you're welcome too! |
Just a note to self, that given the refactoring of PRs #15, #19, #20, #21 the simulation pipeline (stages 1 through 3) is now fully parallelized so that each stage is operating on a slice of the total number of events simulated and then recombined (c.f. PR #21) into an event level ROOT file at the end of the preprocessing stage (stage 3). |
At the moment everything controlling the workflows on Blue Waters is controlled through Bash scripts that need user configuration that submit Torque/PBS jobs with
qsub
that submit files to Shifter containers usingaprun
and the whole mess needs to be tired together with Bash scripts again to guide it. This is pretty ugly and it would be nicer to use some sort of workflow system if possible.From SciPy 2019, 2020, and 2021 I've seen @bdice and co discuss using
signac
to be able to control workflows on HPCs that are dealing with automation of thousands of datasets. So this might be an interesting channel to look at as a way to escape Bash-everything.Relevant links:
signac
on GitHubThe text was updated successfully, but these errors were encountered: