Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine if signac can be used for workflow management #13

Open
matthewfeickert opened this issue Aug 12, 2021 · 6 comments
Open

Determine if signac can be used for workflow management #13

matthewfeickert opened this issue Aug 12, 2021 · 6 comments
Assignees

Comments

@matthewfeickert
Copy link
Member

matthewfeickert commented Aug 12, 2021

At the moment everything controlling the workflows on Blue Waters is controlled through Bash scripts that need user configuration that submit Torque/PBS jobs with qsub that submit files to Shifter containers using aprun and the whole mess needs to be tired together with Bash scripts again to guide it. This is pretty ugly and it would be nicer to use some sort of workflow system if possible.

From SciPy 2019, 2020, and 2021 I've seen @bdice and co discuss using signac to be able to control workflows on HPCs that are dealing with automation of thousands of datasets. So this might be an interesting channel to look at as a way to escape Bash-everything.

Relevant links:

@BenGalewsky
Copy link

I wonder if Parsl could be a good fit for this? We already have a fair amount of experience with Parsl executors on BlueWaters since they are shared with funcX and it's nice that you can use Python as your workflow definition language

@matthewfeickert
Copy link
Member Author

Does Parsl also keep track of the data provenance produced during the workflow?

@BenGalewsky
Copy link

From the Parsl help slack channel:

most ways that I've seen people use parsl, they aren't telling parsl about the data, in the sense of "here are the files" or "here are my databases" so parsl doesn't usually know anything about that at all but there is a reasonable collection of information in the monitoring db if you turn it on about which tasks depended on which other tasks

it doesn't use the word "provenance" at all, but you can ask questions like "what tasks were run as pre-reqs to the task I am pointing at" and for all of those tasks get info like where/when it ran

@matthewfeickert
Copy link
Member Author

I'm going to the signac office hours today to discuss with them if it would be a reasonable solution here, but it seems like a more complete workflow solution than Parsl in that it is able to handle the entire workflow and provenance end to end. The less inventing of data management for recombination of hundreds/thousands of jobs per stage that I need to do the better. 👍

@matthewfeickert matthewfeickert self-assigned this Aug 12, 2021
@matthewfeickert
Copy link
Member Author

matthewfeickert commented Aug 12, 2021

So after attending the signac office hours today (thanks for a very welcoming time @bdice and @atravitz!) I walked the team through the basics of the workflow on Blue Waters at the moment

BlueWaters_workflow

and the good news is that they think that even with all of the containerization this workflow should be well suited to using signac. Another good thing that @bdice mentioned is that to move from Blue Waters to another HPC system, like Delta, the workflow would be the same and the only thing that I would need to change would be the machine specific template. But having to only change ~1 file to port the whole workflow seems awesome! ✨


@BenGalewsky, @bdice and I are also going to try to do some pair programming next week once I've gone through the docs and intro workflow tutorial and attempted to implement some of the workflow. If you'd like to join as well you're welcome too!

@matthewfeickert
Copy link
Member Author

Just a note to self, that given the refactoring of PRs #15, #19, #20, #21 the simulation pipeline (stages 1 through 3) is now fully parallelized so that each stage is operating on a slice of the total number of events simulated and then recombined (c.f. PR #21) into an event level ROOT file at the end of the preprocessing stage (stage 3).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants