Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually restarting an experiment after job completion #449

Open
EverettGrethel opened this issue Jan 9, 2025 · 1 comment
Open

Manually restarting an experiment after job completion #449

EverettGrethel opened this issue Jan 9, 2025 · 1 comment

Comments

@EverettGrethel
Copy link

Suppose I run an experiment that dispatches N jobs, each restarting as many times as specified in restart_limit and restarting using code written in the "restart" section of the YAML script. I would like to be able to also restart the experiment at any point in the future, as opposed to the automatic restart. This way, I can manually restart later on if all of the original restarts fail or if I want to run additional times beyond the specified restart_limit. The restarts would occur within the same experiment folder, as opposed to generating a new folder which occurs when running "maestro run experiment.yaml". Does such a feature exist?

@jwhite242
Copy link
Collaborator

That is currently in progress. Added ability to update restart limits, throttle, and sleep for a running study on develop as the first part of this. Working on getting conductor restarting right now to wake up and read those new limits on a completed study, and then some options for selecting steps to manually restart them (potentially overriding state too if more than just restart limit is the problem?) and rerun the rest of their children too. The main limitation in the initial version of this will be restarting will likely still need the study launched from the same maestro install as some of the graph uses pickle to serialize the state. Follow on work will get that into a more portable/maestro version independent format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants