Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support HA configuration #149

Open
cartalla opened this issue Sep 21, 2023 · 0 comments
Open

[FEATURE] Support HA configuration #149

cartalla opened this issue Sep 21, 2023 · 0 comments

Comments

@cartalla
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Slurm support multiple controllers for HA.
Add support for multiple controllers with each in separate AZs.

Describe the solution you'd like
Currently the slurm config, binaries, and state save location is stored on the EBS volumes of the controller.
This would need to be move to a multi-AZ file system such as EFS or FSxN.
EFS may not have the IOPS required by to keep the state save location from becoming a bottleneck.
Add a configuration parameter for the head node to allow 1-3 controllers.
Create the specified number of controllers and update slurm.conf appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant