Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for randomness / hashed substitution in a trigger schedule #1421

Open
adampetrovic opened this issue Oct 12, 2024 · 15 comments
Open
Labels
enhancement New feature or request

Comments

@adampetrovic
Copy link

Describe the feature you'd like to have.
Currently the trigger schedule directive on ReplicationSource supports a crontab style schedule or a static directive such as daily, yearly etc.

For those that are using a templatized ReplicationSource, many PVCs will inherit the schedule from a single template and therefore have the same trigger time, leading to many backup operations being triggered at the same time.

See example of such a template here: https://github.com/adampetrovic/home-ops/blob/main/kubernetes/templates/volsync/minio.yaml#L34

Some cron implementations use H as a hashed substitution for a value:

'H' is used in the Jenkins continuous integration system to indicate that a "hashed" value is substituted. Thus instead of a fixed number such as '20 * * * *' which means at 20 minutes after the hour every hour, 'H * * * *' indicates that the task is performed every hour at an unspecified but invariant time for each task. This allows spreading out tasks over time, rather than having all of them start at the same time and compete for resources

With this approach, volsync could take the name of the job as the hash value to evaluate to say a minute within the hour.

What is the value to the end user? (why is it a priority?)

Better load spreading of backups

How will we know we have a good solution? (acceptance criteria)

A user is able to defer random execution / jitter of backup times to the volsync operator instead of having to manage it themselves.

Additional context

@adampetrovic adampetrovic added the enhancement New feature or request label Oct 12, 2024
@tesshuflower
Copy link
Contributor

This is an interesting one, potentially something could be done to add a simplified version of this - most likely by:

  • replace H values using a hash based on the replicationsource/dest name
  • put the new trigger schedule into the status
  • if there is a trigger schedule in the status, that will be used (hash will not get computed again)

I don't want to promise this, but sounds like a neat idea if there is time to work on it.

@adampetrovic It looks like the Jenkins implementation may allow for more complex substitutions such as H/10 or H(1-3)?

@adampetrovic
Copy link
Author

Ideally this is something delegated to the cron library and all volsync needs to worry about is passing through the job name for hashing.

Looking at https://github.com/robfig/cron, the last PR merged was 4 years ago, so I'm not going to blindly assume that route will be a fruitful one.

@adampetrovic It looks like the Jenkins implementation may allow for more complex substitutions such as H/10 or H(1-3)?

Yes it does, again it feels off though to have to think about implementing custom cron parsing in volsync.

If volsync were open to using a fork of the cron library, I could have a go at implementing this.

@adampetrovic
Copy link
Author

adampetrovic commented Oct 17, 2024

I have implemented basic support (just standard H and steps H/20) in my fork, as a reference.

I have extended the API to add a function ParseWithJobName which allows for a job name to be parsed, which is used for consistent hashing to the same value.

adampetrovic/cron@4c91fea

@tesshuflower
Copy link
Contributor

I think my initial reaction is I'd rather not point at a fork, but let me think about it a bit.

With a fork:

  • someone needs to maintain it (maybe not much since the original cron project doesn't seem to be getting updates), what happens when fixes are needed etc.
  • if the fork goes away, then I need to get this hash functionality back to maintain behaviour (once hashed cron schedules are in the project)

If we put hashing code into volsync, we could also show the computed schedule in the status, so a user would be able to see what the hash resolved to which might be kind of nice. Disadvantage is it's not so clean and contained in cron parsing code.

@bo0tzz
Copy link
Contributor

bo0tzz commented Oct 18, 2024

Rather than forking, what about switching to a different cron library that's more actively maintained? If they are willing to add hash support, of course.

@onedr0p
Copy link
Contributor

onedr0p commented Oct 18, 2024

I couldn't find an existing library that supports setting a jitter in the cron format. I have to agreed that the jenkins cron format would be ideal.

The author of robfig/cron was asking about to adding H support but closed the issue 😢 Judging by the authors Github activity they might not be coming back to support the project.

@tesshuflower Another option could be (if upstream is truly abandoned) that backube could fork robfig/cron and support what's requested here in the library. I get that adding another project to your org might stretch your resources a bit so I totally get why you wouldn't want to; however, if adding this jitter / randomness / hashed substitution helps with both volsync and snapscheduler getting this feature maybe it can be considered?

@adampetrovic
Copy link
Author

Another option could be (if upstream is truly abandoned) that backube could fork robfig/cron and support what's requested here in the library

I was going to suggest the same thing.

Given it's 4 years since the last commit I can't imagine the maintenance overhead being large. I'd be happy to volunteer my time to help with this, if needed

@tesshuflower
Copy link
Contributor

Thanks for the offer @adampetrovic but perhaps we just put the specific hashing code into VolSync itself - this gives us more control and we can still keep it separate from the actual cron implementation.

Steps:

  • parse cron schedule with hashing (e.g. H * * * *). I was wrong above, I think we still need to compute this every time, in case a user changes the schedule
  • put the computed schedule into the status (e.g. 20 * * * *)
  • use the computed schedule from the status (same code as current)

This is something I can help with when I have some time - I'll take a look at your hashing function at some point and can perhaps use that (or I can get you to contribute it directly here if you prefer).

Questions:

  • What range of use-cases would we want to support when it comes to H?

@adampetrovic
Copy link
Author

adampetrovic commented Oct 22, 2024

What range of use-cases would we want to support when it comes to H?

I think the primary set of features that cron already has like:

  • Ranges H(0-30) to support the ability to limit the range within a specific time period (e.g. all backups within the first 30 minutes of the hour)
  • Steps H/15 or H(0-30)/15 - which would allow for say, 15 minute intervals 3, 18, 33, 48 or 3, 18 (ranged)
  • Basic value separators perhaps? H(0-30),45 - 3, 18, 45. This would allow for a combination of randomness (within a range) and a deterministic value statically provided.

@adampetrovic
Copy link
Author

adampetrovic commented Oct 22, 2024

Are you sure you want to do this in volsync? There's a tonne of parsing / repacking unrelated to the Hash feature that would need to be implemented from scratch that already exists in the cron implementation.

For what it's worth, I've now also added support for hashes with a range in my fork.

@tesshuflower
Copy link
Contributor

I probably wasn't clear - if we support hashing, we'd just parse the hash values out, compute and put into the status - then scheduling would look at that resolved cron schedule (e.g H * * * * resolved to 20 * * * *) and figure out the next scheduled time using the existing cron parsing code.

@adampetrovic
Copy link
Author

adampetrovic commented Oct 22, 2024

Yes, I am aware :)

Volsync would need to basically parse all possible permutations of a cron expression into individual fields and sub-fields within those fields as well as worry about the specifics of the Hash expression. Not impossible by any means, just a lot of edge cases to deal with for something that volsync shouldn't be directly concerned with (IMO).

For example:

0,1,H(5-10)/2,30 * * * * * involves parsing each field (split by <space>), then within each field split by , and then the mechanics of the hash implementation, then replacing the H(5-10)/2 into a comma separated list of values coming out of the hash calculation.

@adampetrovic
Copy link
Author

I also forgot that the parsing function would need to know which field it is parsing (seconds, minutes, hours etc) in order to generate a valid set of values within a range.

@tesshuflower
Copy link
Contributor

agreed, it's not super simple, which I think is why we'd want to keep the set of use-cases we support for H to be constrained to something that would be useful to you, at least at first.

@adampetrovic
Copy link
Author

For me just a simple H would suffice. For others I can see a use case for wanting to restrict the output to a certain range (e.g first 30 minutes in an hour) with H(0-30)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

4 participants