Avoid scheduling jobs if not all parallel jobs are ready #6049

Martchus · 2024-11-06T19:03:11Z

Some jobs of a parallel cluster can be blocked (by a chained parent or by a pending Gru task) while some jobs can be scheduled. Before this change the scheduler assigns the jobs that can be scheduled which creates a half- scheduled parallel cluster.

This is particularly problematic when PARALLEL_ONE_HOST_ONLY=1 and git_auto_update = yes are used because then repairing half-scheduled clusters is more challenging and the likeliness that a cluster is partially blocked is higher. In theory this is also problematic when jobs within the parallel cluster depend on different asset downloads.

With this change the scheduler skips the whole cluster until all jobs can be scheduled to avoid half-scheduled clusters. Judging by the previous code and its comments this was already intended but the code didn't actually work correctly.

I tested the rewritten version of the code by creating a half-blocked cluster manually and observed the behavior of the scheduler before and after this commit.

Related ticket: https://progress.opensuse.org/issues/169342

t/05-scheduler-dependencies.t

Some jobs of a parallel cluster can be blocked (by a chained parent or by a pending Gru task) while some jobs can be scheduled. Before this change the scheduler assigns the jobs that can be scheduled which creates a half- scheduled parallel cluster. This is particularly problematic when `PARALLEL_ONE_HOST_ONLY=1` and `git_auto_update = yes` are used because then repairing half-scheduled clusters is more challenging and the likeliness that a cluster is partially blocked is higher. In theory this is also problematic when jobs within the parallel cluster depend on different asset downloads. With this change the scheduler skips the whole cluster until all jobs can be scheduled to avoid half-scheduled clusters. Judging by the previous code and its comments this was already intended but the code didn't actually work correctly. I tested the rewritten version of the code by creating a half-blocked cluster manually and observed the behavior of the scheduler before and after this commit. Related ticket: https://progress.opensuse.org/issues/169342

codecov · 2024-11-06T19:42:31Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.98%. Comparing base (a1c8732) to head (6045c27).
Report is 14 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #6049   +/-   ##
=======================================
  Coverage   98.98%   98.98%           
=======================================
  Files         395      395           
  Lines       39436    39441    +5     
=======================================
+ Hits        39036    39041    +5     
  Misses        400      400

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

b10n1k · 2024-11-08T11:26:04Z

lib/OpenQA/Scheduler/Model/Jobs.pm

        my $tobescheduled = _to_be_scheduled($j, $scheduled_jobs);
+        if (!defined $tobescheduled) {


correct me but I think _to_be_scheduled_recurse doesnt return undef. shouldnt that be just !$tobescheduled?

We call _to_be_scheduled here and that returns undef.

okurz approved these changes Nov 6, 2024

View reviewed changes

okurz reviewed Nov 6, 2024

View reviewed changes

t/05-scheduler-dependencies.t Outdated Show resolved Hide resolved

Martchus force-pushed the fix-scheduler branch from acd1633 to 6045c27 Compare November 6, 2024 19:22

okurz approved these changes Nov 6, 2024

View reviewed changes

b10n1k reviewed Nov 8, 2024

View reviewed changes

b10n1k approved these changes Nov 8, 2024

View reviewed changes

mergify bot merged commit 2c4a234 into os-autoinst:master Nov 8, 2024
46 checks passed

Martchus deleted the fix-scheduler branch November 8, 2024 14:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid scheduling jobs if not all parallel jobs are ready #6049

Avoid scheduling jobs if not all parallel jobs are ready #6049

Martchus commented Nov 6, 2024

codecov bot commented Nov 6, 2024 •

edited

Loading

b10n1k Nov 8, 2024

Martchus Nov 8, 2024 •

edited

Loading

		my $tobescheduled = _to_be_scheduled($j, $scheduled_jobs);
		if (!defined $tobescheduled) {

Avoid scheduling jobs if not all parallel jobs are ready #6049

Avoid scheduling jobs if not all parallel jobs are ready #6049

Conversation

Martchus commented Nov 6, 2024

codecov bot commented Nov 6, 2024 • edited Loading

Codecov Report

b10n1k Nov 8, 2024

Choose a reason for hiding this comment

Martchus Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Nov 6, 2024 •

edited

Loading

Martchus Nov 8, 2024 •

edited

Loading