Skip to content

Releases: DataBiosphere/toil

8.1.0b1

12 Mar 19:00
Compare
Choose a tag to compare
8.1.0b1 Pre-release
Pre-release

Note: this is a beta release with new functionality immediately needed by Cactus.

Highlighted Features Added

  • Toil can now publish workflow execution metrics to Dockstore for workflows where it knows the TRS ID (i.e. workflows run by TRS ID or Dockstore URL) (#5159).
    • Toil now saves workflow execution history in ~/.toil/history.sqlite
    • Toil will prompt the user to decide whether they want to publish workflow metrics, and, if it thinks it can get ahold of the user, will not proceed until they decide or a timeout elapses.

CWL

  • Add caching to toil-cwl-runner. Use --cachedir [dir] to enable, and avoid rerunning previously cached jobs. (#4298)
  • Add CWL badges to documentation (#5183)
  • Improve error message when a file is not found in CWL (#5174)

Misc

  • Toil will no longer select GPU or high-priority Slurm partitions when it does not need to. (#5223)
  • Allow handling unlimited number of jobs on Slurm (#5064)
  • Add hack to automatically un-stick jobs that can be un-stuck by lsof. (#5214)
  • Re-added support for --time in TOIL_SLURM_ARGS (#5230)
  • Added --slurmPartition and --slurmGPUPartition options for manual Slurm partition selection. (#5231)

Thank you to our contributors:
@stxue1 @mr-c @adamnovak

8.0.0

12 Feb 18:34
Compare
Choose a tag to compare

Highlighted Features Added

  • toil debug-job now has --retrieveTaskDirectory <dir> which will set up a job's downloaded files under <dir> and try to stop the job after doing the downloads. Jobs can call self.files_downloaded_hook() to provide a stopping point for this mode. (#4815)
  • toil debug-job can now reconstruct the inside-the-container environment for CWL and WDL tasks. (#4815)
  • Added support for caching on Slurm and other HPC schedulers (#4775)
  • Replace all instances of boto2 with boto3 for all Toil AWS code (#4718)
  • Add support for Python 3.12 (#4718)
  • Add support for Python 3.13 (#5145)
  • Ceph input/output errors from file locking functions are now tolerated. (#4874)
  • Toil now uses flock to enable directory locks to work properly (#4924)
  • Added support to get Slurm partitions and automatically send jobs to GPUs on Slurm (#4833) (supports both CWL and WDL)
  • New --symlinkJobStoreReads=False option lets you force local-node copies (possibly in the cache) even when reading directly from a FileJobStore is possible, potentially reducing shared filesystem IO. (#4673)
  • Toil now supports reading and writing MiniWDL's call cache. (#4797)
  • Toil now supports running CWL and WDL workflows from Dockstore, by using either a Dockstore page URL or TRS ID as the URL/filename of the workflow to run. Since these often contain ? or #, remember to quote them on the command line! (#5049)
  • Add support for parallel file imports (#5114)
    • New argument --importWorkersThreshold. This specifies the threshold where files will begin to be imported on individual jobs. Small files will be batched into the same import job up to this threshold.
    • --importWorkersDisk defaults to 1 MiB. Should be increased when download streaming is not possible on a worker.

Breaking Changes

CWL

  • Prevent simultaneous Singularity container pulls in toil-cwl-runner (#4990)
  • Added support to import files on workers for toil-cwl-runner (#5025)
    • --runImportsOnWorkers to enable importing files on workers
    • --importWorkersDisk to control how much disk space the import worker will use
  • Don't error when passing through input as the output (#5138)
  • CWL jobs with dynamic requirements now have input type checking properly protected by their conditionals. (#4930)
  • Fixed a LoadListing bug with CWL workflows (#5149)
  • Fix CWL Workflow Slurm memory test (#5151)
  • workDir and jobStore defaults to tmp-outdir-prefix (#5154)
  • CWL container prepull: no reason to if extensions are enabled, they are now supported by cwl-utils 0.36+ (#5188)
  • CWL container prepull: skip if --no-containers is specified (#5188)

WDL

  • Update WDL conformance tests on CI (#4875)
  • Added support to run task only WDL files (#4960)
  • Added support for the gpu field in WDL (#4949)
  • Support passing inputs into toil-wdl-runner for task only WDLs (#4977)
  • toil-wdl-runner will now carry through task exit codes (#4978)
  • toil-wdl-runner will respect explicit null values for optional inputs (#4981)
  • toil-wdl-runner will not immediately error on nonexistent coerced files until outputted (#4994)
    • File? type for string to file coercion is now supported (will be nullified)
  • WDL output files will now live in directories named after their tasks instead of UUID directories (#5008)
  • Fixed a bug with conditional statements inside a WDL scatter (#5055)
  • toil-wdl-runner now correctly finds and returns outputs from tasks in scatters and conditionals when a WDL workflow lacks an output section. (#5094)
  • toil-wdl-runner has a new --allCallOutputs option to allow including all calls' outputs in a workflow's output. (#5093)
  • toil-wdl-runner can now detect and try not to delete the outputs of a workflow that is meant to use the Cromwell Output Organizer (croo). Note that croo still can't actually work on the output of toil-wdl-runner. (#5093)
  • --allCallOutputs no longer discards WDL workflow outputs section outputs. (#5106)
  • File virtualization in toil-wdl-runner now only happens at task boundaries (#5028)
    • File to String coercion should be supported
  • Added support to import files on workers for toil-wdl-runner (#5103)
  • Support WDL 1.1 disk specification as per spec (#5001)
  • Fixed a bug with WDL file imports (#5121)

Kubernetes

Dependencies

  • Toil can now use connexion 4 (#5196)
  • Toil now uses htcondor 23.6 or 24, which are still on PyPI

Misc

  • Makefile: use isolated builds, add dist target (sdist+wheel) and deprecate the sdist target. (#4820) (#4826)
  • Toil will now wait --jobStoreTimeout seconds (default: 30) to see an update to/removal of a job that was run, and will not let the job succeed unless it is seen to make progress. (#3814)
  • Toil job descriptions no longer have a command field, and we track the link to the job body and the command to invoke the Toil worker separately. (#4811)
  • Several typos in the docs were fixed (#4889)
  • Add a test to ensure batchsystem plugins are installable (#4879)
  • Fix Toil utils to work without the AWS extra (#4953)
  • Print commit hash with toil --version when installed from source. Before: 7.1.0a1. After: 7.1.0a1-ccf57e6071e32675daabdcbacb91988e871745a9 (#4954)
  • Fixed a broken URL and an omitted variable in CI tests (#4974)
  • Generate default config correctly (#5014)
  • Use the latest setuptools when running cactus. (#5017)
  • Toil will refuse to proceed if it detects that its coordination directory or a Singularity cache directory it needs to lock is on Ceph, to prevent hanging the Ceph MDS (#4972)
  • Fix a NotImplementedError in the Grid Engine batchsystem (#5061)
    • Added basic Grid Engine CI tests
  • Update Cactus on CI to 2.9.0 (#5062)
  • Separate out create/delete iam role functions into lib.
  • Remove deprecated pipes module (#5122)
  • New --slurmTime/TOIL_SLURM_TIME setting to set the time limit on Slurm jobs in a way Toil itself understands. (#5010)
  • New --slurmPE argument to allow setting a parallel-job Slurm partition without using TOIL_SLURM_PE (#5010)
  • New --slurmArgs argument to allow specifying extra Slurm submission arguments without using TOIL_SLURM_ARGS (#5010)
  • For non-GPU jobs on Slurm, Toil will submit the job to a partition with a time limit long enough to accommodate the configured runtime (from --slurmTime). (For GPU jobs, the lowest-priority GPU partition is still always used.) (#5010)
  • Toil now has a --slurmDefaultAllMem option to run jobs lacking their own memory requirements with Slurm's --mem=0, so they get a whole node's memory. (#4971)
  • toil-cwl-runner now has --no-cwl-default-ram (and --cwl-default-ram) to control whether the CWL spec's default ramMin is applied, or Toil's own default memory logic is used. (#4971)
  • The --dont_allocate_mem and --allocate_mem options have been deprecated and replaced with --slurmAllocateMem, which can be True or False. (#4971)
  • Added WDL unit tests to CI (#5110)
  • Mesos build updated. (#5049)
  • CWL and WDL argument parsing revised for Python 3.12. (#5049)
  • Organize stats and logging files into stats/inbox and stats/archive and avoid a circular rename. (#1727)
  • Added proper FTP support for jobstores (#5134)
  • URL existence and size gets/checks are now done with HEAD requests (#5134)
  • Dependabot configuration should now pass schema validation and is itself under CI (#5175)
  • Toil now tests a version of Cactus that ought to run on Python 3.13. (#5184)
  • WDL conformance tests on Kubernetes may now run for 30 minutes. (#5185)
  • When importing files on workers, fall back to importing on the leader when file sizes are not obtainable (#5135)

Thank you to our contributors: @stxue1, @DailyDreaming, @adamnovak, @mr-c, @gmloose, @davidjsherman!

7.0.0

21 May 22:25
Compare
Choose a tag to compare

What's Changed

6.1.0

08 May 18:55
3f9cba3
Compare
Choose a tag to compare

Highlighted Features Added

  • WDL and CWL task standard output and standard error logs that are not captured by the workflow will now be logged at INFO level and stored in the --writeLogs/--writeLogsGzip directory. (#4657)
  • Use a default log limit of 100MiB (#4788)

Breaking Changes

  • Stats and logging system again uses job display name (#4755)
  • --disableProgress is once again a flag that doesn't take an argument (#4758)

CWL

  • Don't clear out user-provided values for the --default-container option (#4730)

WDL

  • WDL job names now include numbers for scatters (#4755)
  • Multi-line WDL placeholder substitutions no longer interfere with de-indenting WDL command blocks (chanzuckerberg/miniwdl#665)
  • Standard error for failed tasks is now always logged to the worker log somewhere (#4781)

Kubernetes

Dependencies

  • Deps: removed the ruaml.yaml.string plugin dependency for a simpler solution (#4760)

Misc

  • Toil will no longer warn about a missing XDG_RUNTIME_DIR (#4769)
  • Read the Docs and CI docs builds should have Graphviz installed (pending CI image rebuild) (#4734)
  • Add more Python3.12 compatibility by replacing the one function from distutils that we use, strtobool(). (#4765)
  • Set default cache folders to be accessible between toil-wdl-runner workflows (Same as MiniWDL/Singularity defaults) (#4761)
  • Set toil-wdl-runner cache folders on Toil managed clusters to be at /var/lib/toil (#4761)
  • Fall back to assuming machine has 1 core when CPU count is unavailable. (#4545)
  • FileJobStore now supports filenames that get modified when percent-encoded (#4779)

Thank you to our contributors:

@DailyDreaming @mr-c @stxue1 @adamnovak @app/dependabot

Full Changelog: releases/6.0.0...releases/6.1.0

6.0.0

16 Jan 19:40
Compare
Choose a tag to compare

NOTE!

We now have a config file! https://toil.readthedocs.io/en/latest/running/cliOptions.html#the-config-file

Breaking Changes

  • Removed the parasol batch system
  • Removed the TES batch system (this is now a plugin)
  • Removed our WDL compiler in favor of an interpreter (we still support WDL, we just do it differently now)
  • We no longer support python3.7

CWL

  • Support CWL 1.2.1 (#4682)
  • CWL Pipefish compatibility (#4636)
  • Support per-task preemptibility in CWL (#4551)
  • Fix configargparse in CWL (#4618)
  • cwl: use the latest commit from the proposed CWL v1.2.1 branch (#4565)
  • Upgrade cwltool to avoid broken galaxy-tool-util release. (#4639)
  • Implement a better config file system for CWL/WDL options (#4666)
  • Allow working with remote files in CWL and WDL workflows (#4690)
  • Make cwl mutually exclusive groups exist only when cwl is not suppressed (#4725)
  • Log more usefully for CWL workflows (#4736)

WDL

  • Simplify WDL Toil job graphs (#4524)
  • More WDL and Slurm documentation (#4558)
  • Improve WDL documentation (#4732)
  • Add String to File functionality into toil-wdl-runner (#4589)
  • Run WDL output through Toil export system to support URIs (#4579)
  • Allow the WDL output section to reference itself (#4592)
  • Ensure sibling files in toil-wdl-runner (#4610)
  • Make WDLOutputJob collect all task outputs (#4602)
  • Report errors in WDL using MiniWDL's error location printer (#4637)
  • Remove the WDL compiler. (#4679)
  • Implement a better config file system for CWL/WDL options (#4666)
  • Allow working with remote files in CWL and WDL workflows (#4690)
  • Strip leading whitespace from WDL commands (#4720)

Misc

  • Add config file support (#4569)
  • Support Python3.11 and drop Python 3.7 (#4646)
  • Move TES batch system to a plugin (#4650)
  • Turn batch system tests back on (#4649)
  • Separate out integration tests to run on a schedule (#4612)
  • Avoid concurrent modification in cluster scaler tests (#4600)
  • Remove old buckets from AWS (#4588)
  • Tests: only request a single core (#4572)
  • Reduce the number of assert statements (#4590)
  • take any nvidia-smi exception as not having gpu (#4611)
  • More resiliancy (#4395)
  • Remove useage of the deprecated pkg_resources (#4701)
  • Make sure cwltool always knows we have an outdir to fix #4698 (#4699)
  • AWS jobStoreTest: re-use delete_s3_bucket from toil.lib.aws (#4700)
  • Only count output file usage when using the file store (#4692)
  • Remove the parasol batch system. (#4678)
  • Move around reqs and move aws dev libraries to aws (#4664)
  • Make sure the --batchLogsDir exists if it is set (#4635)
  • Update EC2 instances and EC2 update script. (#4745)
  • remove extraneous dependency on old 'mock' (#4739)
  • Point CI at the new public URLs for stuff we host
  • Add init.py to options folder (#4723)

Bug Fixes

  • Lower redirect log level to fix #4526 (#4578)
  • Fix mypy from being broken by new boto types (#4577)
  • Fix CI on local Gitlab runners (#4571)
  • Banish ghost jobs (#4563)
  • Stop deleting chained-to jobs which fail as orphaned jobs (#4557)
  • Fix pickling error when jobstate file doesnt exist and fix threading error when lock file exists then disappears (#4575)
  • Fix #3867 and try to explain but not crash when bad things happen to our mutex file (#4656)
  • Fix CI Appliance Builds (#4655)
  • Tolerate a failed AMI polling attempt (#4727)* Add pure Python fallback for getDirSizeRecursively() (#4753)
  • Don't mark inputs (or outputs) executable for no reason (#4728)
  • Fix scheduled CI tests (#4742)
  • Fix --printJobInfo (#4709)

Thank you to our contributors: @stxue1 , @w-gao, @DailyDreaming , @mr-c , @adamnovak , @glennhickey, @misterbrandonwalker, and @a-detiste !

5.12.0

27 Jul 03:19
6d5a5b8
Compare
Choose a tag to compare

WDL

  • Virtualize filenames as in-container paths from point of view of WDL command (#4527)
  • Add WDL conformance tests to CI (#4530)
  • Use less memory in the Giraffe WDL test (#4541)

Version Upgrades

  • Upgrade to cwltool 3.1.20230601100705 (#4500)
  • Update mock requirement from <5,>=4.0.3 to >=4.0.3,<6 (#4366)

Misc

  • Anonymous access to Google Storage (#4518)
  • Reorder config so that default settings are applied first (#4528)
  • Add a way to forward accelerators to Docker containers (#4492)

Bug Fixes

  • Fix test failures without docker installed (#4544)
  • Prevent certain tests from being run twice in CI (#4529)
  • Drop external Docker builder (#4523)
  • Fix CI lint test (#4533)
  • Grab AWS group policies on top of user (#4505)
  • Grab accelerator set off the end of the list instead of by index (#4506)
  • Fix RtD build (#4491)
  • Include tests (#4499)

Thank you to our contributors: @stxue1 , @DailyDreaming , @mr-c , @adamnovak , and @tjni !

5.11.0

15 Jun 15:17
Compare
Choose a tag to compare

Breaking Changes

  • Imported files will be symlinked by default, unless the user sets --noLinkImports or the workflow imports with symlink=False. (#3949)

WDL

  • Toil will now stop if it encounters an error polling a possible import URL for a WDL workflow input file. (#4479)
  • WDL workflows will be protected against imported files with no basenames. (#4477)

Misc

  • Toil batch system ID numbers for issued jobs now start at 1. (#4482)
  • Attempts to import files from URLs when the implementing job store is missing an extra are now better reported. (#4479)
  • Include tests in the source distribution that gets published to PyPI (#4499)

Bug Fixes

  • Toil should no longer crash when a delete wins a race against a load in FileJobStore (#4484)
  • Prevent local root jobs (such as WDLRootJob) from being run twice. (#4482)
  • Slurm and other grid batch system jobs will now have more informative names (#4472)
  • WDL workflows can no longer import "" as a File. (#4477)

Thank you to our contributors: @stxue1, @DailyDreaming, @mr-c, @adamnovak

5.10.0

18 May 09:03
21422a3
Compare
Choose a tag to compare

Changelog

Highlighted Features Added

  • Add a --caching option which explicitly states whether to use caching with a workflow. Uses a default value depending on whether or not we are using the file job store if not specified. (#4218)
  • New prototype WDL runner python -m toil.wdl.wdltoil using MiniWDL (#3468)
  • MiniWDL-based WDL implementation can now run the vg Giraffe WDL workflow ( #4353)
  • Toil now tests against our own tiny set of WDL conformance tests (#4351)
  • Toil can run the HPRC assembly WDL workflows (#4435)
  • Toil can now use Mesos roles (#4455)

Breaking Changes

  • Replace "preemptable" with "preemptible", add example of using --defaultPreemptible flag to Preemptibility documentation (#1951)

CWL

  • CWL: run all ExpressionTools on the Leader node, instead of submitting separate jobs (#4157)

Kubernetes

  • Kubernetes batch system: Delete jobs individually when batch delete fails (#3403)
  • Documentation for running a Toil leader for a Kubernetes workflow outside Kubernetes now covers examples and common problems for running CWL workflows (document toil-cwl-runner + "Running the Leader Outside Kubernetes" #3422)
  • Kubernetes batch system: support --maxCores, --maxDisk, and --maxMemory (#2864)
  • Add tutorial for Kubernetes launch cluster (#3743)

Dependencies

  • Require htcondor 10 exactly (#4315)
  • Toil jobs now have a local parameter which determines if they should run on the leader. (#4388)

Misc

  • The offline tests can now be run in parallel (#3493)
  • Code updated to be more idiomatic for Python3.7 (#4295)
  • Support for a --network for toil launch-cluster for Google cloud (#4196)
  • Support for a --use_private_ip for toil launch-cluster to dial nodes by private IP instead of public IP (#4196)
  • GPU scheduling should now be supported on Slurm (#4308)
  • Toil now supports a --batchLogsDir option and TOIL_BATCH_LOGS_DIR environment variable, to provide a directory other than the work dir where Toil will instruct HPC batch systems to save their captured job logs.
  • htcondor batch system should now work again, and will retry connections
  • Updated the --coalesceStatusCalls help documentation to reflect the current state of #4431 (#4437)
  • Toil no longer trusts XDG_RUNTIME_DIR under Slurm (fixes some of the issues behind #4395 when Slurm is configured not to follow the XDG spec) (#4435)
  • Toil now puts it lock files for Singularity cache directories for WDL in those directories (#4435)
  • Toil's WDL interpreter can now use local-to-the-leader jobs for evaluating WDL code that doesn't need appreciable resources (#4388)
  • Toil now tolerates more possible exceptions related to the panasas network file system (#4440)
  • Type hinting to functions in resource.py (#938)
  • Added return type to inVirtualEnv() in __init__.py (#938)
  • Added None checks to some function bodies (#938)

Bug Fixes

  • Stop crashing when predefined batch job exit reasons are used and need to go into the message bus log file (#4321)
  • Added import subprocess to restore the behavior of #588. (#4429)
  • Toil will no longer use the stored message bus path from an old execution of a workflow when deciding where to save the message bus log when restarting a workflow (#4438)
  • Fix --custom-net mutual exclusivity bug. (#4458)

Thank you to our contributors: @stxue1 , @DailyDreaming , @mr-c , @adamnovak , @jfennick , @misterbrandonwalker , @w-gao , @stephanaime , @glennhickey , @Hexotical , @manabuishii @gmloose , @boukn , and @thiagogenez !

5.9.2

04 Feb 05:38
Compare
Choose a tag to compare

Changelog

Bug Fixes

  • Change build tag import (#4329)

Thank you to our contributors: @adamnovak , @Hexotical !

5.9.0

03 Feb 06:04
8155e0a
Compare
Choose a tag to compare

Changelog

Bug Fixes

  • Fix --provisioner and --metrics together (#4328)
  • Ignore incorrect type hint from boto3, remove json.loads (#4330)
  • Warn about missing --bypass-file-store with in-place update (#4337)
  • Replace prepareHTSubmission with prepareSubmission in HTCondor (#4319)
  • Merge "Google fixes" (#4293)
  • Support (only) current htcondor (#4320)
  • Delete k8s jobs individually when batch delete fails (#4306)

Misc

  • Update aws spot documentation (#4310)
  • Enable parallel testing (#3493)
  • Add documentation for running CWL workflows on non-Toil-managed Kubernetes clusters (#4332)
  • Export all slurm args by default (#4237)
  • Allow for subclasses of base types in messages (#4322)
  • Non cache default (#4299)

Dependencies

  • Bump mypy from 0.982 to 0.991 (#4345)
  • Bump schema-salad>=8.4.20230128170514,<9 to schema-salad>=8.3.20220913105718,<8.4 (#4342) (#4341)
  • Bump cwltool from 3.1.20221008225030 to 3.1.20221201130942 (#4338)
  • Bump pyupgrade to 3.7 (#4295)

Thank you to our contributors: @adamnovak , @Hexotical , @w-gao, @mr-c , @gmloose , @boukn , and @thiagogenez !