Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve handling of URLs in file functions. #369

Merged
merged 5 commits into from
Mar 26, 2025

Conversation

peterhuene
Copy link
Collaborator

@peterhuene peterhuene commented Mar 26, 2025

  • allow glob to operate off of file:// URLs.
  • allow size to operate off of file:// URLs.
  • allow basename to operate off of any URLs.

This PR also removes custom diagnostics from the stdlib functions so that all of the standard library functions now use function_called_failed diagnostics for errors. It gives a more consistent diagnostic for function calls.

Before submitting this PR, please make sure:

  • You have added a few sentences describing the PR here.
  • You have added yourself or the appropriate individual as the assignee.
  • You have added at least one relevant code reviewer to the PR.
  • Your code builds clean without any errors or warnings.
  • You have added tests (when appropriate).
  • You have updated the README or other documentation to account for these
    changes (when appropriate).
  • You have added an entry to the relevant CHANGELOG.md (see
    "keep a changelog" for more information).
  • Your commit messages follow the conventional commit style.
  • Your PR title follows the conventional commit style.

* allow `glob` to operate off of `file://` URLs.
* allow `size` to operate off of `file://` URLs.
* allow `basename` to operate off of any URLs.

This commit also removes custom diagnostics from the stdlib
functions so that all of the standard library functions now use
`function_called_failed` diagnostics for errors. It gives a more
consistent diagnostic for function calls.
@peterhuene peterhuene self-assigned this Mar 26, 2025
Copy link
Member

@a-frantz a-frantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions about using fewer hardcoded strings in the error messages. I'm not sure how feasible either request is, so maybe they can be ignored. Just trying to think of easing future maintenance.

Otherwise, LGTM

@peterhuene peterhuene requested a review from claymcleod March 26, 2025 19:29
@peterhuene
Copy link
Collaborator Author

peterhuene commented Mar 26, 2025

CI is failing due to docker hub rate limiting.

We need to fix the pull image functionality in Crankshaft to gate to a single request if there are a number of concurrent tasks all requesting to pull the same image; I think that may be contributing to the problem.

@peterhuene
Copy link
Collaborator Author

As a workaround, I think we could do what bollard does in CI and use a local docker hub container that we've pushed any images we need for the tests, which will limit the pull requests to Docker Hub to only the number of images needed to run the tests.

@a-frantz
Copy link
Member

all that sounds good, but a perhaps simpler workaround is to pull from quay.io instead of docker hub to avoid rate limits entirely 🤷

@peterhuene
Copy link
Collaborator Author

CI is failing on windows for the join_paths tests as absolute file paths on Windows are being successfully parsed as URLs (sigh).

@peterhuene
Copy link
Collaborator Author

peterhuene commented Mar 26, 2025

all that sounds good, but a perhaps simpler workaround is to pull from quay.io instead of docker hub to avoid rate limits entirely 🤷

As far as I know, there's no way to tell the local docker daemon to use an alternative default namespace for the image names, so we'd have to run the workflow/task evaluation tests with an engine config that overrides the default container and also update any explicit container names in the test WDLs (some of which are 1:1 with the example WDL in the spec).

Not terribly difficult to do, though. Alternatively, I think if we pulled ubuntu:latest at the start of CI, that'd also do the trick (the daemon would already see it locally and not fetch it from hub).

@peterhuene peterhuene merged commit e441e1e into stjude-rust-labs:main Mar 26, 2025
17 checks passed
@peterhuene peterhuene deleted the urls branch March 26, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants