forked from bazelbuild/rules_docker
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bre] Use a local registry #13
Open
gabrielrussoc
wants to merge
20
commits into
databricks
Choose a base branch
from
gabrielrussoc/local-registry-instead-of-load
base: databricks
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[bre] Use a local registry #13
gabrielrussoc
wants to merge
20
commits into
databricks
from
gabrielrussoc/local-registry-instead-of-load
+344
−2
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gabrielrussoc
changed the title
Gabrielrussoc/local registry instead of load
[bre] Use a local registry
Nov 8, 2024
gabrielrussoc
force-pushed
the
gabrielrussoc/local-registry-instead-of-load
branch
from
November 13, 2024 16:18
5a6bb1e
to
f7ca769
Compare
gabrielrussoc
commented
Nov 14, 2024
@@ -201,7 +201,7 @@ EOF | |||
|
|||
# On macOS, clean all xattrs from the files we're going to load. | |||
if [ "$(uname)" == "Darwin" ]; then | |||
echo "Cleaning xattrs from files on macOS..." | |||
echo "Cleaning xattrs from files on macOS..." >&2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinqian-db this log line actually broke macOS as I explained on Slack since it polluted the output
I'm just logging it to stderr to fix it
terencesun-db
approved these changes
Nov 14, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The main motivation of this pull request is to use
docker pull
instead ofdocker load
for loading images.Despite the name, we are not pulling anything from the network but instead we pull from a local binary we spin up with our images loaded.
This makes a lot of sense because
docker pull
is a really smart and optimised command that can pull only the missing layers and will avoid doing a lot of extra work.docker load
on the other hand is very simple and it requires tar ball with ALL the layers and it will always write them to the data directory regardless if the directory already has it or not. That's exactly why the code was so complicated and tried to optimize this tarball by only selecting the missing layers. All that is gone withdocker pull
.At Databricks, we even tried our best to not
docker load
in universe by trying to check if the image was already in the daemon before calling these rules, and that's also now obsolete.docker pull
is also much better to work with RBE because it acknowledges the snapshotter and can do a better job with storage (see comments in the code).Of course, the downside is that we have to maintain a local registry binary. However, it's a very small and straightforward binary that implements a battle tested API. It's not long lived and all it does is to store all the layers and serve them to
docker pull
when it asks. It can only serve the layers from a particular target.macOS notice: Unfortunately this does not work well in macOS because we can't easily pull from local registries. It won't allow HTTP and it won't trust self signed HTTPS certs, unless users do a lot of manual configuration on their docker desktop. So we keep the original behaviour for macOS. See https://stackoverflow.com/questions/76034521/docker-log-in-to-local-registry-with-docker-desktop-for-mac.
tests
I ran multiple docker tests on both linux and macOS