-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: utils: copyfile: reset permissions on macos reflinks #245
Conversation
@@ -189,6 +191,9 @@ def copyfile( | |||
|
|||
try: | |||
system.reflink(src, dest) | |||
if sys.platform == "darwin": | |||
# NOTE: reflink on macos also clones src permissions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to be clear, what pyreflink is doing is correct, then, right? It is always preserving permissions. It is not doing any chmod operations on macOS because clonefile does that already, but pyreflink is copying permissions on Linux as ficlone does not do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, to be a 1:1 replacement with pyreflink, it has to be:
reflink(src, dst)
if sys.platform == "linux":
shutil.copymode(src, dst)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I would say that cloning permissions is correct, as it seems inconsistent, a bit wasteful and also kinda unfinished, since it only copies mode but not other stuff (like shutil.copystats). E.g. for us the fact that pyreflink does that on linux is a complete waste, as we need to reset permissions ourselves anyway.
So yeah, we are chmodding different perms and for different reasons. And pyreflink is just trying to be consistent between linux and macos (thouhg I'm not sure if macos clone file clones stats in addition to just mode).
Did you test on Linux with CoW filesystems btw? |
@skshetry Nope, just mac for now. |
@efiop, looks like we need to chmod for linux too. About 10 tests are failing for me when I run with: TMPDIR=$(realpath ../tmp); mkdir $TMPDIR
pytest -nlogical Most of those are import related and have |
For me,
creates a file with executable bit set. Looks like this is a problem that I introduced with that optimization. |
@skshetry Thanks for testing it! I'll send a fix ASAP. |
@skshetry what filesystem do you use? I have a suspicion that it depends on the filesystem really (we tend to mixup platform and filesystem a lot when we discuss this, but it is really up to the fs on how it wants to handle it). |
I am not sure what the fix should be. The One way might be to set |
I use |
I see https://github.com/python/cpython/blob/cc7e45cc572dd818412a649970fdee579417701f/Lib/_pyio.py#L1550 |
@skshetry We just need to normalize the perms, that's what we've done before. Could you try with dvc-objects 1.4.8, please? I don't have linux COW fs handy right now 🙁 |
I'm pretty sure it was me, I removed chmod that we used to have completely and only kept it for macos after that. |
hmm, I'm a bit worried about world-readable file that we are creating with this. It could have some security implications. I see that Maybe we should do the same here? |
@skshetry We just use the umask, so the file we create is the same as what the user would create from scratch. Same as shutil.copyfile too. The normal way to go about it is to use umask anyway, so I don't think we need to do anything special here and pyreflink's approach actually looks wrong to me. |
yeah. I'd still sleep better today if we merge this #248. |
@skshetry So you confirm everything works fine now for you with reflink on btrfs, right? |
yes, everything works, all the tests pass when I change tmpdir to btrfs. |
Thank you, @skshetry ! 🙏 |
This is likely because, when you are copying file with a restricted permission, you'll end up creating a world-readable file, exposing contents to others. So the approach feels correct to me. Regarding dvc, with dvc's cache, I am not sure it is relevant. At least it's hard to say it is. |
@skshetry Didn't realise it was setting that mode before cloning. Indeed, you are correct about that. |
Related #238