Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uv run repeatedly compiles bytecode with UV_COMPILE_BYTECODE=1 #12202

Open
mwaskom opened this issue Mar 16, 2025 · 10 comments
Open

uv run repeatedly compiles bytecode with UV_COMPILE_BYTECODE=1 #12202

mwaskom opened this issue Mar 16, 2025 · 10 comments
Labels
bug Something isn't working

Comments

@mwaskom
Copy link

mwaskom commented Mar 16, 2025

Summary

Starting with uv 0.5.5, uv run will compile bytecode files on every invocation when UV_COMPILE_BYTECODE=1 is set:

$ uv init
Initialized project `pyc-woe`
$ UV_COMPILE_BYTECODE=1 uv run python -c ''
Using CPython 3.13.0
Creating virtual environment at: .venv
Bytecode compiled 1 file in 39ms
$ UV_COMPILE_BYTECODE=1 uv run python -c ''
Bytecode compiled 1 file in 36ms
$ UV_COMPILE_BYTECODE=1 uv run python -c ''
Bytecode compiled 1 file in 35ms```

While not shown here, all packages in the venv get their bytecode recompiled, so this can add a lot of latency for large environments.

My understanding is that UV_COMPILE_BYTECODE should be telling uv to compile bytecode when installing packages:

Equivalent to the --compile-bytecode command-line argument. If set, uv will compile Python source files to bytecode after installation

Instead, it seems to be effectively setting "ignore the pyc cache and recompile on every invocation of uv run". Maybe that's expected on your end, but we found it very surprising.

Note that this behavior was introduced in 0.5.5. I'm guessing it's a result of this PR.

Platform

Linux 5.15.0-101.103.2.1.el9uek.x86_64 x86_64 GNU/Linux

Version

uv 0.6.6 (as mentioned, the behavior was introduced in 0.5.5)

Python version

Python 3.9.18 when making the repro, but it's not Python version dependent

@mwaskom mwaskom added the bug Something isn't working label Mar 16, 2025
@zanieb
Copy link
Member

zanieb commented Mar 16, 2025

cc @konstin

@charliermarsh
Copy link
Member

We do recompile the entire environment, as opposed to only compiling newly-installed packages. However, IIUC, it's not quite correct to say that the behavior is "ignore the pyc cache and recompile on every invocation of uv run", since bytecode compilation only actually runs if a file's timestamp or size changes. (See, e.g., https://docs.python.org/3/library/py_compile.html#py_compile.PycInvalidationMode.)

@mwaskom
Copy link
Author

mwaskom commented Mar 17, 2025

However, IIUC, it's not quite correct to say that the behavior is "ignore the pyc cache and recompile on every invocation of uv run", since bytecode compilation only actually runs if a file's timestamp or size changes.

What's causing the timestamp or size to change here? E.g. if I expand my environment a little bit

$ uv pip install fastapi[standard]
Resolved 34 packages in 155ms
Prepared 34 packages in 172ms
Installed 34 packages in 19ms
 ...

Then it sure looks like the entire site-packages is getting recompiled on every uv run:

$ UV_COMPILE_BYTECODE=1 uv run python -c ''
Bytecode compiled 1178 files in 522ms
$ UV_COMPILE_BYTECODE=1 uv run python -c ''
Bytecode compiled 1178 files in 40ms
$ UV_COMPILE_BYTECODE=1 uv run python -c ''
Bytecode compiled 1178 files in 44ms

@charliermarsh
Copy link
Member

Look at the difference in time though? 522ms to 40ms. The subsequent runs short-circuit (on a per-file basis) if the timestamp hasn't changed. This is a feature that's built-in to CPython's bytecode compiler, not uv.

@mwaskom
Copy link
Author

mwaskom commented Mar 17, 2025

40 ms of extra latency for each uv run is still not ideal though? And that's just for one requested package + its dependencies. Copying this log from one of our users:

root@modal:/code/crafty$ UV_COMPILE_BYTECODE=1 uv run python -c ""
Bytecode compiled 11165 files in 9.90s
root@modal:/code/crafty$ UV_COMPILE_BYTECODE=1 uv run python -c ""
Bytecode compiled 11165 files in 1.77s
root@modal:/code/crafty$ UV_COMPILE_BYTECODE=1 uv run python -c ""
Bytecode compiled 11165 files in 1.76s
root@modal:/code/crafty$ UV_COMPILE_BYTECODE=1 uv run python -c ""
Bytecode compiled 11165 files in 1.75s

So, OK fair enough that it's not ignoring the cache, but the point remains that it's a surprising and harmful behavior.

(I guess a separate issue is whether the log should say "Bytecode compiled n files" if that's not what it's actually doing).

@charliermarsh
Copy link
Member

Why not enable bytecode compilation at install time, then, rather than globally?

(I'm not sure that there are other great options here. E.g., if we didn't re-check the bytecode compilation in uv run even when enabled, then uv run --compile-bytecode python -c "" wouldn't compile bytecode at all. IIRC it actually worked that way in the past and we (justifiably) received issues.)

@mwaskom
Copy link
Author

mwaskom commented Mar 17, 2025

Why not enable bytecode compilation at install time, then, rather than globally?

We enable it globally because uv's default behavior w/r/t not compiling during the install is very harmful for containerized workflows. We are trying to help disarm this footgun for our users. And there is not always a clear "install time" distinction from the platform's perspective. Still, we will need to work on that if this is expected behavior for uv, because this outcome is probably worse.

FWIW, I am not a sophisticated user of uv (just passing on issues are users are having with it), but I don't think I really understand the rationale here. My mental model is that uv run can install/update packages on each invocation. If it does that, than I would expect it to also recompile bytecode. (It still feels like it should only scan the new packages, but maybe that's not possible, I don't know.) The surprising behavior here is forcing bytecode compilation (or at least a scan of the pyc cache or something) even though nothing about the environment has changed and nothing needed to be installed. Maybe that's the wrong model though.

@mwaskom
Copy link
Author

mwaskom commented Mar 17, 2025

then uv run --compile-bytecode python -c "" wouldn't compile bytecode at all

Do you mean that Python itself won't invalidate the pyc cache / write new pyc files when the program executes? Or just that uv won't do "just in time" pre-compilation in this case?

It's not obvious to me that pre-compilation is very meaningful in the uv run context, since there's no separation from the user's perspective between "install time" and "run time" there. But again, I haven't thought much about exactly what happens when you uv run.

@charliermarsh
Copy link
Member

We enable it globally because uv's default behavior w/r/t not compiling during the install is very harmful for containerized workflows.

That makes sense. We typically recommend setting this in (e.g.) the install commands within the Dockerfile.

Do you mean that Python itself won't invalidate the pyc cache / write new pyc files when the program executes? Or just that uv won't do "just in time" pre-compilation in this case?

The latter.

This is good feedback, we'll think on it, thanks for engaging. We should consider skipping bytecode compilation in uv run if no packages are installed. I just know we've received issues about that in the past, which I believe prompted this change. The other option is to change the bytecode compilation model more broadly to only compile newly-installed packages. \cc @konstin

@mwaskom
Copy link
Author

mwaskom commented Mar 17, 2025

That makes sense. We typically recommend setting this in (e.g.) the install commands within the Dockerfile.

Yeah, everything would be easier if we could assume users would carefully read all the documentation for their tools :)

In our case (providing a platform for containerized applications), users can experience significantly degraded performance if they don't realize that uv pip install behaves very differently from pip install in this respect. And it's very hard for users to correctly attribute slow container startup to the true cause; they just assume our system is not performant. So we have opted to be conservative and set the provided environment variable, since it's always the desired behavior at install time, and we did not expect that there would be any runtime consequences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants