Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve import time of various stdlib modules #118761

Open
layday opened this issue May 8, 2024 · 67 comments
Open

Improve import time of various stdlib modules #118761

layday opened this issue May 8, 2024 · 67 comments
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir topic-importlib type-feature A feature request or enhancement

Comments

@layday
Copy link

layday commented May 8, 2024

Feature or enhancement

Proposal:

Following on from #109653, further improvements can be made to import times.

Links to previous discussion of this feature:

https://discuss.python.org/t/deferred-computation-evalution-for-toplevels-imports-and-dataclasses/34173

For example:

importlib.metadata is often used for tasks that need to happen at import, e.g. to enumerate/load entry point plug-ins, so it might be worth seeing if we can cut down its own import time a bit more.

importlib.metadata imports zipfile at the top for a function that won't be called in the vast majority of cases. It also imports importlib.abc, which in turn imports importlib.resources, to subclass an ABC with a single, non-abstract method - I assume redefining the method in importlib.metadata would be harmless. Some other less frequently-used imports which are only accessed once or twice, such as json, could also be tucked away in their calling functions.

Linked PRs

@AlexWaygood AlexWaygood added topic-importlib performance Performance or resource usage type-feature A feature request or enhancement labels May 8, 2024
@hugovk
Copy link
Member

hugovk commented Aug 5, 2024

@layday Is it okay if I repurpose this issue as an "Improve import time of various stdlib modules" like #109653 but for 3.14?

I've got some pprint improvements, and if we have importlib.metadata and some others, we can group them under the same umbrella issue like last time.

@layday
Copy link
Author

layday commented Aug 5, 2024

Sure!

@danielhollas
Copy link
Contributor

danielhollas commented Aug 6, 2024

I've opened a PR over at the importlib_metadata repo that avoids importing inspect. python/importlib_metadata#499

importlib.metadata imports zipfile at the top for a function that won't be called in the vast majority of cases.
Some other less frequently-used imports which are only accessed once or twice, such as json, could also be tucked away in their calling functions.

@layday were you planning on tackling these?

It also imports importlib.abc, which in turn imports importlib.resources, to subclass an ABC with a single, non-abstract method

This seems to be solved on main, importlib.abc no longer imports importlib.resources.

@hugovk hugovk changed the title Further improve import time of importlib.metadata Improve import time of various stdlib modules Aug 6, 2024
@hugovk hugovk added the 3.14 new features, bugs and security fixes label Aug 6, 2024
hugovk added a commit that referenced this issue Aug 7, 2024
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
@danielhollas
Copy link
Contributor

I've opened a PR over at the importlib_metadata repo that avoids importing inspect. python/importlib_metadata#499

This has been merged and released in version 8.4 of importlib_metadata 🎉

importlib.metadata imports zipfile at the top for a function that won't be called in the vast majority of cases. It also imports importlib.abc, which in turn imports importlib.resources, to subclass an ABC with a single, non-abstract method - I assume redefining the method in importlib.metadata would be harmless. Some other less frequently-used imports which are only accessed once or twice, such as json, could also be tucked away in their calling functions.

I've submitted python/importlib_metadata#502 that defers zip import, and python/importlib_metadata#503 which defers json and platform.

@picnixz picnixz removed the 3.14 new features, bugs and security fixes label Aug 31, 2024
@picnixz
Copy link
Member

picnixz commented Aug 31, 2024

(removing the 3.14 label since features always target the main branch)

@hugovk
Copy link
Member

hugovk commented Jan 7, 2025

Note for when documenting this in What's New in Python 3.14, can also include #128559 / #128560.

ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
picnixz added a commit that referenced this issue Jan 14, 2025
Importing `pickle` is now roughly 25% faster.

Importing the `re` module is no longer needed and
thus `re` is no more implicitly exposed as `pickle.re`.

---------

Co-authored-by: Adam Turner <[email protected]>
@vstinner
Copy link
Member

@picnixz: I suggest to close this issue.

@AA-Turner
Copy link
Member

I've created a PR for ast (#131953) and a draft PR using self-overwriting descriptors to optimise textwrap (#131956).

A

@TeamSpen210
Copy link

Since typing has a lot of different things defined, many distinct from each other, I wonder if it'd be worth it to either break it up internally or put code into functions, then define module __getattr__() to lazily construct groups of objects as necessary. Since often modules just want one or two things.

@ofek
Copy link
Contributor

ofek commented Mar 31, 2025

I haven't looked at the code but it would be a big win if that is possible since basically everything imports that nowadays and there is no way to avoid it.

@JelleZijlstra
Copy link
Member

@TeamSpen210 something like that could be worthwhile but the maintainability cost might be pretty high. Also, there's a risk that programs that do use a lot of things from typing get slower because of the extra indirection.

There might also be interpreter-level things we can do to speed up specific operations that often happen at import time. For example, creating an empty class is about 70x slower than creating an empty function—maybe we can improve that.

@AA-Turner
Copy link
Member

Continuing on ast optimisations, I get a ~10% speed-up 1 by moving the deprecated classes (ast.slice onwards) into a __getattr__. Class creation is slow!

This does feel borderline, though. Any thoughts?

Footnotes

  1. From 2273.2µs to 2052.1µs (cumulative); from 683.0µs to 539.8µs (self).

@JelleZijlstra
Copy link
Member

I looked at some profiling data and feel there's good opportunities for optimizing class creation if someone is interested in hacking on typeobject.c: see #132042 for details. Making all class creation faster should help a lot with importing large modules like typing.

bell-sw pushed a commit to bell-sw/alpaquita-aports that referenced this issue Apr 3, 2025
JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Apr 4, 2025
JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Apr 4, 2025
annotationlib is used quite a few times in typing.py, but I think the
usages are just rare enough that this makes sense.

The import would get triggered by:
- Using get_type_hints(), evaluate_forward_ref(), and similar introspection
  functions
- Using a string annotation anywhere that goes through _type_convert (e.g.,
  "Final['x']" will trigger an annotationlib import in order to access the
  ForwardRef class).
- Creating a TypedDict or NamedTuple (unless it's empty or PEP 563 is on).

Lots of programs will want to use typing without any of these, so the tradeoff
seems worth it.
JelleZijlstra added a commit that referenced this issue Apr 4, 2025
annotationlib is used quite a few times in typing.py, but I think the
usages are just rare enough that this makes sense.

The import would get triggered by:
- Using get_type_hints(), evaluate_forward_ref(), and similar introspection
  functions
- Using a string annotation anywhere that goes through _type_convert (e.g.,
  "Final['x']" will trigger an annotationlib import in order to access the
  ForwardRef class).
- Creating a TypedDict or NamedTuple (unless it's empty or PEP 563 is on).

Lots of programs will want to use typing without any of these, so the tradeoff
seems worth it.
@JelleZijlstra
Copy link
Member

For typing, most of the remaining cost is now attributable to the imports of collections and functools. functools is unfortunately hard to avoid, because we need it in the _tp_cache decorator that gets executed at import time. collections is doable (by using the private _collections_abc module and some __getattr__ hacks), but functools also imports collections, so we don't currently gain anything by deferring the import in typing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir topic-importlib type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests