Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-118761: Improve import time of annotationlib #132028

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DavidCEllis
Copy link
Contributor

@DavidCEllis DavidCEllis commented Apr 2, 2025

This PR converts annotationlib.py into a package with annotationlib/__init__.py and annotationlib/_stringifier.py.

Discussed here: https://discuss.python.org/t/pep-749-implementing-pep-649/54974/63

This is done in order to move the definition of _Stringifier into a new submodule in order to defer the import of ast in the main module.

The outcome of this is that ast should only be imported if any of the following occur:

  • get_annotations(obj, format=Format.STRING)1 is used and the output is not an empty dict
  • get_annotations(obj, format=Format.FORWARDREF) is used and there are actually forward references
  • forwardref.__forward_arg__ is called and forwardref.__ast_node__ is not None

Note: I've used a class with a __getattr__ method as a way of deferring imports in the current PR but I'd be happy to change that to something else if there's a more standard pattern.

My machine isn't a super stable benchmarking machine so take these only as rough estimates (they're slightly different to those posted in the discuss thread as it's a different run).

This branch:

import time: self [us] | cumulative | imported package
...
import time:       688 |        688 |     types
import time:      3373 |       4061 |   enum
import time:       306 |        306 |   keyword
import time:       906 |       5272 | annotationlib

Main:

import time: self [us] | cumulative | imported package
...
import time:      4032 |       4032 |     _ast
import time:      1605 |       5637 |   ast
import time:       432 |        432 |     types
import time:      2027 |       2459 |   enum
import time:       126 |        126 |       itertools
import time:       205 |        205 |       keyword
import time:        77 |         77 |         _operator
import time:       341 |        418 |       operator
import time:       210 |        210 |       reprlib
import time:       357 |        357 |       _collections
import time:      1415 |       2729 |     collections
import time:       217 |        217 |     _functools
import time:      1396 |       4341 |   functools
import time:      2198 |      14634 | annotationlib

Footnotes

  1. or any of the similar methods such as call_annotate_function

@AA-Turner
Copy link
Member

AA-Turner commented Apr 2, 2025

@DavidCEllis what do you think of AA-Turner@opt-annotationlib?

Instead of the lazy object, it uses self-replacing functions, so that the 'lazy' cost is only paid once. I think for ast.unparse you could probably just get away with a local import, but I haven't benchmarked/tested this.

def _Stringifier(*args, **kwds):
    # This function replaces itself with the real class when first called
    global _Stringifier
    from annotationlib._stringifier import Stringifier as _Stringifier
    return _Stringifier(*args, **kwds)

A

@AA-Turner AA-Turner changed the title gh-118761 - Improve import time of annotationlib by deferring imports of ast and functools gh-118761: Improve import time of annotationlib Apr 2, 2025
@DavidCEllis
Copy link
Contributor Author

DavidCEllis commented Apr 2, 2025

I don't mind the self replacing functions if that's a more recognisable pattern. I personally prefer the class as there's no observable placeholder object that needs to be replaced. Inspecting will only give you the actual function/module and you can't make a reference to the self-replacing function that doesn't get replaced.

[Edit: I also like that it puts information about all of the imports at the top of the module, so you can see that the module may import functools without having to search for the inline import statement.]

The lazy object also only pays the cost once by assigning to the object after the first import - it's basically a by-hand written version of an instance of my general lazy importer module.

@AA-Turner
Copy link
Member

Benchmarking again with the recent changes to ast, the vast majority of the improvement comes just from deferring the functools import. We can gain another ~5ms by splitting the module and deferring the ast import.

I'll leave it up to @JelleZijlstra to decide which he prefers.

Current HEAD

import time: self [us] | cumulative | imported package
import time:      1537 |       1537 |     _ast
import time:      3552 |       5089 |   ast
import time:      4438 |       4438 |     types
import time:      5144 |       9581 |   enum
import time:       133 |        133 |       itertools
import time:      3329 |       3329 |       keyword
import time:        79 |         79 |         _operator
import time:      3214 |       3293 |       operator
import time:      3374 |       3374 |       reprlib
import time:       273 |        273 |       _collections
import time:      6617 |      17016 |     collections
import time:        72 |         72 |     _functools
import time:      3908 |      20995 |   functools
import time:      3612 |      39275 | annotationlib

This PR ('lazy')

import time: self [us] | cumulative | imported package
import time:      3438 |       3438 |     types
import time:      4468 |       7906 |   enum
import time:      3768 |       3768 |   keyword
import time:      5120 |      16793 | annotationlib_lazy

Self replacing functions

import time: self [us] | cumulative | imported package
import time:      3419 |       3419 |     types
import time:      4419 |       7838 |   enum
import time:      2701 |       2701 |   keyword
import time:      5530 |      16067 | annotationlib_self_replacing

Use a local import for functools

import time: self [us] | cumulative | imported package
import time:      1687 |       1687 |     _ast
import time:      3630 |       5316 |   ast
import time:      3850 |       3850 |     types
import time:      4770 |       8619 |   enum
import time:      3127 |       3127 |   keyword
import time:      3694 |      20755 | annotationlib_defer_functools

@JelleZijlstra
Copy link
Member

I'm not sure this is worth it any more:

  • import ast is much faster with the recent changes.
  • collections and functools are already imported by typing.py, so deferring them here doesn't feel that valuable.

@DavidCEllis
Copy link
Contributor Author

Probably fair on the scale of things, although perhaps you could skip the functools import all together with something similar to how dataclasses avoids importing typing?

cpython/Lib/dataclasses.py

Lines 808 to 814 in d30052a

typing = sys.modules.get('typing')
if typing:
if (_is_classvar(a_type, typing)
or (isinstance(f.type, str)
and _is_type(f.type, cls, typing, typing.ClassVar,
_is_classvar))):
f._field_type = _FIELD_CLASSVAR

I'll note that needing annotationlib doesn't mean you also need typing. I use annotations for a dataclasses-like project where I've been very careful to keep import time down and so avoid importing typing at runtime. With Python 3.14, annotationlib on the other hand is somewhat unavoidable in the case that FORWARDREF is needed.

@JelleZijlstra
Copy link
Member

Good idea regarding functools, I just did that myself in #132059 since the change is so simple.

I feel deferring the ast import in annotationlib goes too far. We need ast for both the FORWARDREF and STRING formats in get_annotations(), and if you're using annotationlib, you probably want one of those.

However, I do think it's realistic to defer import annotationlib in typing.py, I'll see if I can make that happen.

@DavidCEllis
Copy link
Contributor Author

With the STRING format you do need ast and that's unavoidable as the fake globals always uses _Stringifier in that case.

With FORWARDREF however, _Stringifier will only be used if there are actual forward references that need to be replaced, so the ast import is only needed in this case.

Example with this PR:

import sys
from annotationlib import get_annotations, Format

class NoForwardRef:
    a: int

class YesForwardRef:
    a: Any

print(get_annotations(NoForwardRef, format=Format.FORWARDREF))
print("ast" in sys.modules)

print(get_annotations(YesForwardRef, format=Format.FORWARDREF))
print("ast" in sys.modules)
{'a': <class 'int'>}
False
{'a': ForwardRef('Any')}
True

I think it's likely many tools that currently get annotations directly will need to switch to using annotationlib.get_annotations(obj, format=Format.FORWARDREF) to avoid the potential NameError or AttributeError results. This doesn't necessarily mean they want or need ForwardRef in all cases, they just need to 'work' as they did with 3.13 and not raise new exceptions if someone removes from __future__ import annotations.

Still arguable if it goes too far with the improvement to ast's own import time but I just wanted to point out that with 3.14 using annotationlib doesn't necessarily mean you want the elements that require ast.

@python-cla-bot
Copy link

python-cla-bot bot commented Apr 6, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants