diff --git a/README.rst b/README.rst index 95714df..58b6460 100644 --- a/README.rst +++ b/README.rst @@ -15,9 +15,9 @@ defer-imports :alt: PyPI supported Python versions -A library that implements `PEP 690 `_–esque lazy imports in pure Python. +A library that implements `PEP 690 `_–esque lazy imports in pure Python. -**NOTE: This is still in development.** +**Note: This is still in development.** .. contents:: @@ -28,7 +28,7 @@ A library that implements `PEP 690 `_–esque Installation ============ -**Requires Python 3.9+** +**Note: Requires Python 3.9+** This can be installed via pip:: @@ -43,7 +43,7 @@ See the docstrings and comments in the codebase for more details. Setup ----- -To do its work, ``defer-imports`` must hook into the Python import system in multiple ways. Its path hook, for instance, should be registered before other user code is parsed. To do that, include the following somewhere such that it will be executed before your code: +To do its work, ``defer-imports`` must hook into the Python import system. To do that, include the following call somewhere such that it will be executed before your code: .. code-block:: python @@ -57,7 +57,7 @@ To do its work, ``defer-imports`` must hook into the Python import system in mul import your_code -It can also be used as a context manager, which makes sense when passing in arguments to adjust the hook: +The function call's result can be used as a context manager, which makes sense when passing in configuration arguments. That way, unrelated code or usage of this library isn't polluted: .. code-block:: python @@ -67,17 +67,17 @@ It can also be used as a context manager, which makes sense when passing in argu import your_code -Making this call without arguments allows user code with imports contained within the ``defer_imports.until_use`` context manager to be deferred until referenced. However, it provides several configuration parameters for toggling global instrumentation (affecting all import statements) and for adjusting the granularity of that global instrumentation. +Making this call without arguments allows user code with imports contained within the ``defer_imports.until_use`` context manager to be deferred until referenced. However, its several configuration parameters allow toggling global instrumentation (affecting all import statements) and adjusting the granularity of that global instrumentation. -**WARNING: Consider using the hook as a context manager form when using these configuration parameters; otherwise, the explicit (or implicit) configuration will persist and may cause other packages using ``defer_imports`` to behave differently than expected.** +**WARNING: Avoid using the hook as anything other than a context manager when passing in configuration; otherwise, the explicit (or default) configuration will persist and may cause other packages using ``defer_imports`` to behave differently than expected.** .. code-block:: python import defer_imports # Ex 1. Henceforth, instrument all import statements in other pure-Python modules - # so that they are deferred. Off by default. If on, it has priority over the other - # kwargs. + # so that they are deferred. Off by default. If on, it has priority over any other + # configuration passed in alongside it. # # Better suited for applications. defer_imports.install_import_hook(apply_all=True) @@ -118,78 +118,22 @@ Assuming the path hook was registered normally (i.e. without providing any confi **WARNING: If the context manager is not used as ``defer_imports.until_use``, it will not be instrumented properly. ``until_use`` by itself, aliases of it, and the like are currently not supported.** -If the path hook *was* registered with configuration, then within the affected modules, all global import statements will be instrumented with two exceptions: if they are within ``try-except-else-finally`` blocks, and if they are within non- ``defer_imports.until_use`` ``with`` blocks. Such imports are still performed eagerly. These "escape hatches" mostly match those described in PEP 690. +If the path hook *was* registered with configuration, then within the affected modules, most module-level import statements will be instrumented. There are two supported exceptions: import statements within ``try-except-else-finally`` blocks and within non- ``defer_imports.until_use`` ``with`` blocks. Such imports are still performed eagerly. These "escape hatches" mostly match those described in PEP 690. Use Cases --------- -- If imports are necessary to get symbols that are only used within annotations, but such imports would cause import chains. - - - The current workaround for this is to perform the problematic imports within ``if typing.TYPE_CHECKING: ...`` blocks and then stringify the fake-imported, nonexistent symbols to prevent NameErrors at runtime; however, the resulting annotations raise errors on introspection. Using ``with defer_imports.until_use: ...`` instead would ensure that the symbols will be imported and saved in the local namespace, but only upon introspection, making the imports non-circular and almost free in most circumstances. - -- If expensive imports are only necessary for certain code paths that won't always be taken, e.g. in subcommands in CLI tools. - - -Extra: Console --------------- - -``defer-imports`` works while within a regular Python REPL, as long as that work is being done in a package being imported and not with direct usage of the ``defer_imports.until_use`` context manager. To directly use the context manager in a REPL, use the included interactive console. - -You can start it from the command line:: - - > python -m defer_imports - Python 3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)] on win32 - Type "help", "copyright", "credits" or "license" for more information. - (DeferredInteractiveConsole) - >>> import defer_imports - >>> with defer_imports.until_use: - ... import typing - ... - >>> import sys - >>> "typing" in sys.modules - False - >>> typing - - >>> "typing" in sys.modules - True - -You can also start it while within a standard Python REPL: - -.. code-block:: pycon - - >>> from defer_imports import interact - >>> interact() - Python 3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)] on win32 - Type "help", "copyright", "credits" or "license" for more information. - (DeferredInteractiveConsole) - >>> import defer_imports - >>> with defer_imports.until_use: - ... import typing - ... - >>> import sys - >>> "typing" in sys.modules - False - >>> typing - - >>> "typing" in sys.modules - True - -Additionally, if you're using IPython in a terminal or Jupyter environment, there is a separate function you can call to ensure the context manager works there as well: - -.. code-block:: ipython - - In [1]: import defer_imports - In [2]: defer_imports.instrument_ipython() - In [3]: with defer_imports.until_use: - ...: import numpy - ...: - In [4]: import sys - In [5]: print("numpy" in sys.modules) - False - In [6]: numpy - In [7]: print("numpy" in sys.modules) - True +- Anything that could benefit from overall decreased startup/import time if the symbols resulting from imports aren't used at import time. + + - If one wants module-level, expensive imports aren't used in commonly run code paths. + + - A good fit for this is a CLI tool and its subcommands. + + - If imports are necessary to get symbols that are only used within annotations. + + - Such imports can be unnecessarily expensive or cause import chains depending on how one's code is organized. + - The current workaround for this is to perform the problematic imports within ``if typing.TYPE_CHECKING: ...`` blocks and then stringify the fake-imported, nonexistent symbols to prevent NameErrors at runtime; however, the resulting annotations will raise errors if ever introspected. Using ``with defer_imports.until_use: ...`` instead would ensure that the symbols will be imported and saved in the local namespace, but only upon introspection, making the imports non-circular and almost free in most circumstances. Features @@ -197,10 +141,10 @@ Features - Supports multiple Python runtimes/implementations. - Supports all syntactically valid Python import statements. -- Doesn't break type-checkers like pyright and mypy. +- Cooperates with type-checkers like pyright and mypy. - Has an API for automatically instrumenting all valid import statements, not just those used within the provided context manager. - - Allows escape hatches for eager importing via ``try-except-else-finally`` and ``with`` blocks. + - Has escape hatches for eager importing: ``try-except-else-finally`` and ``with`` blocks. Caveats @@ -209,10 +153,13 @@ Caveats - Intentionally doesn't support deferred importing within class or function scope. - Eagerly loads wildcard imports. - May clash with other import hooks. -- Can have a (relatively) hefty one-time setup cost from invalidating caches in Python's import system. -- Can't automatically resolve deferred imports when a namespace is being iterated over, leaving a hole in the abstraction. - - This library tries to hide its implementation details to avoid changing the developer/user experience. However, there is one leak in its abstraction: when using dictionary iteration methods on a dictionary or namespace that contains a deferred import key/proxy pair, the members of that pair will be visible, mutable, and will not resolve automatically. PEP 690 specifically addresses this by modifying the builtin ``dict``, allowing each instance to know if it contains proxies and then resolve them automatically during iteration (see the second half of its `"Implementation" section `_ for more details). Note that qualifying ``dict`` iteration methods include ``dict.items()``, ``dict.values()``, etc., but outside of that, the builtin ``dir()`` also qualifies since it can see the keys for objects' internal dictionaries. + - Examples of popular packages using clashing import hooks: |typeguard|_, |beartype|_, |jaxtyping|_, |torchtyping|_, |pyximport|_ + - It's possible to work around this by reaching into ``defer-imports``'s internals and combining its instrumentation machinery with that of another package's, but it's currently not supported well beyond the presence of a ``loader_class`` parameter in ``defer_imports.install_import_hook()``'s signature. + +- Can't automatically resolve deferred imports in a namespace when that namespace is being iterated over, leaving a hole in its abstraction. + + - When using dictionary iteration methods on a dictionary or namespace that contains a deferred import key/proxy pair, the members of that pair will be visible, mutable, and will not resolve automatically. PEP 690 specifically addresses this by modifying the builtin ``dict``, allowing each instance to know if it contains proxies and then resolve them automatically during iteration (see the second half of its `"Implementation" section `_ for more details). Note that qualifying ``dict`` iteration methods include ``dict.items()``, ``dict.values()``, etc., but outside of that, the builtin ``dir()`` also qualifies since it can see the keys for objects' internal dictionaries. As of right now, nothing can be done about this using pure Python without massively slowing down ``dict``. Accordingly, users should try to avoid interacting with deferred import keys/proxies if encountered while iterating over module dictionaries; the result of doing so is not guaranteed. @@ -220,11 +167,11 @@ Caveats Why? ==== -Lazy imports alleviate several of Python's current pain points. Because of that, `PEP 690 `_ was put forth to integrate lazy imports into CPython; see that proposal and the surrounding discussions for more information about the history, implementations, benefits, and costs of lazy imports. +Lazy imports alleviate several of Python's current pain points. Because of that, `PEP 690 `_ was put forth to integrate lazy imports into CPython; see that proposal and the surrounding discussions for more information about the history, implementations, benefits, and costs of lazy imports. Though that proposal was rejected, there are well-established third-party libraries that provide lazy import mechanisms, albeit with more constraints. Most do not have APIs as integrated or ergonomic as PEP 690's, but that makes sense; most predate the PEP and were not created with that goal in mind. -Existing libraries that do intentionally inject or emulate PEP 690's semantics in some form don't fill my needs for one reason or another. For example, `slothy `_ (currently) limits itself to specific Python implementations by relying on the existence of call stack frames. I wanted to create something similar that doesn't rely on implementation-specific APIs, is still more ergonomic than the status quo, and will be easier to maintain as Python (and its various implementations) continues evolving. +Existing libraries that do intentionally inject or emulate PEP 690's semantics and API don't fill my needs for one reason or another. For example, |slothy|_ (currently) limits itself to specific Python implementations by relying on the existence of call stack frames. I wanted to create something similar that relies on public implementation-agnostic APIs as much as possible. How? @@ -236,7 +183,7 @@ The ``defer_imports.until_use`` context manager is what causes the proxies to be Those proxies don't use those stored ``__import__`` arguments themselves, though; the aforementioned special keys are what use the proxy's stored arguments to trigger the late import. These keys are aware of the namespace, the *dictionary*, they live in, are aware of the proxy they are the key for, and have overriden their ``__eq__`` and ``__hash__`` methods so that they know when they've been queried. In a sense, they're like descriptors, but instead of "owning the dot", they're "owning the brackets". Once such a key has been matched (i.e. someone uses the name of the import), it can use its corresponding proxy's stored arguments to execute the late import and *replace itself and the proxy* in the local namespace. That way, as soon as the name of the deferred import is referenced, all a user sees in the local namespace is a normal string key and the result of the resolved import. -The missing intermediate step is making sure these special proxies are stored with these special keys in the namespace. After all, Python name binding semantics only allow regular strings to be used as variable names/namespace keys; how can this be bypassed? ``defer-imports``'s answer is a little compile-time instrumentation. When a user calls ``defer_imports.install_deferred_import_hook()`` to set up the library machinery (see "Setup" above), what they are actually doing is installing an import hook that will modify the code of any given Python file that uses the ``defer_imports.until_use`` context manager. Using AST transformation, it adds a few lines of code around imports within that context manager to reassign the returned proxies to special keys in the local namespace (via ``locals()``). +The missing intermediate step is making sure these special proxies are stored with these special keys in the namespace. After all, Python name binding semantics only allow regular strings to be used as variable names/namespace keys; how can this be bypassed? ``defer-imports``'s answer is a little compile-time instrumentation. When a user calls ``defer_imports.install_import_hook()`` to set up the library machinery (see "Setup" above), what they are doing is installing an import hook that will modify the code of any given Python file that uses the ``defer_imports.until_use`` context manager. Using AST transformation, it adds a few lines of code around imports within that context manager to reassign the returned proxies to special keys in the local namespace (via ``locals()``). With this methodology, we can avoid using implementation-specific hacks like frame manipulation to modify the locals. We can even avoid changing the contract of ``builtins.__import__``, which specifically says it does not modify the global or local namespaces that are passed into it. We may modify and replace members of it, but at no point do we change its size while within ``__import__`` by removing or adding anything. @@ -244,14 +191,14 @@ With this methodology, we can avoid using implementation-specific hacks like fra Benchmarks ========== -A bit rough, but there are currently two ways of measuring activation and/or import time: +There are currently a few ways of measuring activation and/or import time: -- A local benchmark script, invokable with ``python -m bench.bench_samples`` (run with ``--help`` to see more information). +- A local benchmark script for timing the import of a significant portion of the standard library. - - To prevent bytecode caching from impacting the benchmark, run with `python -B `_, which will set ``sys.dont_write_bytecode`` to ``True`` and cause the benchmark script to purge all existing ``__pycache__`` folders in the project directory. - - PyPy is excluded from the benchmark since it takes time to ramp up. - - The cost of registering ``defer-imports``'s import hook is ignored since that is a one-time startup cost that will hopefully be reduced in time. - - An sample run across versions using ``hatch run bench:bench``: + - Invokable with ``python -m bench.bench_samples`` or ``hatch run bench:bench``. + - To prevent bytecode caching from impacting the benchmark, run with |python -B|_, which will set ``sys.dont_write_bytecode`` to ``True`` and cause the benchmark script to purge all existing ``__pycache__`` folders in the project directory. + - PyPy is excluded from the benchmark since it takes time to ramp up. + - An sample run across versions using ``hatch``: (Run once with ``__pycache__`` folders removed and ``sys.dont_write_bytecode=True``): @@ -279,15 +226,17 @@ A bit rough, but there are currently two ways of measuring activation and/or imp CPython 3.13 defer-imports 0.00253s (1.00x) ============== ======= ============= =================== -- Built-in Python timing tools, such as ``timeit`` and ``-X importtime``. +- Commands for only measuring import time of the library, using built-in Python timing tools like |timeit|_ and |python -X importtime|_. - - Examples: + - Examples:: - - ``python -m timeit -n 1 -r 1 -- "import defer_imports"`` - - ``python -X importtime -c "import defer_imports"`` + python -m timeit -n 1 -r 1 -- "import defer_imports" + hatch run bench:import-time defer_imports + python -X importtime -c "import defer_imports" + hatch run bench:simple-import-time defer_imports - - Substitute ``defer_imports`` with other modules, e.g. ``slothy``, to compare. - - The results can vary greatly between runs, so if possible, only compare the resulting time(s) when collected from the same process. + - Substitute ``defer_imports`` in the above commands with other modules, e.g. ``slothy``, to compare. + - The results can vary greatly between runs. If possible, only compare the resulting time(s) when collected from the same process. Acknowledgements @@ -295,15 +244,67 @@ Acknowledgements The design of this library was inspired by the following: -- `demandimport `_ -- `apipkg `_ -- `metamodule `_ -- `modutil `_ -- `SPEC 1 `_ / `lazy-loader `_ -- `PEP 690 and its authors `_ +- |demandimport|_ +- |apipkg|_ +- |metamodule|_ +- |modutil|_ +- `SPEC 1 `_ / |lazy-loader|_ +- `PEP 690 and its authors `_ - `Jelle Zijlstra's pure-Python proof of concept `_ -- `slothy `_ -- `ideas `_ +- |slothy|_ +- |ideas|_ - `Sinbad `_'s feedback Without them, this would not exist. + + +.. + Common/formatted hyperlinks + + +.. _pep_690_text: https://peps.python.org/pep-0690/ + +.. |timeit| replace:: ``timeit`` +.. _timeit: https://docs.python.org/3/library/timeit.html + +.. |python -B| replace:: ``python -B`` +.. _python -B: https://docs.python.org/3/using/cmdline.html#cmdoption-B + +.. |python -X importtime| replace:: ``python -X importtime`` +.. _python -X importtime: https://docs.python.org/3/using/cmdline.html#cmdoption-X + +.. |typeguard| replace:: ``typeguard`` +.. _typeguard: https://github.com/agronholm/typeguard + +.. |beartype| replace:: ``beartype`` +.. _beartype: https://github.com/beartype/beartype + +.. |jaxtyping| replace:: ``jaxtyping`` +.. _jaxtyping: https://github.com/patrick-kidger/jaxtyping + +.. |torchtyping| replace:: ``torchtyping`` +.. _torchtyping: https://github.com/patrick-kidger/torchtyping + +.. |pyximport| replace:: ``pyximport`` +.. _pyximport: https://github.com/cython/cython/tree/master/pyximport + +.. |demandimport| replace:: ``demandimport`` +.. _demandimport: https://github.com/bwesterb/py-demandimport + +.. |apipkg| replace:: ``apipkg`` +.. _apipkg: https://github.com/pytest-dev/apipkg + +.. |metamodule| replace:: ``metamodule`` +.. _metamodule: https://github.com/njsmith/metamodule + +.. |modutil| replace:: ``modutil`` +.. _modutil: https://github.com/brettcannon/modutil + +.. |lazy-loader| replace:: ``lazy-loader`` +.. _lazy-loader: https://github.com/scientific-python/lazy-loader + +.. |slothy| replace:: ``slothy`` +.. _slothy: https://github.com/bswck/slothy + +.. |ideas| replace:: ``ideas`` +.. _ideas: https://github.com/aroberge/ideas diff --git a/bench/bench_samples.py b/bench/bench_samples.py index cbbad8c..0c390c4 100644 --- a/bench/bench_samples.py +++ b/bench/bench_samples.py @@ -2,10 +2,11 @@ """Simple benchark script for comparing the import time of the Python standard library when using regular imports, defer_imports-influence imports, and slothy-influenced imports. -The sample scripts being imported are generated with benchmark/generate_samples.py. +The sample scripts being imported are generated with bench/generate_samples.py. """ import platform +import shutil import sys import time from pathlib import Path @@ -24,20 +25,6 @@ def __exit__(self, *exc_info: object): self.elapsed = time.perf_counter() - self.elapsed -def remove_pycaches() -> None: - """Remove all cached Python bytecode files from the current working directory.""" - - for file in Path().rglob("*.py[co]"): - file.unlink() - - for dir_ in Path().rglob("__pycache__"): - # Sometimes, files with atypical names are still in these. - for file in dir_.iterdir(): - if file.is_file(): - file.unlink() - dir_.rmdir() - - def bench_regular() -> float: with CatchTime() as ct: import bench.sample_regular @@ -51,51 +38,32 @@ def bench_slothy() -> float: def bench_defer_imports_local() -> float: - with defer_imports.install_import_hook(), CatchTime() as ct: - import bench.sample_defer_local + with CatchTime() as ct: # noqa: SIM117 + with defer_imports.install_import_hook(uninstall_after=True): + import bench.sample_defer_local return ct.elapsed def bench_defer_imports_global() -> float: - with CatchTime() as ct, defer_imports.install_import_hook(apply_all=True): - import bench.sample_defer_global + with CatchTime() as ct: # noqa: SIM117 + with defer_imports.install_import_hook(uninstall_after=True, apply_all=True): + import bench.sample_defer_global return ct.elapsed -BENCH_FUNCS = { - "regular": bench_regular, - "slothy": bench_slothy, - "defer_imports (local)": bench_defer_imports_local, - "defer_imports (global)": bench_defer_imports_global, -} - - -def main() -> None: - import argparse +def remove_pycaches() -> None: + """Remove all cached Python bytecode files from the current working directory.""" - # Get arguments from user. - parser = argparse.ArgumentParser() - parser.add_argument( - "--exec-order", - action="extend", - nargs=4, - choices=BENCH_FUNCS.keys(), - type=str, - help="The order in which the influenced (or not influenced) imports are run", - ) - args = parser.parse_args() + for dir_ in Path().rglob("__pycache__"): + shutil.rmtree(dir_) - # Do any remaining setup. - if sys.dont_write_bytecode: - remove_pycaches() + for file in Path().rglob("*.py[co]"): + file.unlink() - exec_order = args.exec_order or list(BENCH_FUNCS) - # Perform benchmarking. - results = {type_: BENCH_FUNCS[type_]() for type_ in exec_order} - minimum = min(results.values()) +def pretty_print_results(results: dict[str, float], minimum: float) -> None: + """Format and print results as an reST-style list table.""" - # Format and print results as an reST-style list table. impl_header = "Implementation" impl_len = len(impl_header) impl_divider = "=" * impl_len @@ -136,5 +104,38 @@ def main() -> None: print(divider) +BENCH_FUNCS = { + "regular": bench_regular, + "slothy": bench_slothy, + "defer_imports (local)": bench_defer_imports_local, + "defer_imports (global)": bench_defer_imports_global, +} + + +def main() -> None: + import argparse + + parser = argparse.ArgumentParser() + parser.add_argument( + "--exec-order", + action="extend", + nargs=4, + choices=BENCH_FUNCS.keys(), + type=str, + help="The order in which the influenced (or not influenced) imports are run", + ) + args = parser.parse_args() + + if sys.dont_write_bytecode: + remove_pycaches() + + exec_order: list[str] = args.exec_order or list(BENCH_FUNCS) + + results = {type_: BENCH_FUNCS[type_]() for type_ in exec_order} + minimum = min(results.values()) + + pretty_print_results(results, minimum) + + if __name__ == "__main__": raise SystemExit(main()) diff --git a/bench/generate_samples.py b/bench/generate_samples.py index 70c6a94..6ecf9c7 100644 --- a/bench/generate_samples.py +++ b/bench/generate_samples.py @@ -4,8 +4,8 @@ from pathlib import Path -# Mostly sourced from https://gist.github.com/indygreg/be1c229fa41ced5c76d912f7073f9de6. -STDLIB_IMPORTS = """\ +# Modified from https://gist.github.com/indygreg/be1c229fa41ced5c76d912f7073f9de6. +_STDLIB_IMPORTS = """\ import __future__ # import _bootlocale # Doesn't exist on 3.11 on Windows @@ -549,47 +549,47 @@ import test """ -INDENTED_STDLIB_IMPORTS = "".join( - (f" {line}" if line.strip() else line) for line in STDLIB_IMPORTS.splitlines(keepends=True) +_INDENTED_STDLIB_IMPORTS = "".join( + (f'{" " * 4}{line}' if line.strip() else line) for line in _STDLIB_IMPORTS.splitlines(keepends=True) ) -PYRIGHT_IGNORE_DIRECTIVES = "# pyright: reportUnusedImport=none, reportMissingTypeStubs=none" -GENERATED_BY_COMMENT = "# Generated by benchmark/generate_samples.py" +_PYRIGHT_IGNORE_DIRECTIVES = "# pyright: reportUnusedImport=none, reportMissingTypeStubs=none" +_GENERATED_BY_COMMENT = "# Generated by bench/generate_samples.py" -CONTEXT_MANAGER_TEMPLATE = f"""\ -{PYRIGHT_IGNORE_DIRECTIVES} -{GENERATED_BY_COMMENT} +_CONTEXT_MANAGER_TEMPLATE = f"""\ +{_PYRIGHT_IGNORE_DIRECTIVES} +{_GENERATED_BY_COMMENT} {{import_stmt}} {{ctx_manager}} -{INDENTED_STDLIB_IMPORTS}\ +{_INDENTED_STDLIB_IMPORTS}\ """ def main() -> None: bench_path = Path("bench").resolve() - # regular imports - regular_contents = "\n".join((PYRIGHT_IGNORE_DIRECTIVES, GENERATED_BY_COMMENT, STDLIB_IMPORTS)) + # ---- regular imports + regular_contents = "\n".join((_PYRIGHT_IGNORE_DIRECTIVES, _GENERATED_BY_COMMENT, _STDLIB_IMPORTS)) regular_path = bench_path / "sample_regular.py" regular_path.write_text(regular_contents, encoding="utf-8") - # defer_imports-instrumented and defer_imports-hooked imports (global) + # ---- defer_imports-instrumented and defer_imports-hooked imports (global) shutil.copy(regular_path, regular_path.with_name("sample_defer_global.py")) - # defer_imports-instrumented and defer_imports-hooked imports (local) - defer_imports_contents = CONTEXT_MANAGER_TEMPLATE.format( + # ---- defer_imports-instrumented and defer_imports-hooked imports (local) + defer_imports_contents = _CONTEXT_MANAGER_TEMPLATE.format( import_stmt="import defer_imports", ctx_manager="with defer_imports.until_use:", ) defer_imports_path = bench_path / "sample_defer_local.py" defer_imports_path.write_text(defer_imports_contents, encoding="utf-8") - # defer_imports-influenced imports (local), but for a test in the tests directory - shutil.copy(defer_imports_path, bench_path.parent / "tests" / "sample_stdlib_imports.py") + # ---- defer_imports-influenced imports (local), but for a test in the tests directory + shutil.copy(defer_imports_path, bench_path.with_name("tests") / "sample_stdlib_imports.py") - # slothy-hooked imports - slothy_contents = CONTEXT_MANAGER_TEMPLATE.format( + # ---- slothy-hooked imports + slothy_contents = _CONTEXT_MANAGER_TEMPLATE.format( import_stmt="from slothy import lazy_importing", ctx_manager="with lazy_importing():", ) diff --git a/bench/sample_defer_global.py b/bench/sample_defer_global.py index 43029e5..bcaa2f7 100644 --- a/bench/sample_defer_global.py +++ b/bench/sample_defer_global.py @@ -1,5 +1,5 @@ # pyright: reportUnusedImport=none, reportMissingTypeStubs=none -# Generated by benchmark/generate_samples.py +# Generated by bench/generate_samples.py import __future__ # import _bootlocale # Doesn't exist on 3.11 on Windows diff --git a/bench/sample_defer_local.py b/bench/sample_defer_local.py index 2163528..67a2953 100644 --- a/bench/sample_defer_local.py +++ b/bench/sample_defer_local.py @@ -1,5 +1,5 @@ # pyright: reportUnusedImport=none, reportMissingTypeStubs=none -# Generated by benchmark/generate_samples.py +# Generated by bench/generate_samples.py import defer_imports diff --git a/bench/sample_regular.py b/bench/sample_regular.py index 43029e5..bcaa2f7 100644 --- a/bench/sample_regular.py +++ b/bench/sample_regular.py @@ -1,5 +1,5 @@ # pyright: reportUnusedImport=none, reportMissingTypeStubs=none -# Generated by benchmark/generate_samples.py +# Generated by bench/generate_samples.py import __future__ # import _bootlocale # Doesn't exist on 3.11 on Windows diff --git a/bench/sample_slothy.py b/bench/sample_slothy.py index 740dd7e..764121c 100644 --- a/bench/sample_slothy.py +++ b/bench/sample_slothy.py @@ -1,5 +1,5 @@ # pyright: reportUnusedImport=none, reportMissingTypeStubs=none -# Generated by benchmark/generate_samples.py +# Generated by bench/generate_samples.py from slothy import lazy_importing diff --git a/pyproject.toml b/pyproject.toml index 7b0bce9..60476b5 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -56,8 +56,9 @@ features = ["bench"] python = ["3.9", "3.10", "3.11", "3.12", "3.13"] [tool.hatch.envs.bench.scripts] -bench = "python -m bench.bench_samples" -raw_import = 'python -X importtime -c "import defer_imports"' +stdlib = "python -m bench.bench_samples" +import-time = 'python -X importtime -c "import {args:defer_imports}"' +simple-import-time = 'python -m timeit -n 1 -r 1 -- "import {args:defer_imports}"' # -------- Test config @@ -78,12 +79,10 @@ defer_imports = ["src"] [tool.coverage.run] plugins = ["covdefaults"] source = ["defer_imports", "tests"] -omit = [ - "src/defer_imports/_typing_compat.py", # Has a module-level __getattr__ that isn't invoked at runtime. -] [tool.coverage.report] fail_under = 90 +exclude_lines = ["^\\s*(?:el)?if TYPE_CHECKING:$"] # -------- Linter config @@ -140,6 +139,7 @@ extend-ignore = [ "PD011", # Erroneous issue that triggers for any .values attribute access at all. "PLR2004", # "Magic number" depends on the use case. "RUF002", # "Ambiguous character" depends on the use case. + "RUF003", # "Ambiguous character" depends on the use case. # ---- Recommended by Ruff when using Ruff format "E111", @@ -152,13 +152,12 @@ extend-ignore = [ "ISC002", # ---- Project-specific rules - "RET505", # Returns in both parts of if-else are fine. - "SIM108", # if-else instead of a ternary is fine. + "RET505", # Returns in both parts of if-else can be more readable. + "SIM108", # if-else instead of a ternary can be more readable. ] unfixable = [ "ERA", # Prevent unlikely erroneous deletion. ] -typing-modules = ["defer_imports._typing_compat"] [tool.ruff.lint.isort] lines-after-imports = 2 @@ -170,15 +169,14 @@ keep-runtime-typing = true [tool.ruff.lint.per-file-ignores] # ---- Package code "src/defer_imports/*.py" = [ - "A002", # Allow some shadowing of builtins by parameter names. - "PLW0603", # "global" is used to update variables at global scope. + "A002", # Allow some shadowing of builtins by parameter names. ] # ---- Test code "tests/**/test_*.py" = [ "T201", # Printing is fine. - "T203", # Pretty printing is fine. - # Don't need some annotations in tests. + "T203", # Pretty-printing is fine. + # Don't need return annotations in tests. "ANN201", "ANN202", "S102", # exec is used to test for NameError within a module's namespace. diff --git a/src/defer_imports/__init__.py b/src/defer_imports/__init__.py index a13e694..9e61d18 100644 --- a/src/defer_imports/__init__.py +++ b/src/defer_imports/__init__.py @@ -5,59 +5,202 @@ """A library that implements PEP 690–esque lazy imports in pure Python.""" from __future__ import annotations -import __future__ -import ast import builtins import contextvars -import io +import importlib.util import sys -import tokenize -import warnings import zipimport from collections import deque from importlib.machinery import BYTECODE_SUFFIXES, SOURCE_SUFFIXES, FileFinder, ModuleSpec, PathFinder, SourceFileLoader from itertools import islice, takewhile from threading import RLock -from . import _typing_compat as _tp - -__version__ = "0.0.3.dev0" +__version__ = "0.0.3.dev1" __all__ = ( - # -- Compile-time hook "install_import_hook", "ImportHookContext", - # -- Runtime hook "until_use", "DeferredContext", - # -- Console helpers - "instrument_ipython", - "DeferredInteractiveConsole", - "interact", ) +TYPE_CHECKING = False + + +# ============================================================================ +# region -------- Lazy import bootstrapping -------- +# ============================================================================ + + +def _lazy_import_module(name: str, package: typing.Optional[str] = None) -> types.ModuleType: + """Lazily import a module. Has the same signature as ``importlib.import_module()``. + + This is purely for limited internal usage, especially since it has not been evaluated for thread safety. + + Notes + ----- + Based on importlib code as well as recipes found in the Python 3.12 importlib docs. + """ + + absolute_name = importlib.util.resolve_name(name, package) + if absolute_name in sys.modules: + return sys.modules[absolute_name] + + path = None + if "." in absolute_name: + parent_name, _, child_name = absolute_name.rpartition(".") + # No point delaying the load of the parent when we need to access one of its attributes immediately. + parent_module = importlib.import_module(parent_name) + assert parent_module.__spec__ is not None + path = parent_module.__spec__.submodule_search_locations + + for finder in sys.meta_path: + spec = finder.find_spec(absolute_name, path) + if spec is not None: + break + else: + msg = f"No module named {absolute_name!r}" + raise ModuleNotFoundError(msg, name=absolute_name) + + if spec.loader is None: + msg = "missing loader" + raise ImportError(msg, name=spec.name) + + spec.loader = loader = importlib.util.LazyLoader(spec.loader) + module = importlib.util.module_from_spec(spec) + sys.modules[absolute_name] = module + loader.exec_module(module) + + if path is not None: + setattr(parent_module, child_name, module) # pyright: ignore [reportPossiblyUnboundVariable] + + return sys.modules[absolute_name] + + +if TYPE_CHECKING: + import ast + import collections.abc as coll_abc + import importlib.abc as imp_abc + import io + import os + import tokenize + import types + import typing + import warnings +else: + # fmt: off + ast = _lazy_import_module("ast") + coll_abc = _lazy_import_module("collections.abc") + imp_abc = _lazy_import_module("importlib.abc") + io = _lazy_import_module("io") + os = _lazy_import_module("os") + tokenize = _lazy_import_module("tokenize") + types = _lazy_import_module("types") + typing = _lazy_import_module("typing") + warnings = _lazy_import_module("warnings") + # fmt: on + + +# endregion + + +# ============================================================================ +# region -------- Shims for typing and annotation symbols -------- +# ============================================================================ + + +if TYPE_CHECKING: + _final = typing.final +else: + + def _final(f: object) -> object: + """Decorator to indicate final methods and final classes. + + Slightly modified version of typing.final to avoid importing from typing at runtime. + """ + + try: + f.__final__ = True # pyright: ignore # Runtime attribute assignment + except (AttributeError, TypeError): # pragma: no cover + # Skip the attribute silently if it is not writable. + # AttributeError happens if the object has __slots__ or a + # read-only property, TypeError if it's a builtin class. + pass + return f + + +if sys.version_info >= (3, 10): # pragma: >=3.10 cover + if TYPE_CHECKING: + from typing import TypeAlias as _TypeAlias, TypeGuard as _TypeGuard + else: + _TypeAlias: typing.TypeAlias = "typing.TypeAlias" + _TypeGuard: typing.TypeAlias = "typing.TypeGuard" +elif TYPE_CHECKING: + from typing_extensions import TypeAlias as _TypeAlias, TypeGuard as _TypeGuard +else: # pragma: <3.10 cover + _TypeAlias = type("TypeAlias", (), {"__doc__": "Placeholder for typing.TypeAlias."}) + + class _PlaceholderGenericAlias(type(list[int])): + def __repr__(self) -> str: + return f"" + + class _PlaceholderMeta(type): + def __getitem__(self, item: object) -> _PlaceholderGenericAlias: + return _PlaceholderGenericAlias(self, item) + + def __repr__(self) -> str: + return f"" + + class _TypeGuard(metaclass=_PlaceholderMeta): + """Placeholder for typing.TypeGuard.""" + + +if sys.version_info >= (3, 11): # pragma: >=3.11 cover + if TYPE_CHECKING: + from typing import Self as _Self + else: + _Self: _TypeAlias = "typing.Self" +elif TYPE_CHECKING: + from typing_extensions import Self as _Self +else: # pragma: <3.11 cover + _Self = type("Self", (), {"__doc__": "Placeholder for typing.Self."}) + + +if sys.version_info >= (3, 12): # pragma: >=3.12 cover + _ReadableBuffer: _TypeAlias = "coll_abc.Buffer" +elif TYPE_CHECKING: + from typing_extensions import Buffer as _ReadableBuffer +else: # pragma: <3.12 cover + _ReadableBuffer: _TypeAlias = "typing.Union[bytes, bytearray, memoryview]" + + +# endregion + # ============================================================================ # region -------- Vendored helpers -------- # -# Helper functions vendored from CPython in some way to avoid actually -# importing them. +# Helper functions vendored from CPython in some way. # ============================================================================ -def _sliding_window(iterable: _tp.Iterable[_tp.T], n: int) -> _tp.Generator[tuple[_tp.T, ...]]: - """Collect data into overlapping fixed-length chunks or blocks. +def _sliding_window( + iterable: coll_abc.Iterable[tokenize.TokenInfo], + n: int, +) -> coll_abc.Generator[tuple[tokenize.TokenInfo, ...]]: + """Collect tokens into overlapping fixed-length chunks or blocks. Notes ----- - Slightly modified version of a recipe in the Python 3.12 itertools docs. + Slightly modified version of the sliding_window recipe found in the Python 3.12 itertools docs. Examples -------- - >>> ["".join(window) for window in sliding_window('ABCDEFG', 4)] - ['ABCD', 'BCDE', 'CDEF', 'DEFG'] + >>> tokens = list(tokenize.generate_tokens(io.StringIO("def func(): ...").readline)) + >>> [" ".join(item.string for item in window) for window in _sliding_window(tokens, 2)] + ['def func', 'func (', '( )', ') :', ': ...', '... ', ' '] """ iterator = iter(iterable) @@ -67,12 +210,12 @@ def _sliding_window(iterable: _tp.Iterable[_tp.T], n: int) -> _tp.Generator[tupl yield tuple(window) -def _sanity_check(name: str, package: _tp.Optional[str], level: int) -> None: +def _sanity_check(name: str, package: typing.Optional[str], level: int) -> None: """Verify arguments are "sane". Notes ----- - Slightly modified version of importlib._bootstrap._sanity_check to avoid depending on an an implementation detail + Slightly modified version of importlib._bootstrap._sanity_check to avoid depending on an implementation detail module at runtime. """ @@ -94,11 +237,10 @@ def _sanity_check(name: str, package: _tp.Optional[str], level: int) -> None: raise ValueError(msg) -def _calc___package__(globals: _tp.MutableMapping[str, _tp.Any]) -> _tp.Optional[str]: +def _calc___package__(globals: coll_abc.MutableMapping[str, typing.Any]) -> typing.Optional[str]: """Calculate what __package__ should be. - __package__ is not guaranteed to be defined or could be set to None - to represent that its proper value is unknown. + __package__ is not guaranteed to be defined or could be set to None to represent that its proper value is unknown. Notes ----- @@ -111,44 +253,25 @@ def _calc___package__(globals: _tp.MutableMapping[str, _tp.Any]) -> _tp.Optional if package is not None: if spec is not None and package != spec.parent: - category = DeprecationWarning if sys.version_info >= (3, 12) else ImportWarning - warnings.warn( - f"__package__ != __spec__.parent ({package!r} != {spec.parent!r})", - category, - stacklevel=3, - ) - return package + if sys.version_info >= (3, 12): # pragma: >=3.12 cover + category = DeprecationWarning + else: # pragma: <3.12 cover + category = ImportWarning - if spec is not None: + msg = f"__package__ != __spec__.parent ({package!r} != {spec.parent!r})" + warnings.warn(msg, category, stacklevel=3) + return package + elif spec is not None: return spec.parent + else: + msg = "can't resolve package from __spec__ or __package__, falling back on __name__ and __path__" + warnings.warn(msg, ImportWarning, stacklevel=3) - warnings.warn( - "can't resolve package from __spec__ or __package__, falling back on __name__ and __path__", - ImportWarning, - stacklevel=3, - ) - package = globals["__name__"] - if "__path__" not in globals: - package = package.rpartition(".")[0] # pyright: ignore [reportOptionalMemberAccess] - - return package - - -def _resolve_name(name: str, package: str, level: int) -> str: - """Resolve a relative module name to an absolute one. - - Notes - ----- - Slightly modified version of importlib._bootstrap._resolve_name to avoid depending on an implementation detail - module at runtime. - """ + package = globals["__name__"] + if "__path__" not in globals: + package = package.rpartition(".")[0] # pyright: ignore [reportOptionalMemberAccess] - bits = package.rsplit(".", level - 1) - if len(bits) < level: - msg = "attempted relative import beyond top-level package" - raise ImportError(msg) - base = bits[0] - return f"{base}.{name}" if name else base + return package # endregion @@ -161,29 +284,29 @@ def _resolve_name(name: str, package: str, level: int) -> str: # ============================================================================ -_StrPath: _tp.TypeAlias = "_tp.Union[str, _tp.PathLike[str]]" -_ModulePath: _tp.TypeAlias = "_tp.Union[_StrPath, _tp.ReadableBuffer]" -_SourceData: _tp.TypeAlias = "_tp.Union[_tp.ReadableBuffer, str, ast.Module, ast.Expression, ast.Interactive]" +_StrPath: _TypeAlias = "typing.Union[str, os.PathLike[str]]" +_ModulePath: _TypeAlias = "typing.Union[_StrPath, _ReadableBuffer]" +_SourceData: _TypeAlias = "typing.Union[_ReadableBuffer, str, ast.Module, ast.Expression, ast.Interactive]" -_TOK_NAME, _TOK_OP = tokenize.NAME, tokenize.OP - _BYTECODE_HEADER = f"defer_imports{__version__}".encode() """Custom header for defer_imports-instrumented bytecode files. Should be updated with every version release.""" -class _DeferredInstrumenter(ast.NodeTransformer): +class _DeferredInstrumenter: """AST transformer that instruments imports within "with defer_imports.until_use: ..." blocks so that their results are assigned to custom keys in the global namespace. Notes ----- - The transformer assumes the module is not empty and "with defer_imports.until_use" is used somewhere in it. + The transformer doesn't subclass ast.NodeTransformer but instead vendors its logic to avoid the upfront import cost. + Additionally, it assumes the AST being instrumented is not empty and "with defer_imports.until_use" is used + somewhere in it. """ def __init__( self, - data: _tp.Union[_tp.ReadableBuffer, str, ast.AST], + data: typing.Union[_ReadableBuffer, str, ast.AST], filepath: _ModulePath = "", encoding: str = "utf-8", *, @@ -197,6 +320,13 @@ def __init__( self.scope_depth = 0 self.escape_hatch_depth = 0 + def visit(self, node: ast.AST) -> typing.Any: + """Visit a node.""" + + method = f"visit_{node.__class__.__name__}" + visitor = getattr(self, method, self.generic_visit) + return visitor(node) + def _visit_scope(self, node: ast.AST) -> ast.AST: """Track Python scope changes. Used to determine if a use of defer_imports.until_use is global.""" @@ -228,7 +358,7 @@ def _visit_eager_import_block(self, node: ast.AST) -> ast.AST: visit_Try = _visit_eager_import_block - if sys.version_info >= (3, 11): + if sys.version_info >= (3, 11): # pragma: >=3.11 cover visit_TryStar = _visit_eager_import_block def _decode_source(self) -> str: @@ -240,7 +370,7 @@ def _decode_source(self) -> str: elif isinstance(self.data, str): return self.data else: - # This is based on importlib.util.decode_source(). + # Do the same thing as importlib.util.decode_source(). newline_decoder = io.IncrementalNewlineDecoder(None, translate=True) # Expected buffer types (bytes, bytearray, memoryview) are known to have a decode method. return newline_decoder.decode(self.data.decode(self.encoding)) # pyright: ignore @@ -268,7 +398,7 @@ def _get_node_context(self, node: ast.stmt): # noqa: ANN202 # Version-dependent def _create_import_name_replacement(name: str) -> ast.If: """Create an AST for changing the name of a variable in locals if the variable is a defer_imports proxy. - If the node is unparsed, the resulting code is almost equivalent to the following:: + The resulting node is almost equivalent to the following code:: if type(name) is _DeferredImportProxy: @temp_proxy = @local_ns.pop("name") @@ -446,13 +576,13 @@ def visit_Module(self, node: ast.Module) -> ast.AST: return self.generic_visit(node) @staticmethod - def _is_non_wildcard_import(obj: object) -> _tp.TypeGuard[_tp.Union[ast.Import, ast.ImportFrom]]: + def _is_non_wildcard_import(obj: object) -> _TypeGuard[typing.Union[ast.Import, ast.ImportFrom]]: """Check if a given object is an import AST without wildcards.""" return isinstance(obj, (ast.Import, ast.ImportFrom)) and obj.names[0].name != "*" @staticmethod - def _is_defer_imports_import(node: _tp.Union[ast.Import, ast.ImportFrom]) -> bool: + def _is_defer_imports_import(node: typing.Union[ast.Import, ast.ImportFrom]) -> bool: """Check if the given import node imports from defer_imports.""" if isinstance(node, ast.Import): @@ -460,11 +590,11 @@ def _is_defer_imports_import(node: _tp.Union[ast.Import, ast.ImportFrom]) -> boo else: return node.module is not None and node.module.partition(".")[0] == "defer_imports" - def _wrap_import_stmts(self, nodes: list[_tp.Any], start: int) -> ast.With: - """Wrap a list of consecutive import nodes from a list of statements using a "defer_imports.until_use" block and + def _wrap_import_stmts(self, nodes: list[typing.Any], start: int) -> ast.With: + """Wrap consecutive import nodes within a list of statements using a "defer_imports.until_use" block and instrument them. - The first node must be guaranteed to be an import node. + The first node must be an import node. """ import_range = tuple(takewhile(lambda i: self._is_non_wildcard_import(nodes[i]), range(start, len(nodes)))) @@ -486,7 +616,7 @@ def _is_import_to_instrument(self, value: ast.AST) -> bool: self.module_level # Only at global scope. and self.scope_depth == 0 - # Only with import nodes without wildcards. + # Only for import nodes without wildcards. and self._is_non_wildcard_import(value) # Only outside of escape hatch blocks. and (self.escape_hatch_depth == 0 and not self._is_defer_imports_import(value)) @@ -495,13 +625,13 @@ def _is_import_to_instrument(self, value: ast.AST) -> bool: def generic_visit(self, node: ast.AST) -> ast.AST: """Called if no explicit visitor function exists for a node. - Almost a copy of ast.NodeVisitor.generic_visit, but we intercept global sequences of import statements to wrap - them in a "with defer_imports.until_use" block and instrument them. + In addition to regular functionality, conditionally intercept global sequences of import statements to wrap them + in "with defer_imports.until_use" blocks. """ for field, old_value in ast.iter_fields(node): if isinstance(old_value, list): - new_values: list[_tp.Any] = [] + new_values: list[typing.Any] = [] for i, value in enumerate(old_value): # pyright: ignore [reportUnknownArgumentType, reportUnknownVariableType] if self._is_import_to_instrument(value): # pyright: ignore [reportUnknownArgumentType] value = self._wrap_import_stmts(old_value, i) # noqa: PLW2901 # pyright: ignore [reportUnknownArgumentType] @@ -523,9 +653,11 @@ def generic_visit(self, node: ast.AST) -> ast.AST: return node -def _check_source_for_defer_usage(data: _tp.Union[_tp.ReadableBuffer, str]) -> tuple[str, bool]: +def _check_source_for_defer_usage(data: typing.Union[_ReadableBuffer, str]) -> tuple[str, bool]: """Get the encoding of the given code and also check if it uses "with defer_imports.until_use".""" + _TOK_NAME, _TOK_OP = tokenize.NAME, tokenize.OP + if isinstance(data, str): token_stream = tokenize.generate_tokens(io.StringIO(data).readline) encoding = "utf-8" @@ -555,7 +687,7 @@ def _check_ast_for_defer_usage(data: ast.AST) -> tuple[str, bool]: class _DeferredFileLoader(SourceFileLoader): """A file loader that instruments .py files which use "with defer_imports.until_use: ...".""" - def __init__(self, *args: _tp.Any, **kwargs: _tp.Any) -> None: + def __init__(self, *args: typing.Any, **kwargs: typing.Any) -> None: super().__init__(*args, **kwargs) self.defer_module_level: bool = False @@ -591,7 +723,7 @@ def get_data(self, path: str) -> bytes: return data[len(_BYTECODE_HEADER) :] - def set_data(self, path: str, data: _tp.ReadableBuffer, *, _mode: int = 0o666) -> None: + def set_data(self, path: str, data: _ReadableBuffer, *, _mode: int = 0o666) -> None: """Write bytes data to a file. Notes @@ -609,8 +741,8 @@ def set_data(self, path: str, data: _tp.ReadableBuffer, *, _mode: int = 0o666) - return super().set_data(path, data, _mode=_mode) - # NOTE: Signature of SourceFileLoader.source_to_code at runtime and in typeshed aren't consistent. - def source_to_code(self, data: _SourceData, path: _ModulePath, *, _optimize: int = -1) -> _tp.CodeType: # pyright: ignore [reportIncompatibleMethodOverride] + # NOTE: Signature of SourceFileLoader.source_to_code at runtime isn't consistent with signature in typeshed. + def source_to_code(self, data: _SourceData, path: _ModulePath, *, _optimize: int = -1) -> types.CodeType: # pyright: ignore [reportIncompatibleMethodOverride] """Compile "data" into a code object, but not before potentially instrumenting it. Parameters @@ -618,11 +750,11 @@ def source_to_code(self, data: _SourceData, path: _ModulePath, *, _optimize: int data: _SourceData Anything that compile() can handle. path: _ModulePath: - Where the data was retrieved (when applicable). + Where the data was retrieved from (when applicable). Returns ------- - _tp.CodeType + types.CodeType The compiled code object. """ @@ -647,7 +779,7 @@ def source_to_code(self, data: _SourceData, path: _ModulePath, *, _optimize: int return super().source_to_code(new_tree, path, _optimize=_optimize) # pyright: ignore # See note above. - def exec_module(self, module: _tp.ModuleType) -> None: + def exec_module(self, module: types.ModuleType) -> None: """Execute the module, but only after getting state from module.__spec__.loader_state if present.""" if (spec := module.__spec__) and spec.loader_state is not None: @@ -660,7 +792,7 @@ class _DeferredFileFinder(FileFinder): def __repr__(self) -> str: return f"{type(self).__name__}({self.path!r})" - def find_spec(self, fullname: str, target: _tp.Optional[_tp.ModuleType] = None) -> _tp.Optional[ModuleSpec]: + def find_spec(self, fullname: str, target: typing.Optional[types.ModuleType] = None) -> typing.Optional[ModuleSpec]: """Try to find a spec for "fullname" on sys.path or "path", with some deferral state attached. Notes @@ -677,13 +809,14 @@ def find_spec(self, fullname: str, target: _tp.Optional[_tp.ModuleType] = None) spec = super().find_spec(fullname, target) if spec is not None and isinstance(spec.loader, _DeferredFileLoader): - # Check defer configuration after finding succeeds, but before loading starts. + # NOTE: We're locking in defer_imports configuration for this module between finding it and loading it. + # However, it's possible to delay getting the configuration until module execution. Not sure what's + # best. config = _current_defer_config.get(None) if config is None: defer_module_level = False else: - # NOTE: The configuration precedence order should match what's documented for install_import_hook(). defer_module_level = config.apply_all or bool( config.module_names and ( @@ -692,6 +825,9 @@ def find_spec(self, fullname: str, target: _tp.Optional[_tp.ModuleType] = None) ) ) + if config.loader_class is not None: + spec.loader = config.loader_class(fullname, spec.loader.path) # pyright: ignore [reportCallIssue] + spec.loader_state = {"defer_module_level": defer_module_level} return spec @@ -708,30 +844,24 @@ def find_spec(self, fullname: str, target: _tp.Optional[_tp.ModuleType] = None) class _DeferConfig: """Configuration container whose contents are used to determine how a module should be instrumented.""" - def __init__(self, apply_all: bool, module_names: _tp.Sequence[str], recursive: bool) -> None: + def __init__( + self, + apply_all: bool, + module_names: coll_abc.Sequence[str], + recursive: bool, + loader_class: typing.Optional[type[imp_abc.Loader]], + ) -> None: self.apply_all = apply_all self.module_names = module_names self.recursive = recursive + self.loader_class = loader_class def __repr__(self) -> str: - attrs = ("apply_all", "module_names", "recursive") - return f"{type(self).__name__}({', '.join(f'{attr}={getattr(self, attr)}' for attr in attrs)})" - - -def _invalidate_path_entry_caches() -> None: - """Invalidate import-related path entry caches in some way. - - Notes - ----- - sys.path_importer_cache.clear() seems to break everything. Pathfinder.invalidate_caches() doesn't, but it has a - greater upfront cost if performed early in an application's lifetime. That's because it imports importlib.metadata. - However, it doesn't nuke as much, thereby preventing an increase in import time later. - """ + attrs = ("apply_all", "module_names", "recursive", "loader_class") + return f'{type(self).__name__}({", ".join(f"{attr}={getattr(self, attr)!r}" for attr in attrs)})' - PathFinder.invalidate_caches() - -@_tp.final +@_final class ImportHookContext: """The context manager returned by install_import_hook(). Can reset defer_imports's configuration to its previous state and uninstall defer_import's import path hook. @@ -741,7 +871,7 @@ def __init__(self, _config_ctx_tok: contextvars.Token[_DeferConfig], _uninstall_ self._tok = _config_ctx_tok self._uninstall_after = _uninstall_after - def __enter__(self) -> _tp.Self: + def __enter__(self) -> _Self: return self def __exit__(self, *exc_info: object) -> None: @@ -774,20 +904,22 @@ def uninstall(self) -> None: except ValueError: pass else: - _invalidate_path_entry_caches() + PathFinder.invalidate_caches() def install_import_hook( *, uninstall_after: bool = False, apply_all: bool = False, - module_names: _tp.Sequence[str] = (), + module_names: coll_abc.Sequence[str] = (), recursive: bool = False, + loader_class: typing.Optional[type[imp_abc.Loader]] = None, ) -> ImportHookContext: r"""Install defer_imports's import hook if it isn't already installed, and optionally configure it. Must be called before using defer_imports.until_use. - The configuration is for instrumenting ALL import statements, not only ones wrapped by defer_imports.until_use. + The configuration knobs are for instrumenting any global import statements, not only ones wrapped by + defer_imports.until_use. This should be run before the code it is meant to affect is executed. One place to put do that is __init__.py of a package or app. @@ -803,8 +935,11 @@ def install_import_hook( A set of modules to apply module-level import deferral to. Has lower priority than apply_all. More suitable for use in libraries. recursive: bool, default=False - Whether module-level import deferral should apply recursively the submodules of the given module_names. If no - module names are given, this has no effect. + Whether module-level import deferral should apply recursively the submodules of the given module_names. Has the + same proirity as module_names. If no module names are given, this has no effect. + loader_class: type[importlib_abc.Loader] | None, optional + An import loader class for defer_imports to use instead of the default machinery. If supplied, it is assumed to + have an initialization signature matching ``(fullname: str, path: str) -> None``. Returns ------- @@ -815,15 +950,14 @@ def install_import_hook( if _DEFER_PATH_HOOK not in sys.path_hooks: try: - # zipimporter doesn't provide find_spec until 3.10. + # zipimporter doesn't provide find_spec until 3.10, so it technically doesn't meet the protocol. hook_insert_index = sys.path_hooks.index(zipimport.zipimporter) + 1 # pyright: ignore [reportArgumentType] except ValueError: hook_insert_index = 0 - _invalidate_path_entry_caches() sys.path_hooks.insert(hook_insert_index, _DEFER_PATH_HOOK) - config = _DeferConfig(apply_all, module_names, recursive) + config = _DeferConfig(apply_all, module_names, recursive, loader_class) config_ctx_tok = _current_defer_config.set(config) return ImportHookContext(config_ctx_tok, uninstall_after) @@ -851,9 +985,9 @@ class _DeferredImportProxy: def __init__( self, name: str, - global_ns: _tp.MutableMapping[str, object], - local_ns: _tp.MutableMapping[str, object], - fromlist: _tp.Sequence[str], + global_ns: coll_abc.MutableMapping[str, object], + local_ns: coll_abc.MutableMapping[str, object], + fromlist: coll_abc.Sequence[str], level: int = 0, ) -> None: self.defer_proxy_name = name @@ -887,7 +1021,7 @@ def __repr__(self) -> str: return f"" - def __getattr__(self, name: str, /) -> _tp.Self: + def __getattr__(self, name: str, /) -> _Self: if name in self.defer_proxy_fromlist: from_proxy = type(self)(*self.defer_proxy_import_args) from_proxy.defer_proxy_fromlist = (name,) @@ -911,7 +1045,7 @@ class _DeferredImportKey(str): __slots__ = ("defer_key_proxy", "is_resolving", "lock") - def __new__(cls, key: str, proxy: _DeferredImportProxy, /) -> _tp.Self: + def __new__(cls, key: str, proxy: _DeferredImportProxy, /) -> _Self: return super().__new__(cls, key) def __init__(self, key: str, proxy: _DeferredImportProxy, /) -> None: @@ -946,7 +1080,7 @@ def _resolve(self) -> None: proxy = self.defer_key_proxy # 1. Perform the original __import__ and pray. - module: _tp.ModuleType = _original_import.get()(*proxy.defer_proxy_import_args) + module: types.ModuleType = _original_import.get()(*proxy.defer_proxy_import_args) # 2. Transfer nested proxies over to the resolved module. module_vars = vars(module) @@ -983,21 +1117,22 @@ def _resolve(self) -> None: def _deferred___import__( name: str, - globals: _tp.MutableMapping[str, object], - locals: _tp.MutableMapping[str, object], - fromlist: _tp.Optional[_tp.Sequence[str]] = None, + globals: coll_abc.MutableMapping[str, object], + locals: coll_abc.MutableMapping[str, object], + fromlist: typing.Optional[coll_abc.Sequence[str]] = None, level: int = 0, -) -> _tp.Any: +) -> typing.Any: """An limited replacement for __import__ that supports deferred imports by returning proxies.""" fromlist = fromlist or () - package = _calc___package__(globals) + package = _calc___package__(globals) if (level != 0) else None _sanity_check(name, package, level) - # Resolve the names of relative imports. + # NOTE: This technically repeats work since it recalculates level internally, but it's better for maintenance than + # keeping a copy of importlib._bootstrap._resolve_name() around. if level > 0: - name = _resolve_name(name, package, level) # pyright: ignore [reportArgumentType] + name = importlib.util.resolve_name(f'{"." * level}{name}', package) level = 0 # Handle submodule imports if relevant top-level imports already occurred in the call site's module. @@ -1026,7 +1161,7 @@ def _deferred___import__( return _DeferredImportProxy(name, globals, locals, fromlist, level) -@_tp.final +@_final class DeferredContext: """A context manager within which imports occur lazily. Not reentrant. Use via defer_imports.until_use. @@ -1045,221 +1180,22 @@ class DeferredContext: As part of its implementation, this temporarily replaces builtins.__import__. """ - __slots__ = ("is_active", "_import_ctx_token", "_defer_ctx_token") + __slots__ = ("_import_ctx_token", "_defer_ctx_token") + + # TODO: Have this turn into a no-op when not being executed with a defer_imports loader. def __enter__(self) -> None: - self.is_active: bool = bool(_current_defer_config.get(False)) - if self.is_active: - self._defer_ctx_token = _is_deferred.set(True) - self._import_ctx_token = _original_import.set(builtins.__import__) - builtins.__import__ = _deferred___import__ + self._defer_ctx_token = _is_deferred.set(True) + self._import_ctx_token = _original_import.set(builtins.__import__) + builtins.__import__ = _deferred___import__ def __exit__(self, *exc_info: object) -> None: - if self.is_active: - _original_import.reset(self._import_ctx_token) - _is_deferred.reset(self._defer_ctx_token) - builtins.__import__ = _original_import.get() - - -until_use: _tp.Final[DeferredContext] = DeferredContext() - - -# endregion - - -# ============================================================================ -# region -------- Console helpers -------- -# -# Helpers for using defer_imports in various consoles, such as the built-in -# CPython REPL and IPython. -# -# TODO: Add tests for these. -# ============================================================================ - - -class _DeferredIPythonInstrumenter(ast.NodeTransformer): - """An AST transformer that wraps defer_imports's AST instrumentation to fit IPython's AST hook interface.""" - - def __init__(self): - # The wrapped transformer's initial data is an empty string because we only get the actual data within visit(). - self.actual_transformer = _DeferredInstrumenter("") - - def visit(self, node: ast.AST) -> _tp.Any: - # Reset the wrapped transformer before (re)use. - self.actual_transformer.data = node - self.actual_transformer.scope_depth = 0 - return ast.fix_missing_locations(self.actual_transformer.visit(node)) - - -def instrument_ipython() -> None: - """Add defer_import's compile-time AST transformer to a currently running IPython environment. - - This will ensure that defer_imports.until_use works as intended when used directly in a IPython console. - - Raises - ------ - RuntimeError - If called in a non-IPython environment. - """ - - try: - ipython_shell: _tp.Any = get_ipython() # pyright: ignore [reportUndefinedVariable] # We guard this. - except NameError: - msg = "Not currently in an IPython environment." - raise RuntimeError(msg) from None - - ipython_shell.ast_transformers.append(_DeferredIPythonInstrumenter()) - - -_features = [getattr(__future__, feat_name) for feat_name in __future__.all_feature_names] - -_delayed_console_names = frozenset( - {"code", "codeop", "os", "_DeferredCompile", "DeferredInteractiveConsole", "interact"} -) - - -def __getattr__(name: str) -> _tp.Any: # pragma: no cover - # Hack to delay executing expensive console-related functionality until requested. - - if name in _delayed_console_names: - global code, codeop, os, _DeferredCompile, DeferredInteractiveConsole, interact - - import code - import codeop - import os - - class _DeferredCompile(codeop.Compile): - """A subclass of codeop.Compile that alters the compilation process via defer_imports's AST transformer.""" - - def __call__(self, source: str, filename: str, symbol: str, **kwargs: object) -> _tp.CodeType: - flags = self.flags - if kwargs.get("incomplete_input", True) is False: - flags &= ~codeop.PyCF_DONT_IMPLY_DEDENT # pyright: ignore - flags &= ~codeop.PyCF_ALLOW_INCOMPLETE_INPUT # pyright: ignore - assert isinstance(flags, int) - - codeob = self._instrumented_compile(source, filename, symbol, flags) - - for feature in _features: - if codeob.co_flags & feature.compiler_flag: - self.flags |= feature.compiler_flag - return codeob - - @staticmethod - def _instrumented_compile(source: str, filename: str, symbol: str, flags: int) -> _tp.CodeType: - orig_tree = compile(source, filename, symbol, flags | ast.PyCF_ONLY_AST, True) - transformer = _DeferredInstrumenter(source, filename) - new_tree = ast.fix_missing_locations(transformer.visit(orig_tree)) - return compile(new_tree, filename, symbol, flags, True) - - class DeferredInteractiveConsole(code.InteractiveConsole): - """An emulator of the interactive Python interpreter, but with defer_imports's transformation baked in. - - This ensures that defer_imports.until_use works as intended when used directly in an instance of this - console. - """ - - def __init__( - self, - locals: _tp.Optional[_tp.MutableMapping[str, _tp.Any]] = None, - filename: str = "", - ) -> None: - defer_locals = {f"@{klass.__name__}": klass for klass in (_DeferredImportKey, _DeferredImportProxy)} - - if locals is not None: - locals.update(defer_locals) - else: - locals = defer_locals | {"__name__": "__console__", "__doc__": None} # noqa: A001 - - super().__init__(locals, filename) - self.compile.compiler = _DeferredCompile() - - def interact(readfunc: _tp.Optional[_tp.AcceptsInput] = None) -> None: - r"""Closely emulate the interactive Python console, but instrumented by defer_imports. - - The resulting console supports direct use of the defer_imports.until_use context manager. - - Parameters - ---------- - readfunc: \_tp.Optional[\_tp.AcceptsInput], optional - An input function to replace InteractiveConsole.raw_input(). If not given, default to trying to import - readline to enable GNU readline if available. - - Notes - ----- - Much of this implementation is based on asyncio.__main__ in CPython 3.14. - """ - - py_impl_name = sys.implementation.name - sys.audit(f"{py_impl_name}.run_stdin") - - repl_locals = { - "__name__": __name__, - "__package__": __package__, - "__loader__": __loader__, - "__file__": __file__, - "__spec__": __spec__, - "__builtins__": __builtins__, - "defer_imports": sys.modules[__name__], - } - console = DeferredInteractiveConsole(repl_locals) - - if readfunc is not None: - console.raw_input = readfunc - else: - try: - import readline - except ImportError: - readline = None - - interactive_hook = getattr(sys, "__interactivehook__", None) - if interactive_hook is not None: - sys.audit(f"{py_impl_name}.run_interactivehook") - interactive_hook() - - if sys.version_info >= (3, 13): - import site - - if interactive_hook is site.register_readline: - # Fix the completer function to use the interactive console locals. - try: - import rlcompleter - except ImportError: - pass - else: - if readline is not None: - completer = rlcompleter.Completer(console.locals) - readline.set_completer(completer.complete) - - if startup_path := os.getenv("PYTHONSTARTUP"): - sys.audit(f"{py_impl_name}.run_startup", startup_path) - - with tokenize.open(startup_path) as f: - startup_code = compile(f.read(), startup_path, "exec") - exec(startup_code, console.locals) # noqa: S102 # pyright: ignore [reportArgumentType] - - banner = ( - f"Python {sys.version} on {sys.platform}\n" - 'Type "help", "copyright", "credits" or "license" for more information.\n' - f"({type(console).__name__})\n" - ) - ps1 = getattr(sys, "ps1", ">>> ") - - console.write(banner) - console.write(f"{ps1}import defer_imports\n") - console.interact("") - - return globals()[name] - - msg = f"module {__name__!r} has no attribute {name!r}" - raise AttributeError(msg) - - -_initial_global_names = tuple(globals()) + _original_import.reset(self._import_ctx_token) + _is_deferred.reset(self._defer_ctx_token) + builtins.__import__ = _original_import.get() -def __dir__() -> list[str]: # pragma: no cover - return list(_delayed_console_names.union(_initial_global_names, __all__)) +until_use: typing.Final[DeferredContext] = DeferredContext() # endregion diff --git a/src/defer_imports/__main__.py b/src/defer_imports/__main__.py deleted file mode 100644 index dac8914..0000000 --- a/src/defer_imports/__main__.py +++ /dev/null @@ -1,8 +0,0 @@ -# SPDX-FileCopyrightText: 2024-present Sachaa-Thanasius -# -# SPDX-License-Identifier: MIT - -from . import interact - - -raise SystemExit(interact()) diff --git a/src/defer_imports/_typing_compat.py b/src/defer_imports/_typing_compat.py deleted file mode 100644 index 9a97a70..0000000 --- a/src/defer_imports/_typing_compat.py +++ /dev/null @@ -1,188 +0,0 @@ -# SPDX-FileCopyrightText: 2024-present Sachaa-Thanasius -# -# SPDX-License-Identifier: MIT - -"""A __getattr__-based lazy import shim for typing- and annotation-related symbols.""" - -from __future__ import annotations - -import sys -from importlib.machinery import ModuleSpec - - -__all__ = ( - # -- from collections.abc - "Callable", - "Generator", - "Iterable", - "MutableMapping", - "Sequence", - # -- from typing - "Any", - "Final", - "Optional", - "Union", - # -- from types - "CodeType", - "ModuleType", - # -- from importlib.abc - "Loader", - # -- from os - "PathLike", - # -- imported with fallbacks - "ReadableBuffer", - "Self", - "TypeAlias", - "TypeGuard", - # -- needs import for definition - "T", - "AcceptsInput", - "PathEntryFinderProtocol", - # -- pure definition - "final", -) - -TYPE_CHECKING = False - -if TYPE_CHECKING: - from typing import final -else: - - def final(f: object) -> object: - """Decorator to indicate final methods and final classes. - - Slightly modified version of typing.final to avoid importing from typing at runtime. - """ - - try: - f.__final__ = True # pyright: ignore # Runtime attribute assignment - except (AttributeError, TypeError): # pragma: no cover - # Skip the attribute silently if it is not writable. - # AttributeError: if the object has __slots__ or a read-only property - # TypeError: if it's a builtin class - pass - return f - - -def __getattr__(name: str) -> object: # noqa: PLR0911, PLR0912, PLR0915 - # ---- Pure imports - if name in {"Callable", "Generator", "Iterable", "MutableMapping", "Sequence"}: - global Callable, Generator, Iterable, MutableMapping, Sequence - - from collections.abc import Callable, Generator, Iterable, MutableMapping, Sequence - - return globals()[name] - - if name in {"Any", "Final", "Optional", "Union"}: - global Any, Final, Optional, Union - - from typing import Any, Final, Optional, Union - - return globals()[name] - - if name in {"CodeType", "ModuleType"}: - global CodeType, ModuleType - - from types import CodeType, ModuleType - - return globals()[name] - - if name == "Loader": - global Loader - - from importlib.abc import Loader - - return globals()[name] - - if name == "PathLike": - global PathLike - - from os import PathLike - - return globals()[name] - - # ---- Imports with fallbacks - if name == "ReadableBuffer": - global ReadableBuffer - - if sys.version_info >= (3, 12): - from collections.abc import Buffer as ReadableBuffer - elif TYPE_CHECKING: - from typing_extensions import Buffer as ReadableBuffer - else: - from typing import Union - - ReadableBuffer = Union[bytes, bytearray, memoryview] - - return globals()[name] - - if name == "Self": - global Self - - if sys.version_info >= (3, 11): - from typing import Self - elif TYPE_CHECKING: - from typing_extensions import Self - else: - - class Self: - """Placeholder for typing.Self.""" - - return globals()[name] - - if name in {"TypeAlias", "TypeGuard"}: - global TypeAlias, TypeGuard - - if sys.version_info >= (3, 10): - from typing import TypeAlias, TypeGuard - elif TYPE_CHECKING: - from typing_extensions import TypeAlias, TypeGuard - else: - - class TypeAlias: - """Placeholder for typing.TypeAlias.""" - - class TypeGuard: - """Placeholder for typing.TypeGuard.""" - - return globals()[name] - - # ---- Composed types/values with imports involved - if name == "T": - global T - - from typing import TypeVar - - T = TypeVar("T") - return globals()[name] - - if name == "AcceptsInput": - global AcceptsInput - - from typing import Protocol - - class AcceptsInput(Protocol): - def __call__(self, prompt: str = "") -> str: ... - - return globals()[name] - - if name == "PathEntryFinderProtocol": - global PathEntryFinderProtocol - - from typing import Protocol - - class PathEntryFinderProtocol(Protocol): - # Copied from _typeshed.importlib. - def find_spec(self, fullname: str, target: ModuleType | None = ..., /) -> ModuleSpec | None: ... - - return globals()[name] - - msg = f"module {__name__!r} has no attribute {name!r}" - raise AttributeError(msg) - - -_initial_global_names = tuple(globals()) - - -def __dir__() -> list[str]: - return list(set(_initial_global_names + __all__)) diff --git a/tests/sample_stdlib_imports.py b/tests/sample_stdlib_imports.py index 2163528..67a2953 100644 --- a/tests/sample_stdlib_imports.py +++ b/tests/sample_stdlib_imports.py @@ -1,5 +1,5 @@ # pyright: reportUnusedImport=none, reportMissingTypeStubs=none -# Generated by benchmark/generate_samples.py +# Generated by bench/generate_samples.py import defer_imports