Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up import time by deferring inspect #499

Merged
merged 4 commits into from
Aug 20, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 11 additions & 6 deletions importlib_metadata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
import zipp
import email
import types
import inspect
import pathlib
import operator
import textwrap
Expand Down Expand Up @@ -1071,6 +1070,9 @@ def _topmost(name: PackagePath) -> Optional[str]:
return top if rest else None


inspect = None
Copy link
Contributor Author

@danielhollas danielhollas Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a trick that is used in stdlib. I am not sure it is needed here, but since get_toplevel_name is called in a loop from _top_level_inferred perhaps it is warranted to avoid the overhead of calling import inspect repeatedly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of global variables or the additional complexity. This change introduces enough disruption to the essential logical flow that I'm -1. Do we know how much overhead there is in repeated import inspect? My understanding (which may be incorrect) is that import inspect is essentially a dict lookup if it's already been imported. I'm guessing the overhead is acceptable. Can we try a simple deferral for now?



def _get_toplevel_name(name: PackagePath) -> str:
"""
Infer a possibly importable module name from a name presumed on
Expand All @@ -1089,11 +1091,14 @@ def _get_toplevel_name(name: PackagePath) -> str:
>>> _get_toplevel_name(PackagePath('foo.dist-info'))
'foo.dist-info'
"""
return _topmost(name) or (
# python/typeshed#10328
inspect.getmodulename(name) # type: ignore
or str(name)
)
n = _topmost(name)
if n:
return n

global inspect
if inspect is None:
import inspect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have a test protecting this behavior, I'd like to see a comment pointing to the issue, so a future someone doesn't refactor this optimization away.

return inspect.getmodulename(name) or str(name)


def _top_level_inferred(dist):
Expand Down
Loading