-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-118761: Improve import time of mimetypes
#126979
Conversation
mimetypes
mimetypes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import time improvement looks great!
My question is: how do these import mod
lines in a function body affect the function execution time itself?
IIRC import
statement accuires a lock, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🎉
@@ -23,11 +23,6 @@ | |||
read_mime_types(file) -- parse one file, return a dictionary or None | |||
""" | |||
|
|||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if lazy import of os
module is worth it here, it is needed three times and the improvement is small and only in a special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python always imports os
and sys
at startup. Moving import sys
makes sense, it's only used by _main(). But I'm not sure about moving os
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the OP, @hugovk provides timings for when the python is run without the site.py
module (python -S
), in which case os
and sys
are not loaded I believe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes a difference with python -S
, compare the last three images above.
But I'm fine reverting any of these, because I expect with site.py
is by far the most usual thing.
They do affect execution time but in a much smaller way. Let's time these scripts that call a function a million times: # 1.py
import urllib.parse
def thing():
urllib.parse.urlparse("https://example.com")
for _ in range(1_000_000):
thing() # 2.py
def thing():
import urllib.parse
urllib.parse.urlparse("https://example.com")
for _ in range(1_000_000):
thing() Importing outside the loop is 1.06 times faster, but this is with a million loops: ❯ hyperfine --warmup 1 "python3.14 1.py" "python3.14 2.py"
Benchmark 1: python3.14 1.py
Time (mean ± σ): 1.403 s ± 0.014 s [User: 1.380 s, System: 0.019 s]
Range (min … max): 1.392 s … 1.441 s 10 runs
Benchmark 2: python3.14 2.py
Time (mean ± σ): 1.486 s ± 0.006 s [User: 1.464 s, System: 0.020 s]
Range (min … max): 1.477 s … 1.498 s 10 runs
Summary
python3.14 1.py ran
1.06 ± 0.01 times faster than python3.14 2.py Looping 1,000 times shows no difference: ❯ hyperfine --warmup 3 "python3.14 1.py" "python3.14 2.py"
Benchmark 1: python3.14 1.py
Time (mean ± σ): 17.5 ms ± 1.5 ms [User: 14.1 ms, System: 2.8 ms]
Range (min … max): 16.7 ms … 34.1 ms 152 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: python3.14 2.py
Time (mean ± σ): 17.5 ms ± 1.0 ms [User: 14.1 ms, System: 2.7 ms]
Range (min … max): 16.9 ms … 28.2 ms 150 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
python3.14 2.py ran
1.00 ± 0.10 times faster than python3.14 1.py |
6% slower in the worst case and no difference in a more realistic scenario sounds good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for the reviews! |
Makes import time 11 to 16 times as fast. Measured with a PGO and LTO non-debug build on macOS.
The slowest import in
mimetypes
isurllib.parse
(taking 5,174 ofmimetypes
5,448 μs = 95%).After deferring
urllib.parse
, import time is 480 μs. That's 11.35 times as fast.We could stop here. The other imports are easy enough to defer as well and have some benefit when running with
-S
to notimport site
on initialisation: 7,540 μs -> 469 μs = 16.08 times as fast.-X importtime
python.exe
total import time: 0.010s -> 0.006s -> 0.005s
urllib.parse
mimetypes
import time: 0.005s -> 0.000s -> 0.000surllib.parse
python.exe -S
total import time: 0.013s -> 0.009s -> 0.006s
urllib.parse
mimetypes
import time: 0.013s -> 0.004s -> 0.001surllib.parse
hyperfine
python.exe
: 24.2 ms -> 11.2 msmain
PR
python.exe -S
: 15.1 ms -> 8.4 msmain
PR