Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UX] minor optimizations for launch and introduce py-spy #4495

Merged
merged 2 commits into from
Jan 13, 2025

Conversation

aylei
Copy link
Contributor

@aylei aylei commented Dec 20, 2024

I've investigated python profiling tools when developing #3159 since python -m cProfile only sees profile on main thread and profiling multi-thread applications using cProfile is tedious. I evaluated yappi and py-spy, it turns out the latter one has better UX and broader adoption. Therefore, I propose changing the profiling approach in dev guide to py-spy.

Also, I found some minor optimization chances and made some quicks in this PR. sky launch speed ​​up 100ms in average:

# PR branch, 26.16s
/usr/bin/time bash -c 'for i in {1..10}; do SKYPILOT_DISABLE_USAGE_COLLECTION=1 python -m sky.cli launch --gpus H100:8> /dev/null; done;'
       26,16 real        16,37 user        25,22 sys

# master branch, 27.26s
/usr/bin/time bash -c 'for i in {1..10}; do SKYPILOT_DISABLE_USAGE_COLLECTION=1 python -m sky.cli launch --gpus H100:8> /dev/null; done;'
       27,26 real        16,26 user        25,34 sys

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (see benchmark above)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

appendix:

  • GIL flamegrpah captured by py-spy

gil

  • Speedscope flamegraph captured by py-sy, which is more user-friendly
Screenshot 2024-12-20 at 18 38 51

@aylei aylei marked this pull request as ready for review December 20, 2024 11:17
@aylei
Copy link
Contributor Author

aylei commented Dec 20, 2024

@Michaelvll please kindly review this PR

@aylei aylei changed the title [UX] minor optimizations for launch and introduce py-spy [UX] minor optimizations for launch, update contributing guide Dec 20, 2024
@aylei aylei changed the title [UX] minor optimizations for launch, update contributing guide [UX] minor optimizations for launch and introduce py-spy Dec 20, 2024
@aylei aylei force-pushed the issue-3159/minor-opts branch from ace3325 to be60a18 Compare December 20, 2024 13:01
Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aylei! It looks mostly good to me.

Comment on lines +8 to 10
# slow due to https://github.com/python-pendulum/pendulum/issues/808
# FIXME(aylei): bump pendulum if it get fixed
import pendulum
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can even get rid of this package, as we have also been observing inaccurate translation of the timestamp to readable time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any issue with this problem? If not, I'd like to open one as a followup~

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was observed a few times, but no issue for that yet. Please feel free to file an issue for the slowness. : )

Comment on lines +32 to +35
# requests and inspect cost ~100ms to load, which can be postponed to
# collection phase or skipped if user specifies no collection
requests = adaptors_common.LazyImport('requests')
inspect = adaptors_common.LazyImport('inspect')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, it might be good to have the usage being called asynchronously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, concurrency using threads/coroutines cannot leverage multiple CPU cores due to GIL and the network IO in usage_lib (_send_to_loki) happens when cli exiting. It might be okay to send the log synchronously now, since we are going to wait for _send_to_loki complete anyway.

@aylei aylei requested a review from Michaelvll January 2, 2025 14:15
@Michaelvll Michaelvll merged commit 38967f6 into skypilot-org:master Jan 13, 2025
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants