Early version of profiler harness #959

beroy · 2024-01-29T03:05:12Z

Include a basic benchmark as the starting point and needed scripts

codecov · 2024-01-29T04:47:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.94%. Comparing base (4f40690) to head (a85038a).

❗ Current head a85038a differs from pull request most recent head c4b9a26. Consider uploading reports for the commit c4b9a26 to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #959   +/-   ##
=======================================
  Coverage   76.94%   76.94%           
=======================================
  Files          72       72           
  Lines        5691     5691           
=======================================
  Hits         4379     4379           
  Misses       1312     1312

Flag	Coverage Δ
unittests	`76.94% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

atolopko-czi · 2024-01-29T14:21:53Z

.github/workflows/profiler.yml

@@ -0,0 +1,30 @@
+name: Performance check


would align this name with the workflow file name

atolopko-czi · 2024-01-29T14:22:27Z

.github/workflows/profiler.yml

+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0


1 if you just want the HEAD commit.

atolopko-czi · 2024-01-29T14:26:50Z

tools/perf_checker/benchmark1.py

This is is testing "export anndata", so why not name the file as such?

and maybe put all the benchmarks into a subdir. The perf_checker.sh script could just iterate through all the files, allowing the explicit definition (and future update) of the benchmarks array in the script.

atolopko-czi · 2024-01-29T14:29:14Z

tools/perf_checker/perf_checker.sh

+# Downloading TileDB-SOMA (branch or main once merged)
+git clone https://github.com/single-cell-data/TileDB-SOMA.git
+cd TileDB-SOMA
+git checkout census_profiler


This will be removed before merging to main, right? (I think that's what the comment above is saying)

This was not removed and is now present in main, and has broken the CI

atolopko-czi · 2024-01-29T14:31:08Z

tools/perf_checker/perf_checker.sh

+
+# Running all benchmark and checking performance changes
+arraylength=${#benchmarks[@]}
+for (( i=0; i<${arraylength}; i++ ))


you can probably just iterate the array, like for benchmark in ${benchmarks} and avoid the indexing

also, per another comment, if all benchmark scripts are placed in a subdir, you can just glob and iterate them: for benchmark in benchmarks/*py

Yes you are right

atolopko-czi · 2024-01-29T14:31:24Z

tools/perf_checker/perf_checker.sh

+arraylength=${#benchmarks[@]}
+for (( i=0; i<${arraylength}; i++ ))
+do
+  python  ./TileDB-SOMA/profiler "python ${benchmarks[$i]}" $dbpath -t time


Suggested change

python ./TileDB-SOMA/profiler "python ${benchmarks[$i]}" $dbpath -t time

python ./TileDB-SOMA/profiler "python ${benchmarks[$i]}" $dbpath -t time

is -t time going to run gnu-time? the profiler requires gnu-time-formatted output. would install gnu-time if needed, in profiler.yml

Good point. I'll fix it need to figure how to enable it on linux

I actually noticed that gtime only works on macos and was not able to install on ubunutu. LMK if there's a way to get it to work on ubuntu

It's usually available under /usr/bin/time. If not try, try sudo apt install time

atolopko-czi · 2024-01-29T14:36:54Z

tools/perf_checker/perf_checker.sh

+mount-s3 census-profiler-tests ./mount-s3 --cache ./s3_cache  --metadata-ttl 300
+
+dbpath=`pwd`/mount-s3
+echo "Mount-S path = ${dbpath}"


mount-s3?

echo not needed if using set -x above

Mount-s3 is an s3 fused drive

atolopko-czi · 2024-01-29T14:41:12Z

tools/perf_checker/perf_checker.sh

+sudo apt install -y ./mount-s3.deb
+
+# Setting up mount-s3
+mkdir ./mount-s3


would rename to indicate its contents instead? could be same as bucket name: census-profiler-tests. Would also add comment explaining that this is necessary to persist the profiling run data that are performed below.

Great ideas both

atolopko-czi · 2024-01-29T14:52:44Z

tools/perf_checker/benchmark1.py

+        ) as query:
+            query.to_anndata(X_name="raw")
+    t2 = perf_counter()
+    print(f"End to end time {t2 - t1}")


Is timing necessary here, since this is being called by the profiler, which is already timing this script?

ebezzi

LGTM overall. Left a few nitpicks.

ebezzi · 2024-03-08T22:42:08Z

.github/workflows/profiler.yml

+        uses: aws-actions/configure-aws-credentials@v1
+        with:
+          aws-region: us-west-2
+          role-to-assume: arn:aws:iam::401986845158:role/MyNewPlayground #arn:aws:iam::401986845158:role/PlaygroundS3 #


Consolidate/remove comments

ebezzi · 2024-03-08T22:47:44Z

tools/perf_checker/perf_checker.sh

+python3.11 -m venv ~/venv
+. ~/venv/bin/activate
+
+pip install psutil


These can go on a single line

ebezzi · 2024-03-08T22:49:56Z

tools/perf_checker/test_anndata_export.py

+import tiledbsoma as soma
+
+print("Starting bm 1", file=stderr)
+census_S3_latest = dict(census_version="2024-01-01")


If you pin a non LTS version it will go away after one month. That said, this is a test file so it's probably fine, but leave a comment for posterity.

Include a basic benchmark as the starting point and needed scripts

beroy requested a review from ebezzi January 29, 2024 03:05

beroy force-pushed the perf_checker branch from 727f75f to 685bc51 Compare January 29, 2024 03:06

beroy requested a review from atolopko-czi January 29, 2024 03:14

atolopko-czi reviewed Jan 29, 2024

View reviewed changes

beroy force-pushed the perf_checker branch 17 times, most recently from c3c5ae1 to 6f765ef Compare February 1, 2024 23:39

beroy force-pushed the perf_checker branch 23 times, most recently from ed3e36d to 1d9701f Compare March 5, 2024 23:38

ebezzi approved these changes Mar 8, 2024

View reviewed changes

beroy force-pushed the perf_checker branch 2 times, most recently from a85038a to b47fd35 Compare March 8, 2024 23:12

Early version of profiler harness

c4b9a26

Include a basic benchmark as the starting point and needed scripts

beroy force-pushed the perf_checker branch from b47fd35 to c4b9a26 Compare March 8, 2024 23:19

beroy merged commit 311a352 into main Mar 9, 2024
15 checks passed

beroy deleted the perf_checker branch March 9, 2024 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Early version of profiler harness #959

Early version of profiler harness #959

beroy commented Jan 29, 2024

codecov bot commented Jan 29, 2024 •

edited

Loading

atolopko-czi Jan 29, 2024

atolopko-czi Jan 29, 2024

atolopko-czi Jan 29, 2024

atolopko-czi Jan 29, 2024

beroy Feb 12, 2024

atolopko-czi Jan 29, 2024

beroy Jan 29, 2024

bkmartinjr Mar 10, 2024

atolopko-czi Jan 29, 2024

atolopko-czi Jan 29, 2024

beroy Jan 29, 2024

atolopko-czi Jan 29, 2024

atolopko-czi Jan 29, 2024

beroy Jan 29, 2024

beroy Feb 5, 2024

atolopko-czi Feb 5, 2024

beroy Feb 12, 2024

atolopko-czi Jan 29, 2024

atolopko-czi Jan 29, 2024

beroy Jan 29, 2024

atolopko-czi Jan 29, 2024

beroy Jan 29, 2024

atolopko-czi Jan 29, 2024 •

edited

Loading

beroy Jan 29, 2024

ebezzi left a comment

ebezzi Mar 8, 2024

ebezzi Mar 8, 2024

ebezzi Mar 8, 2024

	python ./TileDB-SOMA/profiler "python ${benchmarks[$i]}" $dbpath -t time
	python ./TileDB-SOMA/profiler "python ${benchmarks[$i]}" $dbpath -t time

Early version of profiler harness #959

Early version of profiler harness #959

Conversation

beroy commented Jan 29, 2024

codecov bot commented Jan 29, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atolopko-czi Jan 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebezzi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jan 29, 2024 •

edited

Loading

atolopko-czi Jan 29, 2024 •

edited

Loading