Generate charts for Fio benchmark #3549

MVarshini · 2023-09-07T12:54:43Z

PBENCH-1214

Generate charts for Fio benchmark

dbutenhof

You didn't update the pquisby version in server/requirements.txt, which I'd expected. What Quisby are you using?

Aside from that, rather than making specific source comments, I pulled the branch and tried it, and what I'm seeing confuses me:

When I run with the current server lock (pquisby==0.0.17), visualization doesn't work at all for fio: I get errors like Quisby processing failure. Exception: list index out of range on either of the functional test fio runs and the new one we generated the other day.

On the other hand if I change to pquisby==0.0.25, which is the latest, one of the two "canned" functional test FIO runs (fio_rw_2018.02.01T22.40.57) "works" (although the display is a bit odd), but the other canned FIO run and the new one we generated the other day both fail with Quisby processing failure. Exception: not enough values to unpack (expected 2, got 0)

Are you using a different Pquisby version???

And with the one fio dataset that does work, I get

Obviously it's not very "interesting", but that's OK. However, what does it mean that the caption says <>_d-<>_j-<>_iod? Is that intentional, or a formatting issue? (Or missing data?)

MVarshini · 2023-09-07T16:06:43Z

@dbutenhof I was pointing to the latest version of pquisby and didn't check in that file.

Since, disk-job value is unknown <>_d-<>_j-<>_iod left empty in the response

dbutenhof · 2023-09-07T18:23:47Z

@dbutenhof I was pointing to the latest version of pquisby and didn't check in that file.

Hmm. So 0.0.25 is working for you? Because it wasn't for me, on 2 of the three fio datasets, which makes me nervous. Was there something else in your test environment that didn't get onto the PR branch, aside from the requirements.txt change? 😦

Since, disk-job value is unknown <>_d-<>_j-<>_iod left empty in the response

That's rather ugly, but it sounds like maybe that's some of the extra information Soumya was hoping to get from the server? (Although I'm not sure we know how to find it.)

webbnh

I don't think we should leave the pquisby dependency floating. (I'm not going to block the merge on that basis, but I'm not approving, either....)

Also, I have a concern about the behavior of getChartValues() when the benchmark isn't suitable for visualization...where do we guard against that misbehavior?

Other than those, I have a few minor things for your consideration.

dashboard/src/actions/comparisonActions.js

webbnh · 2023-09-07T18:31:54Z

dashboard/src/actions/comparisonActions.js

    }, Object.create(null));

    for (const [key, value] of Object.entries(result)) {
-      console.log(key);
-
      const map = {};


Noting (new) line 158, should we be using Object.create(null) at line 161 instead of {}?

webbnh · 2023-09-07T18:33:59Z

dashboard/src/actions/comparisonActions.js

+export const toggleCompareSwitch = () => ({
+  type: TYPES.TOGGLE_COMPARE_SWITCH,
+});


If you omit the trailing comma on line 188, I think this will all fit on one line.

webbnh · 2023-09-07T18:36:52Z

server/requirements.txt

-pquisby==0.0.17
+pquisby


Are we sure that we want to let this "float"?

Given recent experiences, I think we should peg this at a "known good" version.

I wondered at Webb's duplicate of my comment, but when I just started re-reviewing today I realized that I never actually published that earlier comment. Ooops!

So I'll just add here that pquisby is now at 0.0.27, vs 0.0.25 when I wrote that earlier comment, and I have no idea what's changed. Does it still work? I really do think we should be deliberately managing this dependency rather than letting it float.

dashboard/src/actions/comparisonActions.js

sousinha97 · 2023-09-08T05:13:37Z

@dbutenhof , In order to analyse Fio benchmark data, knowing the following is important apparently(I am not an expert on FIO)-

No of disks used
IODepth
Numjobs

which we are adding to the graphs, if it's not possible to fetch this data, we can remove this from the graph (It would have been great if we were able to extract this data tho, it reduces the effort for the user to go back to their runs and check manually the value of these fields). Would love to know your views on this. Accordingly we can plan and modify the graphs.

I guess these values are provided while running fio command, if not, default values are taken into consideration.

dbutenhof · 2023-09-08T12:31:51Z

@dbutenhof , In order to analyse Fio benchmark data, knowing the following is important apparently(I am not an expert on FIO)-

We're definitely not FIO experts, either. We've pieced together how to run it in trivial configurations, but the data in the result.csv that Quisby consumes was determined years ago as part of a complicated Pbench Agent post-processing step we've barely touched.

No of disks used

I assume we could figure out how to extract this information from the more detailed benchmark logs ... but I'm not even sure if it's global configuration or private to each iteration. (Or, for that matter, whether the Pbench "iteration" concept, which is how the Agent organizes the raw data, directly corresponds to the fio "job" configuration...)

Each fio "job" seems to be targeted to a specific filesystem path, which would define the disk used. Since the result.csv lists each Pbench Agent "iteration" (which I think equates directly to a fio "job" in the standard benchmark wrapper, but don't quote me on that), I suspect that each job is a single disk.

But a lot of these constraints are due to the design of the pbench-fio wrapper script. Right now, Quisby relies on the post-processed result.csv file which means we can only visualize/compare the output of pbench-fio. We also have people who run fio directly, for example, using pbench-user-benchmark, often to take advantage of the massive flexibility of the fio command configuration that's not supported by the Pbench Agent wrapper. Right now we have no way to visualize/compare those runs as we don't even know it's "fio" and we don't have the agent post-processing.

IODepth

This appears to be a global configuration setting on the fio job file, and we can extract it from a fio-generated JSON output file or from the raw text job description file.

Numjobs

As I mentioned above, I'm not entirely certain how the pbench-fio wrapper maps the fio "job" concept into Pbench Agent "iterations". I suspect (without much proof at this point) that it's one-to-one. However, a fio configuration file can apparently define multiple "jobs". We can read the list from the input file or fio's output to get it.

which we are adding to the graphs, if it's not possible to fetch this data, we can remove this from the graph (It would have been great if we were able to extract this data tho, it reduces the effort for the user to go back to their runs and check manually the value of these fields). Would love to know your views on this. Accordingly we can plan and modify the graphs.

I guess these values are provided while running fio command, if not, default values are taken into consideration.

We can figure this all out, but first we need to figure out the relative importance of figuring it out compared to the other stuff we need to do. 😄

FYI: I've summarized this at PBENCH-1274.

webbnh · 2023-09-08T16:38:25Z

Just poking around a randomly-selected FIO result, I found the generate-benchmark-summary.cmd at the top level which contains

/opt/pbench-agent/bench-scripts/postprocess/generate-benchmark-summary "fio" "--block-sizes=4,1024 --iodepth=8 --numjobs=10 --ramptime=10 --runtime=60 --samples=5 --targets=/fio --test-types=read,write --clients=192.168.122.211" "/var/lib/pbench-agent/fio__2023.09.06T12.05.46"

It looks like the values for iodepth and numjobs are there, and, extending Dave's suggestion above, I expect that the number of disks can be gleaned from the targets value there.

If we really want to dig, we have the fio.job files for each iteration, each of which contains a global section with the iodepth and a job-<dev> section for each device (which we can count to get the number of devices) each of which contains the number of jobs.

However, if (for the moment) we want to restrict ourselves to the information that we already have, I think that the Pbench Dashboard can glean the number of disks from the Pbench Server metadata, dataset.metalog.iterations/<iteration>.dev (i.e., in the metadata.log file, under each [iterations/...] section, there is a dev entry which contains the same value as the targets in the command above). And, I think Quisby can deduce the number of jobs by counting the iops_sec:client_hostname:* columns in the result.csv file. However, I don't see any way to get the I/O depth from the information that we already have on hand.

[I've added this information in a reply to PBENCH-1274.]

sousinha97 · 2023-09-11T13:59:03Z

@dbutenhof @webbnh thanks for your input. Yes this process requires a bit of discussion, we can surely look into other major issues first and keep this in backlog as an optimisation task. I will contact with Varshini, and remove this feature for now.

dbutenhof · 2023-09-27T13:35:26Z

No match for argument: rsyslog-mmjsonparse
Error: Unable to find a match: rsyslog-mmjsonparse

That's distressing. I'm not sure whether this is a transient failure in accessing repos or a real change in the dependencies. I'm going to re-trigger the build to see if it happens again.

dbutenhof

I think Webb has some remaining good suggestions, but I don't believe any of them are critical.

Also, I have a concern about the behavior of getChartValues() when the benchmark isn't suitable for visualization...where do we guard against that misbehavior?

The server API returns a useful message in this case, which the dashboard displays:

I think that's fine.

I'm still slightly concerned about allowing pquisby to float, although the latest 0.0.27 seems to work.

On one hand I'd like to give Webb another chance to re-review, but I'm also tempted to merge it this week and deploy onto staging at least by the ops review on Thursday. I'll have to wrestle with that one; but for now I'm going to approve.

(Incidentally, I also think it's intriguing that Jenkins re-ran the CI with my container build fix even though Varshini's commits weren't rebased on top of that.)

dbutenhof · 2023-09-07T18:25:26Z

server/requirements.txt

@@ -12,7 +12,7 @@ flask-restful>=0.3.9
 flask-sqlalchemy
 gunicorn
 humanize
-pquisby==0.0.17
+pquisby


While this is OK, given the inconsistency we've seen in pquisby PyPi packages, I think I'd prefer to see pquisby==0.0.25 if that's what we're supporting for uperf and fio visualization. We can change the version later as necessary.

dbutenhof · 2023-10-02T12:49:20Z

server/requirements.txt

-pquisby==0.0.17
+pquisby


I wondered at Webb's duplicate of my comment, but when I just started re-reviewing today I realized that I never actually published that earlier comment. Ooops!

So I'll just add here that pquisby is now at 0.0.27, vs 0.0.25 when I wrote that earlier comment, and I have no idea what's changed. Does it still work? I really do think we should be deliberately managing this dependency rather than letting it float.

Generate charts for FIO

MVarshini added the Dashboard Of and relating to the Dashboard GUI label Sep 7, 2023

MVarshini requested review from webbnh and dbutenhof September 7, 2023 12:54

MVarshini self-assigned this Sep 7, 2023

dbutenhof reviewed Sep 7, 2023

View reviewed changes

MVarshini force-pushed the PBENCH-1214 branch from ff9ed5b to ecdbbf4 Compare September 7, 2023 15:53

MVarshini force-pushed the PBENCH-1214 branch from ecdbbf4 to 83393b7 Compare September 7, 2023 16:06

webbnh reviewed Sep 7, 2023

View reviewed changes

MVarshini force-pushed the PBENCH-1214 branch from 83393b7 to 947dd97 Compare September 26, 2023 12:19

dbutenhof approved these changes Oct 2, 2023

View reviewed changes

MVarshini added 5 commits October 3, 2023 15:51

PBENCH-1214

a5bf04b

Generate charts for FIO

fio key change

02900f8

quisby version update

ce6319d

review comments

02a66e2

null the array

381778c

MVarshini force-pushed the PBENCH-1214 branch from 947dd97 to 381778c Compare October 3, 2023 10:21

dbutenhof merged commit cc22946 into distributed-system-analysis:main Oct 3, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate charts for Fio benchmark #3549

Generate charts for Fio benchmark #3549

MVarshini commented Sep 7, 2023

dbutenhof left a comment

MVarshini commented Sep 7, 2023

dbutenhof commented Sep 7, 2023

webbnh left a comment

webbnh Sep 7, 2023

webbnh Sep 7, 2023

webbnh Sep 7, 2023

dbutenhof Oct 2, 2023

sousinha97 commented Sep 8, 2023 •

edited

Loading

dbutenhof commented Sep 8, 2023 •

edited

Loading

webbnh commented Sep 8, 2023 •

edited

Loading

sousinha97 commented Sep 11, 2023 •

edited

Loading

dbutenhof commented Sep 27, 2023

dbutenhof left a comment

dbutenhof Sep 7, 2023

dbutenhof Oct 2, 2023

		pquisby==0.0.17
		pquisby

		pquisby==0.0.17
		pquisby

Generate charts for Fio benchmark #3549

Generate charts for Fio benchmark #3549

Conversation

MVarshini commented Sep 7, 2023

dbutenhof left a comment

Choose a reason for hiding this comment

MVarshini commented Sep 7, 2023

dbutenhof commented Sep 7, 2023

webbnh left a comment

Choose a reason for hiding this comment

webbnh Sep 7, 2023

Choose a reason for hiding this comment

webbnh Sep 7, 2023

Choose a reason for hiding this comment

webbnh Sep 7, 2023

Choose a reason for hiding this comment

dbutenhof Oct 2, 2023

Choose a reason for hiding this comment

sousinha97 commented Sep 8, 2023 • edited Loading

dbutenhof commented Sep 8, 2023 • edited Loading

webbnh commented Sep 8, 2023 • edited Loading

sousinha97 commented Sep 11, 2023 • edited Loading

dbutenhof commented Sep 27, 2023

dbutenhof left a comment

Choose a reason for hiding this comment

dbutenhof Sep 7, 2023

Choose a reason for hiding this comment

dbutenhof Oct 2, 2023

Choose a reason for hiding this comment

sousinha97 commented Sep 8, 2023 •

edited

Loading

dbutenhof commented Sep 8, 2023 •

edited

Loading

webbnh commented Sep 8, 2023 •

edited

Loading

sousinha97 commented Sep 11, 2023 •

edited

Loading