Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize DiskIOMeter bar by number of disks #1533

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Explorer09
Copy link
Contributor

@Explorer09 Explorer09 commented Sep 6, 2024

The maximum utilization should be "100% * number of disks". Set the bar and graph of the meter to draw with that maximum. Also remove the 100% cap on the utilization percentage display.

Resolves #1374.

@natoscott
Copy link
Member

Something like this patch will provide the PCP value you seek. You might want to consider a similar change for the NetworkIOMeter, in which case there is a "hinv.ninterface" metric that can be used similarly.

pcp/Metric.h
--- /tmp/git-blob-5igpu5/Metric.h	2024-09-06 15:32:13.776523149 +1000
+++ pcp/Metric.h	2024-09-06 15:26:46.052772959 +1000
@@ -26,6 +26,7 @@ typedef enum Metric_ {
    PCP_CONTROL_THREADS,         /* proc.control.perclient.threads */
 
    PCP_HINV_NCPU,               /* hinv.ncpu */
+   PCP_HINV_NDISK,              /* hinv.ndisk */
    PCP_HINV_CPUCLOCK,           /* hinv.cpu.clock */
    PCP_UNAME_SYSNAME,           /* kernel.uname.sysname */
    PCP_UNAME_RELEASE,           /* kernel.uname.release */
pcp/Platform.c
--- /tmp/git-blob-JCaR3p/Platform.c	2024-09-06 15:32:13.956523525 +1000
+++ pcp/Platform.c	2024-09-06 15:30:56.940361942 +1000
@@ -127,6 +127,7 @@ static const char* Platform_metricNames[
    [PCP_CONTROL_THREADS] = "proc.control.perclient.threads",
 
    [PCP_HINV_NCPU] = "hinv.ncpu",
+   [PCP_HINV_NDISK] = "hinv.ndisk",
    [PCP_HINV_CPUCLOCK] = "hinv.cpu.clock",
    [PCP_UNAME_SYSNAME] = "kernel.uname.sysname",
    [PCP_UNAME_RELEASE] = "kernel.uname.release",
@@ -385,6 +386,7 @@ bool Platform_init(void) {
    Metric_enable(PCP_PID_MAX, true);
    Metric_enable(PCP_BOOTTIME, true);
    Metric_enable(PCP_HINV_NCPU, true);
+   Metric_enable(PCP_HINV_NDISK, true);
    Metric_enable(PCP_PERCPU_SYSTEM, true);
    Metric_enable(PCP_UNAME_SYSNAME, true);
    Metric_enable(PCP_UNAME_RELEASE, true);
@@ -753,6 +755,8 @@ bool Platform_getDiskIO(DiskIOData* data
       data->totalBytesWritten = value.ull;
    if (Metric_values(PCP_DISK_ACTIVE, &value, 1, PM_TYPE_U64) != NULL)
       data->totalMsTimeSpend = value.ull;
+   if (Metric_values(PCP_HINV_NDISK, &value, 1, PM_TYPE_U64) != NULL)
+      data->numDisks = value.ull;
    return true;
 }
 

@Explorer09
Copy link
Contributor Author

@natoscott There is no "utilization percentage" for NetworkIOMeter at the moment. So I would consider showing the number of interfaces a feature separate from this pull request.

@natoscott
Copy link
Member

Understood & agreed. FWIW, I think this approach is not ideal (shares the same problem of the original 100% limiting, as mentioned by the reporter of #1374 - the code was fine the way it was originally). We need to be aware here of properties of some hardware described in the iostat(1) man page ...

              %util  Percentage  of elapsed time during which I/O requests were
                     issued to the device (bandwidth utilization  for  the  de‐
                     vice).  Device  saturation occurs when this value is close
                     to 100% for devices serving requests  serially.   But  for
                     devices  serving requests in parallel, such as RAID arrays
                     and modern SSDs, this number does not reflect  their  per‐
                     formance limits.

@Explorer09
Copy link
Contributor Author

@natoscott What was the ideal approach anyway?

@natoscott
Copy link
Member

Just reverting to the original htop behaviour as the issue reporter suggested.

@Explorer09
Copy link
Contributor Author

@natoscott Look in my patch carefully. I separated the utilization value into two. cached_utilisation_diff would revert to original behavior of allowing more than 100% to be shown; the bar would be drawn with cached_utilisation_norm which is the normalized value.

@BenBE BenBE added the enhancement Extension or improvement to existing feature label Sep 6, 2024
@BenBE BenBE added this to the 3.4.0 milestone Sep 6, 2024
@Explorer09 Explorer09 marked this pull request as ready for review September 6, 2024 19:18
@@ -16,6 +16,7 @@ typedef struct DiskIOData_ {
uint64_t totalBytesRead;
uint64_t totalBytesWritten;
uint64_t totalMsTimeSpend;
uint64_t numDisks;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the data type be unsigned int, size_t, uint64_t or unsigned long long?
Should the property be named "numDisks" or "nDisks"?

Other than those questions, this patch is ready.

@natoscott
Copy link
Member

@natoscott Look in my patch carefully. I separated the utilization value into two. cached_utilisation_diff would revert to original behavior of allowing more than 100% to be shown; the bar would be drawn with cached_utilisation_norm which is the normalized value.

Two thumbs up - good stuff, thanks. Regarding your data type question, I think I know what @BenBE is going to suggest: size_t FTW. :) However, systems with more than 4 billion disks are few and far between so I'm sure a 32 bit unsigned value would suffice. If noone has strong opinions, I suggest sticking with what you have now since that's consistent with the surrounding code.

Copy link
Member

@natoscott natoscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

DiskIOMeter.c Outdated Show resolved Hide resolved
@Explorer09
Copy link
Contributor Author

Regarding your data type question, I think I know what @BenBE is going to suggest: size_t FTW. :)

For PCP this would require a downcast from uint64_t and an (numDisks <= SIZE_MAX) check that I don't think it's worth it. That's why I chose a larger data type than size_t for now.

@natoscott
Copy link
Member

Regarding your data type question, I think I know what @BenBE is going to suggest: size_t FTW. :)

For PCP this would require a downcast from uint64_t and...

FWIW, for PCP to use 32-bit types we would change this section:

   if (Metric_values(PCP_HINV_NDISK, &value, 1, PM_TYPE_U64) != NULL)
      data->numDisks = value.ull;

to

   if (Metric_values(PCP_HINV_NDISK, &value, 1, PM_TYPE_U32) != NULL)
      data->numDisks = value.ul;

but, keep it as you have it, it's fine IMO.

The utilization percentage of DiskIOMeter is an accumulated total of
all disks, and for multiple disks, utilization above 100% is possible.
The maximum utilization should be "100% * number of disks". Set the bar
and graph of the meter to draw with that maximum.

Thanks to Nathan Scott for providing the PCP portion of the patch.

Resolves htop-dev#1374.

Co-authored-by: Nathan Scott <[email protected]>
Signed-off-by: Kang-Che Sung <[email protected]>
@ShaiMagal
Copy link

When this PR will be merged? It's "game changer" for overall about disk activity :)

@fasterit
Copy link
Member

@ShaiMagal: Have you tested the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Extension or improvement to existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow Disk IO meter to be above 100%
5 participants