-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use caliper #137
base: develop
Are you sure you want to change the base?
Use caliper #137
Conversation
…umented; needs CALI_CONFIG env variable for output
…ant, kernelvariant
I had to advance Spack Caliper to 2.5.0 via direct edit of spec to |
Pinging @ajpowelsnl , as this is likely to be of interest to her. I'll take a look myself as well, later |
…on across variants
…metric to work with older Hatchet
Created GenericFrame Class and CompareVariants function to solve this scenario 15.885 Base_OpenMP And the other And attempt gf3 = gf1 / gf2 it produces nans nan Base_Seq ▶ So the GenericFrame class replaces root node with structure Then I can do something like Now we can compare across variants (e.g. Base_OpenMP / Base_Seq) I also create a CompareVariants (free function) which puts into a Pandas matrix a comparison of every variant to each other, actually across a list of CALI_FILES, so we can compare across compilers. |
… exiting full path to executable
…book with ipynb file type
…vary in number of kernels
@jonesholger : just saw your last comment, I've run into the same thing. Have you looked at sharing that back with the Hatchet folks? I think it could be useful |
@DavidPoliakoff : Sorry for late reply. Was offline building a new machine. It looks like some of my work on GenericFrame is heading upstream but likely more flexible in it's ability to edit any node., vs just "root". ExtractCommonSubtree is too RAJAPerf specific wrt tree depth and the recalcs required when comparing two trees which vary in the number of kernels. What I'm looking for now is to detect whether Caliper is using the environment to setup intrinsic manager/service(s) so I can tell the Manager in RAJAPerf to back off - this for the case where CUPTI only allows one subscriber. |
src/RAJAPerfSuiteDriver.cpp
Outdated
@@ -10,6 +10,12 @@ | |||
|
|||
#include <iostream> | |||
|
|||
#ifdef RAJAPERF_USE_CALIPER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonesholger why are these includes here? There is no caliper usage in this file. It looks like they can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this was for a earlier approach. Still need to do some cleanup.
src/common/KernelBase.hpp
Outdated
cali::ConfigManager m; | ||
mgr.insert(std::make_pair(vid,m)); | ||
std::string vstr = getVariantName(vid); | ||
std::string profile = "spot(output=" + vstr + ".cali)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sens to have the "spot" part of the string be something that a user can specify with a command line option? I've heard that caliper supports multiple types of output. Maybe that's not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having the spot service embedded for now is good. The .cali files generated by the spot service are used in Hatchet too. And we don't have to worry about command line arguments or profile inputs. We also setup the per variant based file output here too. I think the future caliper_value_add branch which will support additional services (e.g. CUpti,CUpti-trace, OMPT, PAPI, and later HipXXX} could use more general configuration. Here's an example with a lot of profile configurations which can be switched in at runtime: https://github.com/LLNL/Caliper/blob/master/examples/configs/sampling.conf. The other output file types are split-json for Hatchet (redundant with spot), or JSON. Future Caliper will have I/O wrt using databases too.
You can also provide configs via Environment variables, but that can lead to conflicts when we use a ConfigManager object. For example CUpti can only have one client, and so it'll generate a runtime conflict if we try to activate the service in both places. It is still difficult to detect environment overrides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, missed this. Suppose we were currently setting up our Caliper runs using a ConfigManager. Would we need to worry about conflicts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DavidPoliakoff , yep I asked Boehme about providing flags so we can notice what's happening in the environment. I can then put in some logic where effectively the ConfigManager defers. When I get some time I'll look at Caliper source a bit more closely to see if there's an angle i can exploit.
== CALIPER: cupti: cuptiSubscribe: error: CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED
Hatchet analysis crossvariant
Woptim/caliper integration
Note: The branch on a forker repo for this PR is pulled into the RAJA Perf repo and there is another PR for review, approval, and eventual merging #254
Add Caliper support for "by variant" Spot/Hatchet Analysis. Tree is Variant->Group->Kernel
Creates additional Adiak keys for Build Configuration via modified rajaperf_config.hpp
I created some screenshots below showing Spot/Hatchet analysis for two datasets: clang and g++ respectively.
1: Adiak Keys used for selection criteria (compiler_suffix and variant checkmarked)
2: Overall Comparison group by compiler_suffix
3: Drilldown to Base_Seq/Basic across compilers (note legend change to highlight individual kernels)
4: Hatchet Tree for RAJA_Seq and Base_Seq
5: Hatchet Speedup for Base_OpenMP (note that trees must be completely identical - so we can compare Base_Seq between gcc and clang; but we cannot compute speedup for Base_Seq vs RAJA_Seq either in one dataset, or across compilers or launch dates. I'm thinking of adding some Hatchet routines to allow this (essentially creating some new pseudo labeled trees to allow comparison).