Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hypergraph #778

Open
wants to merge 38 commits into
base: dev
Choose a base branch
from
Open

Hypergraph #778

wants to merge 38 commits into from

Conversation

lkotipal
Copy link
Contributor

@lkotipal lkotipal commented Jun 1, 2023

  • Add Zoltan Graph/Hypergraph support to Vlasiator, requires Hypergraph dccrg#27
  • Neighborhood used for hypergraph partitioning is parametrized as loadBalance.partitioning_neighborhood, defaulting to the full neighborhood encompassing extended sysboundaries and Vlasov translation.
  • Add multidimensional weight support, with LB weight in XYZ determined by pencil count in each direction multiplied by the block count.
  • By default, multidimensional weights are summed in dccrg resulting in exactly the same behavior as before. Using the config parameter loadBalance.norm an arbitrary p-norm can be used, or using norm = 0 the infinity/max-norm.
  • RCB and some other methods also support multiple cell weights. Setting weight_dim = 3 enables multi-weight partitioning in Zoltan.

@lkotipal
Copy link
Contributor Author

lkotipal commented Jun 1, 2023

image
So here are some profiling results. HYPERGRAPH seems to outperform RCB but loses to RIB in this test using samples/Magnetosphere3D on 8 Mahti nodes. Let me know if you want more numbers.

EDIT: numbers outdated, but results are still the same

@lkotipal lkotipal force-pushed the hypergraph branch 2 times, most recently from deaaeec to c97f20b Compare February 8, 2024 08:58
grid.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@markusbattarbee markusbattarbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some less and some more serious questions

@@ -3637,8 +3644,8 @@ void initializeDataReducers(DataReducer * outputReducer, DataReducer * diagnosti
continue;
}
}
if(P::diagnosticWriteAllDROs || lowercase == "lbweight" || lowercase == "vg_lbweight" || lowercase == "vg_loadbalanceweight" || lowercase == "vg_loadbalance_weight" || lowercase == "loadbalance_weight") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P::diagnosticWriteAllDROs lost here

ioread.cpp Outdated
success=readCellParamsVariable(file,fileCells,localCellStartOffset,localCells,"LB_weight",CellParams::LBWEIGHTCOUNTERX,3,mpiGrid);
} else {
success=readCellParamsVariable(file,fileCells,localCellStartOffset,localCells,"LB_weight",CellParams::LBWEIGHTCOUNTERX,1,mpiGrid);
success=success && readCellParamsVariable(file,fileCells,localCellStartOffset,localCells,"LB_weight",CellParams::LBWEIGHTCOUNTERY,1,mpiGrid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really preferable to read the file three times instead of copying LBWEIGHTCOUNTERX into Y and Z?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not! It might actually even be a reasonable idea to leave counters Y and Z as zero, this should in principle reduce to using just the X counter even in 3D RCB? It does at least for all the norm reductions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
Just rember to add a comment so we won't wonder why the other values are left at zero :)

@@ -673,6 +673,15 @@ void SysBoundary::applySysBoundaryVlasovConditions(
SpatialCell::set_mpi_transfer_type(Transfer::CELL_PARAMETERS | Transfer::POP_METADATA | Transfer::CELL_SYSBOUNDARYFLAG, true);
mpiGrid.update_copies_of_remote_neighbors(SYSBOUNDARIES_EXTENDED_NEIGHBORHOOD_ID);

// Mark cells that are communicating velocity blocks on system boundaries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this in fact now provide incorrect information to Zoltan? This tells it that a sysboundary cell which currently is not at a process boundary won't cost anything to communicate, even though in the end it would.

Copy link
Contributor Author

@lkotipal lkotipal Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used for determining partitioning neighborhoods (see setPartitioningNeighborhoods in grid.cpp), just for data analysis of supposed communication weight from vlsv files.

vlasiator.cpp Outdated
mpiGrid[id]->parameters[CellParams::LBWEIGHTCOUNTER] *= 8.0;
for (int i = 0; i < 3; ++i) {
mpiGrid[id]->parameters[CellParams::LBWEIGHTCOUNTERX + i] *= 8.0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should CellParams::SYSBOUNDARIES_COMM also be *8 here?
(and returned afterwards)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, SYSBOUNDARIES_COMM is purely for data analysis, we don't need to set it when we're not writing.

if (Parameters::prepareForRebalance == true) {
// Blocks need to be scaled to time since acceleration uses time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that currently we don't actually include the time from acceleration in the weight.

And using time here is IMO a Bad Idea(tm) because the thing being timed is calculateSpatialTranslation(), which includes all the MPI barriers, ghost updates, remote neighbour contributions etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right, I can revert that crap. In any case it's just a matter of rescaling so shouldn't affect results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants