Update collectionsoa.py to allow user defined MPI partitioning #1414

JamiePringle · 2023-08-21T15:33:56Z

This code allows users to change the function which determines which particles are run on which MPI jobs using the function setPartitionFunction(). Attached to this pull request is a modified version of example_stommel.py which uses this functionality, and run_example_stommel.py which shows how to implement different partitioning schemes. Using this partitioning scheme on my global runs saves me about 20% time, which for my 10 days runs is worth noticing.

To use this function, a new partitioning function must be created with two arguments: (coords,mpi_size=1) The arguements and output are

    Input:

    coords: numpy array with rows of [lon, lat] so that
    coords.shape[0] is the number of particles and coords.shape[1] is 2.

    mpi_size=1: the number of MPI processes.

    Output:

    mpiProcs: an integer array with values from 0 to mpi_size-1
    specifying which MPI job will run which particles. len(mpiProcs)
    must equal coords.shape[0]

The existing partitioning function in this format is now

def partitionParticles4MPI_default(coords,mpi_size=1):
    '''...
    '''

    if KMeans:
        kmeans = KMeans(n_clusters=mpi_size, random_state=0).fit(coords)
        mpiProcs = kmeans.labels_    
    else:  # assigning random labels if no KMeans (see https://github.com/OceanParcels/parcels/issues/1261)
        logger.warning_once('sklearn needs to be available if MPI is installed. '
                            'See http://oceanparcels.org/#parallel_install for more information')
        mpiProcs = np.randint(0, mpi_size, size=len(lon))
        
    #print('Using default KMeans partitioning of particles to MPI processes',flush=True)

    return mpiProcs

One example I have found useful is a function that requires that the number of particles in each MPI job is roughly equal. This prevents the default KMeans algorithm from making small clusters around, for example, the Hawaiian islands. These unequal sizes of MPI jobs leads to unequal allocation of compute resources, and long runs as some MPI processes take much longer to finish. To make the equal allocation of particles, I use a constrained KMeans algorithm. This can be very slow, so I include an option (ncull) to do the initial clustering on a sub-set of the particles. It is important to note that this new partitioning function does NOT need to be included in the parcels distribution -- it is entirely created by the user of parcels.

def partitionParticles4MPI_KMeans_constrained(coords,mpi_size=1):
    #This code does a constrained k-means which ensures that each
    #cluster has a the same number of particles to within a multiplicative factor "slop".
    #Because the constrained k-means can be slow, the clustering is done on
    #a data set that has been reduced by a factor of nCull.
    #the kMeansConstrained function can be obtained from https://joshlk.github.io/k-means-constrained/

    #for large runs, you will have to increase nCull so this runs in a reasonable time. 
    nCull=2
    tic=time.time()
    print('Starting constrained k-means for',coords.shape[0],'particles and nCull',nCull,flush=True)
    coordsTrain=coords[::nCull,:]
    slop=0.1
    maxSizeCluster=int((1.0+slop)*(coordsTrain.shape[0]/mpi_size))
    minSizeCluster=int((1.0-slop)*(coordsTrain.shape[0]/mpi_size))
    kmeans = KMeansConstrained(n_clusters=mpi_size, size_min=minSizeCluster, size_max=maxSizeCluster,
                               random_state=0,n_jobs=-4).fit(coordsTrain)
    
    #now predict where all particles go from the kmeans calculated on partial data set
    mpiProcs = kmeans.predict(coords,minSizeCluster*nCull,maxSizeCluster*nCull)
    print('   done with constrained k-means in',time.time()-tic,'seconds',flush=True)
    return mpiProcs

This code, and the following example of its use, come from the attached example_stommel.py. To use this function, we must import the setPartitionFunction() with from parcels.collection.collectionsoa import setPartitionFunction and BEFORE making the particle set, we must set the new function to be used with setPartitionFunction(partitionParticles4MPI_KMeans_constrained).

I have attached figures for an example in which the initial particle positions are 4 clumps of particles with greatly different numbers of particles. The default KMeans code works correctly, which means it successfully identifies the spatially separate clumps, and so creates MPI jobs with very different numbers of particles.

The constrained KMeans breaks the particles into less compact but more equally sized groups.

Now, there are clear trade-offs between equal size MPI jobs and locality of particles. But in my case, I have found equal size particles to be a big win.

If yall like where this is going, I can write up some documentation for it.

codeForStommelExample.zip

This code allows users to change the function which determines which particles are run on which MPI jobs using the function setPartitionFunction().

for more information, see https://pre-commit.ci

JamiePringle · 2023-08-21T15:37:47Z

Oops. I created this branch only a few days ago, but apparently many changes have been made to collectionsoa.py in that time. It shows that there are many changes in the code, but there are really only two that are mine -- in the new code, lines 29-83 and 131-137. I am afraid that my unfamiliarity with gitHub is showing...

parcels/collection/collectionsoa.py

Co-authored-by: Erik van Sebille <[email protected]>

JamiePringle · 2023-08-21T18:35:07Z

@erikvansebille I was trying to do this with as little change to the code as possible, since I had not originally thought of putting it into the master branch of the code. However, what you say makes sense, but I need a little time to dive into the code to make sure I understand how particle set creation integrates with functions like .from_line() and .from_list().

I will try to add a unit test, if it seems straight forward.

I will be able to get back to this once I get some reviews off my desk...

Co-authored-by: Erik van Sebille <[email protected]>

With these changes, any particleSet creation function should take a kwarg of partitionFunction which defines the partitioning of particles to different MPI jobs/ranks

for more information, see https://pre-commit.ci

Update particle set to specify partitionFunction

for more information, see https://pre-commit.ci

As is unresponsive

This PR is a follow-up for #1414

JamiePringle and others added 3 commits August 21, 2023 11:06

Update collectionsoa.py to allow user defined MPI partitioning

1831f4e

This code allows users to change the function which determines which particles are run on which MPI jobs using the function setPartitionFunction().

[pre-commit.ci] auto fixes from pre-commit.com hooks

3d0077a

for more information, see https://pre-commit.ci

Merge branch 'master' into userPartitionMPI

4d5873d

erikvansebille requested changes Aug 21, 2023

View reviewed changes

JamiePringle and others added 2 commits August 21, 2023 14:18

Update parcels/collection/collectionsoa.py

68262fe

Co-authored-by: Erik van Sebille <[email protected]>

Update parcels/collection/collectionsoa.py

2f368f5

Co-authored-by: Erik van Sebille <[email protected]>

JamiePringle and others added 13 commits August 23, 2023 15:58

Update parcels/collection/collectionsoa.py

71bfbd7

Co-authored-by: Erik van Sebille <[email protected]>

Update parcels/collection/collectionsoa.py

fbb2e97

Co-authored-by: Erik van Sebille <[email protected]>

define partitionFunction at particleSet creation

17469b3

With these changes, any particleSet creation function should take a kwarg of partitionFunction which defines the partitioning of particles to different MPI jobs/ranks

[pre-commit.ci] auto fixes from pre-commit.com hooks

554713c

for more information, see https://pre-commit.ci

Update particle set to specify partitionFunction

2ca72ea

Update particle set to specify partitionFunction

Cleaning up partition_function code

e610cd7

Fixing pre-commit/pydocstyle issues

6fc5554

Fixing another pydocstyle issue...

4926595

Cleaning up partition_function for MPI

4269dbd

Adding unit test for partition_function

fa94df6

Fixing flake8 errors

a21c9b6

Temporary fix for AoS

e148126

Merge branch 'master' into userPartitionMPI

4b133df

erikvansebille approved these changes Aug 25, 2023

View reviewed changes

erikvansebille and others added 4 commits August 25, 2023 14:45

Merge branch 'master' into userPartitionMPI

1349655

Updating MPI test to also run on linux if mpi4py does not exist

ec04293

[pre-commit.ci] auto fixes from pre-commit.com hooks

cb91a2e

for more information, see https://pre-commit.ci

Fixing flake8 error

356f1f7

erikvansebille changed the base branch from master to v3.0 August 28, 2023 06:53

erikvansebille and others added 3 commits September 2, 2023 14:33

Added short tutorial section on partition functions to tutorial_MPI

165c0b9

[pre-commit.ci] auto fixes from pre-commit.com hooks

78948a4

for more information, see https://pre-commit.ci

Update conf.py to ignore binder.org for linkchecking

22b10ae

As is unresponsive

erikvansebille merged commit cb44378 into v3.0 Sep 2, 2023
10 checks passed

erikvansebille deleted the userPartitionMPI branch September 2, 2023 14:13

erikvansebille restored the userPartitionMPI branch September 2, 2023 14:25

erikvansebille deleted the userPartitionMPI branch September 2, 2023 14:28

erikvansebille added a commit that referenced this pull request Sep 4, 2023

Simplification of the MPI partition_function explanation

fd1269e

This PR is a follow-up for #1414

erikvansebille mentioned this pull request Sep 4, 2023

Simplification of the MPI partition_function explanation #1424

Merged

erikvansebille mentioned this pull request Sep 18, 2023

Equal distribution of particles to MPI processors #1231

Closed

This was referenced Sep 26, 2023

Significantly improved MPI performance with revised partitioning #1300

Closed

particle load balancing in parallel runs #1007

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update collectionsoa.py to allow user defined MPI partitioning #1414

Update collectionsoa.py to allow user defined MPI partitioning #1414

JamiePringle commented Aug 21, 2023 •

edited by erikvansebille

Loading

JamiePringle commented Aug 21, 2023

JamiePringle commented Aug 21, 2023

Update collectionsoa.py to allow user defined MPI partitioning #1414

Update collectionsoa.py to allow user defined MPI partitioning #1414

Conversation

JamiePringle commented Aug 21, 2023 • edited by erikvansebille Loading

JamiePringle commented Aug 21, 2023

JamiePringle commented Aug 21, 2023

JamiePringle commented Aug 21, 2023 •

edited by erikvansebille

Loading