add optional argument for specifying factorization strategy #31

samuelpmish · 2024-01-05T19:52:25Z

Hi,

After reading about the significant performance difference between simplicial and supernodal factorization (#25), I thought it would be beneficial to expose that as an option to the user. This PR adds a new argument to the CholeskySolver class to specify which option to use. CHOLMOD also apparently has an "AUTO" option which uses some heuristic to decide which strategy is appropriate for the provided matrix, so I use that as the default argument.

Since the new argument is defaulted, this change shouldn't break existing users' code, although it is potentially a change in the default behavior of CHOLMOD, which could change the runtime of existing code.

The performance of two test problems (as measured on a M2 macbook) are summarized below,

get_icosphere(7) (2D connectivity): supernodal was a 1.7x speedup over simplicial
a tetrahedron mesh of a sphere (3D connectivity): supernodal was a 4.1x speedup over simplicial

In both cases, the default "CHOLMOD_AUTO" option picked the faster (supernodal) approach.

The small performance test script (and relevant data for the tetrahedron mesh) is available here.

samuelpmish · 2024-01-05T19:54:10Z

Oh, also when I ran pytest test_cholesky.py, only a small fraction of the tests were not skipped. Those tests that ran passed, but I can't verify that the remaining tests are working correctly.

wjakob · 2024-01-05T20:12:49Z

A first question: enabling a supernodal decomposition on the CPU makes sense, but what does this mean for the CUDA implementation? I suppose that the kernels would need to be updated to deal with this case. Does it even make sense to use such a strategy on the GPU?

samuelpmish · 2024-01-06T02:37:07Z

Enabling a supernodal decomposition on the CPU makes sense, but what does this mean for the CUDA implementation? ... Does it even make sense to use such a strategy on the GPU?

From the the original talk about GPU-accelerated CHOLMOD, it sounds like supernodal was the only supported factorization strategy. I'm not sure if that's still true today or not. If so, I wonder what CHOLMOD does if you tell it to use the GPU and also specify the simplicial strategy.

I suppose that the kernels would need to be updated to deal with this [supernodal option].

Which kernels need updating?

samuelpmish · 2024-01-06T02:38:47Z

When I run pytest test_cholesky.py on a machine with an NVIDIA GPU (where 17/20 of the tests are not skipped, and run successfully), nvprof reveals that none of the linear algebra in cholespy is using the GPU, except for the triangular solves.

CHOLMOD's recent user guide says that only cholmod_l_factorize (for 64-bit integers) supports GPU acceleration, but cholespy uses the 32-bit integer variant which doesn't seem to take advantage of GPU hardware:

cholespy/src/cholesky_solver.cpp

Lines 236 to 238 in acb3342

    
           // Compute the Cholesky factorization 
        
           m_factor = cholmod_analyze(A, &m_common); 
        
           cholmod_factorize(A, m_factor, &m_common);

am I goofing something up?

bathal1

Hi, thanks for this PR !

It looks good to me, besides a small typo in the docstrings.

Regarding GPU CHOLMOD, the implementation makes use of the NVIDIA runtime API, which is a tedious dependency to have as not all users have it installed already, and they might need different versions for different projects.

Our implementation of the sparse triangular solver makes use of the NVIDIA driver API, which is guaranteed to be present on any system with a GPU, as it is shipped with the driver. It is also backwards compatible so that makes it easier to maintain.

In the application we designed this package for initially, we typically compute the decomposition once, and then solve the associated system many times before needing to update the matrix, hence relying on the CPU for decomposition is not a major bottleneck.

bathal1 · 2024-01-12T10:04:49Z

src/docstr.h

add optional argument for specifying factorization strategy

ce50e7a

bathal1 reviewed Jan 12, 2024

View reviewed changes

src/docstr.h

Copy link

Collaborator

bathal1 Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add optional argument for specifying factorization strategy #31

add optional argument for specifying factorization strategy #31

samuelpmish commented Jan 5, 2024

samuelpmish commented Jan 5, 2024

wjakob commented Jan 5, 2024

samuelpmish commented Jan 6, 2024

samuelpmish commented Jan 6, 2024 •

edited

Loading

bathal1 left a comment

bathal1 Jan 12, 2024

add optional argument for specifying factorization strategy #31

Are you sure you want to change the base?

add optional argument for specifying factorization strategy #31

Conversation

samuelpmish commented Jan 5, 2024

samuelpmish commented Jan 5, 2024

wjakob commented Jan 5, 2024

samuelpmish commented Jan 6, 2024

samuelpmish commented Jan 6, 2024 • edited Loading

bathal1 left a comment

Choose a reason for hiding this comment

bathal1 Jan 12, 2024

Choose a reason for hiding this comment

samuelpmish commented Jan 6, 2024 •

edited

Loading