Skip to content

Commit

Permalink
Update C API graph creation function signatures (#3982)
Browse files Browse the repository at this point in the history
Updating the C API graph creation functions to support the following:
* Add support for isolated vertices
* Add MG optimization to support multiple device arrays per rank as input and concatenate them internally
* Add MG optimization to internally compute the number of edges via allreduce rather than requiring it as an input parameter (this can be expensive to compute in python)

This PR implements these features.  Some simple tests exist to check for isolate vertices (by running pagerank which generates a different result if the graph has isolated vertices).  A simple test for multiple input arrays exists for the MG case.

Closes #3947 
Closes #3974

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Naim (https://github.com/naimnv)

Approvers:
  - Naim (https://github.com/naimnv)
  - Joseph Nke (https://github.com/jnke2016)
  - Seunghwa Kang (https://github.com/seunghwak)

URL: #3982
  • Loading branch information
ChuckHastings authored Nov 21, 2023
1 parent 6b3d3e3 commit 3f1c7b5
Show file tree
Hide file tree
Showing 14 changed files with 2,265 additions and 246 deletions.
2 changes: 2 additions & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,8 @@ set(CUGRAPH_SOURCES
src/community/detail/mis_mg.cu
src/detail/utility_wrappers.cu
src/structure/graph_view_mg.cu
src/structure/remove_self_loops.cu
src/structure/remove_multi_edges.cu
src/utilities/path_retrieval.cu
src/structure/legacy/graph.cu
src/linear_assignment/legacy/hungarian.cu
Expand Down
67 changes: 67 additions & 0 deletions cpp/include/cugraph/graph_functions.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -973,4 +973,71 @@ renumber_sampled_edgelist(
label_offsets,
bool do_expensive_check = false);

/**
* @brief Remove self loops from an edge list
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weight. Currently float and double are supported.
* @tparam edge_type_t Type of edge type. Needs to be an integral type.
*
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param edgelist_srcs List of source vertex ids
* @param edgelist_dsts List of destination vertex ids
* @param edgelist_weights Optional list of edge weights
* @param edgelist_edge_ids Optional list of edge ids
* @param edgelist_edge_types Optional list of edge types
* @return Tuple of vectors storing edge sources, destinations, optional weights,
* optional edge ids, optional edge types.
*/
template <typename vertex_t, typename edge_t, typename weight_t, typename edge_type_t>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<rmm::device_uvector<edge_t>>,
std::optional<rmm::device_uvector<edge_type_t>>>
remove_self_loops(raft::handle_t const& handle,
rmm::device_uvector<vertex_t>&& edgelist_srcs,
rmm::device_uvector<vertex_t>&& edgelist_dsts,
std::optional<rmm::device_uvector<weight_t>>&& edgelist_weights,
std::optional<rmm::device_uvector<edge_t>>&& edgelist_edge_ids,
std::optional<rmm::device_uvector<edge_type_t>>&& edgelist_edge_types);

/**
* @brief Remove all but one edge when a multi-edge exists. Note that this function does not use
* stable methods. When a multi-edge exists, one of the edges will remain, there is no
* guarantee on which one will remain.
*
* In an MG context it is assumed that edges have been shuffled to the proper GPU,
* in which case any multi-edges will be on the same GPU.
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weight. Currently float and double are supported.
* @tparam edge_type_t Type of edge type. Needs to be an integral type.
*
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param edgelist_srcs List of source vertex ids
* @param edgelist_dsts List of destination vertex ids
* @param edgelist_weights Optional list of edge weights
* @param edgelist_edge_ids Optional list of edge ids
* @param edgelist_edge_types Optional list of edge types
* @return Tuple of vectors storing edge sources, destinations, optional weights,
* optional edge ids, optional edge types.
*/
template <typename vertex_t, typename edge_t, typename weight_t, typename edge_type_t>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<rmm::device_uvector<edge_t>>,
std::optional<rmm::device_uvector<edge_type_t>>>
remove_multi_edges(raft::handle_t const& handle,
rmm::device_uvector<vertex_t>&& edgelist_srcs,
rmm::device_uvector<vertex_t>&& edgelist_dsts,
std::optional<rmm::device_uvector<weight_t>>&& edgelist_weights,
std::optional<rmm::device_uvector<edge_t>>&& edgelist_edge_ids,
std::optional<rmm::device_uvector<edge_type_t>>&& edgelist_edge_types);

} // namespace cugraph
193 changes: 178 additions & 15 deletions cpp/include/cugraph_c/graph.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,11 @@ typedef struct {
bool_t is_multigraph;
} cugraph_graph_properties_t;

// FIXME: Add support for specifying isolated vertices
/**
* @brief Construct an SG graph
*
* @deprecated This API will be deleted, use cugraph_graph_create_sg instead
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] src Device array containing the source vertex ids.
Expand All @@ -51,11 +52,11 @@ typedef struct {
argument that can be NULL if edge types are not used.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] renumber If true, renumber vertices to make an efficient data structure.
* If false, do not renumber. Renumbering is required if the vertices are not sequential
* integer values from 0 to num_vertices.
* If false, do not renumber. Renumbering enables some significant optimizations within
* the graph primitives library, so it is strongly encouraged. Renumbering is required if
* the vertices are not sequential integer values from 0 to num_vertices.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [in] properties Properties of the graph
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
Expand All @@ -76,9 +77,63 @@ cugraph_error_code_t cugraph_sg_graph_create(
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Construct an SG graph
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] vertices Optional device array containing a list of vertex ids
* (specify NULL if we should create vertex ids from the
* unique contents of @p src and @p dst)
* @param [in] src Device array containing the source vertex ids.
* @param [in] dst Device array containing the destination vertex ids
* @param [in] weights Device array containing the edge weights. Note that an unweighted
* graph can be created by passing weights == NULL.
* @param [in] edge_ids Device array containing the edge ids for each edge. Optional
argument that can be NULL if edge ids are not used.
* @param [in] edge_type_ids Device array containing the edge types for each edge. Optional
argument that can be NULL if edge types are not used.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] renumber If true, renumber vertices to make an efficient data structure.
* If false, do not renumber. Renumbering enables some significant optimizations within
* the graph primitives library, so it is strongly encouraged. Renumbering is required if
* the vertices are not sequential integer values from 0 to num_vertices.
* @param [in] drop_self_loops If true, drop any self loops that exist in the provided edge list.
* @param [in] drop_multi_edges If true, drop any multi edges that exist in the provided edge list.
* Note that setting this flag will arbitrarily select one instance of a multi edge to be the
* edge that survives. If the edges have properties that should be honored (e.g. sum the
weights,
* or take the maximum weight), the caller should do that on not rely on this flag.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
*
* @return error code
*/
cugraph_error_code_t cugraph_graph_create_sg(
const cugraph_resource_handle_t* handle,
const cugraph_graph_properties_t* properties,
const cugraph_type_erased_device_array_view_t* vertices,
const cugraph_type_erased_device_array_view_t* src,
const cugraph_type_erased_device_array_view_t* dst,
const cugraph_type_erased_device_array_view_t* weights,
const cugraph_type_erased_device_array_view_t* edge_ids,
const cugraph_type_erased_device_array_view_t* edge_type_ids,
bool_t store_transposed,
bool_t renumber,
bool_t drop_self_loops,
bool_t drop_multi_edges,
bool_t do_expensive_check,
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Construct an SG graph from a CSR input
*
* @deprecated This API will be deleted, use cugraph_graph_create_sg_from_csr instead
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] offsets Device array containing the CSR offsets array
Expand All @@ -91,11 +146,11 @@ cugraph_error_code_t cugraph_sg_graph_create(
argument that can be NULL if edge types are not used.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] renumber If true, renumber vertices to make an efficient data structure.
* If false, do not renumber. Renumbering is required if the vertices are not sequential
* integer values from 0 to num_vertices.
* If false, do not renumber. Renumbering enables some significant optimizations within
* the graph primitives library, so it is strongly encouraged. Renumbering is required if
* the vertices are not sequential integer values from 0 to num_vertices.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [in] properties Properties of the graph
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
Expand All @@ -117,18 +172,50 @@ cugraph_error_code_t cugraph_sg_graph_create_from_csr(
cugraph_error_t** error);

/**
* @brief Destroy an SG graph
* @brief Construct an SG graph from a CSR input
*
* @param [in] graph A pointer to the graph object to destroy
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] offsets Device array containing the CSR offsets array
* @param [in] indices Device array containing the destination vertex ids
* @param [in] weights Device array containing the edge weights. Note that an unweighted
* graph can be created by passing weights == NULL.
* @param [in] edge_ids Device array containing the edge ids for each edge. Optional
argument that can be NULL if edge ids are not used.
* @param [in] edge_type_ids Device array containing the edge types for each edge. Optional
argument that can be NULL if edge types are not used.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] renumber If true, renumber vertices to make an efficient data structure.
* If false, do not renumber. Renumbering enables some significant optimizations within
* the graph primitives library, so it is strongly encouraged. Renumbering is required if
* the vertices are not sequential integer values from 0 to num_vertices.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
*
* @return error code
*/
// FIXME: This should probably just be cugraph_graph_free
// but didn't want to confuse with original cugraph_free_graph
void cugraph_sg_graph_free(cugraph_graph_t* graph);
cugraph_error_code_t cugraph_graph_create_sg_from_csr(
const cugraph_resource_handle_t* handle,
const cugraph_graph_properties_t* properties,
const cugraph_type_erased_device_array_view_t* offsets,
const cugraph_type_erased_device_array_view_t* indices,
const cugraph_type_erased_device_array_view_t* weights,
const cugraph_type_erased_device_array_view_t* edge_ids,
const cugraph_type_erased_device_array_view_t* edge_type_ids,
bool_t store_transposed,
bool_t renumber,
bool_t do_expensive_check,
cugraph_graph_t** graph,
cugraph_error_t** error);

// FIXME: Add support for specifying isolated vertices
/**
* @brief Construct an MG graph
*
* @deprecated This API will be deleted, use cugraph_graph_create_mg instead
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] src Device array containing the source vertex ids
Expand Down Expand Up @@ -165,13 +252,89 @@ cugraph_error_code_t cugraph_mg_graph_create(
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Construct an MG graph
*
* @param [in] handle Handle for accessing resources
* @param [in] properties Properties of the constructed graph
* @param [in] vertices List of device arrays containing the unique vertex ids.
* If NULL we will construct this internally using the unique
* entries specified in src and dst
* All entries in this list will be concatenated on this GPU
* into a single array.
* @param [in] src List of device array containing the source vertex ids
* All entries in this list will be concatenated on this GPU
* into a single array.
* @param [in] dst List of device array containing the destination vertex ids
* All entries in this list will be concatenated on this GPU
* into a single array.
* @param [in] weights List of device array containing the edge weights. Note that an
* unweighted graph can be created by passing weights == NULL. If a weighted graph is to be
* created, the weights device array should be created on each rank, but the pointer can be NULL and
* the size 0 if there are no inputs provided by this rank All entries in this list will be
* concatenated on this GPU into a single array.
* @param [in] edge_ids List of device array containing the edge ids for each edge. Optional
* argument that can be NULL if edge ids are not used.
* All entries in this list will be concatenated on this GPU
* into a single array.
* @param [in] edge_type_ids List of device array containing the edge types for each edge.
* Optional argument that can be NULL if edge types are not used. All entries in this list will be
* concatenated on this GPU into a single array.
* @param [in] store_transposed If true create the graph initially in transposed format
* @param [in] num_arrays The number of arrays specified in @p vertices, @p src, @p dst, @p
* weights, @p edge_ids and @p edge_type_ids
* @param [in] drop_self_loops If true, drop any self loops that exist in the provided edge list.
* @param [in] drop_multi_edges If true, drop any multi edges that exist in the provided edge list.
* Note that setting this flag will arbitrarily select one instance of a multi edge to be the
* edge that survives. If the edges have properties that should be honored (e.g. sum the
* weights, or take the maximum weight), the caller should do that on not rely on this flag.
* @param [in] do_expensive_check If true, do expensive checks to validate the input data
* is consistent with software assumptions. If false bypass these checks.
* @param [out] graph A pointer to the graph object
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_graph_create_mg(
cugraph_resource_handle_t const* handle,
cugraph_graph_properties_t const* properties,
cugraph_type_erased_device_array_view_t const* const* vertices,
cugraph_type_erased_device_array_view_t const* const* src,
cugraph_type_erased_device_array_view_t const* const* dst,
cugraph_type_erased_device_array_view_t const* const* weights,
cugraph_type_erased_device_array_view_t const* const* edge_ids,
cugraph_type_erased_device_array_view_t const* const* edge_type_ids,
bool_t store_transposed,
size_t num_arrays,
bool_t drop_self_loops,
bool_t drop_multi_edges,
bool_t do_expensive_check,
cugraph_graph_t** graph,
cugraph_error_t** error);

/**
* @brief Destroy an graph
*
* @param [in] graph A pointer to the graph object to destroy
*/
void cugraph_graph_free(cugraph_graph_t* graph);

/**
* @brief Destroy an SG graph
*
* @deprecated This API will be deleted, use cugraph_graph_free instead
*
* @param [in] graph A pointer to the graph object to destroy
*/
void cugraph_sg_graph_free(cugraph_graph_t* graph);

/**
* @brief Destroy an MG graph
*
* @deprecated This API will be deleted, use cugraph_graph_free instead
*
* @param [in] graph A pointer to the graph object to destroy
*/
// FIXME: This should probably just be cugraph_graph_free
// but didn't want to confuse with original cugraph_free_graph
void cugraph_mg_graph_free(cugraph_graph_t* graph);

/**
Expand Down
12 changes: 12 additions & 0 deletions cpp/include/cugraph_c/resource_handle.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,18 @@ typedef struct cugraph_resource_handle_ {
*/
cugraph_resource_handle_t* cugraph_create_resource_handle(void* raft_handle);

/**
* @brief get comm_size from resource handle
*
* If the resource handle has been configured for multi-gpu, this will return
* the comm_size for this cluster. If the resource handle has not been configured for
* multi-gpu this will always return 1.
*
* @param [in] handle Handle for accessing resources
* @return comm_size
*/
int cugraph_resource_handle_get_comm_size(const cugraph_resource_handle_t* handle);

/**
* @brief get rank from resource handle
*
Expand Down
Loading

0 comments on commit 3f1c7b5

Please sign in to comment.