Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph Library #1521

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

Marsella8
Copy link
Contributor

@Marsella8 Marsella8 commented Oct 10, 2024

Description of changes:

Extending testing functionality for graph library
Bug fixing
General clean up

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:

  • Closes #

This change is Reviewable

@Marsella8 Marsella8 requested a review from lockshaw October 10, 2024 04:25
Copy link

codecov bot commented Oct 10, 2024

Codecov Report

Attention: Patch coverage is 95.67010% with 42 lines in your changes missing coverage. Please review.

Please upload report for BASE (master@93298ed). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...h/series_parallel/series_parallel_decomposition.cc 66.66% 12 Missing ⚠️
.../utils/graph/instances/hashmap_undirected_graph.cc 37.50% 10 Missing ⚠️
lib/utils/include/utils/containers/zip.h 30.00% 7 Missing ⚠️
lib/utils/src/utils/graph/algorithms.cc 0.00% 3 Missing ⚠️
.../graph/instances/unordered_set_undirected_graph.cc 0.00% 2 Missing ⚠️
...rc/utils/graph/undirected/undirected_edge_query.cc 0.00% 2 Missing ⚠️
...raph/multidigraph/algorithms/get_incoming_edges.cc 88.88% 1 Missing ⚠️
...raph/multidigraph/algorithms/get_outgoing_edges.cc 88.88% 1 Missing ⚠️
.../utils/graph/series_parallel/parallel_reduction.cc 96.15% 1 Missing ⚠️
...rc/utils/graph/series_parallel/series_reduction.cc 97.43% 1 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff            @@
##             master    #1521   +/-   ##
=========================================
  Coverage          ?   79.78%           
=========================================
  Files             ?      880           
  Lines             ?    29673           
  Branches          ?      792           
=========================================
  Hits              ?    23676           
  Misses            ?     5997           
  Partials          ?        0           
Flag Coverage Δ
unittests 79.78% <95.67%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
lib/utils/include/utils/containers/find.h 100.00% <100.00%> (ø)
...nclude/utils/graph/dataflow_graph/dataflow_graph.h 44.44% <ø> (ø)
lib/utils/include/utils/graph/digraph/digraph.h 100.00% <ø> (ø)
...b/utils/include/utils/graph/digraph/digraph_view.h 100.00% <ø> (ø)
...ls/include/utils/graph/multidigraph/multidigraph.h 100.00% <ø> (ø)
lib/utils/include/utils/graph/node/graph_view.h 100.00% <ø> (ø)
.../include/utils/graph/undirected/undirected_graph.h 100.00% <ø> (ø)
...ude/utils/graph/undirected/undirected_graph_view.h 100.00% <ø> (ø)
lib/utils/src/utils/graph/digraph/algorithms.cc 100.00% <100.00%> (ø)
...plete_bipartite_composite/get_cbc_decomposition.cc 97.50% <100.00%> (ø)
... and 33 more

@Marsella8 Marsella8 marked this pull request as ready for review October 18, 2024 22:02
@lockshaw lockshaw changed the base branch from repo-refactor to master December 16, 2024 08:37
Copy link
Contributor Author

@Marsella8 Marsella8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 41 files reviewed, 1 unresolved discussion (waiting on @lockshaw)


lib/utils/src/utils/graph/series_parallel/series_parallel_decomposition.cc line 82 at r2 (raw file):

}

bool is_empty(Node const &node) {

borrowed from #1562, will fix them there if needed

Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 23 of 32 files at r1, 12 of 16 files at r2, 7 of 9 files at r3, all commit messages.
Reviewable status: 42 of 47 files reviewed, 35 unresolved discussions (waiting on @Marsella8)


lib/utils/test/src/utils/graph/undirected/algorithms/get_connected_components.cc line 10 at r3 (raw file):

using namespace FlexFlow;

TEST_CASE("get_connected_components") {

Suggestion:

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("get_connected_components") {

lib/utils/test/src/utils/graph/undirected/algorithms/get_connected_components.cc line 41 at r3 (raw file):

  }

  SUBCASE("3 components") {

Also test the case where everything is one component

Also test the case where the graph is empty


lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 34 at r3 (raw file):

      std::unordered_set<Node> correct = {n.at(0)};
      CHECK(correct == result);
    }

Suggestion:

    SUBCASE("small acyclic graph") {
      std::vector<Node> n = add_nodes(g, 4);
      std::vector<DirectedEdge> e = {
          DirectedEdge{n.at(0), n.at(3)},
          DirectedEdge{n.at(0), n.at(1)},
          DirectedEdge{n.at(0), n.at(2)},
          DirectedEdge{n.at(1), n.at(2)},
      };
      add_edges(g, e);

      SUBCASE("get_dominators(..., Node const &)") {
        Node node = n.at(2);
        std::unordered_set<Node> correct = {n.at(0), n.at(2)};
        std::unordered_set<Node> result = get_dominators(g, node);
        CHECK(correct == result);
      }
  
      SUBCASE("get_dominators(..., std::unordered_set<Node> const &)") {
        std::unordered_set<Node> nodes = {n.at(1), n.at(3)};
        std::unordered_set<Node> result = get_dominators(g, nodes);
        std::unordered_set<Node> correct = {n.at(0)};
        CHECK(correct == result);
      }
    }

lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 20 at r3 (raw file):

std::unordered_multiset<Node> get_nodes(Node const &);

bool is_empty(Node const &node);

How would you end up with an empty SP decomposition? That doesn't seem like it should be allowed (i.e., might be a good idea to change the container in the splits to be some wrapper type of std::vector that requires itself to be non-empty)


lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 28 at r3 (raw file):

SeriesParallelDecomposition delete_node(SeriesParallelDecomposition sp,
                                        Node const &node);

How is this a well-behaved operation? Like, what are the semantics of this? Are you sure we want this function as part of the API for SeriesParallelDecomposition?

Code quote:

SeriesParallelDecomposition delete_node(SeriesParallelDecomposition sp,
                                        Node const &node);

lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 30 at r3 (raw file):

                                        Node const &node);

// duplicate nodes within `sp` are counted multiple times

Doxygen style please 🙂

Code quote:

// duplicate nodes within `sp` are counted multiple times

lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 33 at r3 (raw file):

size_t num_nodes(SeriesParallelDecomposition const &sp);

SeriesParallelDecomposition serial_composition(

For consistency, never use "serial" in the FF codebase, always use "series"

Suggestion:

SeriesParallelDecomposition series_composition(

lib/utils/include/utils/graph/series_parallel/series_reduction.h line 19 at r3 (raw file):

std::unordered_set<std::vector<MultiDiEdge>>
    find_all_extended_series_reductions(MultiDiGraphView const &g);

What is an "extended series reduction"? Correct me if I'm wrong here but I don't think this is standard terminology(?) in which case it would be good to have a doxygen docstring here with some explanation

Also, it might be nice to create an actual datatype (e.g., ExtendedSeriesReduction) that is a wrapper type around std::vector<MultiDiEdge> rather than just using a raw vector.

Code quote:

    find_all_extended_series_reductions(MultiDiGraphView const &g);

lib/utils/test/src/utils/graph/undirected/undirected.cc line 15 at r3 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE_TEMPLATE(

Why are there two tests here?


lib/utils/test/src/utils/graph/undirected/undirected.cc line 18 at r3 (raw file):

      "UndirectedGraph implementations", T, HashmapUndirectedGraph) {

    RC_SUBCASE("Full", [&]() {

Why is this using rapidcheck?


lib/utils/src/utils/graph/views/views.cc line 2 at r3 (raw file):

#include "utils/graph/views/views.h"
#include "utils/bidict/algorithms/right_entries.h"

I'm not seeing `right_entries used anywhere here in new code. Was this already necessary and just missing, or can this be removed?


lib/utils/src/utils/graph/views/views.cc line 70 at r3 (raw file):

UndirectedEdge to_undirected_edge(DirectedEdge const &e) {
  return UndirectedEdge{{e.src, e.dst}};

Does commutative_pair not have an explicit constructor?

Code quote:

{e.src, e.dst}

lib/utils/src/utils/graph/views/views.cc line 122 at r3 (raw file):

  std::unordered_set<UndirectedEdge> undirected_edges =
      set_union(g.query_edges(UndirectedEdgeQuery{q.srcs}),
                g.query_edges(UndirectedEdgeQuery{q.dsts}));

Can't you just union the query_sets (i.e., q.srcs and q.dsts) and then do the query, rather than unioning the query results?

Code quote:

  std::unordered_set<UndirectedEdge> undirected_edges =
      set_union(g.query_edges(UndirectedEdgeQuery{q.srcs}),
                g.query_edges(UndirectedEdgeQuery{q.dsts}));

lib/utils/test/src/utils/graph/digraph/algorithms/traversal.cc line 12 at r3 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("get_unchecked_dfs_ordering") {

IIRC the semantics of unchecked_dfs_ordering are essentially that it is free to visit nodes multiple times (i.e., in a diamond graph of 4 nodes (a -> b, a -> c, b -> d, c -> d), it would either yield [a, b, d, c, d] or [a, c, d, b, d]--it would be good to add a testcase for this


lib/utils/test/src/utils/graph/digraph/algorithms/traversal.cc line 24 at r3 (raw file):

      std::vector<Node> result = get_unchecked_dfs_ordering(g, {n[0]});
      CHECK(correct == result);
    }

Suggestion:

    SUBCASE("simple path") {
      std::vector<Node> correct = {n[0], n[1], n[2], n[3]};
      std::vector<Node> result = get_unchecked_dfs_ordering(g, {n[0]});
      CHECK(correct == result);
    }
    
    SUBCASE("start from non-initial node") {
      std::vector<Node> correct = {n[1], n[2], n[3]};
      std::vector<Node> result = get_unchecked_dfs_ordering(g, {n[1]});
      CHECK(correct == result);
    }

lib/utils/test/src/utils/graph/digraph/algorithms/algorithms.cc line 11 at r3 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("DiGraph - algorithms.cc") {

Prefer one TEST_CASE per function, rather than one SUBCASE per function


lib/utils/test/src/utils/graph/digraph/algorithms/algorithms.cc line 16 at r3 (raw file):

    std::vector<Node> n = add_nodes(g, 4);
    std::vector<DirectedEdge> e = {
        DirectedEdge{n[0], n[1]},

Prefer .at for bounds checking

Suggestion:

        DirectedEdge{n.at(0), n.at(1)},

lib/utils/test/src/utils/graph/digraph/algorithms/algorithms.cc line 55 at r3 (raw file):

    }

    SUBCASE("get_sinks") {

Potentially rename to get_terminal_nodes/get_initial_nodes following https://reviewable.io/reviews/flexflow/flexflow-train/1565#-OGCWYwtE_rcz9e1WizH


lib/utils/test/src/utils/containers/contains.cc line 19 at r3 (raw file):

      std::unordered_set<int> s = {1, 2, 3, 4, 5};
      CHECK(contains(s, 3));
      CHECK(!contains(s, 6));

Doesn't doctest require this to use CHECK_FALSE? iirc doctest has an issue with CHECK(!...)

Code quote:

      CHECK(!contains(s, 6));

lib/utils/test/src/utils/graph/views/views.cc line 49 at r3 (raw file):

      CHECK(result == expected);
    }
  }

Probably the best way to test these (due to the complexity of the querying interface) would be to use rapidcheck to generate a bunch of operations and then ensure the behaviors are the same between the view and the non-view graph that would be generated. This is going to be annoying to write, but without restructuring things this is probably the correct approach. The good thing is that the equivalence testing code only needs to be written once per graphview type and then can be reused for each of the views.

Code quote:

  TEST_CASE("UndirectedSubgraphView") {
    UndirectedGraph g = UndirectedGraph::create<HashmapUndirectedGraph>();
    std::vector<Node> n = add_nodes(g, 5);
    add_edges(g,
              {UndirectedEdge{{n.at(0), n.at(3)}},
               UndirectedEdge{{n.at(1), n.at(1)}},
               UndirectedEdge{{n.at(1), n.at(2)}},
               UndirectedEdge{{n.at(1), n.at(3)}},
               UndirectedEdge{{n.at(2), n.at(3)}},
               UndirectedEdge{{n.at(2), n.at(4)}}});
    std::unordered_set<Node> sub_nodes = {n.at(0), n.at(1), n.at(3)};
    UndirectedGraphView view = view_subgraph(g, sub_nodes);

    SUBCASE("get_nodes") {
      std::unordered_set<Node> expected = {n.at(0), n.at(1), n.at(3)};

      std::unordered_set<Node> result = get_nodes(view);

      CHECK(result == expected);
    }

    SUBCASE("get_edges") {
      std::unordered_set<UndirectedEdge> expected = {
          UndirectedEdge{{n.at(0), n.at(3)}},
          UndirectedEdge{{n.at(1), n.at(1)}},
          UndirectedEdge{{n.at(1), n.at(3)}},
      };

      std::unordered_set<UndirectedEdge> result = get_edges(view);

      CHECK(result == expected);
    }
  }

lib/utils/include/utils/graph/series_parallel/parallel_reduction.h line 15 at r3 (raw file):

    find_parallel_reduction(MultiDiGraphView const &);

std::unordered_map<DirectedEdge, std::unordered_set<MultiDiEdge>>

What is an "extended parallel reduction"? Correct me if I'm wrong here but I don't think this is standard terminology(?) in which case it would be good to have a doxygen docstring here with some explanation

Also, it might be nice to create an actual datatype (e.g., ExtendedParallelReduction) that is a wrapper type around std::unordered_set<MultiDiEdge> rather than just using a raw unordered set

If this is just for accelerating the SP decomposition algorithm's rewriting step, it might be better to put these functions (and the corresponding ones for "extended series reductions" in that file rather than in the general "parallel_reduction.h" header)


lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 35 at r3 (raw file):

    SUBCASE("node has no outgoing edges") {
      std::unordered_set<MultiDiEdge> result = get_outgoing_edges(g, n.at(2));

Would be nice for this node to have some incoming edges to ensure they're not accidentally picked up


lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 31 at r3 (raw file):

    SUBCASE("node has no incoming edges") {
      std::unordered_set<MultiDiEdge> result = get_incoming_edges(g, n.at(2));

Would be nice for this node to have some outgoing edges to ensure they're not accidentally picked up


lib/utils/include/utils/graph/node/node.struct.toml line 9 at r3 (raw file):

  "fmt",
  "json",
  "rapidcheck",

How are you using a default Arbitrary instance for Node? I have no issue adding generators for Node, but the default instance seems unlikely to be helpful as Node is only meaningful in the context of a graph?


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 263 at r3 (raw file):

          find_all_extended_series_reductions(g);
      std::unordered_set<std::vector<MultiDiEdge>> correct = {
          {e[0], e[1], e[2]}};

Prefer .at for bounds checking

Suggestion:

          {e.at(0), e.at(1), e.at(2)}};

lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 267 at r3 (raw file):

    }

    SUBCASE("2 linear strands") {

Suggestion:

    SUBCASE("2 linear strands that share a common terminal node") {

lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 277 at r3 (raw file):

      std::unordered_set<std::vector<MultiDiEdge>> result =
          find_all_extended_series_reductions(g);
      std::unordered_set<std::vector<MultiDiEdge>> correct = {{e[0], e[2]},

Prefer .at for bounds checking


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 300 at r3 (raw file):

          find_all_extended_series_reductions(g);
      std::unordered_set<std::vector<MultiDiEdge>> correct = {
          {e[0], e[2], e[7]}, {e[3], e[6]}, {e[5], e[9]}};

Prefer .at for bounds checking


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 372 at r3 (raw file):

        std::unordered_set<MultiDiEdge> result_edges = get_edges(g);
        std::unordered_set<MultiDiEdge> correct_edges = [&] {
          std::unordered_set<MultiDiEdge> new_edges = unordered_set_of(e);

Why not just set_minus here?


lib/utils/include/utils/graph/multidigraph/algorithms/get_outgoing_edges.h line 12 at r3 (raw file):

std::unordered_map<Node, std::unordered_set<MultiDiEdge>>
    get_outgoing_edges(MultiDiGraphView const &g);

Suggestion:

std::unordered_map<Node, std::unordered_set<MultiDiEdge>>
    get_outgoing_edges_map(MultiDiGraphView const &g);

lib/utils/include/utils/graph/multidigraph/algorithms/get_incoming_edges.h line 12 at r3 (raw file):

std::unordered_map<Node, std::unordered_set<MultiDiEdge>>
    get_incoming_edges(MultiDiGraphView const &g);

Suggestion:

std::unordered_map<Node, std::unordered_set<MultiDiEdge>>
    get_incoming_edges_map(MultiDiGraphView const &g);

lib/utils/test/src/utils/graph/digraph/algorithms/directed_edge_query.cc line 27 at r3 (raw file):

      DirectedEdgeQuery result = directed_edge_query_all();

      CHECK(matches_edge(result, DirectedEdge{n.at(0), n.at(1)}));

Do you really need 5 checks here? It feels like the behavior of directed_edge_query_all could have been tested on a much smaller graph (or even without a graph at all, using rapidcheck). This feels like unnecessary reusing of g, which is why I recommend doing one TEST_CASE per function instead of one SUBCASE per function unless you have very good reason.

In fact, I don't see how g is necessary for any of the cases in this file?


lib/utils/test/src/utils/graph/digraph/algorithms/directed_edge_query.cc line 38 at r3 (raw file):

          DirectedEdgeQuery{query_set{n.at(0)}, query_set{n.at(1)}};

      CHECK(matches_edge(q, DirectedEdge{n.at(0), n.at(1)}));

Check that it doesn't match reversed edges

Code quote:

      CHECK(matches_edge(q, DirectedEdge{n.at(0), n.at(1)}));

lib/utils/test/src/utils/graph/digraph/algorithms/directed_edge_query.cc line 57 at r3 (raw file):

        CHECK(result == correct);
      }
      SUBCASE("intersection with std::nullopt") {

Suggestion:

      SUBCASE("intersection with matchall") {

lib/utils/src/utils/graph/series_parallel/series_reduction.cc line 79 at r3 (raw file):

  }
  return unordered_set_of(values(strands));
}

Newlines please--this is really hard to read

Code quote:

std::unordered_set<std::vector<MultiDiEdge>>
    find_all_extended_series_reductions(MultiDiGraphView const &g) {
  std::unordered_map<Node, std::unordered_set<MultiDiEdge>> incoming_edges =
      get_incoming_edges(g);
  std::unordered_map<Node, std::unordered_set<MultiDiEdge>> outgoing_edges =
      get_outgoing_edges(g);
  std::unordered_map<Node, std::vector<MultiDiEdge>> strands;
  std::unordered_map<Node, Node> node_to_head_of_strand;
  for (Node const &n : get_topological_ordering(g)) {
    if ((incoming_edges.at(n).size() == 1) &&
        (outgoing_edges.at(n).size() == 1)) {
      MultiDiEdge incoming = get_only(incoming_edges.at(n));
      MultiDiEdge outgoing = get_only(outgoing_edges.at(n));
      Node pre = g.get_multidiedge_src(incoming);
      if (contains_key(node_to_head_of_strand, pre)) {
        Node head = node_to_head_of_strand.at(pre);
        node_to_head_of_strand.emplace(n, head);
        strands.at(head).push_back(outgoing);
      } else {
        node_to_head_of_strand.emplace(n, n);
        strands[n].push_back(incoming);
        strands[n].push_back(outgoing);
      }
    }
  }
  return unordered_set_of(values(strands));
}

Copy link
Contributor Author

@Marsella8 Marsella8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 42 of 47 files reviewed, 35 unresolved discussions (waiting on @lockshaw)


lib/utils/include/utils/graph/node/node.struct.toml line 9 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

How are you using a default Arbitrary instance for Node? I have no issue adding generators for Node, but the default instance seems unlikely to be helpful as Node is only meaningful in the context of a graph?

removed, had accidentally added rapidcheck to UndirectedEdge


lib/utils/include/utils/graph/series_parallel/parallel_reduction.h line 15 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What is an "extended parallel reduction"? Correct me if I'm wrong here but I don't think this is standard terminology(?) in which case it would be good to have a doxygen docstring here with some explanation

Also, it might be nice to create an actual datatype (e.g., ExtendedParallelReduction) that is a wrapper type around std::unordered_set<MultiDiEdge> rather than just using a raw unordered set

If this is just for accelerating the SP decomposition algorithm's rewriting step, it might be better to put these functions (and the corresponding ones for "extended series reductions" in that file rather than in the general "parallel_reduction.h" header)

Have added docstrings and datatypes. (It's for accelerating the algo by batching multiple reduction steps together).
I've kept the extended reductions there, given that already series and parallel reduction were used solely for the SP decomp algorithm (So now they are not really used anymore, but still probably worth keeping around since they are pretty general functions).


lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 20 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

How would you end up with an empty SP decomposition? That doesn't seem like it should be allowed (i.e., might be a good idea to change the container in the splits to be some wrapper type of std::vector that requires itself to be non-empty)

it can emerge in some of the SP-ization algorithms as a temporary artifact, though the final SP decompositions are always normalized (see normalize_sp_decomposition in #1562 ) so it doesn't show up "externally".


lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 28 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

How is this a well-behaved operation? Like, what are the semantics of this? Are you sure we want this function as part of the API for SeriesParallelDecomposition?

I've deleted it (in #1562), it was originally used for making a 2 terminal SP graph into an all-to-all SP graph, by deleting the synchronization nodes.


lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 30 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Doxygen style please 🙂

fixed in #1562


lib/utils/include/utils/graph/series_parallel/series_parallel_decomposition.h line 33 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For consistency, never use "serial" in the FF codebase, always use "series"

Done.


lib/utils/include/utils/graph/series_parallel/series_reduction.h line 19 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What is an "extended series reduction"? Correct me if I'm wrong here but I don't think this is standard terminology(?) in which case it would be good to have a doxygen docstring here with some explanation

Also, it might be nice to create an actual datatype (e.g., ExtendedSeriesReduction) that is a wrapper type around std::vector<MultiDiEdge> rather than just using a raw vector.

Done.


lib/utils/src/utils/graph/series_parallel/series_reduction.cc line 79 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Newlines please--this is really hard to read

Done.


lib/utils/src/utils/graph/views/views.cc line 2 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I'm not seeing `right_entries used anywhere here in new code. Was this already necessary and just missing, or can this be removed?

removed.


lib/utils/src/utils/graph/views/views.cc line 70 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Does commutative_pair not have an explicit constructor?

No, I can add it, though note that it would make writing test cases for UndirectedGraph pretty verbose, since you now have to do UndirectedEdge{commutative_pair{x,y}}


lib/utils/src/utils/graph/views/views.cc line 122 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Can't you just union the query_sets (i.e., q.srcs and q.dsts) and then do the query, rather than unioning the query results?

Done.


lib/utils/test/src/utils/containers/contains.cc line 19 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Doesn't doctest require this to use CHECK_FALSE? iirc doctest has an issue with CHECK(!...)

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/algorithms.cc line 11 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer one TEST_CASE per function, rather than one SUBCASE per function

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/algorithms.cc line 16 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer .at for bounds checking

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/algorithms.cc line 55 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Potentially rename to get_terminal_nodes/get_initial_nodes following https://reviewable.io/reviews/flexflow/flexflow-train/1565#-OGCWYwtE_rcz9e1WizH

changed to initial and terminal respectively


lib/utils/test/src/utils/graph/digraph/algorithms/directed_edge_query.cc line 27 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do you really need 5 checks here? It feels like the behavior of directed_edge_query_all could have been tested on a much smaller graph (or even without a graph at all, using rapidcheck). This feels like unnecessary reusing of g, which is why I recommend doing one TEST_CASE per function instead of one SUBCASE per function unless you have very good reason.

In fact, I don't see how g is necessary for any of the cases in this file?

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/directed_edge_query.cc line 38 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check that it doesn't match reversed edges

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/traversal.cc line 12 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

IIRC the semantics of unchecked_dfs_ordering are essentially that it is free to visit nodes multiple times (i.e., in a diamond graph of 4 nodes (a -> b, a -> c, b -> d, c -> d), it would either yield [a, b, d, c, d] or [a, c, d, b, d]--it would be good to add a testcase for this

Done.


lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 31 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Would be nice for this node to have some outgoing edges to ensure they're not accidentally picked up

Done.


lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 35 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Would be nice for this node to have some incoming edges to ensure they're not accidentally picked up

Done.


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 263 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer .at for bounds checking

Done.


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 277 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer .at for bounds checking

Done.


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 300 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer .at for bounds checking

Done.


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 372 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why not just set_minus here?

Done.


lib/utils/test/src/utils/graph/undirected/undirected.cc line 15 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why are there two tests here?

duplicate, removed


lib/utils/test/src/utils/graph/undirected/undirected.cc line 18 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why is this using rapidcheck?

is this not ok for basic sanity checks? iirc, it was already present, I mostly cleaned it up.


lib/utils/test/src/utils/graph/undirected/algorithms/get_connected_components.cc line 41 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Also test the case where everything is one component

Also test the case where the graph is empty

Done.


lib/utils/test/src/utils/graph/views/views.cc line 49 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Probably the best way to test these (due to the complexity of the querying interface) would be to use rapidcheck to generate a bunch of operations and then ensure the behaviors are the same between the view and the non-view graph that would be generated. This is going to be annoying to write, but without restructuring things this is probably the correct approach. The good thing is that the equivalence testing code only needs to be written once per graphview type and then can be reused for each of the views.

Sorry, I'm not super clear on this. The idea would be to:

  • Have a set graph (which is not RC generated)
  • Use RC to generate a bunch of random edge, node queries and apply them to both the normal graph and the view equivalent.
  • Have some equality checker that checks that the resulting nodes / edges match between the real graph and the view (e.g. for ViewDirectedAsUndirected, ensure that every DiEdge has a corresponding UndirectedEdge and viceversa (1 or 2 DiEdge, more precisely).

I don't see how to make the equivalence testing code to be generic, wouldn't you need to make some assumptions about what specific view you are using?


lib/utils/include/utils/graph/multidigraph/algorithms/get_incoming_edges.h line 12 at r3 (raw file):

std::unordered_map<Node, std::unordered_set<MultiDiEdge>>
    get_incoming_edges(MultiDiGraphView const &g);

Have changed the signature to get_incoming_edges(MultiDiGraphView, std::unordered_set<Node>), this way it's clearer that it returns a map and it matches with get_incoming_edges for DiGraphs


lib/utils/include/utils/graph/multidigraph/algorithms/get_outgoing_edges.h line 12 at r3 (raw file):

std::unordered_map<Node, std::unordered_set<MultiDiEdge>>
    get_outgoing_edges(MultiDiGraphView const &g);

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/directed_edge_query.cc line 57 at r3 (raw file):

        CHECK(result == correct);
      }
      SUBCASE("intersection with std::nullopt") {

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 34 at r3 (raw file):

      std::unordered_set<Node> correct = {n.at(0)};
      CHECK(correct == result);
    }

Done.


lib/utils/test/src/utils/graph/undirected/algorithms/get_connected_components.cc line 10 at r3 (raw file):

using namespace FlexFlow;

TEST_CASE("get_connected_components") {

Done.


lib/utils/test/src/utils/graph/series_parallel/series_reduction.cc line 267 at r3 (raw file):

    }

    SUBCASE("2 linear strands") {

Done.


lib/utils/test/src/utils/graph/digraph/algorithms/traversal.cc line 24 at r3 (raw file):

      std::vector<Node> result = get_unchecked_dfs_ordering(g, {n[0]});
      CHECK(correct == result);
    }

Done.

Pietro Max Marsella added 2 commits January 11, 2025 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants