-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utils: Refactor and Test Updates #1464
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 16 files reviewed, 2 unresolved discussions (waiting on @lockshaw)
lib/utils/test/src/utils/graph/views/views.cc
line 48 at r1 (raw file):
std::unordered_set<UndirectedEdge> result = get_edges(view); // CHECK(result == expected);
possible bug? currently the get_edges also returns the edges that "cross the cut" (e.g. (1, 2)), though this does not happen for example with DiSubgraphView.
lib/utils/test/src/utils/graph/views/views.cc
line 88 at r1 (raw file):
} }
commented because not yet implemented
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## repo-refactor #1464 +/- ##
=================================================
+ Coverage 76.83% 78.16% +1.33%
=================================================
Files 792 860 +68
Lines 26559 27994 +1435
Branches 743 770 +27
=================================================
+ Hits 20406 21881 +1475
+ Misses 6153 6113 -40
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 57 files reviewed, 3 unresolved discussions (waiting on @lockshaw)
lib/utils/test/src/utils/containers/get_first.cc
line 10 at r2 (raw file):
TEST_CASE("get_first") { std::unordered_set<int> s = {1, 2, 3}; CHECK(contains(s, get_first(s)));
maybe get_any
? get first is a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 16 files at r1, 41 of 41 files at r2, 12 of 12 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 100 unresolved discussions (waiting on @Marsella8)
lib/utils/include/utils/graph/README.md
line 0 at r4 (raw file):
Thanks for going out of your way to update the documentation! This is very much appreciated 🙂
lib/utils/include/utils/graph/README.md
line 21 at r4 (raw file):
- `DiGraph`: at most one edge allowed between every ordered pair of nodes, edges are directed (i.e., have a source node and a destination node) - `MultiDiGraph`: arbitrary numbers of directed edges allowed between every pair of nodes. - `DataFlowGraph`: similar to `MultiDiGraph`, but the edges entering, exiting a given nodes now have a well-defined order.
DataflowGraph
(note capitalization) also has some additional restrictions: first of all all DataflowGraph
s are DAGs, and they have an explicit concept of inputs and outputs from a node, with the restriction that exactly one edge (no more and no less) can be going into an individual input. Conceptually DataflowGraph
is intended to represent computation-style graphs, where edges represent value uses and nodes represent functions from tuples of inputs to tuples of outputs.
lib/utils/include/utils/graph/README.md
line 126 at r4 (raw file):
This graph class is particularly useful for processing a sub-graph of a given graph while still maintaining information regarding the edges that cross the cut. ### Labelled Dataflow Variant
The main (and possibly only now?) labelled variant is LabelledDataflowGraph
, which has labels for nodes and for node outputs, similar to the old OutputLabelledMultiDiGraph
, and LabelledOpenDataflowGraph
has labels for nodes and for OpenDataflowValue
s, which are a a variant
of node outputs and open graph inputs. It might be worth addressing these a bit more individually, and making it clear what in them is labelled (the concept and consequences of outputs being labelled rather than edges being labelled is a bit unintuitive, at least if you're approaching it from a more pure graph theory perspective)
lib/utils/include/utils/graph/undirected/undirected_edge.h
line 7 at r4 (raw file):
namespace FlexFlow { struct UndirectedEdge {
FYI this can be moved over to use dtgen
and commutative_pair
lib/utils/test/src/utils/containers.cc
line 13 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("sum") {
Ideally pull these out into their own files in utils/containers/
. The goal is to eventually remove containers.h
be either moving functions into their own files in utils/containers.h
, or by deleting functions that aren't particularly useful
lib/utils/test/src/utils/containers.cc
line 15 at r4 (raw file):
TEST_CASE("sum") { std::vector<int> v = {1, 2, 3, 4, 5}; CHECK(sum(v) == 15);
Check case of an empty list
lib/utils/test/src/utils/containers.cc
line 18 at r4 (raw file):
} TEST_CASE("sum_where") {
Check case of an empty list and/or a predicate that fails on every element
lib/utils/test/src/utils/containers.cc
line 24 at r4 (raw file):
} TEST_CASE("product_where") {
Check case of an empty list and/or a predicate that fails on every element
lib/utils/test/src/utils/containers.cc
line 30 at r4 (raw file):
} TEST_CASE("contains_l and contains_r") {
These should be moved into their own files in utils/bidict/algorithms
lib/utils/test/src/utils/containers.cc
line 38 at r4 (raw file):
CHECK(contains_l(bd, 3) == false); CHECK(contains_r(bd, std::string("one")) == true); CHECK(contains_r(bd, std::string("three")) == false);
Add a testcase where L
and R
are the same type to ensure that nothing gets confused (i.e., check that for bd = {{1, 2}, {3, 4}}
that contains_l(bd, 2)
is true and contains_r(bd, 2)
is false
lib/utils/test/src/utils/containers.cc
line 48 at r4 (raw file):
CHECK(is_submap(m1, m2) == true); CHECK(is_submap(m1, m3) == false);
Check additional edge cases here--what if the keys are a subset but not the values? What if the values are a subset but not the keys?
lib/utils/test/src/utils/containers.cc
line 51 at r4 (raw file):
} TEST_CASE("index_of") {
Check edge case where there are duplicate values (i.e., index_of({1, 2, 3, 4, 3, 5}, 3)
lib/utils/test/src/utils/containers.cc
line 57 at r4 (raw file):
} TEST_CASE("merge_maps") {
Split into separate merge_maps
and merge_bidicts
functions
lib/utils/test/src/utils/containers.cc
line 66 at r4 (raw file):
{"four", 4}}; bidict<int, std::string> lhs{fwd_map1, bwd_map1}; bidict<int, std::string> rhs{fwd_map2, bwd_map2};
Why construct like this, why not just use equate
or take in a list of pairs or something? Constructing fwd_map
and bwd_map
individually seems really error prone (and honestly the API probably shouldn't even allow this)
Code quote:
std::unordered_map<int, std::string> fwd_map1 = {{1, "one"}, {2, "two"}};
std::unordered_map<std::string, int> bwd_map1 = {{"one", 1}, {"two", 2}};
std::unordered_map<int, std::string> fwd_map2 = {{3, "three"},
{4, "four"}};
std::unordered_map<std::string, int> bwd_map2 = {{"three", 3},
{"four", 4}};
bidict<int, std::string> lhs{fwd_map1, bwd_map1};
bidict<int, std::string> rhs{fwd_map2, bwd_map2};
lib/utils/test/src/utils/containers.cc
line 78 at r4 (raw file):
std::unordered_map<int, std::string> rhs = {{3, "three"}, {4, "four"}}; std::unordered_map<int, std::string> result = merge_maps(lhs, rhs); std::unordered_map<int, std::string> expected = {
Add a check that merge_maps
throws an exception if the maps aren't disjoint in the key set
lib/utils/test/src/utils/containers.cc
line 78 at r4 (raw file):
std::unordered_map<int, std::string> rhs = {{3, "three"}, {4, "four"}}; std::unordered_map<int, std::string> result = merge_maps(lhs, rhs); std::unordered_map<int, std::string> expected = {
Add a testcase where the value sets aren't disjoint
lib/utils/test/src/utils/containers.cc
line 87 at r4 (raw file):
auto f = lookup_in(m); CHECK(f(1) == "one"); CHECK(f(2) == "two");
Check that the correct exception type is thrown if an invalid key is used
lib/utils/test/src/utils/containers.cc
line 90 at r4 (raw file):
} TEST_CASE("lookup_in_l") {
Move to its own file in utils/bidict/algorithms
lib/utils/test/src/utils/containers.cc
line 99 at r4 (raw file):
} TEST_CASE("lookup_in_r") {
Move to its own file in utils/bidict/algorithms
lib/utils/test/src/utils/containers.cc
line 114 at r4 (raw file):
CHECK(is_superseteq_of(s1, s2) == true); CHECK(is_superseteq_of(s1, s3) == false);
Check that ever set is a superseteq
of itself (i.e., check that this function is reflexive)
lib/utils/test/src/utils/containers.cc
line 117 at r4 (raw file):
} TEST_CASE("restrict_keys") {
Isn't there already a test file for this (i.e., restrict_keys.cc
)?
lib/utils/test/src/utils/containers.cc
line 127 at r4 (raw file):
TEST_CASE("optional_all_of") { std::vector<int> v = {2, 4, 6, 8};
What if the list is empty?
lib/utils/test/src/utils/containers.cc
line 128 at r4 (raw file):
TEST_CASE("optional_all_of") { std::vector<int> v = {2, 4, 6, 8}; auto f = [](int x) -> std::optional<bool> { return x % 2 == 0; };
Separate subcases
lib/utils/test/src/utils/containers.cc
line 140 at r4 (raw file):
} TEST_CASE("are_all_same") {
What if the list is empty?
lib/utils/test/src/utils/containers.cc
line 161 at r4 (raw file):
std::vector<int> v = {2, 3, 4, 5}; auto result = flatmap<int, decltype(get_factors), int>(v, get_factors);
Fix the type/template signature of flatmap
so that the template arguments can be deduced (i.e., deduce Out
from In
and F
using std::invoke_result_t
)
lib/utils/test/src/utils/containers.cc
line 173 at r4 (raw file):
} TEST_CASE("maximum") {
What's the behavior if the list is empty?
lib/utils/test/src/utils/hash-utils.cc
line 93 at r4 (raw file):
CHECK(hash1 != hash2); tuple1 = tuple2;
For consistency with the other testcases
Suggestion:
gte<0>(tuple1) = 2;
lib/utils/test/src/utils/join_strings.cc
line 12 at r4 (raw file):
std::vector<std::string> const v = {"Hello", "world", "!"}; SUBCASE("iterator") {
Ideally also add a testcase for the join_strings(InputIt, InputIt, std::string, F)
overload
lib/utils/test/src/utils/containers/are_disjoint.cc
line 13 at r4 (raw file):
CHECK(are_disjoint(l, r)); r.insert(3); CHECK_FALSE(are_disjoint(l, r));
Add a test case where one or both sets are empty
lib/utils/test/src/utils/containers/as_vector.cc
line 12 at r4 (raw file):
std::unordered_set<int> s = {1, 2, 3}; std::vector<int> result = as_vector(s); CHECK(result == std::vector<int>({3, 2, 1}));
Ideally don't rely on the ordering of std::unordered_set
lib/utils/test/src/utils/containers/enumerate.cc
line 12 at r4 (raw file):
TEST_CASE("enumerate") { std::unordered_set<int> input_set = {1, 2, 3, 4, 5}; std::unordered_map<size_t, int> result = enumerate(input_set);
Doesn't this return a bidict
?
lib/utils/test/src/utils/containers/enumerate.cc
line 13 at r4 (raw file):
std::unordered_set<int> input_set = {1, 2, 3, 4, 5}; std::unordered_map<size_t, int> result = enumerate(input_set); std::unordered_map<size_t, int> expected = {
Suggestion:
std::unordered_map<size_t, int> correct = {
lib/utils/test/src/utils/containers/enumerate.cc
line 14 at r4 (raw file):
std::unordered_map<size_t, int> result = enumerate(input_set); std::unordered_map<size_t, int> expected = { {1, 4}, {2, 3}, {3, 2}, {4, 1}, {0, 5}};
Ideally don't rely on the ordering of an unordered_set
--better to enumerate
a vector
if you want to check precise ordering, or do different checks that aren't order dependent to test the overload for unordered_set
Code quote:
{1, 4}, {2, 3}, {3, 2}, {4, 1}, {0, 5}};
lib/utils/test/src/utils/containers/filter_keys.cc
line 14 at r4 (raw file):
auto f = [](int x) { return x % 2 == 1; }; // Filtering function std::unordered_map<int, std::string> result = filter_keys(m, f); std::unordered_map<int, std::string> expected = {{1, "one"}, {3, "three"}};
Suggestion:
std::unordered_map<int, std::string> correct = {{1, "one"}, {3, "three"}};
lib/utils/test/src/utils/containers/find.cc
line 7 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("find") {
Since find
is intended to work over a bunch of data structures (vector
, unordered_set
, set
at the very least) add tests of additional datatypes--just because the code of the function is the same doesn't necessarily mean that the APIs used have the same meaning for each of these datatu[es
Code quote:
TEST_CASE("find") {
lib/utils/test/src/utils/containers/find.cc
line 9 at r4 (raw file):
TEST_CASE("find") { std::vector<int> v = {1, 2, 3, 4, 5}; CHECK(find(v, 3) != v.cend());
Suggestion:
CHECK(find(v, 3) == v.find(3));
lib/utils/test/src/utils/containers/find.cc
line 10 at r4 (raw file):
std::vector<int> v = {1, 2, 3, 4, 5}; CHECK(find(v, 3) != v.cend()); CHECK(find(v, 6) == v.cend());
Suggestion:
CHECK(find(v, 6) == v.find(6));
lib/utils/test/src/utils/containers/get_first.cc
line 10 at r2 (raw file):
Previously, Marsella8 wrote…
maybe
get_any
? get first is a bit confusing.
Good point, but I'd probably go with get_one_of
or choose_one_of
lib/utils/test/src/utils/containers/keys.cc
line 14 at r4 (raw file):
{1, "one"}, {2, "two"}, {3, "three"}}; std::unordered_set<int> result = keys(m); std::unordered_set<int> expected = {3, 2, 1};
Why flip this unnecessarily?
Suggestion:
std::unordered_set<int> expected = {1, 2, 3};
lib/utils/test/src/utils/containers/map_keys.cc
line 11 at r4 (raw file):
TEST_CASE("map_keys") { std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}}; auto f = [](int x) { return x * x; }; // Mapping function
Suggestion:
auto f = [](int x) { return x * x; };
lib/utils/test/src/utils/containers/map_keys.cc
line 12 at r4 (raw file):
std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}}; auto f = [](int x) { return x * x; }; // Mapping function auto result = map_keys(m, f);
Suggestion:
std::unordered_map<int, std::string> result = map_keys(m, f);
lib/utils/test/src/utils/containers/map_keys.cc
line 15 at r4 (raw file):
CHECK(result.size() == 2); CHECK(result[1] == "one"); CHECK(result[4] == "two");
Suggestion:
std::unordered_map<int, std::string> correct = {{1, "one"}, {4, "two"}};
CHECK(result == correct);
lib/utils/test/src/utils/containers/map_values.cc
line 10 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("map_values") { std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};
It would be nice to have not all the resulting sizes be the same
Suggestion:
std::unordered_map<int, std::string> m = {{1, "one"}, {3, "three"}};
lib/utils/test/src/utils/containers/map_values.cc
line 11 at r4 (raw file):
TEST_CASE("map_values") { std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}}; auto f = [](std::string const &s) { return s.size(); }; // Mapping function
Suggestion:
auto f = [](std::string const &s) { return s.size(); };
lib/utils/test/src/utils/containers/map_values.cc
line 13 at r4 (raw file):
auto f = [](std::string const &s) { return s.size(); }; // Mapping function std::unordered_map<int, size_t> result = map_values(m, f); std::unordered_map<int, size_t> expected = {{1, 3}, {2, 3}};
Use correct
instead of expected
for consistency
Suggestion:
std::unordered_map<int, size_t> correct = {{1, 3}, {2, 3}};
lib/utils/test/src/utils/containers/restrict_keys.cc
line 15 at r4 (raw file):
std::unordered_set<int> mask = {2, 3, 4}; std::unordered_map<int, std::string> result = restrict_keys(m, mask); std::unordered_map<int, std::string> expected = {{2, "two"}, {3, "three"}};
Suggestion:
std::unordered_map<int, std::string> correct = {{2, "two"}, {3, "three"}};
lib/utils/test/src/utils/containers/reversed.cc
line 14 at r4 (raw file):
// Checking the reversed sequence is as expected CHECK(result == expected);
Suggestion:
std::vector<int> result = reversed(input_vec);
std::vector<int> correct = {5, 4, 3, 2, 1};
CHECK(result == correct);
lib/utils/test/src/utils/containers/set_union.cc
line 12 at r4 (raw file):
std::unordered_set<int> s2 = {2, 3, 4}; std::unordered_set<int> result = set_union(s1, s2); std::unordered_set<int> expected = {1, 2, 3, 4};
Suggestion:
std::unordered_set<int> correct = {1, 2, 3, 4};
lib/utils/test/src/utils/containers/sorted_by.cc
line 9 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing sorted_by function") {
Suggestion:
TEST_CASE("sorted_by") {
lib/utils/test/src/utils/containers/sorted_by.cc
line 12 at r4 (raw file):
std::unordered_set<int> s = {5, 2, 3, 4, 1}; auto sorted_s = sorted_by(s, [](int a, int b) { return a < b; }); CHECK(sorted_s == std::vector<int>({1, 2, 3, 4, 5}));
Suggestion:
SUBCASE("sort increasing") {
std::unordered_set<int> s = {5, 2, 3, 4, 1};
std::vector<int> result = sorted_by(s, [](int a, int b) { return a < b; });
std::vector<int> correct = {1, 2, 3, 4, 5};
CHECK(result == correct);
}
lib/utils/test/src/utils/containers/sorted_by.cc
line 16 at r4 (raw file):
std::unordered_set<int> s2 = {-5, -1, -3, -2, -4}; auto sorted_s2 = sorted_by(s2, [](int a, int b) { return a > b; }); CHECK(sorted_s2 == std::vector<int>({-1, -2, -3, -4, -5}));
Suggestion:
SUBCASE("sort decreasing") {
std::unordered_set<int> input = {-5, -1, -3, -2, -4};
std::vector<int> result = sorted_by(input, [](int a, int b) { return a > b; });
std::vector<int> correct = {-1, -2, -3, -4, -5};
CHECK(result == correct);
}
lib/utils/test/src/utils/containers/subvec.cc
line 8 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing subvec function") {
Suggestion:
TEST_CASE("subvec") {
lib/utils/test/src/utils/containers/subvec.cc
line 15 at r4 (raw file):
auto subvec_v2 = subvec(v, std::nullopt, std::optional<int>(3)); CHECK(subvec_v2 == std::vector<int>({1, 2, 3}));
Break up into subcases, use result
and correct
names, test more edge cases (what if both values are std::nullopt
? what if they are both integers, but the first one is less than the other? What if the integers are negative?)
Code quote:
std::vector<int> v = {1, 2, 3, 4, 5};
auto subvec_v = subvec(v, std::optional<int>(1), std::optional<int>(4));
CHECK(subvec_v == std::vector<int>({2, 3, 4}));
auto subvec_v2 = subvec(v, std::nullopt, std::optional<int>(3));
CHECK(subvec_v2 == std::vector<int>({1, 2, 3}));
lib/utils/test/src/utils/containers/values.cc
line 10 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("values") {
Add a testcase where there are duplicate values
lib/utils/test/src/utils/containers/values.cc
line 14 at r4 (raw file):
{1, "one"}, {2, "two"}, {3, "three"}}; std::vector<std::string> result = values(m); std::vector<std::string> expected = {"three", "two", "one"};
Don't rely on the order of unordered_set
Code quote:
std::vector<std::string> expected = {"three", "two", "one"};
lib/utils/test/src/utils/containers/values.cc
line 14 at r4 (raw file):
{1, "one"}, {2, "two"}, {3, "three"}}; std::vector<std::string> result = values(m); std::vector<std::string> expected = {"three", "two", "one"};
Use correct
instead of expected
for consistency
lib/utils/test/src/utils/containers/vector_split.cc
line 8 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing vector_split function") {
Check boundary cases (what if idx
is 0? what if idx
is 4?)
lib/utils/test/src/utils/containers/vector_split.cc
line 8 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing vector_split function") {
Suggestion:
TEST_CASE("vector_split") {
lib/utils/test/src/utils/graph/traversal.cc
line 8 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("traversal") {
Split into separate TEST_CASE
s for separate functions (i.e., one TEST_CASE
for get_unchecked_dfs_ordering
, one TEST_CASE
for get_bfs_ordering
, one TEST_CASE
for get_dfs_ordering
, each with potentially multiple SUBCASE
s)
Code quote:
TEST_CASE("traversal") {
lib/utils/test/src/utils/graph/traversal.cc
line 34 at r4 (raw file):
CHECK(get_dfs_ordering(g, {n[0]}) == std::vector<Node>{n[0], n[1], n[2], n[3]});
Use result
and correct
to improve readability
Code quote:
CHECK(get_dfs_ordering(g, {n[0]}) ==
std::vector<Node>{n[0], n[1], n[2], n[3]});
lib/utils/test/src/utils/graph/traversal.cc
line 37 at r4 (raw file):
} SUBCASE("nonlinear") { g.add_edge(DirectedEdge{n[1], n[3]});
No check in this SUBCASE
?
lib/utils/test/src/utils/graph/digraph/algorithms.cc
line 11 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("DiGraph - algorithms.cc") {
Break into separate TEST_CASE
s (one for each function) and add tests for some more error cases (graphs with multiple components, cycles, etc. Try to keep them small, you won't need huge graphs to test these properties as long as you're not trying to test them all at once)
lib/utils/test/src/utils/graph/digraph/algorithms.cc
line 19 at r4 (raw file):
DirectedEdge{n[0], n[1]}, DirectedEdge{n[0], n[2]}, DirectedEdge{n[1], n[2]},
Sort for readability
Suggestion:
DirectedEdge{n[0], n[1]},
DirectedEdge{n[0], n[2]},
DirectedEdge{n[0], n[3]},
DirectedEdge{n[1], n[2]},
lib/utils/test/src/utils/graph/digraph/algorithms.cc
line 24 at r4 (raw file):
SUBCASE("get_edges") { std::unordered_set<DirectedEdge> expected_edges = unordered_set_of(e);
Prefer correct
over expected
for consistency
Code quote:
expected_edges
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 17 at r4 (raw file):
add_edges(g, {DirectedEdge{n[0], n[1]},
Prefer .at
for bounds checking
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 35 at r4 (raw file):
SUBCASE("matches_edge") { DirectedEdgeQuery q{{n[0]}, {n[1]}};
Suggestion:
DirectedEdgeQuery q = DirectedEdgeQuery{
query_set{n[0]},
query_set{n[1]}
};
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 41 at r4 (raw file):
} SUBCASE("query_intersection") {
Should also test query_intersection
when some of the query_set
s are std::nullopt
(i.e., they match everything). Then again this function isn't hugely important, so up to you
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 42 at r4 (raw file):
SUBCASE("query_intersection") { DirectedEdgeQuery q1{{n[0], n[1]}, {n[1], n[2], n[4]}};
More verbose, but probably more readable than trying to parse the nested curly braces
Suggestion:
DirectedEdgeQuery q1 = DirectedQuery{
query_set{n[0], n[1]},
query_set{n[1], n[2], n[4]},
};
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 51 at r4 (raw file):
CHECK(result.srcs == expected_srcs); CHECK(result.dsts == expected_dsts);
Suggestion:
DirectedEdgeQuery result = query_intersection(q1, q2);
DirectedEdgeQuery correct = DirectedEdgeQuery{
query_set{n.at(1)},
query_set{n.at(2)},
};
CHECK(result == correct);
lib/utils/test/src/utils/graph/digraph/get_connected_components.cc
line 20 at r4 (raw file):
}; CHECK(get_connected_components(g) == expected_components);
Suggestion:
UndirectedGraph g = UndirectedGraph::create<HashmapUndirectedGraph>();
std::vector<Node> n = add_nodes(g, 4);
std::vector<UndirectedEdge> edges = {{n[0], n[1]}, {n[2], n[1]}};
add_edges(g, edges);
std::unordered_set<std::unordered_set<Node>> result = get_connected_components(g);
std::unordered_set<std::unordered_set<Node>> correct = {
{n[0], n[1], n[2]},
{n[3]},
};
CHECK(result == correct););
lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc
line 13 at r4 (raw file):
DiGraph g = DiGraph::create<AdjacencyDiGraph>(); std::vector<Node> n = add_nodes(g, 6); std::vector<DirectedEdge> edges = {DirectedEdge{n[0], n[1]},
Prefer .at
for bounds checking
lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc
line 24 at r4 (raw file):
CHECK(index_of(ordering, n[l]).has_value()); CHECK(index_of(ordering, n[r]).has_value()); CHECK(index_of(ordering, n[l]) < index_of(ordering, n[r]));
Comparing the std::optional
s directly feels a bit weird. Also you probably don't need the contains checks if you dereference the values, as dereferencing a std::optional
without a value already throws an error
Suggestion:
CHECK(index_of(ordering, n[l]).value() < index_of(ordering, n[r]).value());
lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc
line 9 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE(FF_TEST_SUITE) {
Test case name?
lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc
line 9 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE(FF_TEST_SUITE) {
Suggestion:
TEST_SUITE(FF_TEST_SUITE) {
TEST_CASE(FF_TEST_SUITE) {
lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc
line 22 at r4 (raw file):
}; CHECK(get_weakly_connected_components(g) == expected_components);
Suggestion:
std::vector<DirectedEdge> edges = {DirectedEdge{n[0], n[1]},
DirectedEdge{n[2], n[1]}};
add_edges(g, edges);
std::unordered_set<std::unordered_set<Node>> result = get_weakly_connected_components(g);
std::unordered_set<std::unordered_set<Node>> correct = {
{n[0], n[1], n[2]},
{n[3]},
};
CHECK(result == correct);
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 10 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("get_dominators") {
Another good test graph (contains cycles): https://github.com/flexflow/FlexFlow/blob/5e1f349215294ae312ccac9f2891aecacaf95e39/lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators_map.cc#L10-L26
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 24 at r4 (raw file):
SUBCASE("single node") { std::unordered_set<Node> expected_dominators = {n[0], n[2]}; CHECK(get_dominators(g, n[2]) == expected_dominators);
Prefer consistency where possible, even if it's a bit verbose
Suggestion:
std::unordered_set<Node> result = get_dominators(g, n[2]);
std::unordered_set<Node> correct = {n[0], n[2]};
CHECK(result == correct);
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 23 at r4 (raw file):
MultiDiEdge e1 = g.add_edge(n2, n0); MultiDiEdge e2 = g.add_edge(n0, n2); MultiDiEdge e3 = g.add_edge(n1, n2);
Why no duplicate edges? The whole point of MultiDiGraph
is that you can do g.add_edge(n0, n1); g.add_edge(n0, n1);
, and then you end up with two MultiDiEdge
s having been added. add_edge
should be tested for this.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 25 at r4 (raw file):
MultiDiEdge e3 = g.add_edge(n1, n2); SUBCASE("add_node") {
Could consider using get_graph_data
as introduced in #1471 for some of these? Up to you, this way of doing things is also fine, just checking all of the graph data might be more complete in some ways
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 28 at r4 (raw file):
Node n3 = g.add_node(); std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n3}}); CHECK(contains(nodes, n3));
Suggestion:
std::unordered_set<Node> query_result = g.query_nodes(NodeQuery{{n3}});
std::unordered_set<Node> correct = {n3};
CHECK(query_result == correct);
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 35 at r4 (raw file):
std::unordered_set<MultiDiEdge> edges = g.query_edges(MultiDiEdgeQuery({n2}, {n1})); CHECK(contains(edges, e4));
Suggestion:
MultiDiEdge e4 = g.add_edge(n2, n1);
std::unordered_set<MultiDiEdge> query_result =
g.query_edges(MultiDiEdgeQuery({n2}, {n1}));
std::unordered_set<MultiDiEdge> correct = {e4};
CHECK(query_result == correct);
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 41 at r4 (raw file):
g.remove_node(n0); std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n0}}); CHECK_FALSE(contains(nodes, n0));
Suggestion:
std::unordered_set<Node> node_query_result = g.query_nodes(NodeQuery{{n0}});
std::unordered_set<Node> correct_node_query_result = {};
CHECK(node_query_result == correct_node_query_result);
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 47 at r4 (raw file):
CHECK_FALSE(contains(edges_after_removal, e0)); CHECK_FALSE(contains(edges_after_removal, e1)); CHECK_FALSE(contains(edges_after_removal, e2));
This test isn't correct--this query will only return edges from n0 -> n0
, so even if n0
wasn't deleted, edges_after_removal
would still be empty
Code quote:
std::unordered_set<MultiDiEdge> edges_after_removal =
g.query_edges(MultiDiEdgeQuery({n0}, {n0}));
CHECK_FALSE(contains(edges_after_removal, e0));
CHECK_FALSE(contains(edges_after_removal, e1));
CHECK_FALSE(contains(edges_after_removal, e2));
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 54 at r4 (raw file):
std::unordered_set<MultiDiEdge> edges = g.query_edges(MultiDiEdgeQuery({n1}, {n2})); CHECK_FALSE(contains(edges, e3));
Suggestion:
std::unordered_set<MultiDiEdge> query_result =
g.query_edges(MultiDiEdgeQuery({n1}, {n2}));
std::unordered_set<MultiDiEdge> correct = {};
CHECK_FALSE(query_result == correct);
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 59 at r4 (raw file):
SUBCASE("query_nodes") { std::unordered_set<Node> all_nodes = g.query_nodes(NodeQuery{{n0, n1, n2}});
Also test matchall
, and queries for nodes that don't exist?
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 69 at r4 (raw file):
SUBCASE("query_edges") { std::unordered_set<MultiDiEdge> all_edges = g.query_edges(MultiDiEdgeQuery({n0, n1, n2}, {n0, n1, n2}));
Also test matchall
, and for nodes that don't exist?
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 14 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("MultiDiGraph - get_incoming_edges") {
Suggestion:
TEST_CASE("get_incoming_edges(MultiDiGraphView, Node)") {
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 23 at r4 (raw file):
{n.at(1), n.at(1)}, {n.at(0), n.at(0)}, };
Why not the usual std::vector<DirectedEdge>
?
Code quote:
std::vector<std::pair<Node, Node>> input = {
{n.at(0), n.at(1)},
{n.at(0), n.at(1)},
{n.at(1), n.at(1)},
{n.at(0), n.at(0)},
};
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 23 at r4 (raw file):
{n.at(1), n.at(1)}, {n.at(0), n.at(0)}, };
Sort for readability
Suggestion:
std::vector<std::pair<Node, Node>> input = {
{n.at(0), n.at(0)},
{n.at(0), n.at(1)},
{n.at(0), n.at(1)},
{n.at(1), n.at(1)},
};
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 29 at r4 (raw file):
CHECK(get_incoming_edges(g, n[1]) == std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[2]}); CHECK(get_incoming_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});
Suggestion:
SUBCASE("node has incoming edges") {
std::unordered_set<MultiDiEdge> result = get_incoming_edges(g, n.at(1));
std::unordered_set<MultiDiEdge> correct = {edges.at(0), edges.at(1), edges.at(2)};
CHECK(result == correct);
}
SUBCASE("node has no incoming edges") {
std::unordered_set<MultiDiEdge> result = get_incoming_edges(g, n.at(2));
std::unordered_set<MultiDiEdge> correct = {};
CHECK(result == correct);
}
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 14 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("MultiDiGraph - get_outgoing_edges") {
Suggestion:
TEST_CASE("get_outgoing_edges(MultiDiGraphView, Node)") {
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 22 at r4 (raw file):
{n.at(0), n.at(1)}, {n.at(1), n.at(1)}, {n.at(0), n.at(0)},
Sort for readability
Suggestion:
{n.at(0), n.at(0)},
{n.at(0), n.at(1)},
{n.at(0), n.at(1)},
{n.at(1), n.at(1)},
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 29 at r4 (raw file):
CHECK(get_outgoing_edges(g, n[0]) == std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[3]}); CHECK(get_outgoing_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});
SUBCASE
s as in `get_incoming_edges
Code quote:
CHECK(get_outgoing_edges(g, n[0]) ==
std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[3]});
CHECK(get_outgoing_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});
lib/utils/test/src/utils/graph/views/views.cc
line 48 at r1 (raw file):
Previously, Marsella8 wrote…
possible bug? currently the get_edges also returns the edges that "cross the cut" (e.g. (1, 2)), though this does not happen for example with DiSubgraphView.
Good catch, yeah that seems like a bug
lib/utils/test/src/utils/graph/views/views.cc
line 88 at r1 (raw file):
Previously, Marsella8 wrote…
commented because not yet implemented
As in the class isn't implemented or the testcase isn't implemented?
lib/utils/test/src/utils/graph/views/views.cc
line 52 at r4 (raw file):
} TEST_CASE("DiSubgraphView") {
Probably better to just test the corresponding get_subgraph
method instead rather than testing the views directly (and same for the other views tested here)--the view classes themselves are probably better thought of as internal implementation details. Also, get_graph_data
as introduced in #1471 should simply these tests.
It's not quite perfect because it doesn't test the full power of the query
interface (both this and get_graph_data
just query all, so concrete queries don't get tested), but I don't have a good solution for this yet so what's here is totally fine for now
lib/utils/test/src/utils/random_utils.cc
line 5 at r4 (raw file):
#include <algorithm> void checkProbabilities(std::vector<int> const &counts,
Might be better as a lambda since it's only used in one testcase, but not a big deal either way
lib/utils/test/src/utils/random_utils.cc
line 6 at r4 (raw file):
void checkProbabilities(std::vector<int> const &counts, int numIterations,
Why pass numIterations
instead of just computing sum(values(counts))
?
lib/utils/test/src/utils/random_utils.cc
line 8 at r4 (raw file):
int numIterations, std::vector<float> const &weights, float totalWeight) {
Why not just use sum(weights)
instead of passing totalWeight
explicitly?
Code quote:
float totalWeight
lib/utils/test/src/utils/random_utils.cc
line 12 at r4 (raw file):
float expectedProbability = weights[i] / totalWeight; float observedProbability = static_cast<float>(counts[i]) / numIterations; CHECK(observedProbability ==
Prefer a single equality check of the whole map instead of a for loop containing an equality check of each key/value
lib/utils/test/src/utils/random_utils.cc
line 18 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("select_random") {
Suggestion:
TEST_CASE("select_random(std::vector<T>)") {
lib/utils/test/src/utils/random_utils.cc
line 24 at r4 (raw file):
int result = select_random(values); CHECK(std::find(values.begin(), values.end(), result) != values.end());
Feel free to use contains
here
lib/utils/test/src/utils/random_utils.cc
line 29 at r4 (raw file):
SUBCASE("Invalid arguments") { std::vector<float> weights = {0.1f, 0.3f, 0.2f}; CHECK(select_random(values, weights) == 2);
What's going on here? Why should we return 2 if the arguments are invalid? Also, why is this testcase not under the weighted section below?
lib/utils/test/src/utils/random_utils.cc
line 38 at r4 (raw file):
std::vector<float> weights = {1.0f, 1.0f, 1.0f, 1.0f, 1.0f}; std::vector<int> counts(values.size(), 0);
Not sure if this code is exactly right, but something like it
Suggestion:
std::unordered_map<int, int> counts = unordered_map{zip(values, repeat(0)};
lib/utils/test/src/utils/random_utils.cc
line 42 at r4 (raw file):
for (int i = 0; i < numIterations; i++) { int selected = select_random(values, weights); counts[selected - 1]++;
Suggestion:
counts.at(selected)++;
lib/utils/test/src/utils/random_utils.cc
line 57 at r4 (raw file):
int selected = select_random(values, weights); counts[selected - 1]++; }
Pull out into a separate sample
lambda?
Code quote:
std::vector<int> counts(values.size(), 0);
int const numIterations = 10000;
for (int i = 0; i < numIterations; i++) {
int selected = select_random(values, weights);
counts[selected - 1]++;
}
lib/utils/test/src/utils/stack_string.cc
line 85 at r4 (raw file):
} TEST_CASE("Arbitrary<stack_string>") {
Why was this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 100 unresolved discussions (waiting on @lockshaw)
lib/utils/include/utils/graph/README.md
line 21 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
DataflowGraph
(note capitalization) also has some additional restrictions: first of all allDataflowGraph
s are DAGs, and they have an explicit concept of inputs and outputs from a node, with the restriction that exactly one edge (no more and no less) can be going into an individual input. ConceptuallyDataflowGraph
is intended to represent computation-style graphs, where edges represent value uses and nodes represent functions from tuples of inputs to tuples of outputs.
Done.
lib/utils/include/utils/graph/README.md
line 126 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
The main (and possibly only now?) labelled variant is
LabelledDataflowGraph
, which has labels for nodes and for node outputs, similar to the oldOutputLabelledMultiDiGraph
, andLabelledOpenDataflowGraph
has labels for nodes and forOpenDataflowValue
s, which are a avariant
of node outputs and open graph inputs. It might be worth addressing these a bit more individually, and making it clear what in them is labelled (the concept and consequences of outputs being labelled rather than edges being labelled is a bit unintuitive, at least if you're approaching it from a more pure graph theory perspective)
Done.
lib/utils/include/utils/graph/README.md
line at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Thanks for going out of your way to update the documentation! This is very much appreciated 🙂
np!
lib/utils/include/utils/graph/undirected/undirected_edge.h
line 7 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
FYI this can be moved over to use
dtgen
andcommutative_pair
done (if you see rc adjacent changes it's because of this)
lib/utils/test/src/utils/containers.cc
line 13 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Ideally pull these out into their own files in
utils/containers/
. The goal is to eventually removecontainers.h
be either moving functions into their own files inutils/containers.h
, or by deleting functions that aren't particularly useful
containers.
cc is no more (deleted a few functions which were very specific and not being used anywhere within the codebase, everything else has been moved into their own modules).
lib/utils/test/src/utils/containers.cc
line 15 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check case of an empty list
Done.
lib/utils/test/src/utils/containers.cc
line 18 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check case of an empty list and/or a predicate that fails on every element
Done.
lib/utils/test/src/utils/containers.cc
line 24 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check case of an empty list and/or a predicate that fails on every element
Done.
lib/utils/test/src/utils/containers.cc
line 30 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
These should be moved into their own files in
utils/bidict/algorithms
already in bidict as a member function
lib/utils/test/src/utils/containers.cc
line 38 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add a testcase where
L
andR
are the same type to ensure that nothing gets confused (i.e., check that forbd = {{1, 2}, {3, 4}}
thatcontains_l(bd, 2)
is true andcontains_r(bd, 2)
is false
Done.
lib/utils/test/src/utils/containers.cc
line 48 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check additional edge cases here--what if the keys are a subset but not the values? What if the values are a subset but not the keys?
Done.
lib/utils/test/src/utils/containers.cc
line 51 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check edge case where there are duplicate values (i.e.,
index_of({1, 2, 3, 4, 3, 5}, 3)
Done.
lib/utils/test/src/utils/containers.cc
line 57 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Split into separate
merge_maps
andmerge_bidicts
functions
Done.
lib/utils/test/src/utils/containers.cc
line 66 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why construct like this, why not just use
equate
or take in a list of pairs or something? Constructingfwd_map
andbwd_map
individually seems really error prone (and honestly the API probably shouldn't even allow this)
Done.
lib/utils/test/src/utils/containers.cc
line 78 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add a check that
merge_maps
throws an exception if the maps aren't disjoint in the key set
Done.
lib/utils/test/src/utils/containers.cc
line 78 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add a testcase where the value sets aren't disjoint
Done.
lib/utils/test/src/utils/containers.cc
line 87 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check that the correct exception type is thrown if an invalid key is used
removed the function: seems overly specific (can be done with other functions), and is not used anywhere in the codebase
lib/utils/test/src/utils/containers.cc
line 90 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Move to its own file in
utils/bidict/algorithms
removed the functions (not used anywhere in the codebase, seemed very specific).
lib/utils/test/src/utils/containers.cc
line 99 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Move to its own file in
utils/bidict/algorithms
see above.
lib/utils/test/src/utils/containers.cc
line 114 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check that ever set is a
superseteq
of itself (i.e., check that this function is reflexive)
Done.
lib/utils/test/src/utils/containers.cc
line 117 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Isn't there already a test file for this (i.e.,
restrict_keys.cc
)?
Done.
lib/utils/test/src/utils/containers.cc
line 127 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What if the list is empty?
Done.
lib/utils/test/src/utils/containers.cc
line 128 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Separate subcases
Done.
lib/utils/test/src/utils/containers.cc
line 140 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What if the list is empty?
ended up putting a throw for an empty container (i think for other functions it makes sense to vacuously define how it behaves for an empty container (e.g. sum, optional_all_of), but for are_all_same
I can't really think of a situation where somebody would want to pass an empty container. If you want to be pedantic, you could argue that you can define is_all_same recursively and the empty container would have to evaluate to true, but I feel like in practice in doesn't make much sense.
lib/utils/test/src/utils/containers.cc
line 161 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Fix the type/template signature of
flatmap
so that the template arguments can be deduced (i.e., deduceOut
fromIn
andF
usingstd::invoke_result_t
)
Done.
lib/utils/test/src/utils/containers.cc
line 173 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What's the behavior if the list is empty?
now throws an exception
lib/utils/test/src/utils/hash-utils.cc
line 93 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
For consistency with the other testcases
Done.
lib/utils/test/src/utils/join_strings.cc
line 12 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Ideally also add a testcase for the
join_strings(InputIt, InputIt, std::string, F)
overload
Done.
lib/utils/test/src/utils/containers/are_disjoint.cc
line 13 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add a test case where one or both sets are empty
Done.
lib/utils/test/src/utils/containers/as_vector.cc
line 12 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Ideally don't rely on the ordering of
std::unordered_set
Done.
lib/utils/test/src/utils/containers/enumerate.cc
line 12 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Doesn't this return a
bidict
?
File has been overwritten by recent repo-refactor
change
lib/utils/test/src/utils/containers/enumerate.cc
line 13 at r4 (raw file):
std::unordered_set<int> input_set = {1, 2, 3, 4, 5}; std::unordered_map<size_t, int> result = enumerate(input_set); std::unordered_map<size_t, int> expected = {
Done.
lib/utils/test/src/utils/containers/enumerate.cc
line 14 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Ideally don't rely on the ordering of an
unordered_set
--better toenumerate
avector
if you want to check precise ordering, or do different checks that aren't order dependent to test the overload forunordered_set
Done.
lib/utils/test/src/utils/containers/filter_keys.cc
line 14 at r4 (raw file):
auto f = [](int x) { return x % 2 == 1; }; // Filtering function std::unordered_map<int, std::string> result = filter_keys(m, f); std::unordered_map<int, std::string> expected = {{1, "one"}, {3, "three"}};
Done.
lib/utils/test/src/utils/containers/find.cc
line 7 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Since
find
is intended to work over a bunch of data structures (vector
,unordered_set
,set
at the very least) add tests of additional datatypes--just because the code of the function is the same doesn't necessarily mean that the APIs used have the same meaning for each of these datatu[es
Done.
lib/utils/test/src/utils/containers/find.cc
line 9 at r4 (raw file):
TEST_CASE("find") { std::vector<int> v = {1, 2, 3, 4, 5}; CHECK(find(v, 3) != v.cend());
Done.
lib/utils/test/src/utils/containers/find.cc
line 10 at r4 (raw file):
std::vector<int> v = {1, 2, 3, 4, 5}; CHECK(find(v, 3) != v.cend()); CHECK(find(v, 6) == v.cend());
Done.
lib/utils/test/src/utils/containers/get_first.cc
line 10 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Good point, but I'd probably go with
get_one_of
orchoose_one_of
changed to get_one_of
lib/utils/test/src/utils/containers/keys.cc
line 14 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why flip this unnecessarily?
Done.
lib/utils/test/src/utils/containers/map_keys.cc
line 11 at r4 (raw file):
TEST_CASE("map_keys") { std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}}; auto f = [](int x) { return x * x; }; // Mapping function
Done.
lib/utils/test/src/utils/containers/map_keys.cc
line 12 at r4 (raw file):
std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}}; auto f = [](int x) { return x * x; }; // Mapping function auto result = map_keys(m, f);
Done.
lib/utils/test/src/utils/containers/map_keys.cc
line 15 at r4 (raw file):
CHECK(result.size() == 2); CHECK(result[1] == "one"); CHECK(result[4] == "two");
Done.
lib/utils/test/src/utils/containers/map_values.cc
line 10 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
It would be nice to have not all the resulting sizes be the same
Done.
lib/utils/test/src/utils/containers/map_values.cc
line 11 at r4 (raw file):
TEST_CASE("map_values") { std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}}; auto f = [](std::string const &s) { return s.size(); }; // Mapping function
Done.
lib/utils/test/src/utils/containers/map_values.cc
line 13 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Use
correct
instead ofexpected
for consistency
Done.
lib/utils/test/src/utils/containers/restrict_keys.cc
line 15 at r4 (raw file):
std::unordered_set<int> mask = {2, 3, 4}; std::unordered_map<int, std::string> result = restrict_keys(m, mask); std::unordered_map<int, std::string> expected = {{2, "two"}, {3, "three"}};
Done.
lib/utils/test/src/utils/containers/reversed.cc
line 14 at r4 (raw file):
// Checking the reversed sequence is as expected CHECK(result == expected);
Done.
lib/utils/test/src/utils/containers/set_union.cc
line 12 at r4 (raw file):
std::unordered_set<int> s2 = {2, 3, 4}; std::unordered_set<int> result = set_union(s1, s2); std::unordered_set<int> expected = {1, 2, 3, 4};
Done.
lib/utils/test/src/utils/containers/sorted_by.cc
line 9 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing sorted_by function") {
Done.
lib/utils/test/src/utils/containers/sorted_by.cc
line 12 at r4 (raw file):
std::unordered_set<int> s = {5, 2, 3, 4, 1}; auto sorted_s = sorted_by(s, [](int a, int b) { return a < b; }); CHECK(sorted_s == std::vector<int>({1, 2, 3, 4, 5}));
Done.
lib/utils/test/src/utils/containers/sorted_by.cc
line 16 at r4 (raw file):
std::unordered_set<int> s2 = {-5, -1, -3, -2, -4}; auto sorted_s2 = sorted_by(s2, [](int a, int b) { return a > b; }); CHECK(sorted_s2 == std::vector<int>({-1, -2, -3, -4, -5}));
Done.
lib/utils/test/src/utils/containers/subvec.cc
line 8 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing subvec function") {
Done.
lib/utils/test/src/utils/containers/subvec.cc
line 15 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Break up into subcases, use
result
andcorrect
names, test more edge cases (what if both values arestd::nullopt
? what if they are both integers, but the first one is less than the other? What if the integers are negative?)
Done.
lib/utils/test/src/utils/containers/values.cc
line 10 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add a testcase where there are duplicate values
Done.
lib/utils/test/src/utils/containers/values.cc
line 14 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Don't rely on the order of
unordered_set
Done.
lib/utils/test/src/utils/containers/values.cc
line 14 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Use
correct
instead ofexpected
for consistency
Done.
lib/utils/test/src/utils/containers/vector_split.cc
line 8 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Check boundary cases (what if
idx
is 0? what ifidx
is 4?)
Done.
lib/utils/test/src/utils/containers/vector_split.cc
line 8 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing vector_split function") {
Done.
lib/utils/test/src/utils/graph/traversal.cc
line 8 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Split into separate
TEST_CASE
s for separate functions (i.e., oneTEST_CASE
forget_unchecked_dfs_ordering
, oneTEST_CASE
forget_bfs_ordering
, oneTEST_CASE
forget_dfs_ordering
, each with potentially multipleSUBCASE
s)
Done.
lib/utils/test/src/utils/graph/traversal.cc
line 34 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Use
result
andcorrect
to improve readability
Done.
lib/utils/test/src/utils/graph/traversal.cc
line 37 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
No check in this
SUBCASE
?
Done.
lib/utils/test/src/utils/graph/digraph/algorithms.cc
line 11 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Break into separate
TEST_CASE
s (one for each function) and add tests for some more error cases (graphs with multiple components, cycles, etc. Try to keep them small, you won't need huge graphs to test these properties as long as you're not trying to test them all at once)
Done.
lib/utils/test/src/utils/graph/digraph/algorithms.cc
line 19 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Sort for readability
Done.
lib/utils/test/src/utils/graph/digraph/algorithms.cc
line 24 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Prefer
correct
overexpected
for consistency
Done.
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 17 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Prefer
.at
for bounds checking
Done.
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 35 at r4 (raw file):
SUBCASE("matches_edge") { DirectedEdgeQuery q{{n[0]}, {n[1]}};
Done.
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 42 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
More verbose, but probably more readable than trying to parse the nested curly braces
Done.
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 51 at r4 (raw file):
CHECK(result.srcs == expected_srcs); CHECK(result.dsts == expected_dsts);
Done.
lib/utils/test/src/utils/graph/digraph/get_connected_components.cc
line 20 at r4 (raw file):
}; CHECK(get_connected_components(g) == expected_components);
Done.
lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc
line 13 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Prefer
.at
for bounds checking
Done.
lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc
line 24 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Comparing the
std::optional
s directly feels a bit weird. Also you probably don't need the contains checks if you dereference the values, as dereferencing astd::optional
without a value already throws an error
Done.
lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc
line 9 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Test case name?
Done.
lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc
line 9 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE(FF_TEST_SUITE) {
Done.
lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc
line 22 at r4 (raw file):
}; CHECK(get_weakly_connected_components(g) == expected_components);
Done.
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 24 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Prefer consistency where possible, even if it's a bit verbose
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 23 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why no duplicate edges? The whole point of
MultiDiGraph
is that you can dog.add_edge(n0, n1); g.add_edge(n0, n1);
, and then you end up with twoMultiDiEdge
s having been added.add_edge
should be tested for this.
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 28 at r4 (raw file):
Node n3 = g.add_node(); std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n3}}); CHECK(contains(nodes, n3));
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 35 at r4 (raw file):
std::unordered_set<MultiDiEdge> edges = g.query_edges(MultiDiEdgeQuery({n2}, {n1})); CHECK(contains(edges, e4));
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 41 at r4 (raw file):
g.remove_node(n0); std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n0}}); CHECK_FALSE(contains(nodes, n0));
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 47 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
This test isn't correct--this query will only return edges from
n0 -> n0
, so even ifn0
wasn't deleted,edges_after_removal
would still be empty
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 54 at r4 (raw file):
std::unordered_set<MultiDiEdge> edges = g.query_edges(MultiDiEdgeQuery({n1}, {n2})); CHECK_FALSE(contains(edges, e3));
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 59 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Also test
matchall
, and queries for nodes that don't exist?
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 69 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Also test
matchall
, and for nodes that don't exist?
Done.
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 14 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("MultiDiGraph - get_incoming_edges") {
Done.
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 23 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why not the usual
std::vector<DirectedEdge>
?
MultiDiEdge
s cannot be constructed directly, since they are essentially wrappers for an uid (and the graph itself does the bookkeping about which edge has which source and sink).
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 23 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Sort for readability
Done.
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 29 at r4 (raw file):
CHECK(get_incoming_edges(g, n[1]) == std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[2]}); CHECK(get_incoming_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});
Done.
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 14 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("MultiDiGraph - get_outgoing_edges") {
Done.
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 22 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Sort for readability
Done.
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 29 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
SUBCASE
s as in `get_incoming_edges
Done.
lib/utils/test/src/utils/graph/views/views.cc
line 48 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Good catch, yeah that seems like a bug
Fixed it, issue had to do with the query_edges function which was including all open edges coming out of the subgraph
lib/utils/test/src/utils/graph/views/views.cc
line 88 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
As in the class isn't implemented or the testcase isn't implemented?
std::unordered_set<Node> JoinedNodeView::query_nodes(NodeQuery const &query)
in views.cc
has not yet been implemented.
lib/utils/test/src/utils/graph/views/views.cc
line 52 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Probably better to just test the corresponding
get_subgraph
method instead rather than testing the views directly (and same for the other views tested here)--the view classes themselves are probably better thought of as internal implementation details. Also,get_graph_data
as introduced in #1471 should simply these tests.It's not quite perfect because it doesn't test the full power of the
query
interface (both this andget_graph_data
just query all, so concrete queries don't get tested), but I don't have a good solution for this yet so what's here is totally fine for now
get_graph_data
I think is only implemented for DataflowGraph
s (I can add it for the other classes as well if needed).
lib/utils/test/src/utils/random_utils.cc
line 6 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why pass
numIterations
instead of just computingsum(values(counts))
?
Done.
lib/utils/test/src/utils/random_utils.cc
line 8 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why not just use
sum(weights)
instead of passingtotalWeight
explicitly?
Done.
lib/utils/test/src/utils/random_utils.cc
line 12 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Prefer a single equality check of the whole map instead of a for loop containing an equality check of each key/value
i think in this case it ends up being a bit clunky to compare the 2 vectors (since we are testing for the values being approximately equal), it would be something like in the code snippet.
In general, why is it preferrable to have a single check rather than multiple ones? To have a more defined point of where the check is being done / more awareness of the entire container rather than just a single value?
Code snippet:
CHECK(all_of(zip(observed_probabilities, expected_probabilities), ...)
lib/utils/test/src/utils/random_utils.cc
line 18 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("select_random") {
Done.
lib/utils/test/src/utils/random_utils.cc
line 24 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Feel free to use
contains
here
Done.
lib/utils/test/src/utils/random_utils.cc
line 29 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What's going on here? Why should we return 2 if the arguments are invalid? Also, why is this testcase not under the weighted section below?
Test case was very odd and outdated, remodernized it, lmk what you think.
lib/utils/test/src/utils/random_utils.cc
line 38 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Not sure if this code is exactly right, but something like it
Done.
lib/utils/test/src/utils/random_utils.cc
line 42 at r4 (raw file):
for (int i = 0; i < numIterations; i++) { int selected = select_random(values, weights); counts[selected - 1]++;
Done.
lib/utils/test/src/utils/random_utils.cc
line 57 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Pull out into a separate
sample
lambda?
See above.
lib/utils/test/src/utils/stack_string.cc
line 85 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why was this removed?
re-added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 20 of 131 files reviewed, 105 unresolved discussions (waiting on @lockshaw)
lib/utils/include/utils/graph/undirected/undirected_edge.struct.toml
line 1 at r6 (raw file):
namespace = "FlexFlow"
Changed undirected edge to use commutative pair and dtgen, if you see some rapidcheck adjacent changes it's because of that.
lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc
line 41 at r5 (raw file):
} void HashmapUndirectedGraph::remove_edge(UndirectedEdge const &e) {
BUG? Intuitively this should be symmetric, something like:
Code snippet:
std::unordered_set<Node> &max_map = this->adjacency.at(e.endpoints.max());
max_map.erase(e.endpoints.min());
std::unordered_set<Node> &min_map = this->adjacency.at(e.endpoints.max());
min_map.erase(e.endpoints.max());
lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc
line 53 at r6 (raw file):
keys(query_keys(query.nodes, this->adjacency)); for (auto const &src_kv : query_keys(query.nodes, this->adjacency)) { for (auto const &dst : src_kv.second) {
Fix for the bug in where getting the subgraph view of an undirected graph would result in the edges crossing the cut from inside subgraph to outside subgraph being included in the subgraph.
lib/utils/src/utils/graph/views/views.cc
line 263 at r6 (raw file):
DirectedEdgeQuery const &q) const { std::unordered_set<UndirectedEdge> undirected_edges = set_union(g.query_edges(UndirectedEdgeQuery{q.srcs}),
I think this should be set_union instead of intersection right? After changing the query_edges in HashmapUndirectedGraph
another test using as_digraph
was failing, I'd assume it's due to this.
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 67 at r5 (raw file):
CHECK(matches_edge(result, DirectedEdge{n.at(1), n.at(3)})); CHECK(matches_edge(result, DirectedEdge{n.at(2), n.at(4)})); // CHECK_FALSE(matches_edge(result, DirectedEdge{n.at(2), n.at(3)}));
Do they match all edges, even if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 20 of 131 files reviewed, 105 unresolved discussions (waiting on @lockshaw)
lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc
line 53 at r6 (raw file):
Previously, Marsella8 wrote…
Fix for the bug in where getting the subgraph view of an undirected graph would result in the edges crossing the cut from inside subgraph to outside subgraph being included in the subgraph.
(unsure of correctness, but should make sense)
lib/utils/src/utils/graph/views/views.cc
line 263 at r6 (raw file):
Previously, Marsella8 wrote…
I think this should be set_union instead of intersection right? After changing the query_edges in
HashmapUndirectedGraph
another test usingas_digraph
was failing, I'd assume it's due to this.
(very unsure about this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 107 of 111 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: 128 of 131 files reviewed, 87 unresolved discussions (waiting on @Marsella8)
lib/substitutions/src/substitutions/unlabelled/multidigraph_pattern_match.cc
line 3 at r6 (raw file):
#include "substitutions/unlabelled/multidigraph_pattern_match.h" // #include "substitutions/unlabelled/edge_splits.h" // #include "substitutions/unlabelled/pattern_edge.h"
Delete (yes I know you didn't add these, but may as well remove them now)
Code quote:
// #include "substitutions/unlabelled/edge_splits.h"
// #include "substitutions/unlabelled/pattern_edge.h"
lib/utils/include/utils/join_strings.h
line 27 at r6 (raw file):
} return oss.str(); }
Suggestion:
template <typename InputIt, typename F>
std::string join_strings(InputIt first,
InputIt last,
std::string const &delimiter,
F const &f) {
std::ostringstream oss;
bool first_iter = true;
for (; first != last; first++) {
if (!first_iter) {
oss << delimiter;
}
oss << f(*first);
first_iter = false;
}
return oss.str();
}
lib/utils/include/utils/bidict/algorithms/merge_bidicts.h
line 16 at r6 (raw file):
if (!are_disjoint(left_entries(lhs), left_entries(rhs))) { throw mk_runtime_error( "Left entries of merge_bidicts parameters are non-disjoint");
Including debugging info in exception messages where possible is nice
Suggestion:
if (!are_disjoint(left_entries(lhs), left_entries(rhs))) {
throw mk_runtime_error(
fmt::format("Left entries of merge_bidicts parameters are non-disjoint: lhs = {}, rhs = {}", lhs, rhs));
lib/utils/include/utils/bidict/algorithms/merge_bidicts.h
line 20 at r6 (raw file):
if (!are_disjoint(right_entries(lhs), right_entries(rhs))) { throw mk_runtime_error( "Right entries of merge_bidicts parameters are non-disjoint");
Suggestion:
throw mk_runtime_error(
fmt::format("Right entries of merge_bidicts parameters are non-disjoint: lhs = {}, rhs = {}", lhs, rhs));
lib/utils/include/utils/containers/are_all_same.h
line 10 at r6 (raw file):
template <typename C> bool are_all_same(C const &c) { if (c.size() == 0) {
Suggestion:
if (c.empty()) {
lib/utils/include/utils/containers/are_all_same.h
line 11 at r6 (raw file):
bool are_all_same(C const &c) { if (c.size() == 0) { throw mk_runtime_error("input to are_all_same must be non-empty container");
Suggestion:
throw mk_runtime_error(fmt::format("are_all_same expected non-empty container but receieved {}", c));
lib/utils/include/utils/containers/index_of.h
line 6 at r6 (raw file):
#include <algorithm> #include <optional> namespace FlexFlow {
Suggestion:
#include <algorithm>
#include <optional>
namespace FlexFlow {
lib/utils/include/utils/containers/index_of.h
line 10 at r6 (raw file):
/** * @details If multiple `e` are present within the container, the function *returns the index of the first appearance
Suggestion:
* @details If multiple `e` are present within the container, the function
* returns the index of the first appearance
lib/utils/include/utils/containers/is_submap.h
line 14 at r6 (raw file):
std::unordered_map<K, V> const &sub) { return restrict_keys(m, keys(sub)) == sub; }
For consistency with is_subseteq_of
Suggestion:
bool is_submapeq_of(
std::unordered_map<K, V> const &sub,
std::unordered_map<K, V> const &m) {
return restrict_keys(m, keys(sub)) == sub;
}
lib/utils/include/utils/containers/map_keys.h
line 13 at r6 (raw file):
* returns the updated map. * * @details Note that if multiple original keys are transformed to the same new
Is this a good behavior or should this throw an error? I'm not sure this is a great property to have, and I don't think we're relying on it anywhere?
lib/utils/include/utils/containers/map_keys.h
line 14 at r6 (raw file):
* * @details Note that if multiple original keys are transformed to the same new * key, which value will be associated to the new key is undefined.
Suggestion:
* @note If multiple original keys are transformed to the same new
* key, which value will be associated to the new key is undefined.
lib/utils/include/utils/containers/maximum.h
line 14 at r6 (raw file):
throw mk_runtime_error( "input to function maximum must be non-empty container"); }
Suggestion:
if (c.empty()) {
throw mk_runtime_error(
fmt::format("maximum expected non-empty container but received {}", c));
}
lib/utils/include/utils/containers/merge_maps.h
line 18 at r6 (raw file):
throw mk_runtime_error( "Key sets of merge_maps parameters are non-disjoint"); }
Suggestion:
if (!are_disjoint(keys(lhs), keys(rhs))) {
throw mk_runtime_error(
fmt::format("Key sets of merge_maps parameters are non-disjoint: lhs = {}, rhs = {}", lhs, rhs));
}
lib/utils/include/utils/containers/optional_all_of.h
line 9 at r6 (raw file):
template <typename Container, typename Function> std::optional<bool> optional_all_of(Container const &container,
Are we using this anywhere? It's kinda a weird function and I wouldn't object to removing it if we're not using it
lib/utils/include/utils/containers/product.h
line 9 at r6 (raw file):
/** * @details An empty container vacuously has product 1
We don't really have a convention set for doxygen yet, but might be good to use @note
for documenting edge cases like this? Or idk, maybe we should only use @note
for really weird/bug-prone edge cases? Thoughts?
Suggestion:
* @note An empty container vacuously has product 1
lib/utils/include/utils/containers/product.h
line 32 at r6 (raw file):
typename ConditionF, typename Element = typename Container::value_type> Element product_where(Container const &container, ConditionF const &condition) {
Move to separate utils/containers/product_where.h
lib/utils/include/utils/containers/subvec.h
line 19 at r6 (raw file):
typename std::vector<T>::iterator::difference_type { if (idx < 0) { return std::max(0, (int)v.size() + idx);
Avoid c-style casts
Suggestion:
static_cast<int>(v.size())
lib/utils/include/utils/containers/subvec.h
line 19 at r6 (raw file):
typename std::vector<T>::iterator::difference_type { if (idx < 0) { return std::max(0, (int)v.size() + idx);
Why std::max(0, ...)
?
lib/utils/include/utils/containers/sum.h
line 21 at r6 (raw file):
typename ConditionF, typename Element = typename Container::value_type> Element sum_where(Container const &container, ConditionF const &condition) {
Move to separate utils/containers/sum_where.h
lib/utils/include/utils/containers/value_all.h
line 15 at r6 (raw file):
return unwrap(element, [] { throw mk_runtime_error( "Encountered element without value in call to value_all");
Suggestion:
fmt::format("value_all expected all elements to have values, but received {}", v));
lib/utils/include/utils/containers/value_all.h
line 25 at r6 (raw file):
return unwrap(element, [] { throw mk_runtime_error( "Encountered element without value in call to value_all");
Suggestion:
fmt::format("value_all expected all elements to have values, but received {}", v));
lib/utils/include/utils/containers/vector_split.h
line 15 at r6 (raw file):
int idx) { if (idx < 0 || idx > static_cast<int>(v.size())) { throw std::out_of_range("Index out of range in vector_split");
Suggestion:
throw std::out_of_range(fmt::format("Index out of range in vector_split: index = {}, vector = {}", idx, v));
lib/utils/include/utils/fmt/optional.h
line 67 at r6 (raw file):
}; template <>
FYI once 1490 merges this will have to be moved to test/utils/doctest/fmt/optional.h
(not changes needed now)
lib/utils/include/utils/graph/README.md
line 21 at r6 (raw file):
- `DiGraph`: at most one edge allowed between every ordered pair of nodes, edges are directed (i.e., have a source node and a destination node) - `MultiDiGraph`: arbitrary numbers of directed edges allowed between every pair of nodes. - `DataflowGraph`: similar to `MultiDiGraph`, but the edges entering, exiting a given nodes now have a well-defined order. Due to the interface used to construct them (where essentially a node can only be added to the graph after all of its predecessor nodes have been added) `DataflowGraph`s are directed acyclic graphs. Each node has an associated ordered sequence of inputs and outputs, with the restriction that one and only one edge can enter an individual input.
Suggestion:
similar to `MultiDiGraph` but with the following differences: ...
lib/utils/include/utils/graph/README.md
line 135 at r6 (raw file):
- `OpenLabelledDataflowGraph` allows for labelling of `Node`s and `OpenDataflowValue`s, which is a variant describing both `DataflowOutput`s and `DataflowGraphInput`s, which represent the open inputs to the graph (i.e. the inputs for which their corresponding output is not present in the graph). While the interfaces of these graphs differ slightly from the core graph variants, they still have the corresponding `add_node`/`add_edge` methods, and `query_nodes`/`query_edges` methods.
DataflowGraph
s don't have add_edge
Code quote:
`add_edge` m
lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h
line 10 at r6 (raw file):
/** * @brief A node `d` is said to dominate a node `n` if every path from the root * node to `n` must go through `d`.
Just link to https://en.wikipedia.org/wiki/Dominator_(graph_theory) as it gives a better explanation than any docstring we write will
Code quote:
* @brief A node `d` is said to dominate a node `n` if every path from the root
* node to `n` must go through `d`.
lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h
line 19 at r6 (raw file):
/** * @brief Returns the intersection of the dominators of the input nodes.
Conceptually I think this is equivalent to merging this set of nodes and then finding the set of dominators of the new merged node? Confirm this is correct and, if so, add to the explanation here
Suggestion:
of the given set of nodes.
lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc
line 41 at r5 (raw file):
Previously, Marsella8 wrote…
BUG? Intuitively this should be symmetric, something like:
Yeah I think you're right, though FYI your suggested code has a typo (both use .max()
) and in your code assigning references of max_map
and min_map
probably isn't necessary
lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc
line 53 at r6 (raw file):
Previously, Marsella8 wrote…
(unsure of correctness, but should make sense)
I believe the indented semantics of UndirectedGraph::query_edges
is all edges connected to the a node in the input set, not all edges only connected to the nodes in the input set--maybe this is a fix that should be added in the subgraph code?
lib/utils/src/utils/graph/views/views.cc
line 263 at r6 (raw file):
Previously, Marsella8 wrote…
(very unsure about this)
I don't think so, see https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O6aRRAsCc8-SUiQ2l9V
lib/utils/test/src/utils/join_strings.cc
line 10 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("join_strings") {
Probably would be good to check the edge case that v
is empty
lib/utils/test/src/utils/join_strings.cc
line 24 at r6 (raw file):
auto add_exclamation = [](std::string const &str) { return str + "!"; }; CHECK(join_strings(v, " ", add_exclamation) == "Hello! world! !!"); }
Suggestion:
SUBCASE("join_strings with transforming function") {
auto add_exclamation = [](std::string const &str) { return str + "!"; };
std::string result = join_strings(v, " ", add_exclamation);
std::string correct = "Hello! world! !!";
CHECK(result == correct);
}
lib/utils/test/src/utils/bidict/bidict.cc
line 19 at r6 (raw file):
SUBCASE("bidict::contains_r") { CHECK(dict.contains_r(std::string("one")));
Why is explicit std::string(...)
needed here?
lib/utils/test/src/utils/bidict/bidict.cc
line 31 at r6 (raw file):
CHECK(bd.contains_r(3)); CHECK_FALSE(bd.contains_r(1)); }
Suggestion:
SUBCASE("bidict::contains_l") {
SUBCASE("L type is not the same as R type") {
CHECK(dict.contains_l(1));
CHECK_FALSE(dict.contains_l(3));
}
SUBCASE("L type is the same as R type") {
bidict<int, int> bd;
bd.equate(1, 3);
CHECK(bd.contains_l(1));
CHECK_FALSE(bd.contains_l(3));
}
}
SUBCASE("bidict::contains_r") {
SUBCASE("L type is not the same as R type") {
CHECK(dict.contains_r(std::string("one")));
CHECK_FALSE(dict.contains_r(std::string("three")));
}
SUBCASE("L type is the same as R type") {
bidict<int, int> bd;
bd.equate(1, 3);
CHECK(bd.contains_r(3));
CHECK_FALSE(bd.contains_r(1));
}
}
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 8 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("merge_bidicts") {
What happens if they have overlapping keys but all of the kv pairs are the same, i.e., merging {{1, "one"}, {2, "two"}}
and {{2, "two"}, {3, "three"}}
?
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 25 at r6 (raw file):
bidict<int, std::string> bd2 = {{2, "three"}, {3, "four"}}; CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);
Suggestion:
CHECK_THROWS(merge_bidicts(bd1, bd2));
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 32 at r6 (raw file):
bidict<int, std::string> bd2 = {{3, "two"}, {4, "four"}}; CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);
Suggestion:
CHECK_THROWS(merge_bidicts(bd1, bd2));
lib/utils/test/src/utils/containers/are_all_same.cc
line 21 at r6 (raw file):
SUBCASE("Empty Container") { std::vector<int> input = {}; CHECK_THROWS_AS(are_all_same(input), std::runtime_error);
Suggestion:
CHECK_THROWS(are_all_same(input));
lib/utils/test/src/utils/containers/as_vector.cc
line 15 at r6 (raw file):
std::vector<int> result = as_vector(input); std::vector<int> correct_sorted = {1, 2, 3}; CHECK(sorted(result) == correct_sorted);
Suggestion:
std::vector<int> correct = {1, 2, 3};
CHECK(sorted(result) == sorted(correct));
lib/utils/test/src/utils/containers/enumerate.cc
line 55 at r6 (raw file):
CHECK(keys(result) == correct_keys); CHECK(unordered_multiset_of(values(result)) == correct_values);
If you're already doing unordered comparison of the keys and values, why not just compare std::unordered_multiset<std::pair<...>>
rather than two separate comparisons?
Code quote:
std::unordered_set<int> correct_keys = {0, 1, 2, 3};
std::unordered_multiset<std::string> correct_values = {
"zero", "one", "two", "three"};
std::map<int, std::string> result = enumerate(input);
CHECK(keys(result) == correct_keys);
CHECK(unordered_multiset_of(values(result)) == correct_values);
lib/utils/test/src/utils/containers/find.cc
line 13 at r6 (raw file):
TEST_CASE("find") { SUBCASE("vector") {
Ideally also add a test where the desired element occurs multiple times in the list
lib/utils/test/src/utils/containers/find.cc
line 15 at r6 (raw file):
SUBCASE("vector") { std::vector<int> v = {1, 2, 3, 4, 5}; CHECK_WITHOUT_STRINGIFY(find(v, 3) == std::find(v.begin(), v.end(), 3));
Add subcase names for these, i.e., SUBCASE("element is in vector")
, SUBCASE("element is not vector")
lib/utils/test/src/utils/containers/flatmap.cc
line 9 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Test for flatmap function on vectors") {
Add a check for type changing (e.g., instead of get_factors
having type int -> std::vector<int>
, you'd have int -> std::vector<string>
)
lib/utils/test/src/utils/containers/index_of.cc
line 13 at r6 (raw file):
SUBCASE("unique elements") { std::vector<int> v = {1, 2, 3, 4, 5}; CHECK(index_of(v, 3).value() == 2);
SUBCASE("element is present in vector")
, SUBCASE("element is not present in vector")
lib/utils/test/src/utils/containers/map_keys.cc
line 10 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("map_keys") {
Add check for duplicate keys (even if the value is undefined, check whether or not it throws an error, or check that it's one of the allowed return values, etc.)
lib/utils/test/src/utils/containers/maximum.cc
line 22 at r6 (raw file):
std::vector<int> input = {}; CHECK_THROWS_AS(maximum(input), std::runtime_error);
Suggestion:
CHECK_THROWS(maximum(input));
lib/utils/test/src/utils/containers/merge_maps.cc
line 27 at r6 (raw file):
std::unordered_map<int, std::string> rhs = {{2, "three"}, {3, "four"}}; CHECK_THROWS_AS(merge_maps(lhs, rhs), std::runtime_error);
Suggestion:
CHECK_THROWS(merge_maps(lhs, rhs));
lib/utils/test/src/utils/containers/product.cc
line 8 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("product") {
Probably a good idea where not too inconvenient to test the various overloads of these functions, either using TEST_CASE_TEMPLATE
or using multiple TEST_CASE
s (e.g., TEST_CASE("product(std::vector<int>)") { ... }
, TEST_CASE("product(std::unordered_set<int>)") { ... }
, etc.).
Obviously weigh implementation difficulty, how many other tasks you have going on, etc. This is not super-high priority, so if you have more important tasks you could also be working on feel free to not worry about this--my general heuristic with testing code is that if it's an improvement over the current situation and doesn't actively violate a bunch of code style, etc. I'll merge it. I'll probably still leave comments pointing it out on PRs, but in those cases feel free to just mention in the response that you'd prefer not to do those fixes as part of the current PR--this is totally fine, and often a good way to keep PRs from getting too large
lib/utils/test/src/utils/containers/product.cc
line 24 at r6 (raw file):
} TEST_CASE("product_where") {
Move to separate product_where.cc
lib/utils/test/src/utils/containers/reversed.cc
line 9 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing the 'reversed' function") {
Suggestion:
TEST_CASE("reversed(std::vector<int>)") {
lib/utils/test/src/utils/containers/sorted_by.cc
line 10 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("sorted_by") {
Might be good to test what happens if there are duplicates in the list?
lib/utils/test/src/utils/containers/sum.cc
line 24 at r6 (raw file):
} TEST_CASE("sum_where") {
Move to separate sum_where.cc
lib/utils/test/src/utils/containers/value_all.cc
line 14 at r6 (raw file):
SUBCASE("With nullopt") { std::vector<std::optional<int>> input = {1, 2, std::nullopt, 4, 5}; CHECK_THROWS_AS(value_all(input), std::runtime_error);
I haven't really put much though into what error types things throw, so for things that throw mk_runtime_error
it's probably better to just check that they throw for now
Suggestion:
CHECK_THROWS(value_all(input));
lib/utils/test/src/utils/containers/vector_split.cc
line 25 at r6 (raw file):
} SUBCASE("Boundary case: idx = 5") {
Suggestion:
SUBCASE("Boundary case: idx is last index in list") {
lib/utils/test/src/utils/containers/vector_split.cc
line 35 at r6 (raw file):
} SUBCASE("Out of bounds case: idx = 6") {
Suggestion:
SUBCASE("Out of bounds case: idx == list size") {
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 35 at r4 (raw file):
Previously, Marsella8 wrote…
Done.
Just FYI if you leave a trailing comma the formatter will insert a linebreak. Not saying you need to do this here, just making sure you're aware.
For example,
std::pair<int, int> x = {
3,
5
};
->
std::pair<int, int> x = {3, 5};
while
std::pair<int, int> x = { 3, 5, };
->
std::pair<int, int> x = {
3,
5,
};
lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc
line 67 at r5 (raw file):
Previously, Marsella8 wrote…
Do they match all edges, even if
Probably simpler to just check result
equals DirectedEdgeQuery correct = DirectedEdgeQuery{query_set{n.at(1), n.at(2)}, query_set{n.at(3), n.at(4)}};
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 23 at r6 (raw file):
SUBCASE("single node") { Node node = n[2];
Suggestion:
Node node = n.at(2);
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 24 at r6 (raw file):
SUBCASE("single node") { Node node = n[2]; std::unordered_set<Node> correct = {n[0], n[2]};
Suggestion:
std::unordered_set<Node> correct = {n.at(0), n.at(2)};
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 30 at r6 (raw file):
SUBCASE("multiple nodes") { std::unordered_set<Node> nodes = {n[1], n[3]};
Suggestion:
std::unordered_set<Node> nodes = {n.at(1), n.at(3)};
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 32 at r6 (raw file):
std::unordered_set<Node> nodes = {n[1], n[3]}; std::unordered_set<Node> result = get_dominators(g, nodes); std::unordered_set<Node> correct = {n[0]};
Suggestion:
std::unordered_set<Node> correct = {n.at(0)};
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 61 at r6 (raw file):
correct = {n.at(0), n.at(1), n.at(3)}; CHECK(result == correct);
You're missing a check (also separate subcases please, at the very least so you have to justify why you want to do two checks here instead of one)
Code quote:
std::unordered_set<Node> result = get_dominators(g, n.at(1));
std::unordered_set<Node> correct = {n.at(0), n.at(1)};
result = get_dominators(g, n.at(3));
correct = {n.at(0), n.at(1), n.at(3)};
CHECK(result == correct);
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 34 at r6 (raw file):
} SUBCASE("add_edge") {
Add a subcase in here to test adding a duplicate edge
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 42 at r6 (raw file):
} SUBCASE("add_edges") {
Why does subcase this exist?
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 60 at r6 (raw file):
std::unordered_set<MultiDiEdge> edge_result = g.query_edges(MultiDiEdgeQuery({n0}, {n0}));
This is only edges from n0 to n0, which I don't think is what you want?
Code quote:
std::unordered_set<MultiDiEdge> edge_result =
g.query_edges(MultiDiEdgeQuery({n0}, {n0}));
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 73 at r6 (raw file):
} SUBCASE("remove_edges") {
What is the point of this SUBCASE
?
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 23 at r4 (raw file):
Previously, Marsella8 wrote…
MultiDiEdge
s cannot be constructed directly, since they are essentially wrappers for an uid (and the graph itself does the bookkeping about which edge has which source and sink).
Yes, but can't you add DirectedEdge
and then get back a list of MultiDiEdge
? DirectedEdge
is just more readable and specific than std::pair<Node, Node>
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 27 at r6 (raw file):
std::vector<MultiDiEdge> edges = add_edges(g, input); SUBCASE("node has incoming edges") {
Suggestion:
SUBCASE("node has outgoing edges") {
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 34 at r6 (raw file):
} SUBCASE("node has no incoming edges") {
Suggestion:
SUBCASE("node has no outgoing edges") {
lib/utils/test/src/utils/graph/views/views.cc
line 48 at r1 (raw file):
Previously, Marsella8 wrote…
Fixed it, issue had to do with the query_edges function which was including all open edges coming out of the subgraph
I don't think your current fix is correct, see my comments on your modifications to HashmapUndirectedGraph
lib/utils/test/src/utils/graph/views/views.cc
line 88 at r1 (raw file):
Previously, Marsella8 wrote…
std::unordered_set<Node> JoinedNodeView::query_nodes(NodeQuery const &query)
inviews.cc
has not yet been implemented.
Interesting, that should be trivial to implement, can you add it?
lib/utils/test/src/utils/graph/views/views.cc
line 52 at r4 (raw file):
Previously, Marsella8 wrote…
get_graph_data
I think is only implemented forDataflowGraph
s (I can add it for the other classes as well if needed).
Adding it for other classes would be fine
lib/utils/test/src/utils/containers.cc
line 140 at r4 (raw file):
Previously, Marsella8 wrote…
ended up putting a throw for an empty container (i think for other functions it makes sense to vacuously define how it behaves for an empty container (e.g. sum, optional_all_of), but for
are_all_same
I can't really think of a situation where somebody would want to pass an empty container. If you want to be pedantic, you could argue that you can define is_all_same recursively and the empty container would have to evaluate to true, but I feel like in practice in doesn't make much sense.
I actually think are_all_same
should be return true if passed an empty list--like, as a user of this function, based on the name I would expect it to behave like (\forall x, y \in L . x == y
). If you want to add an are_all_same1
function that throws on empty list I don't think I would object (though I'm not convinced it's necessary), but I think having are_all_same
fail on an empty input is rather unintuitive.
Note that I think in one of the PRs (I think it was associated with concat, so probably either inceptionv3 or candle-uno) I added a function that is similar but returns the value if they are all the same, which throws an error if the list is empty (as makes sense), so there is support for this case as well
lib/utils/include/utils/containers/get_one_of.h
line 9 at r6 (raw file):
template <typename T> T get_one_of(std::unordered_set<T> const &s) {
Add check that s
is not empty
lib/utils/test/src/utils/disjoint_set.cc
line 27 at r6 (raw file):
element = generate_element<T>(2); CHECK_WITHOUT_STRINGIFY(ds.find(element) == element);
Why no stringify? optional
can be fmt
ed, and so can std::string
and int
lib/utils/test/src/utils/random_utils.cc
line 12 at r4 (raw file):
Previously, Marsella8 wrote…
i think in this case it ends up being a bit clunky to compare the 2 vectors (since we are testing for the values being approximately equal), it would be something like in the code snippet.
In general, why is it preferrable to have a single check rather than multiple ones? To have a more defined point of where the check is being done / more awareness of the entire container rather than just a single value?
The second case--CHECK(a == b)
prints a
and b
in the error message if it fails, so if you have for (int i = 0; ...) { CHECK(a[i] == b[i]); }
then you only get the diverging elements rather than getting to see all of a
and b
which is often helpful
Also, simplicity--just having one check is usually more readable.
That said, all of these are not hard rules--if due to doctest's interface making it a single check would be too inconvenient, then we can do the looped version, I just prefer to avoid it where possible.
lib/utils/test/src/utils/stack_map.cc
line 49 at r6 (raw file):
std::vector<std::pair<int, int>> expected = {{1, 10}, {2, 20}, {3, 30}}; std::vector<std::pair<int, int>> actual = map; CHECK_WITHOUT_STRINGIFY(actual == expected);
Why? std::vector<std::pair<int, int>>
should 100% be fmt
able
lib/utils/test/src/utils/stack_string.cc
line 52 at r6 (raw file):
StackString str3{"abc"}; CHECK_WITHOUT_STRINGIFY(str1 == str1);
stack_string
should be fmt
able, why no stringify?
lib/utils/test/src/utils/stack_vector.cc
line 10 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE_TEMPLATE("PushBack", T, int, double, char) {
And same for other test cases in this file
Suggestion:
TEST_CASE_TEMPLATE("stack_vector<T, MAXSIZE>::push_back", T, int, double, char) {
lib/utils/test/src/utils/stack_vector.cc
line 67 at r6 (raw file):
vector2.push_back(20); CHECK_WITHOUT_STRINGIFY(vector1 == vector2);
Why no stringify?
lib/utils/test/src/utils/tuple.cc
line 46 at r6 (raw file):
auto result = tuple_prepend(value, t1); std::tuple<int, float, double> expected(42, 3.14f, 2.71828); CHECK(tuple_compare(result, expected));
Suggestion:
std::tuple<int, float, double> result = tuple_prepend(value, t1);
std::tuple<int, float, double> correct = {42, 3.14f, 2.71828};
CHECK(result == correct);
lib/utils/test/src/utils/type_index.cc
line 22 at r6 (raw file):
} TEST_CASE("matches function") {
Suggestion:
TEST_CASE("matches<T>(std::type_index)") {
lib/utils/test/src/utils/containers/get_one_of.cc
line 11 at r6 (raw file):
std::unordered_set<int> s = {1, 2, 3}; CHECK(contains(s, get_one_of(s))); }
What happens if s
is empty?
lib/utils/test/src/utils/containers/is_subset_eq_of.cc
line 0 at r6 (raw file):
Rename to is_subseteq_of.cc
to match the corresponding .h
file (i.e., utils/containers/is_subseteq_of.h
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 72 of 148 files reviewed, 83 unresolved discussions (waiting on @lockshaw)
lib/substitutions/src/substitutions/unlabelled/multidigraph_pattern_match.cc
line 3 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Delete (yes I know you didn't add these, but may as well remove them now)
Done.
lib/utils/include/utils/join_strings.h
line 27 at r6 (raw file):
} return oss.str(); }
Done.
lib/utils/include/utils/bidict/algorithms/merge_bidicts.h
line 16 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Including debugging info in exception messages where possible is nice
see are_all_same.h comment
lib/utils/include/utils/bidict/algorithms/merge_bidicts.h
line 20 at r6 (raw file):
if (!are_disjoint(right_entries(lhs), right_entries(rhs))) { throw mk_runtime_error( "Right entries of merge_bidicts parameters are non-disjoint");
see are_all_same.h comment
lib/utils/include/utils/containers/are_all_same.h
line 10 at r6 (raw file):
template <typename C> bool are_all_same(C const &c) { if (c.size() == 0) {
Done.
lib/utils/include/utils/containers/are_all_same.h
line 11 at r6 (raw file):
bool are_all_same(C const &c) { if (c.size() == 0) { throw mk_runtime_error("input to are_all_same must be non-empty container");
This requires to have all formatting for all possible containers that might be passed, probably not worth it.
lib/utils/include/utils/containers/index_of.h
line 6 at r6 (raw file):
#include <algorithm> #include <optional> namespace FlexFlow {
Done.
lib/utils/include/utils/containers/index_of.h
line 10 at r6 (raw file):
/** * @details If multiple `e` are present within the container, the function *returns the index of the first appearance
Done.
lib/utils/include/utils/containers/map_keys.h
line 13 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Is this a good behavior or should this throw an error? I'm not sure this is a great property to have, and I don't think we're relying on it anywhere?
Updated
lib/utils/include/utils/containers/map_keys.h
line 14 at r6 (raw file):
* * @details Note that if multiple original keys are transformed to the same new * key, which value will be associated to the new key is undefined.
Done.
lib/utils/include/utils/containers/maximum.h
line 14 at r6 (raw file):
throw mk_runtime_error( "input to function maximum must be non-empty container"); }
See are_all_same
comment
lib/utils/include/utils/containers/merge_maps.h
line 18 at r6 (raw file):
throw mk_runtime_error( "Key sets of merge_maps parameters are non-disjoint"); }
Done.
lib/utils/include/utils/containers/product.h
line 9 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
We don't really have a convention set for doxygen yet, but might be good to use
@note
for documenting edge cases like this? Or idk, maybe we should only use@note
for really weird/bug-prone edge cases? Thoughts?
I'd say as a loose standard:
- brief for one line descriptions
- details for more comprehensive descriptions and specific behavior (e.g. how
get_distance_from_root
behaves if there are multiple roots). - note for unlikely edge cases / more unimportant properties which are nice to know (e.g. product is vacuously one)
- example for small examples if the function is complex.
Obviously 2 and 3 are not conceptually well separated.
lib/utils/include/utils/containers/product.h
line 32 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Move to separate
utils/containers/product_where.h
Done.
lib/utils/include/utils/containers/subvec.h
line 19 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Avoid c-style casts
Done.
lib/utils/include/utils/containers/subvec.h
line 19 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why
std::max(0, ...)
?
modelling it after python's splits/slices, so if you have {1,2,3,4} and do [0, 40] you get {1,2,3,4} (before it was just reading junk, I can also change it to throw errors if it's out of bounds. Ditto for std::min(idx, (int)v.size());
lib/utils/include/utils/containers/sum.h
line 21 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Move to separate
utils/containers/sum_where.h
Done.
lib/utils/include/utils/containers/value_all.h
line 15 at r6 (raw file):
return unwrap(element, [] { throw mk_runtime_error( "Encountered element without value in call to value_all");
see are_all_same.h
comment
lib/utils/include/utils/containers/value_all.h
line 25 at r6 (raw file):
return unwrap(element, [] { throw mk_runtime_error( "Encountered element without value in call to value_all");
Done.
lib/utils/include/utils/containers/vector_split.h
line 15 at r6 (raw file):
int idx) { if (idx < 0 || idx > static_cast<int>(v.size())) { throw std::out_of_range("Index out of range in vector_split");
Done.
lib/utils/include/utils/graph/README.md
line 21 at r6 (raw file):
- `DiGraph`: at most one edge allowed between every ordered pair of nodes, edges are directed (i.e., have a source node and a destination node) - `MultiDiGraph`: arbitrary numbers of directed edges allowed between every pair of nodes. - `DataflowGraph`: similar to `MultiDiGraph`, but the edges entering, exiting a given nodes now have a well-defined order. Due to the interface used to construct them (where essentially a node can only be added to the graph after all of its predecessor nodes have been added) `DataflowGraph`s are directed acyclic graphs. Each node has an associated ordered sequence of inputs and outputs, with the restriction that one and only one edge can enter an individual input.
Done.
lib/utils/include/utils/graph/README.md
line 135 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
DataflowGraph
s don't haveadd_edge
Done.
lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h
line 10 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Just link to https://en.wikipedia.org/wiki/Dominator_(graph_theory) as it gives a better explanation than any docstring we write will
Done.
lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h
line 19 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Conceptually I think this is equivalent to merging this set of nodes and then finding the set of dominators of the new merged node? Confirm this is correct and, if so, add to the explanation here
yes correct
lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc
line 53 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I believe the indented semantics of
UndirectedGraph::query_edges
is all edges connected to the a node in the input set, not all edges only connected to the nodes in the input set--maybe this is a fix that should be added in the subgraph code?
got it (see the associated fix in std::unordered_set<UndirectedEdge> UndirectedSubgraphView::query_edges(
)
lib/utils/test/src/utils/join_strings.cc
line 10 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Probably would be good to check the edge case that
v
is empty
Done.
lib/utils/test/src/utils/join_strings.cc
line 24 at r6 (raw file):
auto add_exclamation = [](std::string const &str) { return str + "!"; }; CHECK(join_strings(v, " ", add_exclamation) == "Hello! world! !!"); }
Done.
lib/utils/test/src/utils/bidict/bidict.cc
line 19 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why is explicit
std::string(...)
needed here?
Done.
lib/utils/test/src/utils/bidict/bidict.cc
line 31 at r6 (raw file):
CHECK(bd.contains_r(3)); CHECK_FALSE(bd.contains_r(1)); }
Done.
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 8 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What happens if they have overlapping keys but all of the kv pairs are the same, i.e., merging
{{1, "one"}, {2, "two"}}
and{{2, "two"}, {3, "three"}}
?
Done.
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 25 at r6 (raw file):
bidict<int, std::string> bd2 = {{2, "three"}, {3, "four"}}; CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);
Done.
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 32 at r6 (raw file):
bidict<int, std::string> bd2 = {{3, "two"}, {4, "four"}}; CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);
Done.
lib/utils/test/src/utils/containers/are_all_same.cc
line 21 at r6 (raw file):
SUBCASE("Empty Container") { std::vector<int> input = {}; CHECK_THROWS_AS(are_all_same(input), std::runtime_error);
Done.
lib/utils/test/src/utils/containers/as_vector.cc
line 15 at r6 (raw file):
std::vector<int> result = as_vector(input); std::vector<int> correct_sorted = {1, 2, 3}; CHECK(sorted(result) == correct_sorted);
Done.
lib/utils/test/src/utils/containers/enumerate.cc
line 55 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
If you're already doing unordered comparison of the keys and values, why not just compare
std::unordered_multiset<std::pair<...>>
rather than two separate comparisons?
set is unordered, we are not guaranteed to have 0 map to "zero", 1 to "one", ... (changed the string labels, before it was pretty confusing).
lib/utils/test/src/utils/containers/find.cc
line 13 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Ideally also add a test where the desired element occurs multiple times in the list
Done.
lib/utils/test/src/utils/containers/find.cc
line 15 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add subcase names for these, i.e.,
SUBCASE("element is in vector")
,SUBCASE("element is not vector")
Done.
lib/utils/test/src/utils/containers/flatmap.cc
line 9 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add a check for type changing (e.g., instead of
get_factors
having typeint -> std::vector<int>
, you'd haveint -> std::vector<string>
)
Done.
lib/utils/test/src/utils/containers/index_of.cc
line 13 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
SUBCASE("element is present in vector")
,SUBCASE("element is not present in vector")
Done.
lib/utils/test/src/utils/containers/map_keys.cc
line 10 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add check for duplicate keys (even if the value is undefined, check whether or not it throws an error, or check that it's one of the allowed return values, etc.)
Done.
lib/utils/test/src/utils/containers/maximum.cc
line 22 at r6 (raw file):
std::vector<int> input = {}; CHECK_THROWS_AS(maximum(input), std::runtime_error);
Done.
lib/utils/test/src/utils/containers/merge_maps.cc
line 27 at r6 (raw file):
std::unordered_map<int, std::string> rhs = {{2, "three"}, {3, "four"}}; CHECK_THROWS_AS(merge_maps(lhs, rhs), std::runtime_error);
Done.
lib/utils/test/src/utils/containers/product.cc
line 8 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Probably a good idea where not too inconvenient to test the various overloads of these functions, either using
TEST_CASE_TEMPLATE
or using multipleTEST_CASE
s (e.g.,TEST_CASE("product(std::vector<int>)") { ... }
,TEST_CASE("product(std::unordered_set<int>)") { ... }
, etc.).Obviously weigh implementation difficulty, how many other tasks you have going on, etc. This is not super-high priority, so if you have more important tasks you could also be working on feel free to not worry about this--my general heuristic with testing code is that if it's an improvement over the current situation and doesn't actively violate a bunch of code style, etc. I'll merge it. I'll probably still leave comments pointing it out on PRs, but in those cases feel free to just mention in the response that you'd prefer not to do those fixes as part of the current PR--this is totally fine, and often a good way to keep PRs from getting too large
got it thanks! Added it since its relatively easy to add, will keep it in mind for future containers functions.
lib/utils/test/src/utils/containers/product.cc
line 24 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Move to separate
product_where.cc
Done.
lib/utils/test/src/utils/containers/reversed.cc
line 9 at r6 (raw file):
TEST_SUITE(FF_TEST_SUITE) { TEST_CASE("Testing the 'reversed' function") {
Done.
lib/utils/test/src/utils/containers/sorted_by.cc
line 10 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Might be good to test what happens if there are duplicates in the list?
Done.
lib/utils/test/src/utils/containers/sum.cc
line 24 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Move to separate
sum_where.cc
Done.
lib/utils/test/src/utils/containers/value_all.cc
line 14 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I haven't really put much though into what error types things throw, so for things that throw
mk_runtime_error
it's probably better to just check that they throw for now
Done.
lib/utils/test/src/utils/containers/vector_split.cc
line 25 at r6 (raw file):
} SUBCASE("Boundary case: idx = 5") {
Done.
lib/utils/test/src/utils/containers/vector_split.cc
line 35 at r6 (raw file):
} SUBCASE("Out of bounds case: idx = 6") {
Done.
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 23 at r6 (raw file):
SUBCASE("single node") { Node node = n[2];
Done.
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 24 at r6 (raw file):
SUBCASE("single node") { Node node = n[2]; std::unordered_set<Node> correct = {n[0], n[2]};
Done.
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 30 at r6 (raw file):
SUBCASE("multiple nodes") { std::unordered_set<Node> nodes = {n[1], n[3]};
Done.
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 32 at r6 (raw file):
std::unordered_set<Node> nodes = {n[1], n[3]}; std::unordered_set<Node> result = get_dominators(g, nodes); std::unordered_set<Node> correct = {n[0]};
Done.
lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc
line 61 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
You're missing a check (also separate subcases please, at the very least so you have to justify why you want to do two checks here instead of one)
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 34 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add a subcase in here to test adding a duplicate edge
Done.
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 42 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why does subcase this exist?
removed
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 60 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
This is only edges from n0 to n0, which I don't think is what you want?
fixed
lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc
line 73 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What is the point of this
SUBCASE
?
removed
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc
line 23 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Yes, but can't you add
DirectedEdge
and then get back a list ofMultiDiEdge
?DirectedEdge
is just more readable and specific thanstd::pair<Node, Node>
as yes sorry, done
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 27 at r6 (raw file):
std::vector<MultiDiEdge> edges = add_edges(g, input); SUBCASE("node has incoming edges") {
Done.
lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc
line 34 at r6 (raw file):
} SUBCASE("node has no incoming edges") {
Done.
lib/utils/test/src/utils/graph/views/views.cc
line 88 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Interesting, that should be trivial to implement, can you add it?
Added a tentative implementation, crashes for get_edges
, not sure what I'm doing wrong, mind looking at it and seeing if anything pops out?
lib/utils/test/src/utils/graph/views/views.cc
line 52 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Adding it for other classes would be fine
I think for now we don't need it for these classes for now, i think it makes tests less atomic.
lib/utils/include/utils/containers/optional_all_of.h
line 9 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Are we using this anywhere? It's kinda a weird function and I wouldn't object to removing it if we're not using it
no it's not being used, agree that it should be removed
lib/utils/test/src/utils/containers.cc
line 140 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I actually think
are_all_same
should be return true if passed an empty list--like, as a user of this function, based on the name I would expect it to behave like (\forall x, y \in L . x == y
). If you want to add anare_all_same1
function that throws on empty list I don't think I would object (though I'm not convinced it's necessary), but I think havingare_all_same
fail on an empty input is rather unintuitive.Note that I think in one of the PRs (I think it was associated with concat, so probably either inceptionv3 or candle-uno) I added a function that is similar but returns the value if they are all the same, which throws an error if the list is empty (as makes sense), so there is support for this case as well
Yeah makes sense, changed it accordingly.
lib/utils/include/utils/containers/get_one_of.h
line 9 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add check that
s
is not empty
Done.
lib/utils/include/utils/containers/is_submap.h
line 14 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
For consistency with
is_subseteq_of
Done.
lib/utils/test/src/utils/disjoint_set.cc
line 27 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why no stringify?
optional
can befmt
ed, and so canstd::string
andint
Done.
lib/utils/test/src/utils/random_utils.cc
line 12 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
The second case--
CHECK(a == b)
printsa
andb
in the error message if it fails, so if you havefor (int i = 0; ...) { CHECK(a[i] == b[i]); }
then you only get the diverging elements rather than getting to see all ofa
andb
which is often helpfulAlso, simplicity--just having one check is usually more readable.
That said, all of these are not hard rules--if due to doctest's interface making it a single check would be too inconvenient, then we can do the looped version, I just prefer to avoid it where possible.
got it thanks, I think in this case the looped version is more comprehensible
lib/utils/test/src/utils/stack_map.cc
line 49 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why?
std::vector<std::pair<int, int>>
should 100% befmt
able
Done.
lib/utils/test/src/utils/stack_string.cc
line 52 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
stack_string
should befmt
able, why no stringify?
added docstring support to the stack_ types
lib/utils/test/src/utils/stack_vector.cc
line 10 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
And same for other test cases in this file
Done.
lib/utils/test/src/utils/stack_vector.cc
line 67 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why no stringify?
Done.
lib/utils/test/src/utils/tuple.cc
line 46 at r6 (raw file):
auto result = tuple_prepend(value, t1); std::tuple<int, float, double> expected(42, 3.14f, 2.71828); CHECK(tuple_compare(result, expected));
Done.
lib/utils/test/src/utils/type_index.cc
line 22 at r6 (raw file):
} TEST_CASE("matches function") {
Done.
lib/utils/test/src/utils/containers/get_one_of.cc
line 11 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What happens if
s
is empty?
now throws
lib/utils/test/src/utils/containers/is_subset_eq_of.cc
line at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Rename to
is_subseteq_of.cc
to match the corresponding.h
file (i.e.,utils/containers/is_subseteq_of.h
)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 111 files at r5, 61 of 73 files at r7, all commit messages.
Reviewable status: 134 of 148 files reviewed, 51 unresolved discussions (waiting on @Marsella8)
lib/utils/include/utils/stack_map.h
line 142 at r7 (raw file):
}; } // namespace doctest
Shouldn't be necessary as long as stack_map
has a std::ostream &operator<<(std::ostream &, stack_map<K, V, MAXSIZE> const &)
defined, as doctest
will use that if it's defined.
Also, doctest
things shouldn't be in utils
(they were for a bit, but if you look at the current repo-refactor
they've all been moved to test/utils
Code quote:
namespace doctest {
template <typename K, typename V, std::size_t MAXSIZE>
struct StringMaker<FlexFlow::stack_map<K, V, MAXSIZE>> {
static String convert(FlexFlow::stack_map<K, V, MAXSIZE> const &map) {
return toString(fmt::to_string(map));
}
};
} // namespace doctest
lib/utils/include/utils/stack_string.h
line 140 at r7 (raw file):
template <size_t MAXSIZE> struct StringMaker<FlexFlow::stack_string<MAXSIZE>> {
See https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O7kp-4g9JgF07PcV2Y1
lib/utils/include/utils/stack_vector.h
line 374 at r7 (raw file):
namespace doctest { template <typename T, std::size_t MAXSIZE>
See https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O7kp-4g9JgF07PcV2Y1
lib/utils/include/utils/containers/are_all_distinct.h
line 6 at r7 (raw file):
#include "utils/containers/unordered_multiset_of.h" #include "utils/containers/unordered_set_of.h" #include "utils/exception.h"
Remove? Doesn't seem to be used.
Code quote:
#include "utils/exception.h"
lib/utils/include/utils/containers/are_all_same.h
line 11 at r6 (raw file):
Previously, Marsella8 wrote…
This requires to have all formatting for all possible containers that might be passed, probably not worth it.
It's a good concern, though peronally I'd lean toward that being fine as I think it's reasonable to expect that most things should be printable (or that if they're not, that they should be made printable) unless there's some cases we know of where this is impossible. What happens if you change it to require printability--are there any nontrivial cases, or is it just a bunch of #include "utils/fmt/....h
(which I don't have much of an objection to)?
lib/utils/include/utils/containers/map_keys.h
line 24 at r7 (raw file):
F const &f) { std::unordered_multiset<K2> transformed_keys = transform(unordered_multiset_of(keys(m)), f);
Probably better not to evaluate f
more times than necessary if possible, could probably get moved into the loop and use the f
call there?
lib/utils/include/utils/containers/product.h
line 9 at r6 (raw file):
Previously, Marsella8 wrote…
I'd say as a loose standard:
- brief for one line descriptions
- details for more comprehensive descriptions and specific behavior (e.g. how
get_distance_from_root
behaves if there are multiple roots).- note for unlikely edge cases / more unimportant properties which are nice to know (e.g. product is vacuously one)
- example for small examples if the function is complex.
Obviously 2 and 3 are not conceptually well separated.
My only objection is that I think @note
gets highlighted (correct me if I'm wrong), so using it for "more unimportant properties" is a bit odd
lib/utils/include/utils/containers/subvec.h
line 19 at r6 (raw file):
Previously, Marsella8 wrote…
modelling it after python's splits/slices, so if you have {1,2,3,4} and do [0, 40] you get {1,2,3,4} (before it was just reading junk, I can also change it to throw errors if it's out of bounds. Ditto for
std::min(idx, (int)v.size());
Probably better to throw an error if it's still out-of-bounds after the negative-to-positive-index conversion
lib/utils/include/utils/containers/transform.h
line 38 at r7 (raw file):
template <typename F, typename In, typename Out = decltype(std::declval<F>()(std::declval<In>()))>
Suggestion:
typename Out = std::invoke_result_t<F, In>>
lib/utils/include/utils/containers/transform.h
line 50 at r7 (raw file):
template <typename F, typename In, typename Out = decltype(std::declval<F>()(std::declval<In>()))>
Suggestion:
typename Out = std::invoke_result_t<F, In>>
lib/utils/include/utils/fmt/unordered_map.h
line 22 at r7 (raw file):
: formatter<::std::string> { template <typename FormatContext> auto format(::std::unordered_map<K, V> const &m, FormatContext &ctx) const
Can you make sure this is also applied to the other files in utils/fmt
?
lib/utils/test/src/utils/bidict/bidict.cc
line 19 at r6 (raw file):
Previously, Marsella8 wrote…
Done.
I still see the explicit std::string(...)
?
lib/utils/test/src/utils/bidict/bidict.cc
line 41 at r7 (raw file):
CHECK_FALSE(bd.contains_r(1)); } }
Nest these as
SUBCASE("L type is the same as R type") {
SUBCASE("bidict::contains_l") {
...
}
SUBCASE("bidict::contains_r") {
}
}
SUBCASE("L type is not the same as R type") {
SUBCASE("bidict::contains_l") {
...
}
SUBCASE("bidict::contains_r") {
}
}
Code quote:
SUBCASE("bidict::contains_l") {
SUBCASE("L type is not the same as R type") {
CHECK(dict.contains_l(1));
CHECK_FALSE(dict.contains_l(3));
}
SUBCASE("L type is the same as R type") {
bidict<int, int> bd;
bd.equate(1, 3);
CHECK(bd.contains_l(1));
CHECK_FALSE(bd.contains_l(3));
}
}
SUBCASE("bidict::contains_r") {
SUBCASE("L type is not the same as R type") {
CHECK(dict.contains_r(std::string("one")));
CHECK_FALSE(dict.contains_r(std::string("three")));
}
SUBCASE("L type is the same as R type") {
bidict<int, int> bd;
bd.equate(1, 3);
CHECK(bd.contains_r(3));
CHECK_FALSE(bd.contains_r(1));
}
}
lib/utils/test/src/utils/bidict/bidict.cc
line 52 at r7 (raw file):
CHECK(bd.contains_r(3)); CHECK_FALSE(bd.contains_r(1)); }
Delete?
Code quote:
SUBCASE("bidict::contains_r, bidict::contains_r - same type") {
bidict<int, int> bd;
bd.equate(1, 3);
bd.equate(2, 4);
CHECK(bd.contains_l(1));
CHECK_FALSE(bd.contains_l(3));
CHECK(bd.contains_r(3));
CHECK_FALSE(bd.contains_r(1));
}
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 32 at r7 (raw file):
bidict<int, std::string> bd2 = {{2, "two"}, {3, "three"}}; CHECK_THROWS(merge_bidicts(bd1, bd2));
If this is the case, then it should probably be renamed merge_disjoint_bidicts
lib/utils/test/src/utils/containers/enumerate.cc
line 55 at r6 (raw file):
Previously, Marsella8 wrote…
set is unordered, we are not guaranteed to have 0 map to "zero", 1 to "one", ... (changed the string labels, before it was pretty confusing).
std::unordered_multiset<std::pair<int, std::string>> correct = {{0, "A"}, {1, "B"}, {2, "C"}, {3, "D"}};
would check that, right?
lib/utils/test/src/utils/containers/find.cc
line 32 at r7 (raw file):
std::unordered_set<int> s = {1, 2, 3, 4, 5}; SUBCASE("element found") {
Would probably be a better name, but not a big deal
Suggestion:
SUBCASE("element in container") {
lib/utils/test/src/utils/containers/index_of.cc
line 14 at r7 (raw file):
std::vector<int> v = {1, 2, 3, 4, 3, 5}; SUBCASE("unique element") {
Suggestion:
SUBCASE("element occurs once in container") {
lib/utils/test/src/utils/containers/index_of.cc
line 19 at r7 (raw file):
SUBCASE("duplicate elements") { CHECK(index_of(v, 3).value() == 2); // Returns first occurrence }
Suggestion:
SUBCASE("if element appears multiple times, returns the first occurence") {
CHECK(index_of(v, 3).value() == 2);
}
lib/utils/test/src/utils/containers/product_where.cc
line 17 at r7 (raw file):
CHECK(correct == result); } SUBCASE("empty filtered container") {
Add subcase for if the container started empty
lib/utils/test/src/utils/containers/sorted_by.cc
line 27 at r7 (raw file):
} SUBCASE("duplicates") {
Suggestion:
SUBCASE("container contains duplicate elements") {
lib/utils/test/src/utils/containers/sum_where.cc
line 17 at r7 (raw file):
int result = sum_where(input, condition); CHECK(correct == result); }
Add subcase for if the container starts empty
lib/utils/test/src/utils/containers/vector_split.cc
line 35 at r7 (raw file):
} SUBCASE("Out of bounds case: idx == list_size + 1") {
Isn't the first out of bounds index list_size
and not list_size+1
, e.g., if I have a list l
of 1 element, l.at(1)
would throw an error
lib/utils/test/src/utils/random_utils.cc
line 28 at r7 (raw file):
SUBCASE("correct distribution") { auto check_probabilities = [](std::vector<int> values,
Suggestion:
auto check_probabilities = [](std::vector<int> const &values,
lib/utils/test/src/utils/containers/get_one_of.cc
line 16 at r7 (raw file):
SUBCASE("empty set") { std::unordered_set<int> s = {}; CHECK_THROWS_AS(get_one_of(s), std::runtime_error);
Suggestion:
CHECK_THROWS(get_one_of(s));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 73 files at r7.
Reviewable status: 137 of 148 files reviewed, 47 unresolved discussions (waiting on @Marsella8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 111 files at r5, 1 of 73 files at r7.
Reviewable status: 138 of 148 files reviewed, 47 unresolved discussions (waiting on @lockshaw)
lib/utils/include/utils/stack_map.h
line 142 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Shouldn't be necessary as long as
stack_map
has astd::ostream &operator<<(std::ostream &, stack_map<K, V, MAXSIZE> const &)
defined, asdoctest
will use that if it's defined.Also,
doctest
things shouldn't be inutils
(they were for a bit, but if you look at the currentrepo-refactor
they've all been moved totest/utils
Done.
lib/utils/include/utils/stack_string.h
line 140 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
See https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O7kp-4g9JgF07PcV2Y1
Done.
lib/utils/include/utils/stack_vector.h
line 374 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
See https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O7kp-4g9JgF07PcV2Y1
Done.
lib/utils/include/utils/containers/are_all_distinct.h
line 6 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove? Doesn't seem to be used.
Done.
lib/utils/include/utils/containers/are_all_same.h
line 11 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
It's a good concern, though peronally I'd lean toward that being fine as I think it's reasonable to expect that most things should be printable (or that if they're not, that they should be made printable) unless there's some cases we know of where this is impossible. What happens if you change it to require printability--are there any nontrivial cases, or is it just a bunch of
#include "utils/fmt/....h
(which I don't have much of an objection to)?
Done.
lib/utils/include/utils/containers/map_keys.h
line 24 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Probably better not to evaluate
f
more times than necessary if possible, could probably get moved into the loop and use thef
call there?
moved it at the end of the function body
lib/utils/include/utils/containers/product.h
line 9 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
My only objection is that I think
@note
gets highlighted (correct me if I'm wrong), so using it for "more unimportant properties" is a bit odd
yes, but I don't think it matters too much since the user can still see the different categories and assess the importance of information from that (+ the highlighting means that doxygen recognizes it). hope it makes sense
lib/utils/include/utils/containers/subvec.h
line 19 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Probably better to throw an error if it's still out-of-bounds after the negative-to-positive-index conversion
Done.
lib/utils/include/utils/containers/transform.h
line 38 at r7 (raw file):
template <typename F, typename In, typename Out = decltype(std::declval<F>()(std::declval<In>()))>
Done.
lib/utils/include/utils/containers/transform.h
line 50 at r7 (raw file):
template <typename F, typename In, typename Out = decltype(std::declval<F>()(std::declval<In>()))>
Done.
lib/utils/include/utils/fmt/unordered_map.h
line 22 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Can you make sure this is also applied to the other files in
utils/fmt
?
Done.
lib/utils/test/src/utils/bidict/bidict.cc
line 19 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I still see the explicit
std::string(...)
?
Done.
lib/utils/test/src/utils/bidict/bidict.cc
line 41 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Nest these as
SUBCASE("L type is the same as R type") { SUBCASE("bidict::contains_l") { ... } SUBCASE("bidict::contains_r") { } } SUBCASE("L type is not the same as R type") { SUBCASE("bidict::contains_l") { ... } SUBCASE("bidict::contains_r") { } }
Done.
lib/utils/test/src/utils/bidict/bidict.cc
line 52 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Delete?
Done.
lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc
line 32 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
If this is the case, then it should probably be renamed
merge_disjoint_bidicts
Done.
lib/utils/test/src/utils/containers/enumerate.cc
line 55 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
std::unordered_multiset<std::pair<int, std::string>> correct = {{0, "A"}, {1, "B"}, {2, "C"}, {3, "D"}};
would check that, right?
no, since unordered_set
has no guaranteed order and thus we could also have something like {{0, "C"}, {1, "B"}, {2, "A"}, {3, "D"}}
. Our only guarantee is that we get a bijective mapping between 0, 1, ..., n and the elements of the set. (Which is why i consider having enumerate defined on an unordered_set to be a bit odd, though I can perfectly see why in practice you might need something like that).
lib/utils/test/src/utils/containers/index_of.cc
line 14 at r7 (raw file):
std::vector<int> v = {1, 2, 3, 4, 3, 5}; SUBCASE("unique element") {
Done.
lib/utils/test/src/utils/containers/index_of.cc
line 19 at r7 (raw file):
SUBCASE("duplicate elements") { CHECK(index_of(v, 3).value() == 2); // Returns first occurrence }
Done.
lib/utils/test/src/utils/containers/product_where.cc
line 17 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add subcase for if the container started empty
Done.
lib/utils/test/src/utils/containers/sorted_by.cc
line 27 at r7 (raw file):
} SUBCASE("duplicates") {
Done.
lib/utils/test/src/utils/containers/sum_where.cc
line 17 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Add subcase for if the container starts empty
Done.
lib/utils/test/src/utils/containers/vector_split.cc
line 35 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Isn't the first out of bounds index
list_size
and notlist_size+1
, e.g., if I have a listl
of 1 element,l.at(1)
would throw an error
No because its [0, idx)
, so if you have a l = {1,2}
, with idx=0
you get an empty prefix, with idx=2
you get {1,2}
and with idx=3
you go out of bounds. (If you have n
elements, then you have n+1
possible splits.
lib/utils/test/src/utils/random_utils.cc
line 28 at r7 (raw file):
SUBCASE("correct distribution") { auto check_probabilities = [](std::vector<int> values,
Done.
lib/utils/test/src/utils/containers/get_one_of.cc
line 16 at r7 (raw file):
SUBCASE("empty set") { std::unordered_set<int> s = {}; CHECK_THROWS_AS(get_one_of(s), std::runtime_error);
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 33 of 62 files at r8, all commit messages.
Reviewable status: 131 of 161 files reviewed, 26 unresolved discussions (waiting on @Marsella8)
lib/utils/include/utils/containers/maximum.h
line 14 at r6 (raw file):
Previously, Marsella8 wrote…
See
are_all_same
comment
Update?
lib/utils/include/utils/containers/product.h
line 9 at r6 (raw file):
(+ the highlighting means that doxygen recognizes it)
I mean in the doxygen-rendered html, not in the editor 😂
lib/utils/include/utils/containers/subvec.h
line 25 at r8 (raw file):
new_idx = size + idx; } if (new_idx < 0 || new_idx > size) {
Should this be >
or >=
?
lib/utils/test/src/utils/containers/enumerate.cc
line 55 at r6 (raw file):
Previously, Marsella8 wrote…
no, since
unordered_set
has no guaranteed order and thus we could also have something like{{0, "C"}, {1, "B"}, {2, "A"}, {3, "D"}}
. Our only guarantee is that we get a bijective mapping between 0, 1, ..., n and the elements of the set. (Which is why i consider having enumerate defined on an unordered_set to be a bit odd, though I can perfectly see why in practice you might need something like that).
Somehow missed you were enumerating unordered_set
. Thanks 🙂
lib/utils/test/src/utils/containers/subvec.cc
line 59 at r8 (raw file):
SUBCASE("Out of bounds index from below") { CHECK_THROWS(subvec(v, -100, 2));
Would be better to test the boundary cases rather than extreme cases, as the boundary cases are more likely to have bugs
lib/utils/test/src/utils/containers/vector_split.cc
line 35 at r7 (raw file):
Previously, Marsella8 wrote…
No because its
[0, idx)
, so if you have al = {1,2}
, withidx=0
you get an empty prefix, withidx=2
you get{1,2}
and withidx=3
you go out of bounds. (If you haven
elements, then you haven+1
possible splits.
Makes sense, thanks for clarifying 🙂
In that case, might be nice to clarify SUBCASE("Boundary case: idx is the last index in the list")
since an index of 5 isn't really "in" a list of length 5 (probably better to say "idx is right after the last element in the list" or something like that)
lib/utils/include/utils/bidict/algorithms/merge_disjoint_bidicts.h
line 17 at r8 (raw file):
if (!are_disjoint(left_entries(lhs), left_entries(rhs))) { throw mk_runtime_error( fmt::format("Left entries of {} and {} are non-disjoint"), lhs, rhs);
Suggestion:
throw mk_runtime_error(
fmt::format("Left entries of {} and {} are non-disjoint", lhs, rhs));
lib/utils/include/utils/bidict/algorithms/merge_disjoint_bidicts.h
line 21 at r8 (raw file):
if (!are_disjoint(right_entries(lhs), right_entries(rhs))) { throw mk_runtime_error( fmt::format("Right entries of {} and {} are non-disjoint"), lhs, rhs);
Suggestion:
throw mk_runtime_error(
fmt::format("Right entries of {} and {} are non-disjoint", lhs, rhs));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r9, 1 of 1 files at r10, all commit messages.
Reviewable status: 131 of 161 files reviewed, 20 unresolved discussions (waiting on @Marsella8)
lib/utils/include/utils/containers/maximum.h
line 5 at r9 (raw file):
#include "utils/exception.h" #include "utils/fmt/map.h"
Remove, for any template types the caller file is responsible for including the necessary header file (I just went and fixed this myself, but FYI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 131 of 161 files reviewed, 20 unresolved discussions (waiting on @lockshaw)
lib/utils/include/utils/containers/maximum.h
line 14 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Update?
Done.
lib/utils/include/utils/containers/product.h
line 9 at r6 (raw file):
Previously, lockshaw (Colin Unger) wrote…
(+ the highlighting means that doxygen recognizes it)
I mean in the doxygen-rendered html, not in the editor 😂
whoops : - (
lib/utils/include/utils/containers/subvec.h
line 25 at r8 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Should this be
>
or>=
?
current one is correct (since range is exclusive for the upper-bound, so size() is the last valid bound)
lib/utils/test/src/utils/containers/subvec.cc
line 59 at r8 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Would be better to test the boundary cases rather than extreme cases, as the boundary cases are more likely to have bugs
fixed
lib/utils/test/src/utils/containers/vector_split.cc
line 35 at r7 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Makes sense, thanks for clarifying 🙂
In that case, might be nice to clarify
SUBCASE("Boundary case: idx is the last index in the list")
since an index of 5 isn't really "in" a list of length 5 (probably better to say "idx is right after the last element in the list" or something like that)
Done.
lib/utils/include/utils/bidict/algorithms/merge_disjoint_bidicts.h
line 17 at r8 (raw file):
if (!are_disjoint(left_entries(lhs), left_entries(rhs))) { throw mk_runtime_error( fmt::format("Left entries of {} and {} are non-disjoint"), lhs, rhs);
Done.
lib/utils/include/utils/bidict/algorithms/merge_disjoint_bidicts.h
line 21 at r8 (raw file):
if (!are_disjoint(right_entries(lhs), right_entries(rhs))) { throw mk_runtime_error( fmt::format("Right entries of {} and {} are non-disjoint"), lhs, rhs);
Done.
Description of changes:
Additional testing for graph utilities
Related Issues:
Linked Issues:
Issues closed by this PR:
This change is