-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: add router benchmark program and use it as integration test #4437
Conversation
Miscellaneous facts: * This load test can be used to benchmark the router. * The test orchestrator is a simplified variant of the end2end_integration test. * The leg-work is done by the end2end binary, which has been extended to play other games than ping-pong. * The test always exercises all available pairs. For the purpose of benchmarking, it relies on a bespoke topology: router_bm.topo, which has just enough to exercise all types of forwarding.
…outine closure. Just let it be captured.
For now the test is not accomplishing much. It runs the benchmark and collects "some" metrics. The metrics are just printed out in raw form. A future version will collect upcoming (separate PR) metrics that are actually relevant to a router benchmark, and compare them with some expectations.
I needed most of the features in end2end_integration so there was no benefit in having a separate executable. It turned out impossible to create a custom topology that would produce only one type of traffic at one router while exercising all possible role pairs. One still needs to chose role pairs to obtain that. Also I couldn't do it with only the existing subsets. I needed to add noncore#nonlocalcore and a slightly extended topology.
Also suppress most pongs in the packetflood game, in order to allow some routers to have only IN or only OUT traffic.
Mainly: * Added some more flexibility in the end2end_integration subset selection. * Make use of samein router_benchmark test. * Suppressed some more tracing under the traces=false option. * Cache the path to the target during during the packetflood game.
Changes: * added flowID to ping messages in the router benchmark program. * fix ping response suppression. Do not suppress one in 256 responses, but allow one in 256. * fixed typoe in log. * fix metric labeling. The proper interface ID for each metric had been lost in a refactoring.
Changes: * Make server methods take a pointer receiver so pongs count isn't reset for each call. * Formatting and so on.
Mainly: * Add mertic to expose number of cores used by Go. * Clarify running/runnable metrics. * Collect the correct performance data from the test run. * Fixed the router_bm topo so that there actually are two ISDs.
Have the test driver supply the best metrics time-window to the test harness.
Rely on wait_connectivity rather than the passage of time.
This is in the faint hope of evading a merge conflict.
...Or may be I had already rebased on top of the trunk's fix. Don't know.
Had to skip the autogenerated docs in rules_openapi/tools as it seems we have no control on those and mdlint doesn't like them anymore.
Checks are only performed during CI. Expectations have been extrapolated from one initial CI run.
That was the problem.
Just so that I can easily see what the CI environment is like.
Delete edundant log. Move core count cpature and output to the end.
Five seconds gave too much variability. It seems that pkt rate and cpu rate don't align very well or that cpu rate is very volatile. That caused pkts/cpu-sec to have large random spikes. With 10s the spikes are gone.
Since switching to a 10s window the average observed performance may have been reduced a bit.
Summary: * In the router, prevent the Go runtime from trying to use more cores than the specified number of processor threads. Reason: * Be consistent with the default where there are as many processor threads as there are cores. * In non-default cases, we want to be able to support multiple routers on the same host. When doing that, Go's assumption that all the cores are fair game distorts the performance as all router instances compete for the same cores and so incurr far more preemption than they would in a normal deployment. * In the test harness, configure the number of cores allowed to each router based on what's available on the host running the test. * In the end2end binary, terminate the pong drain task quickly instead of waiting up to 10s. This allows to report the end-time of the test accurately, thereby removing a lot of variability in the results (before, the metrics capture window could overlap with the end of the active phase of the test). * Because the test goes faster, increase the number of packets to 1.5M, so we are sure we have a reliable 10s metrics capture window in the middle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 14 files at r1.
Reviewable status: 2 of 15 files reviewed, 6 unresolved discussions (waiting on @jiceatscion)
.buildkite/pipeline_lib.sh
line 20 at r2 (raw file):
if [[ "$test" =~ router_benchmark ]]; then args="--test_arg=--ci" fi
There should already be an env variable CI
which could be used instead of this flag. If I understand the plumbum API correctly, we could even set the flag implicitly based on an envvar using cli.Flag(...., envname="CI")
.
acceptance/router_benchmark/BUILD.bazel
line 8 at r2 (raw file):
args = [ "--executable=end2end_integration:$(location //tools/end2end_integration)", ],
I'm actually a bit confused how this can even run the ./bin/end2end
as a subcommand binary. How is this even built and how come this is on the appropriate path to run here? Usually bazel is such a stickler about this...
acceptance/router_benchmark/test.py
line 3 at r2 (raw file):
#!/usr/bin/env python3 # Copyright 2020 Anapaya Systems
Nit copy pasted copyright
pkg/snet/packet_conn.go
line 118 at r2 (raw file):
// FlowID: arbitrary value to distribute connections over router CPUs. // Initialize to 1 if backward compatible (no-distribution) behavior is desired. FlowID uint32
Could we deal with this Flow ID change in a separate PR, so that this gets the appropriate visibility?
IMO, PacketConn
this is not the right API level to expose this. The PacketConn
is sort of a "raw" connection. A user of this API can be expected to set the FlowID directly in the Packet
.
I would add this to the snet.Conn
instead, setting it when creating a packet in scionConnWriter.WriteTo
. We can e.g. set this based on a hash of src/dst ISD, AS, Host address and Port (see e.g. https://www.rfc-editor.org/rfc/rfc6437#appendix-A).
I don't think this requires a backwards compatibility layer. From an application perspective, with using a constant FlowID, we would have possibly had an ordering guarantee between packets sent from different snet.Conn
s. I doubt that this would even have worked in the first place (dispatcher / sending host might not guarantee the send order already). (Ab-)Using this in an application would probably have been tricky enough that I'm sure someone would have noticed that it's a bad idea.
tools/end2end/main.go
line 17 at r2 (raw file):
// This is a general purpose client/server code for end2end tests. It plays // ping-pong with some variantions depending on command line arguments.
I'm not sure about generalizing this end2end thing. It feels to me like this makes things much more complicated than they need to be.
Wouldn't a separate "blaster" be much shorter and simpler than this? We only want run one test at a time, and so we don't seem to benefit much from the end2end_integration
harness. The server side becomes trivial if we don't send any replies and just report received packets out of band.
Note that we could also use the end2end_integration
harness with a different command than end2end -- it just assumes that some flags are supported -- so this could still be used with a simpler "blaster" test.
I don't know, what do you think?
tools/integration/integrationlib/common.go
line 150 at r2 (raw file):
} log.Info("Retrying...") time.Sleep(integration.RetryTimeout)
Nit. This is now printing "Retrying..." after the last attempt.
Observed numbers have about double after improving test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 15 files reviewed, 2 unresolved discussions (waiting on @matzf)
.buildkite/pipeline_lib.sh
line 20 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
There should already be an env variable
CI
which could be used instead of this flag. If I understand the plumbum API correctly, we could even set the flag implicitly based on an envvar usingcli.Flag(...., envname="CI")
.
Done.
I wasn't aware. Works like a charm.
acceptance/router_benchmark/BUILD.bazel
line 8 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
I'm actually a bit confused how this can even run the
./bin/end2end
as a subcommand binary. How is this even built and how come this is on the appropriate path to run here? Usually bazel is such a stickler about this...
I honestly don't know. I didn't invent this; I cargo-culted it from trc_update/BUILD.bazel. Apparently the value of "--executable" is actually a mapping: "end2end_integration" -> path_of(tools/end2end_integration). The rest is the business of common/base.py (which builds the map) and /test.py that peruses it. Bazel is content because the executable file is listed as a data dependency of the test.
acceptance/router_benchmark/test.py
line 3 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit copy pasted copyright
Done
pkg/snet/packet_conn.go
line 118 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
Could we deal with this Flow ID change in a separate PR, so that this gets the appropriate visibility?
IMO,
PacketConn
this is not the right API level to expose this. ThePacketConn
is sort of a "raw" connection. A user of this API can be expected to set the FlowID directly in thePacket
.I would add this to the
snet.Conn
instead, setting it when creating a packet inscionConnWriter.WriteTo
. We can e.g. set this based on a hash of src/dst ISD, AS, Host address and Port (see e.g. https://www.rfc-editor.org/rfc/rfc6437#appendix-A).
I don't think this requires a backwards compatibility layer. From an application perspective, with using a constant FlowID, we would have possibly had an ordering guarantee between packets sent from differentsnet.Conn
s. I doubt that this would even have worked in the first place (dispatcher / sending host might not guarantee the send order already). (Ab-)Using this in an application would probably have been tricky enough that I'm sure someone would have noticed that it's a bad idea.
I suspected you'd object to such an ad-hoc API change. I just did a couple of runs with and without... the performance difference is actually unknown because I just discovered while deleting the code that I had forgotten to use the passed flowID value when serializing! So, I guess we can live without that until we had time to add the feature properly.
tools/end2end/main.go
line 17 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
I'm not sure about generalizing this end2end thing. It feels to me like this makes things much more complicated than they need to be.
Wouldn't a separate "blaster" be much shorter and simpler than this? We only want run one test at a time, and so we don't seem to benefit much from the
end2end_integration
harness. The server side becomes trivial if we don't send any replies and just report received packets out of band.Note that we could also use the
end2end_integration
harness with a different command than end2end -- it just assumes that some flags are supported -- so this could still be used with a simpler "blaster" test.I don't know, what do you think?
A separate blaster may be a "simpler" solution to execute, but not to code, because end2end is already written. I was about to write a separate tool when I realized that the changes I needed to make to end2end to satisfy my needs where very simple. In the long run I agree with you. We'll eventually make a specialized tool. Not because it's easier but because it can test fewer components at a time (i.e. one), which will give more accurate readings. For now, I'm happy that I could meter all 5 traffic types in two runs with one topology. I find it pretty convenient to get some early results. So, if it's ok with you, I'd prefer to keep this for now and do another round of improvements in a later PR.
tools/integration/integrationlib/common.go
line 150 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit. This is now printing "Retrying..." after the last attempt.
Done
Summary: * Stop abusing the num_processors config variable and add a num_cores config for that purpose. (This gives us the flexibility of configuring processors and cores independently. * Adjust the core allocation scheme in the test harness. Gives better results and better balance in presence of other than 12 cores. * Document the new config variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 14 files at r1, 5 of 9 files at r5, all commit messages.
Reviewable status: 7 of 16 files reviewed, 4 unresolved discussions (waiting on @jiceatscion)
acceptance/router_benchmark/BUILD.bazel
line 8 at r2 (raw file):
Previously, jiceatscion wrote…
I honestly don't know. I didn't invent this; I cargo-culted it from trc_update/BUILD.bazel. Apparently the value of "--executable" is actually a mapping: "end2end_integration" -> path_of(tools/end2end_integration). The rest is the business of common/base.py (which builds the map) and /test.py that peruses it. Bazel is content because the executable file is listed as a data dependency of the test.
Yup, I saw that it's the same in trc_update
. I do understand the end2end_integration
, but I'm confused about end2end
which is invoked by end2end_integration
. I think this is escapes the bazel sandbox somehow, which probably means that it only works because we build into bin/
before.
doc/manuals/router.rst
line 182 at r5 (raw file):
The send buffer size in bytes. 0 means use system default. .. option:: router.num_cores = <int> (Default: ``GOMAXPROCS``)
I would prefer not to create a redundant option to control this. The existing GOMAXPROCS
already does what we need. I think it's a great idea to document better that this (and other GO...) environment variables apply.
In the test setup, adding an environment variable is just as easy as editing the toml, it can be set in the docker-compose file.
(Btw, the compose file also has some options to limit cpu usage. AFAIU, setting the cpuset
option would result in an affinity mask which the go runtime does take into account for the default GOMAXPROC value. The go runtime does not, however, take into account CPU quotas with cpus
value.)
pkg/snet/dispatcher.go
line 59 at r5 (raw file):
ia addr.IA, registration *net.UDPAddr, svc addr.SVC) (PacketConn, uint16, error) {
Nit: maybe revert this file and snet/packet_conn.go
entirely to avoid any git blame and or merge issues.
pkg/snet/packet_conn.go
line 118 at r2 (raw file):
Previously, jiceatscion wrote…
I suspected you'd object to such an ad-hoc API change. I just did a couple of runs with and without... the performance difference is actually unknown because I just discovered while deleting the code that I had forgotten to use the passed flowID value when serializing! So, I guess we can live without that until we had time to add the feature properly.
Ok!
tools/end2end/main.go
line 17 at r2 (raw file):
A separate blaster may be a "simpler" solution to execute, but not to code, because end2end is already written.
The end2end tool grew by about 40% lines of code to include the packetflood. My gut feeling is that with a copy-pasted and stream-lined version of end2end that only does the "packetflood", we would end up with not much more code in total.
I don't like that this now conflates two mostly separate code paths just to share some of the boilerplate, but I think I don't feel strongly enough about this end2end test to object. Especially, as you already have plans to replace this. @scionproto/scion-core-team opinions?
topology/router_bm.topo
line 1 at r5 (raw file):
--- # Bespoke Topology for router benchmarking:
Nit. For other tests that expect specific topology, we put this .topo file into the testdata, i.e. acceptance/router_benchmark/testdata/
. Probably a good way to keep things tidy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 16 files reviewed, 2 unresolved discussions (waiting on @matzf)
acceptance/router_benchmark/BUILD.bazel
line 8 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
Yup, I saw that it's the same in
trc_update
. I do understand theend2end_integration
, but I'm confused aboutend2end
which is invoked byend2end_integration
. I think this is escapes the bazel sandbox somehow, which probably means that it only works because we build intobin/
before.
mmm, end2end_integration might need to have bin/end2end as a data dependency too. It seems to be enough. Although I don't know precisely how bazel manages it.
doc/manuals/router.rst
line 182 at r5 (raw file):
Previously, matzf (Matthias Frei) wrote…
I would prefer not to create a redundant option to control this. The existing
GOMAXPROCS
already does what we need. I think it's a great idea to document better that this (and other GO...) environment variables apply.In the test setup, adding an environment variable is just as easy as editing the toml, it can be set in the docker-compose file.
(Btw, the compose file also has some options to limit cpu usage. AFAIU, setting thecpuset
option would result in an affinity mask which the go runtime does take into account for the default GOMAXPROC value. The go runtime does not, however, take into account CPU quotas withcpus
value.)
Well, yes it occured to me that we could use the env var directly. I just loathe env vars. They pop in and out of existence with no dicernable origin and no scope. I thought I was doing us a favor by adding an explicit config... but ok. I used the env var instead.
pkg/snet/dispatcher.go
line 59 at r5 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit: maybe revert this file and
snet/packet_conn.go
entirely to avoid any git blame and or merge issues.
done
tools/end2end/main.go
line 17 at r2 (raw file):
Previously, matzf (Matthias Frei) wrote…
A separate blaster may be a "simpler" solution to execute, but not to code, because end2end is already written.
The end2end tool grew by about 40% lines of code to include the packetflood. My gut feeling is that with a copy-pasted and stream-lined version of end2end that only does the "packetflood", we would end up with not much more code in total.
I don't like that this now conflates two mostly separate code paths just to share some of the boilerplate, but I think I don't feel strongly enough about this end2end test to object. Especially, as you already have plans to replace this. @scionproto/scion-core-team opinions?
Yeah, in terms of code quantity it ended-up about the same. It was just an easier (seeming) path to follow. Ideally I would just ditch this and replace it with the packet blaster approach, but something tells me it isn't going to be a 2 days job. The nitty-gritty is rather messy. I really don't want to drag this PR on forever. The test itself may not be great but the pattern is established and produces result. So I'd like to submit it. However I have no problem moving the packetflood game into its own binary. So...stay tuned for that one.
topology/router_bm.topo
line 1 at r5 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit. For other tests that expect specific topology, we put this .topo file into the testdata, i.e.
acceptance/router_benchmark/testdata/
. Probably a good way to keep things tidy.
router_benchmark is like end2end_integration. You can tun it by hand outside the test framework. That's why I put the topo file in the same location. But yeah, I can move the file. Done
Specifically: * Use the GOMAXPROCS environment variable instead of a setting config field for the routers. * Move the router_bm.topo file to router_benchmark/test_data/ * Full restore of some unchanged files. * Add an explicit dependency of end2end_integration to the end2end binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 17 files reviewed, 2 unresolved discussions (waiting on @matzf)
doc/manuals/router.rst
line 182 at r5 (raw file):
Previously, jiceatscion wrote…
Well, yes it occured to me that we could use the env var directly. I just loathe env vars. They pop in and out of existence with no dicernable origin and no scope. I thought I was doing us a favor by adding an explicit config... but ok. I used the env var instead.
done
Moved it to a sibling binary named end2endblast. As a result end2end* do not take a "game" argument any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 21 files reviewed, 1 unresolved discussion (waiting on @matzf)
tools/end2end/main.go
line 17 at r2 (raw file):
Previously, jiceatscion wrote…
Yeah, in terms of code quantity it ended-up about the same. It was just an easier (seeming) path to follow. Ideally I would just ditch this and replace it with the packet blaster approach, but something tells me it isn't going to be a 2 days job. The nitty-gritty is rather messy. I really don't want to drag this PR on forever. The test itself may not be great but the pattern is established and produces result. So I'd like to submit it. However I have no problem moving the packetflood game into its own binary. So...stay tuned for that one.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 14 files at r1, 7 of 14 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: 13 of 21 files reviewed, 5 unresolved discussions (waiting on @jiceatscion)
acceptance/router_benchmark/BUILD.bazel
line 8 at r2 (raw file):
Previously, jiceatscion wrote…
mmm, end2end_integration might need to have bin/end2end as a data dependency too. It seems to be enough. Although I don't know precisely how bazel manages it.
For the record; as you found out, the end2end is baked into the tester docker image, that's how it ends up in the right place.
doc/manuals/router.rst
line 182 at r5 (raw file):
Previously, jiceatscion wrote…
done
Ah, I see. I appreciate that you want to improve the configuration, sorry for rudely shooting it down!
As we cannot get rid of the environment variable even if we wanted, also exposing this as a configuration option means that the settings can conflict and one takes precedence, which, IMO, is even nastier than the environment variable alone. Hope this makes some sense.
doc/manuals/router.rst
line 214 at r7 (raw file):
The number of kernel threads that go creates depends on the number of usable cores, which is controlled by the environment variable ``GOMAXPROCS``. See :ref:`GOMAXPROCS <gomaxprocs>`.
Nit: use the :env:
role, this directly creates a reference and the right formatting. Then we don't need the separate anchor _gomaxprocs
.
Suggestion:
controlled by the environment variable :env:`GOMAXPROCS`.
tools/end2end/main.go
line 17 at r2 (raw file):
Previously, jiceatscion wrote…
done
Ok, got it. Seems good to me.
tools/end2end/main.go
line 360 at r7 (raw file):
} func (c *client) ping(ctx context.Context, n int, path snet.Path, logIfOk bool) error {
Nit: remove the logIfOk
(true
here in end2end
, false
in end2endblast
).
tools/end2endblast/main.go
line 117 at r7 (raw file):
flag.Var(timeout, "timeout", "The timeout for each attempt") flag.BoolVar(&epic, "epic", false, "Enable EPIC") flag.BoolVar(&traces, "traces", true, "Enable Jaeger traces")
Nit: as far as I understand, the traces as they are implemented here never make much sense for the blaster. I think it would make sense to remove this flag here and drop all the related code. In the end2end, on the other hand, we could revert to always enable this.
tools/end2endblast/main.go
line 306 at r7 (raw file):
defer c.sdConn.Close() c.errorPaths = make(map[snet.PathFingerprint]struct{}) pong_out := make(chan int)
Nit: pongOut
(also pingResult
, pongResult
, ...)
tools/end2endblast/main.go
line 359 at r7 (raw file):
return true } c.path = p // struct fields cannot be assigned with :=
Nit: in contrast to end2end
, this does not switch to alternative paths on errors. We could clean this up by moving this getRemote
call out, before the first call of the blindPing
.
We can also remove everything related to errorPaths
Namely: * Doc formating * Simplify end2endblast now that its function is separate from that of end2end. * Restore endend almost completely to its original form for the same reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 21 files reviewed, 1 unresolved discussion (waiting on @matzf)
doc/manuals/router.rst
line 182 at r5 (raw file):
Previously, matzf (Matthias Frei) wrote…
Ah, I see. I appreciate that you want to improve the configuration, sorry for rudely shooting it down!
As we cannot get rid of the environment variable even if we wanted, also exposing this as a configuration option means that the settings can conflict and one takes precedence, which, IMO, is even nastier than the environment variable alone. Hope this makes some sense.
Well, yeah. For the config approach to be an actual improvement, we would need to also suppress the effect of the env var (which we can, btw: call GOMAXPROC(NumCPU()) by default). But that's added mess for too little benefit. I do find that there are too many sources of configuration in general; not just for this one thing, so if it gets addressed it better be as a separate more comprehensive effort.
tools/end2end/main.go
line 360 at r7 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit: remove the
logIfOk
(true
here inend2end
,false
inend2endblast
).
done
tools/end2endblast/main.go
line 117 at r7 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit: as far as I understand, the traces as they are implemented here never make much sense for the blaster. I think it would make sense to remove this flag here and drop all the related code. In the end2end, on the other hand, we could revert to always enable this.
Done. Might as well now.
tools/end2endblast/main.go
line 306 at r7 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit:
pongOut
(alsopingResult
,pongResult
, ...)
Done. Sorry for letting snakes in the camel pen.
tools/end2endblast/main.go
line 359 at r7 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit: in contrast to
end2end
, this does not switch to alternative paths on errors. We could clean this up by moving thisgetRemote
call out, before the first call of theblindPing
.
We can also remove everything related toerrorPaths
Done... and then some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 21 files reviewed, all discussions resolved (waiting on @matzf)
doc/manuals/router.rst
line 214 at r7 (raw file):
Previously, matzf (Matthias Frei) wrote…
Nit: use the
:env:
role, this directly creates a reference and the right formatting. Then we don't need the separate anchor_gomaxprocs
.
done
That is: * Fix description of end2end.go. * Delete NilCtx(), no-longer used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 14 files at r1, 1 of 14 files at r6, 9 of 9 files at r8, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jiceatscion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @jiceatscion)
…cionproto#4437) * Enhance the end2end_integration programs as follows: * end2end_integration has a slightly finer-grain role-based pair selection capability. * end2end_integration outputs an extra result line that provides precise start/stop timestamps for the whole test. * process_metrics exposes the number of cores used by Go. * Add end2endblast. Similar to end2end but plays a game other than pingpong: packetblast. It keeps logging and tracing to a minimum. The server side responds to very few of the pings (1/256). Just enough to prove that the circuit works. * Add an integration test called router_benchmark: * relies on an especially crafted topology: router_bm.topo * uses end2end_integration with command end2endblast over that topology as a load test for the router * extracts and logs relevant performance metrics after the test * compares performances with a set of expectations if executed by CI. As a benchmark program, this is only a first approximation. Extending the end2end suite provided a quick solution but isn't ideal: it uses too many cores at once due to requiring 5 routers actively running for one of the tests. More work is in progress to replace this with some leaner setup where only one router is actually running.
Summary of changes:
As a benchmark program, this is only a first approximation. Extending the end2end suite provided a quick solution but isn't ideal: it uses too many cores at once due to requiring 5 routers actively running for one of the tests. More work is in progress to replace this with some leaner setup where only one router is actually running.
Fixes: #4410
Fixes: #4411
This change is