Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker variant: immutable "data views" with monotonic buffer ressources #318

Closed
Neverlord opened this issue Mar 19, 2023 · 2 comments
Closed
Assignees

Comments

@Neverlord
Copy link
Member

I've been thinking about performance recently. That's why I've been revisiting the radix-tree implementation for speeding up filter lookups.

But there's also room for improvement how we represent messages in memory. We've discussed memory-mappable layouts in the past. With a memory-mappable representation, we would basically read a message from the network and then simply create a sort of wrapper that decodes the bytes on demand. The downside is that creating a memory-mappable format is more complicated and requires dedicated builder APIs. While "deserializing" a value is very trivial, accessing fields in a memory-mapped data structure can come with some overhead since data must be decoded on the fly. We also would have to change our network format.

Instead of going down this road, I think there's also another option that doesn't require us to change the network format. With a monotonic buffer resource and a custom allocator, we can flatten nested data structures like broker::data in memory, reduce the number of heap allocations and skip any destructors (by "winking out" the entire data structure). This is the same technique that makes RapidJSON fast.

To quantify what kind of speedup we could get, I've implemented a small micro benchmark that uses regular broker::data and a new shallow_data implementation (not fully functional, just the types I've needed for the benchmark). I've picked shallow_data, because the original idea was that the data would also hold references into the bytes where we've deserialized from to avoid any unnecessary copying overhead. For the benchmark, it made little difference because we only have small strings.

I've picked something small to start with, so I've used a variable called event_1 with this content: (1, 1, (event_1, (42, test))). The benchmark currently only looks at how long it takes to deserialize the data:

----------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations
----------------------------------------------------------------------
broker_data/event_1                556 ns          556 ns      1231354
broker_data/shallow_event_1        185 ns          185 ns      3826998

It's a small data structure, so the runtime is fast either way. However, even for this very small data structure, we have a 3x speedup. Real-world messages will be larger and when doing thousands of these per second, the performance gain adds up quickly.

I would leave broker::data untouched and use the new "flattened" representation for the message types. Of course there'll be faster ways to do things in the new API. We might leave broker::data in for convenience or eventually fade it out. In the transition phase, I think we can make the API either backwards compatible by converting to "regular" broker::data where needed and otherwise keep the migration overhead minimal. We wouldn't touch the network format nor the JSON representation. We can also make this transparent to the Python bindings, if we don't remove them before that.

@Neverlord
Copy link
Member Author

Quick recap of where we are.

In #368, I had a a bare-bone integration of the new broker::variant. I measured a significant speedup with it:

(for broker-benchmark), we get an increase of ~40% events/s (or additional ~200k in total numbers).

Of course that's just a Broker benchmark. I'll port Zeek to the new API next and then see how the full system behaves. We can't really do a maximum throughput measurement with Zeek, because Zeek doesn't use the backpressure. So I'll probably need some help putting together a meaningful benchmark with Zeek. Maybe we can throw a fixed rate at Broker and just compare CPU/Memory load?

Aside from switching to broker::variant for messages, we also have a compatibility header for Zeek that describes how Zeek events and misc. other types are "rendered" in Broker. The API for this was rather low-level, unfortunately, requiring users to fiddle with broker::data directly. To make the API more high-level, I've opened #376. This basically makes is easier for Zeek to use the wrapper types while hiding the underlying low-level stuff (allowing us to swap out broker::data for broker::variant).

A bit of an orthogonal yet related PR is #377. It has been a goal for a while now that the wire protocol and format is "self-hosted". Broker has its own wire protocol now, but still uses CAF to generate the binary representation for objects on the wire. The PR adds a formatting mini-library to Broker that we can re-use for variant.

On the Zeek side, zeek/zeek#3354 cleans up various places where Zeek used broker::data directly (outside of Zeek's Broker Manager).

All of these PRs are refactoring that may change APIs but don't change behavior (or performance). So these are the "low risk" parts of this whole operation.

With these puzzle pieces in place (assuming we get them merged soon), I can start making an integration branch that adds broker::variant and uses the new envelopes for Broker's internal messaging. This is basically done at the Broker side, just needs some adjustments. Most remaining work is on the Zeek side now, since Zeek needs to learn how to convert between zeek::Val and broker::variant. Most heavy lifting on that is already done, though, since we can re-use most of the scaffolding for broker::data. I plan to keep Zeek functional with the non-variant Broker (via #ifdef switch). This will allow us to switch back to "old Broker" if something goes horribly wrong with the new code.

Regarding broker::data: it will live alongside broker::variant for at least a while.

Zeek's RecordVal is one of the places where we are going to remain using broker::data for now, because the records are mutable (and broker::variant is immutable). Long-term, we might be able to use a builder here. The tricky part is removing something from a builder again. Currently, they are append-only. The question is whether tuning RecordVal would be worth it for Zeek. I have no numbers on that yet (nor do I know how frequent broker::data-backed record values are used in Zeek scripts).

The other part where I didn't integrate broker::variant is the data store API. One reason is that data stores also allow mutating the values. The other reason is that I'm not sure whether it's worth the effort (again) since I don't know how performance-critical data stores are for Zeek users in the first place. If tuning the data stores does not remove (or at least mitigate) a bottleneck for relevant use cases than we might just continue using broker::data there.

@Neverlord Neverlord moved this to In Progress in Zeek 6.1 Oct 8, 2023
@ckreibich ckreibich removed this from Zeek 6.1 Oct 24, 2023
@Neverlord Neverlord changed the title Consider using "data views" with monotonic buffer ressources Broker variant: immutable "data views" with monotonic buffer ressources Nov 11, 2023
@bbannier bbannier moved this to In Progress in Zeek 6.2 Jan 30, 2024
timwoj added a commit that referenced this issue Feb 20, 2024
* issue/gh-318:
  Remove obsolete code
  CI: Use a python venv in test.sh to avoid pip errors on newer platforms
  CI: Pin Windows openssl to 3.1.1
  CI: Fix building of alpine image with regards to python packages
  Implement new variant type and builder API
@Neverlord
Copy link
Member Author

All done.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Zeek 6.2 Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

1 participant