-
Notifications
You must be signed in to change notification settings - Fork 1
POC for wasm map implementation
Crafty is a javascript game framework. Recently(-ish) I tried replacing Crafty's existing spatial map implementation with one written in rust and compiled to web assembly (wasm). This is a quick summary of my experience doing so.
A spatial map is used for broad phase collision detection. It makes checking collisions more efficient, at the cost of doing some extra work of tracking entities as they move around. Crafty uses a simple grid-based map written in javascript.
It was an excellent candidate for rewriting as a wasm module for two reasons:
- Profiling reveals that both updating and searching the map were actual performance bottlenecks
- It's a complicated data structure that can own its own data and provides a relatively simple api for interacting with.
So it seemed both practical and worthwhile to rewrite the module in a systems language that could compile to wasm. I chose rust for this purpose because (a) I knew it had good support for web assembly as a compilation target, and (b) it was a good excuse to learn the language!
There were two bits of work I'm not going to discuss in depth here. (I'm also going to pretend for narrative purposes that I did these tasks in a precise order, when really there was a lot of context switching.)
First I had to clean up the way Crafty interacted with the map. There was some unnecessary coupling in a couple of places, but that was relatively easy to fix. I also had to provide some hooks to let Crafty actually load wasm modules more easily, but since this just for a proof-of-concept this was again not too much work.
The second (more involved) task was to actually rewrite the spatial map in rust. Since I was new to the language there was definitely some churn as I learned how lifetimes and other uniquely rustic concepts worked, but overall it was pretty satisfying. It was definitely a better language than javascript for writing efficient data structures!
Once I was done I had a rust module that provided a struct with essentially the same api as Crafty's own spatial map.
So how do we actually invoke this magical rust implementation from javascript? We write wrappers in both rust and javascript:
- The rust wrapper provides ffi methods that allow for external code to manipulate the spatial map. This will involve handing out pointers to this external code. This rust library can be compiled with web assembly as a target.
- The javascript wrapper provides an api identical to Crafty's current spatial map, and knows how to interface with the methods provided by the wasm wrapper. This also includes the shim that actually loads the web assembly module. (Although I just called through to an existing library provided by Mozilla.)
A disclaimer: I started this work directly before the release of wasm-bindgen. As we'll see, there's some boilerplate involved that it might have helped with, but it's probably worthwhile to see a "from-scratch" implementation.
At a fundamental level, the methods provided by a wasm module know about integers, floats, and booleans. That's it -- you can't pass in javascript objects, or return a string from a method. This led to 3 types of problems I had to deal with in writing these wrappers:
- How to maintain a persistent spatial map and call methods on it
- How to handle the return results of search operations, which in Crafty's current api return an array of entities
- How to handle the ray traversal api, which in Crafty's current api accepted a javascript callback.
The brief summary here is:
- You can pass references to (and ownership of) objects out to javascript, then use pass the pointers back to the rust wrapper to invoke methods
- The javascript wrapper can obtain a reference to an array within the wasm module's memory, which the rust wrapper will use to store the resuts of a search method
- The javascript wrapper can provide methods to the wasm module as imports. The rust library can use such an import to indirectly invoke a js callback. We store the callbacks in an array, letting us pass in the array index through to the rust library and back out to the imported invocation method.
Let's look at these in more detail!
Let's look at the rust struct's methods, as defined by the SpatialMap
trait:
pub trait SpatialMap {
fn update_map<'a, 'b>(&mut self, id:i32, x:f32, y:f32, w:f32, h:f32);
fn remove(&mut self, id:i32);
fn search(&mut self, x: f32, y: f32, w: f32, h: f32) -> HashSet<i32>;
}
Probably the most fundamental operation we can do is to update an entities position within the spatial map.
The equivalent method exposed by the javascript api is: update(id, x, y, w, h)
,
where we pass in the identity of the entity and it's current position and dimensions.
The method doesn't return anything. Essentially a direct match for the rust signature, if you ignore &self
.
But of course, we actually can't ignore that. To call a method on the spatial map, we have to have a reference to it. Let's proceed with the assumption that we do have such a reference, and see how we could provide a rust wrapper that lets us invoke update on a map:
#[no_mangle]
pub extern "C" fn update_map(
map: *mut map::GridSpatialMap,
id: i32, x: f32, y:f32, w: f32, h: f32) {
unsafe {
(*map).update_map(id, x, y, w, h);
}
}
There are three key pieces here necessary for this to work:
- We add the
no_mangle
attribute to the method, which ensures that the method name will be preserved in the wasm module. - We use the modifier
extern "C"
to declare that it obeys the standard C calling conventions required by wasm. - The method takes in a pointer to a concrete implementation of our trait (
GridSpatialMap
), and uses an unsafe block to dereference it and invokeupdate_map
. This (very clearly) isn't guaranteed to safe, but there's not much we can do about that other than being careful about what we pass in.
Great, if we have a reference to the map this will work. But how the heck will our javascript code get a reference in the first place? The strategy I went with was to provide a helper method that creates an instance of GridSpatialMap, and returns a reference to it:
#[no_mangle]
pub extern "C" fn new_map() -> *mut map::GridSpatialMap {
let b = Box::new(map::GridSpatialMap::new());
return Box::into_raw(b)
}
Once we do this the external code owns the map, and would be responsible for (eventually) cleaning it up. Luckily our spatial map will just be a singleton that lasts the entire lifetime of the module, so we can ignore that part. (An alternative approach might be to create a singleton with a 'static
lifetime in the rust library instead. I didn't really explore this, so I'm not sure if it's feasible/advisable or not.)
How do we call this from the javascript side?
First we load the wasm object itself, using a shim provided by Mozilla. The code will look something like this:
function loadMapModule(Crafty) {
fetchAndInstantiate("./rust-module/target/wasm32-unknown-unknown/release/wasm_test.wasm", imports)
.then(mod => makeMapModule(mod.exports, Crafty));
}
function makeMapModule(exports, Crafty) {
// Get the reference to the spatial map
var $map = exports.new_map();
var map_module = {
// wrap the exported update_map method in a nicer js api
update: function(id, x, y, w, h) {
exports.update_map($map, id|0, +obj._x, +obj._y, +obj._w, +obj._h);
}
}
}
When we call fetchAndInstantiate
, it will return a promise that passes through the wasm module. The extern
methods we provided will be available as properties of mod.exports
-- so to create the spatial map and obtain a reference to it we call exports.new_map()
. We then pass it back to any methods that require a reference to the map.
For methods that return and accept simple data types, this is all the work we need to do!
Aside: Now, you might think that passing around raw pointers in memory to javascript code is dangerous. And you'd be right! But web assembly is a bit cleverer than that. Each wasm module owns some block of memory, and any pointers returned by web_assembly methods are really just offsets into this memory. If some malicious code tried to invoke update_map
with a bad pointer, it could cause the code to panic, but it couldn't actually access memory that lived outside of the wasm module.
The above approach is all that's necessary for updating the spatial map. But that's pretty pointless without being able to search it.
The javascript api method looks something like this (although the actual method lets you avoid the array allocation if you want)
search: function(rect) {
var results = [];
// code that searches within an area defined by rect._x, rect._y, rect._w, rect._h
// any entities that match are pushed onto the results array
return results;
}
From our previous rust trait's implementation, the signature was:
fn search(&mut self, x: f32, y: f32, w: f32, h: f32) -> HashSet<i32>;
The issue here is that we need to somehow pass back the results of this method, in such a way that the javascript wrapper can convert it to a javascript array of objects.
It would be sufficient to pass back an array of Crafty entity ids, since they're integers -- then the js wrapper could use those ids to find the corresponding objects. Luckily, javascript code can directly access the web assembly module's memory as a buffer. And remember the earlier note about how pointers returned by web assembly are turned into offsets into the module's memory? That means we can use them as indices into that memory.
The specific approach I used was this:
- Create an extern method that allocates a buffer and passes ownership of it to the javascript code
- Create a
search
method that takes a pointer to such a buffer, and writes the results of the map's search into it, and then returns the number of results. - The javascript code calls search, then iterates through the buffer to find which entity's will be added to the javascript array.
The implementation is something like this on the rust side:
#[no_mangle]
pub extern "C" fn get_result_buffer(capacity: usize) -> *mut i32 {
let mut v = vec![0;capacity];
let ptr = v.as_mut_ptr();
mem::forget(v);
return ptr as *mut i32;
}
#[no_mangle]
pub extern "C" fn search(
map: *mut map::GridSpatialMap,
result_buffer: *mut i32,
x: f32, y:f32, w: f32, h: f32) -> usize {
unsafe {
let result_set = (*map).search( x, y, w, h);
let l = result_set.len();
// a helper method that copies all the entries in the hashset to the array
map_set_to_array(result_buffer, result_set);
return l;
}
}
And on the javascript side, the first iteration looked something like this:
var BUFFER_SIZE = 2000;
var $buffer_ptr = exports.get_result_buffer(BUFFER_SIZE);
var $buffer_view = new Int32Array(exports.memory.buffer, $buffer_ptr, BUFFER_SIZE);
// and then within the definition of the map module's methods
search: function(rect) {
var results = [];
var l = search_raw($map, $buffer_ptr, +rect._x, +rect._y, +rect._w, +rect._h);
for (var i =0; i < l; i++) {
results.push(Crafty($buffer_view[i]));
}
return results;
};
So we call the exported wasm method, which writes the results into the result buffer and returns how many we have. Then the javascript code looks at the very same buffer, finds the associated objects, and returns them as an array.
Because javascript is single threaded, we can avoid unnecessary allocations by reusing the same result buffer each time. We also don't worry about exceeding the buffer, which would go (very) badly if we ever had more than 2000 results from a search. That seemed acceptable for this POC, but in real code you'd want to handle this more gracefully.
Sadly, there was one gotcha here that required some further modifications.
The actual block of memory that the wasm module owns can change (for instance, if it needs to grow.)
Any pointers owned by the javascript code will still be valid, since they're just offsets into that block of memory.
But it causes our Int32Array
to become "detached" (invalid), since it's tied to the actual region of memory.
The solution I went for was to use this helper method instead of direct reference to the buffer, which will detect whether the array is detached:
var $buffer_view = new Int32Array(exports.memory.buffer, $buffer_ptr, BUFFER_SIZE);
function get_buffer_view() {
if ($buffer_view.byteLength === 0) {
$buffer_view = new Int32Array(exports.memory.buffer, $buffer_ptr, BUFFER_SIZE);
}
return $buffer_view;
}
It would be nice if wasm provided a hook that told you when the memory changed, but maybe that's not really practical.
I believe there are some options that give you more direct control over the memory buffer used by the wasm module.
That might provide a different solution to this issue, but since the above fix worked I didn't explore these in depth.
The final complication was the ray traversal api. The javascript api provided by the map looks like this:
traverseRay(Object origin, Object direction, Function callback, Number maxDistance)
The origin
and direction
objects define the starting point and direction of the ray. The spatial map will be
searched for possible candidates until we're maxDistance
from the origin.
The complication here is that the broadphase map doesn't contain the information to find the point of intersection
-- it can only tell us which entities are in the approximate vicinity as the ray. That's where the callback
parameter comes in.
For every entity that is in the vicinity of the ray, the callback is invoked and passed the candidate entity. The callback method can
then check whether the entity actually intersects with the ray. The callback can return true
to terminate the ray traversal;
otherwise, it will continue.
There's a secondary complication related to the way ray traversal works on a spatial grid, specifically when we want to find the first intersection. When we're looking at the entities in a particular region of the map, we don't know what order they're in. And when an entity spans multiple cells of the spatial map, it's possible for it's point of intersection to be in one of the later regions. That's why the callback accepts a second parameter, which will let us tell which "set" the entity belongs to. The callback then is responsible for finding the points of intersection of all the various candidate entities, and only declaring one to be the first point of intersection once the callback is invoked in a way that indicates that the traversal has moved into a new region of the map. This is accomplished by passing in a number that increments as we move to new regions. (Because of the complication here, there are higher level raycasting methods that simplify the above logic. Which is nice, because after swapping out the underlying map implementation we can ensure that these higher level methods still work!)
One natural way to traverse along the ray in rust would be to use an iterator which returns the same information that the callback needs:
// Public method exposed by the map
pub fn traverse_ray(& self, ox: f32, oy: f32, dir_x: f32, dir_y: f32, max_distance: f32) -> impl Iterator<Item=MapTraversalResult> + '_
// A type which captures both the entity id and the "set number"
pub struct MapTraversalResult {
pub id: i32,
pub set_number: i32
}
The invocation here matches the javascript api, with the exception of the callback. If did have a way to invoke the callback, it's clear how to proceed: the wrapper would get the iterator, and pass each result to that method, terminating when the iterator was complete or the callback returned true.
How can we invoke an arbitrary js function from within the rust wrapper? In the first part we saw how the rust wrapper can declare external functions that have corresponding javascript implementations. The solution I went with was to store the callbacks in an array within the js wrapper, and pass in the index to the rust method. Rust could pass both the callback parameters, along with the callback index, back to the js wrapper, which would then locate the correct callback.
So the javascript side setup for handling callbacks looked something like this:
// Define a callback stack that stores our callbacks
var callback_stack = [];
// The method that will allow rust to indirectly invoke the callback by index
env: {
invoke_traversal_callback: function invoke_traversal_callback(
callback_id,
entity_id,
set_number
) {
var e = Crafty(entity_id);
return !!callback_stack[callback_id](e, set_number);
}
}
// And finally, the wrapper method that maps the existing ray traversal signature to our wasm implementation
traverseRay: function(origin, direction, callback, maxDistance) {
var callback_id = callback_stack.push(cb) - 1;
exports.traverse_ray($map, callback_id|0, origin._x, origin._y, direction.x, direction.y, maxDistance);
callback_stack.pop();
}
While the rust side looks like this:
// Declare the signature of invoke_traversal_callback
extern {
fn invoke_traversal_callback(callback_id: i32, entity_id: i32, set_number: i32) -> bool;
}
// The method exported as part of the wasm module that provides ray traversal
#[no_mangle]
pub extern "C" fn traverse_ray(
map: *mut map::GridSpatialMap,
callback_id: i32,
ox: f32, oy:f32, nx: f32, ny: f32,
max_distance: f32) {
unsafe {
let mut traverser = (*map).traverse_ray(ox, oy, nx, ny, max_distance);
// step through the traversal, running the callback on each result
// Return either when the callback returns true, or the traversal is exhausted
while let Some(traversal_result) = traverser.next() {
if invoke_traversal_callback(callback_id, traversal_result.id, traversal_result.set_number) {
return;
}
}
return;
}
}
And that's it! I'm not sure this is the best solution, but it certainly works.
Ok, so how did all this work out in practice? There's some overhead when calls cross the js/wasm boundary, and it was possible that this might dominate over any performance improvements from rewriting the module.
To test this, I created a simple Crafty scene that stress the update
and search
methods specifically. It consisted of a large-ish number of graphically simple entities bouncing around at random.
Each tick of the gameloop, each entity would update it's position, check to see if it was now intersecting other entities, and if so reverse velocity.
Using the existing implementation, I uppded the number of entities until it was extremely laggy on my 12" Macbook in Firefox 61. (~10 FPS for a 30x30 grid of entities)
Profiling indicated that, indeed, two major performance bottlenecks were the search
and update
methods. Then I swapped out implementations.
The resulting FPS for the same number of entities was about 25! That was enough to change it from obviously jerky to tolerably smooth. (And the actual performance improvement for the spatial map calls improved by a substantial amount -- the 2D physics and rendering are responsible for a decent chunk of CPU time.) This is quite a lot better than I'd expected, since it seems to completely move the bottleneck to other parts of the system.
There was an annoying gotcha in finding the new bottlenecks, though. Turning on Firefox's in-browser profiler seemed to have an extreme impact on performance when the wasm module was in use; something about the way it's instrumenting wasm calls made things slow again. In my demo this was visually quite obvious, but it's something to keep in mind if you're trying to use the profiler to see how wasm modules are performing.
Results in a recent version of Chrome were similar, although because Chrome was faster I had to up the base number of entities before slowdown kicked in. You can try this out for yourself here.
So what next? I don't think now is quite the time to directly ship wasm modules with Crafty, but since this was such a performance improvement the way forward seems clear:
- Land the relatively minor changes to Crafty that were necessary to swap out the spatial map
- Provide a nice way for Crafty to load wasm modules, that will automatically fallback to the plain js implementation when wasm isn't available
- Clean up the spatial map module and provide explicit instructions on how to use it. (Possibly includes publishing the underlying spatial map implementation as a cargo crate?). This would also include getting feedback from folk more experienced than I using rust/wasm, since (as mentioned a few times) there might be superior approaches to some of these issues.
It would also be worthwhile exploring what other areas of Crafty could benefit from a wasm rewrite. However, quite likely these would require more intrusive changes.