-
Notifications
You must be signed in to change notification settings - Fork 6
WebAssembly Web Engines Hackfest 2021
(conrad presents)
- Video: https://www.youtube.com/watch?v=MN5DIuw6KsE&list=PL4sEzdAGvRgCTD9dJtGHTpeh9W6Wx7aLX
- Slides: https://webengineshackfest.org/2021/slides/irreducible-control-flow-in-webassembly-by-conrad-watt.pdf
[Ross] It looks like Interface Types's fusion algorithm will in fact introduce irreducible control flow (with exceptions) on the engine side, so engines will likely have to handle irreducible control flow somehow anyways should Interface Types be adopted.
[pmatos] Is multiloop being worked on as part of the GC proposal?
[Conrad] You could think of it as a stage zero proposal. Orthogonal to any proposal that doesn't introduce new control flow.
[pmatos] Would this make the funclets proposal unnecessary?
[Conrad] I think multiloop is an evolution of the funclet proposal. Multiloop introduces fewer new instructions.
[Brian] I really like the name funclet. Any other questions or comments?
(ryan hunt presents)
- Video: https://www.youtube.com/watch?v=lnGj_3E2yBs&list=PL4sEzdAGvRgCTD9dJtGHTpeh9W6Wx7aLX
- Slides: https://webengineshackfest.org/2021/slides/a-tour-of-compiling-a-webassembly-module-by-ryan-hunt.pdf
[Brian] Thank you. Any questions for Ryan?
[Ross] That was the main thing I was hoiping to learn. How will function references work in OO languages?
[Ryan] I don’t have a great answer to how much overhead there is in jumping from baseline to table to optimizied code. Right now we don’t implement callref path, it’s all through an indirect and a table. It’s currently pretty inefficient but we don’t have optimization details figured out yet.
[LH] Seems plausible we could stop all the threads and tier up. We could also backpatch, but that’s more speculative.
[Ross] We’ve found stuff in the call pipline can change performance by up to 50%.
[Brian] We’re collecting session ideas.
[Andy Wingo] The C++ and GC (garbage collection) session is my idea. Hoping we can discuss polyfills, GC along the toolchain, and higher-level goals like where we could go in terms of reference types in GC.
[Asumu] I’m proposing a session to talk about GC in the JS API, how ergonomic we want it, what kind of JS usability we want to consider, talk about what’s been proposed thus far.
[Thomas Lively] Is it expected that each person will attend just one breakout session?
[Andy W] It will depend on how many we have and how much time we have.
[Thomas] Would anyone be interested in talking about module splitting or multivalue in the tools?
[Andy W] Oh yeah! I’m curious to just learn what they are.
[Thomas L] For the multivalue thing, it would be good to know if anyone has a good use case for using multivalues in toolchains. We floated the idea of a new CAPI but so far that’s been low priority. If anyone feels that should be higher priority, I’d love to hear why. For module splitting, we have a new WASM-split that decomposes modules so you can take a bunch of functions and split them into a secondary module that can be lazily loaded automatically as needed. Kind of hard to use right now, mostly it’s just for profile guidance splitting, where you split all the uncalled functions into a secondary module to reduce size of the primary module. Interested in allow code annotation to mark which functions should be split out to the secondary module.
[Brian] Want to note we have 21 people on the call and 4 session ideas on the pad, but only 6 people have indicated their interests. So if you haven’t indicated your interest on the pad, please do so now.
[paulo] Is this related to your past talk on decomposition for C++ modules (https://cdnapisec.kaltura.com/index.php/extwidget/preview/partner_id/520801/uiconf_id/31230141/entry_id/1_rob8glb6/embed/dynamic)?
[Thomas L] Yes it is.
[Ioanna] I’d like to talk about advancing exception handling to phase 4 and beyond.
[Andy W] Are you thinking of identifying blockers and such, or talking to people who work on standards?
[Ioanna] Yes, we could be in stage 3 soon.
[Thomas L] I’d certainly be interested in that.
[Andy W] Seems like most ideas are in regard to the toolchain and standards. I think those are natural areas on which to have breakout groups. Don’t know if there are useful implementation-level areas to discuss, but if so, don’t hesitate to propose one.
[Brian] Looks like we have five proposals now, with a lot of interest in the two GC-related sessions. Anyone have a last-minute proposal?
[Thomas L] One room for GC, since everyone seems interested, then separate rooms for the other topics.
[Andy W] Sounds good, we’ll get some rooms created.
[Brian] We can use this room for the GC talks, and we’ll set up separate rooms for the post-GC breakout session.
(discussion about how to arrange the rooms)
[Brian] Take a 5-10 minute break for coffee and whatever and we’ll come back here for the GC discussion?
[Andy W] Can do.
[Brian] That session will go about 40 minutes?
[Andy W] Yes, and then split into a couple of 20-minute sessions after that. Given group size, we don’t need a big report in unconference style, just a thanks and bye.
(break)
Interested: [add names] Minutes/Notes? [If chosen, jitsi URL]
- State of GC in toolchain: clang, llvm, lld, binaryen, emscripten
- State of polyfills / transpilers based on linear memory or bare externref/funcref
- Ref C++ prototyping
- Other things??
Minutes/Notes:
Andy: Looking at the state of the world. Interest in GC and toolchain is interaction with the DOM and solving cycle problems - especially with external GC heaps. Emscripten is LLVM + binaryen + stdlib + glue scripts. Walking up the stack starting with Emscripten - where are we with integration of reftypes in the standard toolchain that we have. Might impose on Thomas to explain where we are in Emscripten and Binaryen.
Thomas: As Andy mentioned Emscripten is LLVM + binaryen + python driving it all. Emscripten has nothing to do with the GC proposal so far. Thinking forward it might have some GC proposal , maybe. Possibly not. It's really a separate layer of abstraction. C++ calls imported functions implemented in JS. The ABI might need to change to use the GC proposal JS API, perhaps but mostly things wouldn't change that much. That leaves binaryen and LLVM. Binaryen has a bit of activity with the GC proposal - full implementation of the current specified milestone (there's a google doc floating around) - specifies a snapshot of the MVP implemented by V8 and SM is implementing as well. Binaryen has full impl. Been doing lots of prototyping.
Andy: what does it mean for binaryen to have full support for gc?
Thomas: binaryen is in a unique place because it's both a consumer and producer of wasm. It needs to decode, validate and ingest. Does lots of transformations and emit it. To have full support, it means it parses all instructions and emits all of them. Alon is doing a lot of optimizations on this.
Andy: can you share a bit more about use cases?
Thomas: we are working with the J2CL team - open source Java to Javascript compiler that google uses a lot. They are working on a Java to Wasm. They produce unoptimized modules and binaryen is taking those and optimizing them. We are at a point where we start to compile real world codebases. Another team is dart where we are starting to also starting to compile real world codebases.
Andy: ?
Thomas: binaryen is integral part of emscripten but it's also a general tool that you don't need to use with emscripten. We expected it to be a general backend of languages to compile to wasm but only more recently this has started to happen.
Ross: One thing I would be interested about people thoughts. The GC proposal requires ppl to rewrite runtime to use the GC proposal or use binaryen.
Thomas: Ultimately the runtime in J2CL is calling JS to get stuff done. Similarly for dart. I don't have a good answer in the general case.
Andy: lets see how it would work in the general case - interaction with gc proposal. The issue on the LLVM side is the support for the reftypes proposal. There's support for reftypes for the linker and lowest level (in assembly). The way to incorporate reftypes in your project is to write it in llvm specific asm. There are builtins to expose table.get, table.set, etc. We would like ultimately to reference dom values from C++. We need to bridge the gap with llvm ir and the frontend. Thomas pointed out that problems that we have reftypes is that we can't access externref and funcref as normal. Scalable vectors in ARM are similar. They don't have a size that can be reasoned about. ARM has an extension to C that restricts what can be done with these types. What we have been working on (Paulo and Me) is that we are trying to get this implemented all through LLVM. These values in IR are based in pointers to opaque types in non-integral address spaces. Using as an idea/direction what C++ / CLI did.
Lars: This seems like any kind of structure that any runtime would use, right?
Andy: There's a lot of impedence mismatch with the LLVM toolchain. If we get them understandable to users when they use the values incorrectly, that would be sufficient for me.
Thomas: there's also going to be a lot of mismatch with the llvm backend. For example, it assumes there's a finite number of types floating around and the gc proposal assumes it could be unbounded. Solving these problems in an upstreamable way is going to be quite a challenge.
Ross: I gave a talk about allowing C++ to have a GC which allows ???
Andy: Would be interested in a reference to that talk and would like to learn more about it.
Ross: will work on writing it up.
Andy: how does it work with externrefs
Ross: The idea is that you have a library that allows you to be able to store the externref and the GC will have to cooperate with the library. Same on the JS side.
Thomas: as andy mentioned there have been proposals to handle these cycles, which doesn't require changes to the JS GC. Once the GC proposal is settled my team will be eager to investigate those possibilities. The overhead of the user space solutions is unknown. If it's too high then we need to investigate collaborations with the host GC.
Andy: might be a good segway - if we consider languages that would like to produce webassembly + gc, are there things right now that can compile down to webassembly without gc?
Thomas: I don't think it would be hard to do but nobody has done it yet.
Ross: You could use your own gc but just would catch your cycles.
Daniel: I am daniel working in assemblyscript. One idea that I had is that in assemblyscript we really want GC. We invented our own runtime thing to compile a GC like Go's to webassembly directly. What you want to do is for binaryen to have all the type information. We want to use gc types and instructions. Even if it's not supported in an engine, binaryen could optimize that...
Thomas: is the only thing that's blocking that gc types are not exposed?
Daniel: we would need the c api and if we have a gc array we can only copy one byte at a time. we could polyfill. but then binaryen cannot reason about it.
Ross: why do you need that?
Daniel: if binaryen knows that then it can do clever optimizations.
Thomas: we are thinking about gc specific optimizations. Like type pruning but not implemented it but definitely on our roadmap. The type system in binaryen - the internals are in flux but the user interface should be pretty stable. I can expose that to the C api and you can take it from there.
Ross: at half way point...
Andy: I think we covered a lot of ground here - so it's great.
Interested: wingo, pmatos, tlively, ross, conrad, luke, dbezhetskov, lars, ryan, ioanna
- Compelling use cases for multi-value in toolchains?
- New C ABI that takes advantage of multi-value
- wasm-split can break a module into multiple parts
- Lazy loading / resume
- Right now profile-driven, but hard to use. Future directions with source annotations? DX?
- Related to prior work presented at SOIL seminar
URL: https://meetings.igalia.com/weh2021webassembly2
Thomas: the use case of multivalues has two parts. theyc an take and produce an arbitrry number of values. block structre is not user visible.
size optimizations in binaryen (2 yrs ago) less than 2%
secret flag in clang to allow to use the multivalues in IR.
Andy: multivalues is hard but not useful.
Thomas: we could get small size speeds and size opts by changing abi. Rust doesn't have a stable abi so could start using it abi. Alex mentioned there's some wasi use cases, if you have a wasi std function...
Thomas: basically the implementation in binaryen is the same as in the presentation. We hve a file called module-splitting.cpp that... A partner wants to use this.
Splitting works where you insert a layer of indirection through the table. Call the calls before call_indirect...
asyncify use case is very important, where you have to resume in the middle of a look. Similart o the go routing use case, the answer is a stack switching proposal. I haven;t need any use cases that would benefit from multiloop that aren't better handled by stack switching.
multiloop is like multi-value.
Interested: tlively, wingo, pmatos, dcode
- JS API is still in early stages. Could discuss goals, review of state of casts/RTTs, API proposals (MVP, V8).
URL: https://meetings.igalia.com/weh2021webassembly
Interested: asumu, dcode, ross, conrad, luke, dbezhetskov, lars, ryan, ioanna
asumu presents (url here asumu) to start the topic
ross: I guess the question I have is what kind of interop do people want to have rather than how to have it.
lars: The ergonomics of how we have it is also less important to us.
Conrad: I think littledan had some opinions on this.
lars: I guess the google people had a number of proposals at sort of different levels, but I think less is probably fine for the time being
Daniel V: AssemblyScript is pretty close to JS, so the easier it is to mix and interface the better for us. As long as it compiles to both worlds. My particular pet peeve is what to do with strings - we'd like to just pass them by reference. What we would want is something like being able to instantiate a string that we pass in for static data or dynamic strings and use that across - but then also we have different string encodings we are compiling between. What do we want the engine to aid here - like caching intermediate results.
Littledan: I am excited to hear this conversation because it can reduce the amount of copying. Starting with UTF16 might make sense, many languages have that in common and it could reduce a big amount of serialization if it could be adopted. I would support that, though I am willing to accept if it is after. I think it would be useful on both sides to not have to have 2 paths for how things are represented as objects. It increases the surface area of things that have to be maintained if they are separate. I think it would be ideal if they were pretty high level and ergonomic to use, otherwise wrappers and marshalling really can incur performance costs. I liked the proposal the v8 team had. Being able to do that does require something more like nominal typing - their proposal did have this information, but there are also other ways to do this. It could be in a custom section, that would have to be in a particular type at a nominal index.. Or it could be other metadata, but I think this area is really important for us to flesh out more.
conrad: There is a kind of layering question about whether there is a way to not care about the concrete layout of the object in memory
littledan: it is hard for me to think through at that level of abstraction where you would want the ..?
ross: Often you have your objects compiled ot web assembly, and you have internals - would you like some of those to be concealable, or do you want JS to have access to everything?
Daniel W: If we had private and public fields that would be pretty good. Having some encapsulation would be ideal.
littledan: my understanding was that private should be supported
conrad: I dont think it would give you private fields in the sense that a user would expect
ross: it's good for completely abstract types. There is an open issue that people dont know how to compile this. Besides ergonomics, if the JS has to have access to everything it is hard to do encapsulation. I'm not saying impossible, just hard.
littledan: I'm not sure which kind of encapsulation we mean -- strong among other web assembly modules or from js?
ross: with type imports you can get access to others internals.
littledan: I mean this encapsulation in wrapping something in private
conrad: You can actually tunnel through that abstraction tho. This is part of the quesiton, which kind of encapsulation are we talking about and do we desire.
littledan: but you are not talking about reading values, right?
conrad: if you are talking about structural types, it is hard to say you are forging but yes, kind of
littledan: (scribe missed it)
conrad: It is slightly analogus, tho not a completely same thing as a reflection API.
littledan: I'd like to learn more about this forgeability of encapsulation in type imports, with https://github.com/WebAssembly/proposal-type-imports/blob/master/proposals/type-imports/Overview.md#private-types-and-casts
asumu: I posed an issue that is linked on the slides (asumu fill in)
luke: Distinguishing type imports from this private is necessary. Nobody can name it if it is private, if you use private as proposed it does give you what you expect
conrad: i might have misunderstood. For clarity importing a structural type as abstract can be tunnelled through, but if you use the link dan provided (generative private types) that does work
luke: I think one of the very basic features is can I set the prototype such that wasm can't see it but JavaScript can - a lot of things flow from that. You can do a lot of things, you don't need to create named own properties if you can add accessors. The prototype tends to be hanging off the internal hidden class - you are mostly on the way to setting the shape. If I can answer the question of how to set the prototype, a lot of other things fall out of that. I wrote some of this up about 3 years ago, at the time the idea was contentious, but with time maybe RTT is a better way.
littledan: asumu and I are thinking more about typed/fixed shape objects in JavaScript - if anyone wants to talk with us more about this offline, we'd be happy to continue it.