-
Notifications
You must be signed in to change notification settings - Fork 6
JavaScript Web Engines Hackfest 2021
Time (UTC) | Activity |
---|---|
16:00 - 16:10 | Introduction to the breakout session |
16:10 - 17:00 | JavaScript Intl features Birds of a Feather (facilitators: Ujjwal Sharma, Shane Carr, and Caio Lima) |
17:00 - 17:10 | Break |
17:10 - 18:00 | Decimal discussion (presentation by Caio Lima) |
18:00 - 18:10 | Break |
18:10 - 19:00 | Realms discussion (presentation by Leo Balter) |
19:00 - 19:10 | Break |
19:10 - 20:00 | Record and Tuples discussion (presentation by Rick Button) |
[Ujjwal S] The idea here is to brainstorm ideas and ways to fill in any voids in Intl
. Internationalization (I18N) is the general act of building tools that can adapt themselves to the context they’re in by using localization (L10N). Intl
is specific in ECMA 402 and hopefully by the end of discussion today there will be things we could include in this API.
[Ujjwal] This is already an in-progress proposal. One of the things I think people get hurt by a lot is units. For example, the US imperial system versus metric in the rest of the world. In some places like India, people use a complicated mix of units. Driving distance is in metric, but height is imperial. Air temperature is in Celsius but body temperature is Fahrenheit. There’s a proposal for smart units (https://github.com/tc39/proposal-smart-unit-preferences) but that’s an example for what kind of problems Intl
is trying to solve.
[Caio L] I personally think it’s quite hard to know how to measure things. This is the first I’m hearing of this proposal, but it’s good news for me as a programmer. There are lots of ways of measuring areas in Brazil, it can vary by state. Dealing with that kind of thing as a programmer can be very hard. Global applications would really benefit from this proposal.
[Shane C] https://github.com/unicode-org/cldr/blob/6c8b6e35307adc4beb7970e1c1f0c16f7c209027/common/supplemental/units.xml#L272 shows a survey of what kinds of units or sets of units that should be used for various things. If you’re interested in this feature, useful if you post to the repository or comment on existing issues to express support for this. There’s a lot of API surface for this kind of thing, because you need both unit preferences and a unit converter which is quite complicated. You have to deal with mixtures of units, and also rounding. We could either design an API that exposes all that, or one that does it all under the hood where you give it one value and get back a converted string. The main thing you can do to help is express your support.
[Dan E] Maybe we could step through some examples of the APIs?
[Ujjwal] MDN has a good Intl
page we can go over. Let’s look at NumberFormat
. One example is formatting a currency. So if you have a number of money, there could be different rules for currency symbols and their placement, and for thousands and decimal separators. Or even the grouping strategy in numbers. You could commit all these variations to memory, but at some point it makes more sense to use an API like this. NumberFormat
really lets you customize how a number is presented. Unit length, for example.
[Dan E] One possible new feature to request would be more kinds of features.
[Ujjwal] It’s mostly straightforward and intuitive. Another thing I really like is PluralRules
. Plurals are fairly easy in English, but if you go into other languages, it gets more complicated. Arabic has many plurals, for example. Welsh also has many plural rules. The general idea around the API is you can leverage these things and plug them into each other to create an interface on the web that looks more natural to the user. An example is if you want to say “17 apples” you can plug the “17” into NumberFormat
to get the right number representation and PluralRules
to know how you should pluralize ”“apples”.
[Caio] You mentioned NumberFormat
. Could we talk about what we’d like to extend or add?
[Shane] Yeah, I have a slide deck I can show.
(pulls up slide deck)
[Shane] Let me give you the highlights. One big feature request we get is for number ranges. That’s being proposed. Another big request is control over the grouping strategy. A lot of developers want to show grouping separators in a localized way. Rounding priority and increment have been proposed, so you can round to the nearest 5 or 50 or what have you. Control over trailing zero display, especially for currency. We’re fixing precision of string-to-number. Control over negative-sign display.
[Dan] I heard your team is working on a transition to Rust?
[Shane] Yeah, let me answer presentation questions first.
[Kyle] I’m surprised to see rounding isn’t a thing in the number format yet. What happens right now with rounding? Does it follow number rounding rules?
[Shane] Current behavior is halfExpand
and that’s not configurable.
[Shane] Let me pull up a slide about ICU4X. This slide shows the “I18N stack”. From bottom to top: Unicode->CLDR->ICU->Browser Engine->ECMA-402->Application. ICU is written in C++ and is a large library with a lot of features. Being large as a large impact on the size of apps or browser engines. It actually becomes a big part of the app size in most cases. ICU4X is a rewrite of core pieces of ICU in Rust with the idea of being more modular and pluggable. SpiderMonkey will be one of the early adopters of ICU4X. User impact for this is the browsers will be able to ship more locales, so Firefox users should be able to get more than the current 80 locales. Should also get performance improvements. We have a mailing list (https://github.com/unicode-org/icu4x) if you’re interested.
[Dan] I’m excited ICU is being rewritten. ICU is old and complicated to get into.
[Ujjwal] You mentioned Firefox. Do you have a timeline for when Chrome will adopt ICU4X?
[Shane] V8 is unlikely to be an early adopter, but when ICU4X has large feature parity, V8 will consider bringing it in. One problem is they don’t have compatibility for Rust, but that’s likely to change. It might happen in a few years.
[Dan] How about WebKit?
[Shane] WebKit is interesting because it uses OS ICU, instead of bundling its own. So you get the same ICU behavior across all apps in the system. So will ICU4X go into macOS/iOS? That’s for them to decide.
[Dan] Anyone here from Apple or WebKit?
[Ross T] WebKit, but not Apple, so can’t speak for them. More broadly, when I was implementing 402 features, I was surprised at how stagnant everything was. I was glad to help rectify that, at least for now. It’s hard for me to comment on this specific effort because of how custom Apple’s approach is there. That puts other platforms in an interesting situation because on the one hand, for an engine like V8 you can treat ICU like any other dependency. WebKit on Apple doesn’t work that way, but it does on PlayStation, where we're able to update ICU more frequently.
[Ujjwal] Does ICU ever break API compatibility? If so, does that affect the WebKit implementation in any way?
[Ross] We stick to the C API because of the ABI stability it provides. Given we’re writing C++, it loses some flexibility with respect to code writing.
[Shane] ICUseek is stable, it means you can swap out the shared library. Only if you use the C API. Can you build a header-only C++ API? That’s a long-standing request. There’s a small amount of that already, but ICU4X aims to use Rust and provide a C API.
[Ross] That would be very cool. Not sure I would predict Rust in WebKit’s imminent future but it sure would be nice.
[Ujjwal] One of the goals of ICU4X is a tiny footprint. Can ICU4X be compiled to WASM right now?
[Shane] GREAT question. WASM is a direction we’d like to go. If you want to use ICU4X from JavaScript, we’d like to support that. We plan to write WASM wrappers starting in a couple of weeks. Is that a direction of interest?
[Ujjwal] Definitely in my case! I think this would be great if we could drive adoption. Another thing is for people building interpreters like Engine262 would really appreciate this, since they have to build Intl
on top of basically nothing.
[Shane] If you like Rust and know it, there are a lot of opportunities to contribute. Check our “good first issues” label in our issues.
[Caio] In WPE WebKit, we could use this. Would ICU4X work there, given MIPS architecture?
[Shane] ICU4X should support any architecture rust can compile to.
[Tyler W] https://doc.rust-lang.org/nightly/rustc/platform-support.html
(break)
(presentation by Caio Lima)
- Video: https://www.youtube.com/watch?v=_PsqIK3Vcxg&list=PL4sEzdAGvRgCTD9dJtGHTpeh9W6Wx7aLX
- Slides: https://webengineshackfest.org/2021/slides/decimal-values-for-javascript-by-caio-lima.pdf
[Dan] How do people feel about implementing Decimal
in browsers?
[Rob P] "excited, let's implement soon" - (as someone who does not have to do the work)
[Caio] I’m excited! This sort of thing is already supported in many languages, so having it missing in JS is a failing.
[Ross] Seems like a good idea for WebAssembly as well, given that (I believe) there's hardware support for 128-bit decimals.
[Dan] well, maybe if software decimal starts being the bottleneck somewhere... I think it can be a higher level, toolchain-provided feature for a while in WASM
[Ross Tate] yeah, not a high priority feature
[Rob P] Ross, I intuitively thought that and "floated" the idea to Andreas Rossberg a year ago. He seemed to think it was high-level and needed more justification to be a primitive.
[Ross] Yeah, seems like something that an interested customer should advocate for, and if there's no one pushing for it then no need to add it.
[Caio] One thing I didn’t go over is why Rationals wasn’t a good idea. Besides the upside of handling irrational fractions, implementing it would be quite painful. Applying a GCD for every operation would be expensive.
[Caio] I would like to hear from implementors, mainly how do they feel about having this a primitive type?
[Yulia S] Primitive types are a little scary, but there shouldn’t be a problem on the Mozilla side.
[Caio] Adding a new primitive does mean approaching a lot of places in the compilers. We had to change three compilers to support BigInt
. The major motivation for primitive here is to support binary comparisons.
[Yulia] It also makes sense as a reflection of BigInt
.
[Dan] There was some negative feedback later which I was going to bring back to the committee. Maybe I should bring it back. I also haven’t advanced on the feedback on operator overloading.
[Yulia] I would be interested in that. We gave similar feedback to Record and Tuple. If this is the best way to go, we definitely deal with it. I would love to see an alternative approach explored a bit more, or guidelines on when to introduce primitives, or not. That said, I wouldn’t have a problem seeing this go to Stage 2.
[Caio] A reminder that this work was sponsored by Bloomberg, so thank you to them.
(break)
(presentation by Leo Balter)
[Dan E] What do people think?
[Yulia] I think it’s a really great development.
[Caridy] We’re looking for feedback in three areas. First is the global names. We’re still not sure what names should be installed. The optimizations that can’t be done: what should happen when the wrappers get passed around? the last is module/graph separation. What kind of caching mechanism will be possible? Those are areas we’d like to get feedback.
[Leo] For globals, we’re disallowing any non-configurable global properties. So if you have an iframe, there are “forge-ables” like window.location
. You can’t delete them. For Realms, the reasonable contract is to not have anything that is not configurable. We need to make sure if the host needs to add anything, it needs to be configurable and delete-able. There are some minor nitpicks we need to avoid side effects cross-realms.
[Dan] Another discussion is how Realms work with ES modules. The idea is each realm has its own modules, since realms have new global objects.
[Leo] Names we want to add to the realms is part of the integration we’re going through with HTML. We’re not giving ECMAscript authority to define these names, but this is work that needs to be done to integrate realms with the web.
[Ashley] A question about module caching. Before, I’ve tried to do hot code reloading and caching modules makes that difficult. If each realm gets its own module registry, I presume that wouldn’t generate new things.
[Leo] One of the things we’ve been discussing on the ECMAscript side, my concern is that we have each module being evaluated for each realm. Internally, parts tend to be cached internally. I want each module instantiated for each realm. This seems workable.
[Dan] There’s a broader question is, is there a non-broken way to do this without hot loading. It would resolve a much-requested feature to have hot module reloading, but it’s not something we really have yet.
[Ashley] The URL thing mostly works, but it’s hard to change URLs for modules a module loads without manipulating source. Maybe browsers clearing caches isn’t a spec concern.
[Dan] I’m not sure if the web is going to get a great way of clearing the cache. Maybe we want something for ES modules specifically. There’s concern around linking. What do you do if the set of exports changes while you’re reloading? There’s just a ton of edge cases, and it would be a big project to add it to JS.
[Ashley] What happens with async if we can’t pass promises over the realm?
[Leo] You can still wrap an async function but this is low-level code. You can still wrap them up but if you try to use them, they’re not really useful or helpful. We’re trying to set the boundaries of responsibilities of low-level code. We’re not wrapping promises or anything.
[Caridy] The way to go right now is that when you evaluate, you have to wrap that async function somehow. You basically have to invent your own protocol to call it and catch errors and callbacks. Tedious, but possible. The other alternative is to use some sort of membrane between the two realms. That might allow you to share things between realms. Eventually we might be able to come up with ideas to share more between realms via wrapping mechanism. We’ll have that conversation at some point.
[Dan] I agree you can either use a custom protocol or a membrane. The current proposal is designer to be simple as possible, and really encapsulate the realm from the outside. Letting you only pass these primitives and wrapped callable values means it’s a lot harder to leak across the realm boundary. With the previous proposal, you could share a promise to a realm, but that also made it easier to violate the realm’s boundary.
[Leo] Thanks to a lot of people, hopefully this will be reality soon!
(break)
(presentation by Rick Button)
[Matt G] How does Box tie into the equality model?
[Rick] Our thinking is they’re tied to the object they’re tied to. If the box contained the same object in two records, they would be equal. A box is equal to any other box that boxes the same object. You can’t currently mutate boxes. Box is similar to symbol.for
if you could pass in an object rather than a string.
[Ross] Wondering how is this expected to interact with message passing.
[Rick] Kind of above TC39 itself.
[Dan] In a superficial way, we’d have to plumb this through HTML. Implementations have to decide how to handle values shared between different threads, since it will complicate their GC. I’m cautiously optimistic we’ll get to memory strategies that allow this to be not copied.
[Ross] Not a primary motivation to use these, then.
[Dan] ]There are people motivated hoping it will not copy.
[Ross] Boxing might be problematic because you can’t copy ones that have boxes inside.
[Dan] I think that’s an important case. We could say if a record has a box, you can’t transfer it to another worker. Or you can, but you can’t unbox it in the other worker. Allowing references to objects would be the next level of making GC even more complicated.
[Ross] Same problem came up in WASM as well.
[Dan] Hopefully we can come up with an answer for both.
[Caridy] Some intersections here with realms. We do think record and tuples will be a great complement to realms, because you can share complex immutable structures between realms. I bet Google will have some thoughts around this, given we had the same problem with realms and sharing between them.
[Dan] What do you think, Marja?
[Marja H] I don’t know enough to comment right now.
[Rick] About comparison: in the spec, comparison semantics are linear time. There are some strategies for in-engine performance improvement. Interning allows one “engine value” for a given “language value”. Say you have two string instances with the same contents. The language doesn’t expose that they’re the same. The engine can store a value in the backend that allows faster (pointer check) comparisons. So you can intern a string or record when it’s created, or at comparison time.
[Rick] There are significant complications. Only some records/tuples will be compared. If you do work to optimize comparison of things that are never compared, you wasted that time. Secondly, record and tuples equality semantics are weird the way JS equality is weird. We’ve discussed this a lot in the champion group. Current thinking is it will do the same operation JS does. Example: #[-0] === #[0] => true
. Third, in theory, record and tuples that are compared are usually small values. Generally people don’t compare deep trees. They just check a couple of values.
[Rick] All these complications lead us to think of linear time comparison. It’s less complex to implement, since no need to do interning. The performance is more consistent since there are no cliffs. It matches existing expectations in user land. There are a many comparisons that are linear time and users are accustomed to that. This would also keep the door open for future optimizations, especially after seeing how record and tuples are used in the wild.
[Ross] I think going linear time at first is the right call. Surprised you aren’t calling to do something after a comparison succeeds.
[Dan] On the spot requires multiple indirections and complicates GC. Nobody’s prohibited from doing this, implementations can do it, but we don’t want to set expectations.
[Rick] Champions feel we don’t want to require a certain performance characteristic right now, we just want to allow this and get out of the way. By announcing this idea about linear time comparison, we want to say to the community that in JS, immutable things aren’t magically faster. We want to set the expectation.
[Ross] You’re saying “please evaluate this proposal as linear time”.
[Rick] Right, I wouldn’t even open the door to optimization.
[Marja] We see a clear use case for comparing large record and tuples with each other, but not repeatedly. We didn’t find a use case for keeping record and tuples around. Interning didn’t really seem necessary. IN a thread I saw someone asked why we should use record and tuples and someone else said because they’re faster. I found some cases where sometimes they’re slower. Do you have thoughts about what cases users can expect to be fast?
[Dan] If you’re optimizing code, you can hoist map checks because you can expect the map of an record and tuples will never change.
[Marja] Oh interesting, we’ll look into that more.
[Rick] On the Reddit/Twitter front, I’ve noticed there’s an expectation of performance out of immutability. Part of it might be that polyfilling is slow, so the mutability there implies immutability is faster. We need to be good about communicating expectations to coders ahead of this landing in browsers. We don’t want to get into a situation where users expect something very different from reality.
[Caio] Right now we’re able to hoist structure checks a little more aggressively. If things change, we can update the code we had before. I’m skeptical record and tuples would perform better than regular objects in JSC.
[Rick] Good to know. We haven’t discussed JSC performance too much so far.
[Dan] We do have a number of JSC people here, they might have thoughts.
[Caio] We would get a gain from not having to set up watch points around record and tuples. Watch points are expensive in JSC. In terms of access, I don’t think we have much to gain.
[Ashley] One situation I hoped we’d get better performance is map lookups. If I’m passing around a bunch of vectors as objects, right now I’d have to store a separate map of all X that point to Y, or turn XYs into strings and compare and then throw away. Hopefully record and tuples would make this faster.
[Rick] There are certainly cases where record and tuples encourages reduced cost.
[Matt] On the SpiderMonkey front, we talked about this a while ago. This is the first time I’d recognized the -0/0 problem. We were eager to intern; it makes a convenient implementation. Need to rethink a bit, and think about what the right way is to do this.
[Dan] Any other feedback?
[Matt] From a user perspective, I like the rest of it. This pushes in the direction of adding a new primitive type, which is a pain. We were hoping to avoid that if possible, but it looks like it might not be possible. That makes implementing this a lot harder. And the more primitive types there are, we only have room for so many primitive types. So we need to think more on this.
[Dan] Happy to be in touch about that.
(thanks and goodbye)