Skip to content

2023 Test262 Testing JavaScript Conformance

Manuel Rego Casasnovas edited this page Jun 8, 2023 · 1 revision

Test262, Testing JavaScript Conformance

Topics

  • 2-way sync for test262
  • Test262 in Babel CI
  • Running test262 as friendly as possible for engines with partial language support
  • How to contribute
  • Where to contribute
  • When is coverage enough?
  • test262.fyi
  • Know more about test262
  • Test generation
  • Harnesses
  • Test262 as feature detector

Thomas: Qt company, I expected more people remotely, (noise) I expect from this session (noise again) to conside evaluating. Thank you very much!

.... I am here to see what happens.

Pablo Saavedra: my interest is to know issues with video and audio during the section :-) .

Antonio: know more in the area that we can use in the projects in Igalia.

Tiago: we are working with iOS stuff, and recently we got interested in their JS implementation. We are working on the executables and would like to know about testing.

Dan: I'm from Bloomberg, we are interested in web platform tests in Chrome or Gecko. If we could share more tests.

Nicolo: tests are part of our CI?

Linus: Ladybug, working on the engine, interested in implementing specs. It is important to run test262 as soon as possible when one implements JS from scratch. Another topic is how to contribute, what is the most appreciated, etc. Third point, about identifying what is useful and what is not, and listing areas where test coverage is not good. (presents test262.fyi and other stuff for comparing the major JS engines and how good they are)

Alex: I am curious about "minimal implementation" and how to test it.

Artem: I am working on Kotlin to Java and Kotlin to Wasm, and I am interested in this too.

Andreu: I work on implementing AsyncContext in V8, focusing on conformance which includes tests, so I am interested in test262 for that. Also, what is the minimal surface for a new engine.

Joyee: I come across the topic sometimes

Guillaume: I'm from Igalia, I work on (?) which sometimes includes contributing to test262. As a newcomer, sometimes writing the test feels tedious because of variations of one same test, we have a generator, but maybe there's a better way to do it, to make it easier

Ujjwal: I work at the compilers team at Igalia, around TC39 things, which means I care a lot about test262. It is very important for JS and TC39. One topic I care about is the harness – there's a lot of test runners for test262, but some are not maintained or don't match the current conventions. Only one real test runner, which has received 3 updates in the last 3 years or so. Maybe we should rewrite the harness and make it more maintainable. Implementers don't care because they have their own harness, but spec writers need a better cross-runtime harness.

Jonas: I'm from the Tauri team. Our use case is different: since Tauri uses webviews, our role is being historians, since webviews aren't updated a lot, especially on Linux. We don't know which features are supported without running. We need a test suite we could run in old versions of Ubuntu. I want to remind eveyrone of that as a use case: how can we figure out what's going on on old implementations with little documentation?

Philip: so we can go multiple ways from here, like doing an intro with the slides?

(Philip presents the intro)

(CanadaHonk is in Jitsi): (shows test262.fyi)

Philip: do people who are here use test262 for their needs? (yes, some do)

Dan: it is useful to detect if features are (?) feasible? I would like to know if test262.fyi could show that.

CanadaHonk: I have a script that goes through the proposals and generates the links on the home page.

Dan: There's this project to have machine-readable metadata for all of the proposals.

CanadaHonk: I should look into that.

Linus: About the harness directory in test262, one question I had is, why was YAML chosen as a metadata format? It's not the simplest of format, we had to write a YAML parser ourselves. I have no expectations of changing the metadata for 50k files, but it's not expected that you need a YAML parser for a test runner.

Alex: What do you suggest as an alternative?

Linus: A simple key-value

Alex: YAML is a simple key-value

Linus: ... whitespace...

Philip: It would be possible to have a YAML subset without many of the YAML features

Dan: There was a V8 bug where some tests didn't work because of the YAML parser. We fixed the YAML parser.

Jonas: We're talking about testing the test suite.

Philip: There's an interpeting.md document in the repo for writing a test runner. It mentions what you have to concatenate into the test before executing it. There's host-defined functions, like print for communication with the test runner, and that's used for testing asynchronous functions (maybe should've had a different name). $262 is a namespace with functions that are not implemented in ECMA-262 but are necessary for testing, like exposing GC. Test harnesses must run each test in both strict and non-strict mode. There's instructions for module tests...

Philip: There's a folder called tests/staging, where implementations can put tests that don't meet the standards of the rest of the test suite. It's intended for implementations to have test coverage for new features before they're finished, so they can start testing early. If you're working on an implementation and you don't have the capability to add test into that folder, you should ask for it.

Philip: There's also various metadata, things for syntax-specific tests, and other stuff that's not used widely. If you implement everything in the document, you should have a test runner that can run the test suite.

Linus: I've seen tests that test the harness

Ujjwal: yes there are ones.

Philip: Mostly tests for includes. Surprised to see not all of them pass on test262.fyi

Andreu: Some of the tests in harness are specific to some features, and it makes sense that they fail if not implemented.

Guillaume: For implementations with several tiers, how do they test the higher-level optimizing tiers? You need a certain number of iterations before the tier starts

Ujjwal: test262 adds to the native test suites of engines

Guillaume: But when an implementation adds a feature, and they want to make it work on all tiers, shouldn't there be a mechanism in the harness to run a test in a loop or something?

Philip: ECMAScript doesn't know anything about optimizing tiers.

Ujjwal: Test262 is a conformance test suites, it checks for a correct understanding of the language, it makes no judgements for how to implement things.

Dan: V8 has a test where they run with different compiler flags turned on and off, but not test262. It'd be interesting to have this. Is everyone going to pay this cost on the CI though?

Philip: I imagine a runner could have compiler flags such that the JIT threshold is 0.

Dan: Yeah.

Linus: Any official recommendation on what can and cannot be used in test262 tests? Things that might not be implemented in all engines. Has this been considered?

Philip: Even in the maintainers group, there's a variety of opinions on this. IMO If you put new things in the harness directory that doesn't... But there's a case for not adding the posibility for bugs on the test harness

Dan: ... (something about complexity of the most used files?)

Linus: Since the harness is split into so many files, it's easy to keep the most used files simple. Since assert.js and... are things almost every test needs.

Dan: Do you think that goal is being met?

Linus: We've only recently started using test262. If we had started earlier, we would have figured out how much support we need. For anyone writing a new engine, you should try to support test262 early to improve this. If some other engine says, this 500 tests are blocked by something unsupported in the harness, could we replace that?

Philip: If someone has a PR to add decorators to the harness, I wouldn't oppose, but if someone complains we'd have to figure it out.

Jonas: For my use case, I don't care about performance, since you won't run it often. But it'd be good to measure the performance of each of these tests. Not for optimizing, but as interesting metadata. Also, about $262, how many of those special functions are needed for tests to work?

Philip: I think they're not needed for most tests.

Andreu: In my test runner for Deno, I noticed you could run a lot of tests without $262. And for the particular functions, their usage is scoped to the tests that need it: GC-related functions for GC tests, etc.

...

Dan: V8 has directories that run in various optimization tiers (?)

Linus: There are a few specific tests that take up the entirety of the test time. In particular regex tests, which create huge strings and feed it to the regex. It could help to find those specific tests and exclude them. It might not get the test suite to an acceptable time for CI, but it could help.

Nicolò: I wonder if test262.fyi could have performance information for how long each test takes in each engine.

CanadaHonk: Sounds good, it should be feasible.

  • metadata: slow flag that implementations can skip if they're in a hurry
Clone this wiki locally