Write a design document for how to support richer testing input #1295

bassosimone · 2022-11-18T09:42:44Z

This issue is a child issue of #1291. The aim is to write a design document explaining how richer testing input will flow from the backend, to the probe, to the generated data, to the pipeline, and to explorer.

This PoC investigates whether it would be possible to run experiments directly without using the internal/engine abstraction as the middle man. The PoC is in the context of ooni/ooni.org#1295

* ooni/ooni.org#1291; * ooni/ooni.org#1292; * ooni/ooni.org#1295; * ooni/probe#2445.

bassosimone · 2023-06-12T18:05:13Z

As explained in ooni/2023-05-richer-input@7871d3b, the https://github.com/ooni/2023-05-richer-input contains a prototype that explores the richer input domain and redesigns ooniprobe around richer input. Notably, such a repository contains an initial design document.

bassosimone · 2023-10-19T10:08:23Z

So, here's the plan I propose for starting to supporting and experimenting with richer input.

Check-in API We will keep using /api/v1/check-in for now. We will include extra information in the responses for dnscheck, riseupvpn, signal, stunreachability, and torsf. To this end, we will extend the experiment specific stanza inside of the check-in response. For example, here's how we will extend the /api/v1/check-in stanza for dnscheck:

{
  "tests": {
    "dnscheck": {
      "report_id": "", // same as before
      "targets": [{
        "input": "https://dns.google/dns-query",
        "options": {"HTTP3Enabled": true}
      }]
    }
  }
}

In other words, this means that we will have a specific richer input definition for each experiment that depends on the characteristics of each experiment. The probe will process this structure and act accordingly.

With richer input we aim to solve the following experiment needs:

dnscheck: ability to deliver input and options
riseupvpn: ability to deliver the correct CA, the correct endpoints, and to disable if riseup API is down
signal: ability to deliver the correct CA
stunreachability: ability to deliver the correct STUN endpoints
torsf: ability to deliver the correct rendezvous method and cloud-fronted SNI
urlgetter: ability to run some measurements for research

Regarding disabling riseupvpn, we already implemented the core functionality in ooni/probe-cli#1355.

Probe To implement this plan we need to ensure that we periodically call check-in. The best course of action would probably be to refafctor the code such that we call check-in at the beginning of a measurement session. This refactoring would requires changes in the probe-cli, probe-android, and probe-ios codebases.

Now, because of the backend changes described above, the check-in response contains richer input. We already have a package in the probe called checkincache that caches information extracted from the check-in response. We will extend this package to cache the richer input provided for each richer-input-aware experiment.

We will also refactor how we run experiments. We will define the following JSON structure as the most fundamental pure-data structure representing executing a single experiment with a single input and some options:

{
  "input": "https://dns.google/dns-query",
  "options": {"HTTP3Enabled": true},
  "test_name": "dnscheck"
}

Note how this data structure corresponds exactly to executing this code:

./miniooni dnscheck -O HTTP3Enabled=true -i https://dns.google/dnsquery

So, if the user runs the above commands, miniooni would translate the command line to the data structure and execute the data structure. Instead, if the user runs:

./miniooni dnscheck -O HTTP3Enabled=true -i https://dns.google/dnsquery -i https://example.com/dns-query

We will translate the command line invocation to two data structures, one for each input.

Additionally, if the user runs:

./miniooni dnscheck

We will use the checkincache package to load targets for dnscheck and produce a list of data structures to run.

OONI Run v2 An OONI Run v2 descriptor contains a superset of the following information:

{
  "nettests": [{
    "inputs": ["https://dns.google/dns-query"],
    "options": {"HTTP3Enabled": true},
    "test_name": "dnscheck"
  }]
}

Thus, note how OONI Run v2 and miniooni command line invocations are isomorphic. So, there exists a data transformation to convert an OONI Run v2 nettest descriptor to a miniooni command line invocation.

Hence, supporting richer input for miniooni implies supporting it for OONI Run v2. (The proper course of action would be that ./miniooni command line invocations produce ephemeral OONI Run v2 descriptors and then invokes the OONI Run v2 executor engine to perform measurements.)

Data processing pipeline We add richer input to dnscheck, riseupvpn, signal, stunreachability, and torsf. Thus, we need to ask the question of where and how to process the richer-input-enhanced measurement results.

Regarding dnscheck and urlgetter, the best place to process the results is ooni/data, given that these results are best understood by exploding the measurement JSON and looking at individual observations.
Regarding riseupvpn, signal, stunreachability, and torsf, richer input is only meant to make the experiments more reliable and there is no need to change the way in which we are processing their results.

OONI Explorer We will start exposing information about observations extracted using ooni/data.

Required Cleanups We can safely delete /api/_/check-in, the experimental check-in API developed when experimenting the correct design to support richer input. We don't need it anymore.

Rejected Designs In ooni/2023-05-richer-input, we experimented with several designs for implementing richer input. The most promising design was the one based on an external DSL written in JSON, which described how we would run experiments. Unfortunately, this design does not allow for computing the results of measurements directly in the probe given that the DSL is very limited and only allows for composing basic measurement primitives (e.g., DNS lookups, TCP connect, TLS handshake). It seems the correct way forward in this respect would be to write experiments in a high-level language (e.g., JavaScript) and serve this code to probes. However, this design is way beyond the original intent and scope of richer input, therefore we should shelve it for now.

bassosimone · 2023-10-23T13:13:24Z

I have started to create follow-up issues for this issue. However, I am not done. When I will be done, I will close this issue. I'll see to do this as part of the current Sprint.

bassosimone · 2023-10-24T09:30:15Z

@jbonisteel today we need to discuss whether to close this issue in light of the set of issues I have created or whether we need to additionally scope down and split some backend or explorer issues.

bassosimone added documentation Improvements or additions to documentation priority/medium funder/drl2022-2024 labels Nov 18, 2022

bassosimone self-assigned this Nov 18, 2022

bassosimone mentioned this issue Feb 6, 2023

PoC: running experiments bypassing internal/engine ooni/probe-cli#1075

Draft

4 tasks

bassosimone mentioned this issue Apr 3, 2023

engine, api: protype new check-in API ooni/probe#2445

Closed

4 tasks

bassosimone mentioned this issue May 26, 2023

archival: add tags to each fundamental data type ooni/probe#2483

Closed

bassosimone added a commit to ooni/2023-05-richer-input that referenced this issue Jun 12, 2023

doc: reference relevant issues

7871d3b

* ooni/ooni.org#1291; * ooni/ooni.org#1292; * ooni/ooni.org#1295; * ooni/probe#2445.

This was referenced Sep 1, 2023

Support richer input - backend ooni/backend#713

Open

Support richer input - explorer ooni/explorer#870

Open

This was referenced Oct 23, 2023

richer input: support dnscheck ooni/backend#754

Open

dnscheck: use richer input ooni/probe#2597

Closed

check-in v1: richer input for urlgetter: discuss prototype ooni/backend#759

Closed

This was referenced Oct 24, 2023

engine: introduce executor for richer input ooni/probe#2607

Closed

Richer Input: ooni/spec - specify the check-in v2 data format #1429

Closed

jbonisteel closed this as completed Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write a design document for how to support richer testing input #1295

Write a design document for how to support richer testing input #1295

bassosimone commented Nov 18, 2022

bassosimone commented Jun 12, 2023

bassosimone commented Oct 19, 2023 •

edited

Loading

bassosimone commented Oct 23, 2023

bassosimone commented Oct 24, 2023

Write a design document for how to support richer testing input #1295

Write a design document for how to support richer testing input #1295

Comments

bassosimone commented Nov 18, 2022

bassosimone commented Jun 12, 2023

bassosimone commented Oct 19, 2023 • edited Loading

bassosimone commented Oct 23, 2023

bassosimone commented Oct 24, 2023

bassosimone commented Oct 19, 2023 •

edited

Loading