Author | @bassosimone |
Last-Updated | 2020-04-02 |
Status | historic |
Obsoleted-by | ooni/probe-engine#359 |
Obsoleted-by | ooni/probe-engine#522 |
Obsoleted-by | dd-003-step-by-step.md |
Abstract. Rationale and design of ooni/netx, which was later merged into ooni/probe-engine and ooni/probe-cli.
OONI experiments send and/or receive network traffic to determine if there is blocking. We want the implementation of OONI experiments to be as simple as possible. We also want to attribute errors to the major network or protocol operation that caused them.
At the same time, we want an experiment to collect as much low-level data as possible. For example, we want to know whether and when the TLS handshake completed; what certificates were provided by the server; what TLS version was selected; and so forth. These bits of information are very useful to analyze a measurement and better classify it.
We also want to automatically or manually run follow-up measurements where we change some configuration properties and repeat the measurement. For example, we may want to configure DNS over HTTPS (DoH) and then attempt to fetch again an URL. Or we may want to detect whether there is SNI bases blocking. This package allows us to do that in other parts of probe-engine.
As we observed ooni/probe-engine#13, every experiment consists of two separate phases:
-
measurement gathering
-
measurement analysis
During measurement gathering, we perform specific actions that cause network data to be sent and/or received. During measurement analysis, we process the measurement on the device. For some experiments (e.g., Web Connectivity), this second phase also entails contacting OONI backend services that provide data useful to complete the analysis.
This package implements measurement gathering. The analysis
is performed by other packages in probe-engine. The core
design idea is to provide OONI-measurements-aware replacements
for Go standard library interfaces, e.g., the
http.RoundTripper
. On top of that, we'll create all the
required interfaces to achive the measurement goals mentioned above.
We are of course writing test templates in probe-engine
anyway, because we need additional abstraction, but we can
take advantage of the fact that the API exposed by this package
is stable by definition, because it mimics the stdlib. Also,
for many experiments we can collect information pertaining
to TCP, DNS, TLS, and HTTP with a single call to netx
.
This code used to live at github.com/ooni/netx
. On 2020-03-02
we merged github.com/ooni/netx@4f8d645bce6466bb into probe-engine
because it was more practical and enabled easier refactoring.
Consistently with Go's terminology, we define HTTP round trip the process where we get a request to send; we find a suitable connection for sending it, or we create one; we send headers and possibly body; and we receive response headers.
We also define HTTP transaction the process starting with an HTTP round trip and terminating by reading the full response body.
We define netx replacement a Go struct of interface that has the same interface of a Go standard library object but additionally performs measurements.
This library MUST wrap error
such that:
-
we can classify all errors we care about; and
-
we can map them to major operations.
The github.com/ooni/netx/modelx
MUST contain a wrapper for
Go error
named ErrWrapper
that is at least like:
type ErrWrapper struct {
Failure string // error classification
Operation string // operation that caused error
WrappedErr error // the original error
}
func (e *ErrWrapper) Error() string {
return e.Failure
}
Where Failure
is one of the errors we care about, i.e.:
connection_refused
: ECONNREFUSEDconnection_reset
: ECONNRESETdns_bogon_error
: detected bogon in DNS replydns_nxdomain_error
: NXDOMAIN in DNS replyeof_error
: unexpected EOF on connectiongeneric_timeout_error
: some timer has expiredssl_invalid_hostname
: certificate not valid for SNIssl_unknown_authority
: cannot find CA validating certificatessl_invalid_certificate
: e.g. certificate expiredunknown_failure <string>
: any other error
Note that we care about bogons in DNS replies because they are often used to censor specific websites.
And where Operation
is one of:
resolve
: domain name resolutionconnect
: TCP connecttls_handshake
: TLS handshakehttp_round_trip
: reading/writing HTTP
The code in this library MUST wrap returned errors such
that we can cast back to ErrWrapper
during the analysis
phase, using Go 1.13 errors
library as follows:
var wrapper *modelx.ErrWrapper
if errors.As(err, &wrapper) == true {
// Do something with the error
}
We want to provide netx replacements for the following interfaces in the Go standard library:
-
http.RoundTripper
-
http.Client
-
net.Dialer
-
net.Resolver
Accordingly, we'll define the following interfaces in
the github.com/ooni/probe-engine/netx/modelx
package:
type DNSResolver interface {
LookupHost(ctx context.Context, hostname string) ([]string, error)
}
type Dialer interface {
Dial(network, address string) (net.Conn, error)
DialContext(ctx context.Context, network, address string) (net.Conn, error)
}
type TLSDialer interface {
DialTLS(network, address string) (net.Conn, error)
DialTLSContext(ctx context.Context, network, address string) (net.Conn, error)
}
We won't need an interface for http.RoundTripper
because it is already an interface, so we'll just use it.
Our replacements will implement these interfaces.
Using an API compatible with Go's standard libary makes
it possible to use, say, our net.Dialer
replacement with
other libraries. Both http.Transport
and
gorilla/websocket
's websocket.Dialer
have
functions like Dial
and DialContext
that can be
overriden. By overriding such function pointers,
we could use our replacements instead of the standard
libary, thus we could collect measurements while
using third party code to implement specific protocols.
Also, using interfaces allows us to combine code quite easily. For example, a resolver that detects bogons is easily implemented as a wrapper around another resolve that performs the real resolution.
The github.com/ooni/netx/modelx
package will define
an handler for low level events as:
type Handler interface {
OnMeasurement(Measurement)
}
We will provide a mechanism to bind a specific
handler to a context.Context
such that the handler
will receive all the measurements caused by code
using such context. This mechanism is like:
type MeasurementRoot struct {
Beginning time.Time // the "zero" time
Handler Handler // the handler to use
}
You will be able to assign a MeasurementRoot
to
a context by using the following function:
func WithMeasurementRoot(
ctx context.Context, root *MeasurementRoot) context.Context
which will return a clone of the original context
that uses the MeasurementRoot
. Pass this context to
any method of our replacements to get measurements.
Given such context, or a subcontext, you can get
back the original MeasurementRoot
using:
func ContextMeasurementRoot(ctx context.Context) *MeasurementRoot
which will return the context MeasurementRoot
or
nil
if none is set into the context. This is how our
internal code gets access to the MeasurementRoot
.
The github.com/ooni/probe-engine/netx
package MUST provide an API such
that you can construct and configure a net.Resolver
replacement
as follows:
r, err := netx.NewResolverWithoutHandler(dnsNetwork, dnsAddress)
if err != nil {
log.Fatal("cannot configure specifc resolver")
}
var resolver modelx.DNSResolver = r
// now use resolver ...
where DNSNetwork
and DNSAddress
configure the type
of the resolver as follows:
-
when
DNSNetwork
is""
or"system"
,DNSAddress
does not matter and we use the system resolver -
when
DNSNetwork
is"udp"
,DNSAddress
is the address or domain name, with optional port, of the DNS server (e.g.,8.8.8.8:53
) -
when
DNSNetwork
is"tcp"
,DNSAddress
is the address or domain name, with optional port, of the DNS server (e.g.,8.8.8.8:53
) -
when
DNSNetwork
is"dot"
,DNSAddress
is the address or domain name, with optional port, of the DNS server (e.g.,8.8.8.8:853
) -
when
DNSNetwork
is"doh"
,DNSAddress
is the URL of the DNS server (e.g.https://cloudflare-dns.com/dns-query
)
When the resolve is not the system one, we'll also be able
to emit events when performing resolution. Otherwise, we'll
just emit the DNSResolveDone
event defined below.
Any resolver returned by this function may be configured to return the
dns_bogon_error
if any LookupHost
lookup returns a bogon IP.
The package will also contain this function:
func ChainResolvers(
primary, secondary modelx.DNSResolver) modelx.DNSResolver
where you can create a new resolver where secondary
will be
invoked whenever primary
fails. This functionality allows
us to be more resilient and bypass automatically certain types
of censorship, e.g., a resolver returning a bogon.
The github.com/ooni/probe-engine/netx
package MUST also provide an API such
that you can construct and configure a net.Dialer
replacement
as follows:
d := netx.NewDialerWithoutHandler()
d.SetResolver(resolver)
d.ForceSpecificSNI("www.kernel.org")
d.SetCABundle("/etc/ssl/cert.pem")
d.ForceSkipVerify()
var dialer modelx.Dialer = d
// now use dialer
where SetResolver
allows you to change the resolver,
ForceSpecificSNI
forces the TLS dials to use such SNI
instead of using the provided domain, SetCABundle
allows to set a specific CA bundle, and ForceSkipVerify
allows to disable certificate verification. All these funcs
MUST NOT be invoked once you're using the dialer.
The github.com/ooni/probe-engine/netx
package MUST contain
code so that we can do:
t := netx.NewHTTPTransportWithProxyFunc(
http.ProxyFromEnvironment,
)
t.SetResolver(resolver)
t.ForceSpecificSNI("www.kernel.org")
t.SetCABundle("/etc/ssl/cert.pem")
t.ForceSkipVerify()
var transport http.RoundTripper = t
// now use transport
where the functions have the same semantics as the namesake functions described before and the same caveats.
We also have syntactic sugar on top of that and legacy methods, but this fully describes the design.
The github.com/ooni/probe-engine/netx/modelx
will contain the
definition of low-level events. We are interested in
knowing the following:
-
the timing and result of each I/O operation.
-
the timing of HTTP events occurring during the lifecycle of an HTTP request.
-
the timing and result of the TLS handshake including the negotiated TLS version and other details such as what certificates the server has provided.
-
DNS events, e.g. queries and replies, generated as part of using DoT and DoH.
We will represent time as a time.Duration
since the
beginning configured either in the context or when
constructing an object. The modelx
package will also
define the Measurement
event as follows:
type Measurement struct {
Connect *ConnectEvent
HTTPConnectionReady *HTTPConnectionReadyEvent
HTTPRoundTripDone *HTTPRoundTripDoneEvent
ResolveDone *ResolveDoneEvent
TLSHandshakeDone *TLSHandshakeDoneEvent
}
The events above MUST always be present, but more
events will likely be available. The structure
will contain a pointer for every event that
we support. The events processing code will check
what pointer or pointers are not nil
to known
which event or events have occurred.
To simplify joining events together the following holds:
-
when we're establishing a new connection there is a nonzero
DialID
shared byConnect
andResolveDone
-
a new connection has a nonzero
ConnID
that is emitted as part of a successfulConnect
event -
during an HTTP transaction there is a nonzero
TransactionID
shared byHTTPConnectionReady
andHTTPRoundTripDone
-
if the TLS handshake is invoked by HTTP code it will have a nonzero
TrasactionID
otherwise a nonzeroConnID
-
the
HTTPConnectionReady
will also see theConnID
-
when a transaction starts dialing, it will pass its
TransactionID
toResolveDone
andConnect
-
when we're dialing a connection for DoH, we pass the
DialID
to theHTTPConnectionReady
event as well
Because of the following rules, it should always be possible
to bind together events. Also, we define more events than the
above, but they are ancillary to the above events. Also, the
main reason why HTTPConnectionReady
is here is because it is
the event allowing to bind ConnID
and TransactionID
.