-
Notifications
You must be signed in to change notification settings - Fork 1.7k
TransportDesign
Concurrency/Race issues -- Golang provides an excellent API for sharing data between threads (channels) and we aren't taking advantage of them at all. There is a big mental burden (and chance of error) by locking/unlocking in the networkManager.
Maintainability -- people find the current design confusing. Would prefer more clear boundaries between components. This would also look like the ECMAScript WebRTC API. However, they have zero relation. This is internal to pion-WebRTC and has no effect on a user.
Callback heavy -- Everytime we want to add new features we add a new callback, and it is making things cumbersome
Manages all network traffic. To send a packet call a method on the networkManager.
Responding to an inbound packet is done via callbacks. When you create a networkManager we have multiple distinct callbacks for events (RTP, SCTP, ICE etc..)
Handles ICE status. NetworkManager pushes packets on it, and uses callbacks when it wants to send traffic.
Handles encryption for SCTP packets, and DTLS handshake. NetworkManager pushes packets on it, and uses callbacks when it wants to send traffic.
Protocol used by DataChannels. NetworkManager pushes packets on it, and uses callbacks when it wants to send traffic.
Used for Encrypting/Decrypting audio/video data. This is used by the NetworkManager directly, it isn't async so provides simple Encrypt/Decrypt blocking APIs
This proposal is inspired by work done in libp2p/go-libp2p-transport
, perlin-network/noise
and xtaci/kcp-go
.
In addition, it mirrors some basic interfaces in the io
and net
package.
Note: Some details may be missing but the overall approach seems feasible.
// Transport represents a layer-able transport
type Transport interface {
Dial(Conn) (Conn, error)
Listen(Conn) (net.Listener, error)
}
// Conn is a minimal clone of the net.Conn interface to reduce implementation overhead. Could be completed down the line.
type Conn interface {
// Read reads data from the connection.
Read(b []byte) (n int, err error)
// Write writes data to the connection.
Write(b []byte) (n int, err error)
// LocalAddr returns the local network address.
LocalAddr() net.Addr
// RemoteAddr returns the remote network address.
RemoteAddr() net.Addr
// Close shuts down the transport and the underlying transports
// Any blocked Accept operations will be unblocked and return errors.
Close()
}
The net.Listener interface for reference:
type Listener interface {
// Accept waits for and returns the next connection to the listener.
Accept() (Conn, error)
// Close closes the listener.
// Any blocked Accept operations will be unblocked and return errors.
Close() error
// Addr returns the listener's network address.
Addr() Addr
}
With the following instances:
- internal/Transport.ICE
- Dial & Listen return once the first candidate valid pair binding succeeds. At that point upper lagers may start talking.
- Read/Write calls Read/Write on the current best valid candidate pair.
- internal/Transport.DTLS
- Dial & Listen return once the handshake succeeds. At that point upper lagers may start talking.
- Read/Write encrypts the data and calls Read/Write on the underlying conn.
- internal/Transport.SRTP
- Read/Write calls Read/Write on the underlying conn.
- internal/Transport.SCTP
- Dial & Listen return once SCTP Init succeeds. At that point upper lagers may start talking.
- Read/Write do the needed queuing and calls Read/Write on the underlying conn.
- internal/Transport.DataChannel: In charge data channel connections (Seems best to split Binary/Text in the WebRTC layer)
- Dial & Listen return once the data channel is open. At that point upper layers may start talking.
- Read/Write wraps the data and calls Read/Write on the underlying conn.
Each instance has a specialized constructor for passing options. In addition, they expose their own specific methods for introspection by the upper WebRTC layer. Some of the protocol RFCs actually contain recommendations for those. Any values used by these methods will be translated into their RTP* alternatives by the WebRTC layer this entails some code duplication but the internal state will be safe.
It seems advisable to create a TransportBuilder
that can dynamically build the correct transport based on the SDP info. This is also where protocol re-starts would be handled. And we could do things like avoid spinning up SCTP when no data channels are used.
The following show an example of API usage. The examples hides some edge cases and error handling for clarity.
// # Initialization
iceTransport := NewICETransport(specificOptions...)
dtlsTransport := NewDTLSTransport(specificOptions...)
srtpTransport := NewSRTPTransport(specificOptions...)
sctpTransport := NewSCTPTransport(specificOptions...)
dcTransport := NewDCTransport(specificOptions...)
// # Dial
// Note: This example will probably need some muxers for (de-)multiplexing the different protocols, E.g.: NewABMux(a Conn, b Conn) Conn
// I see this as a positive since now this logic is kinda hidden in port-send and port-receive
iceConn, _ := iceTransport.Dial(nil)
dtlsConn, _ := dtlsTransport.Dial(iceConn)
srtpConn, _ := srtpTransport.Dial(dtlsConn)
sctpConn, _ := sctpTransport.Dial(dtlsConn)
dcConn, _ := dcTransport.Dial(sctpConn)
// Listen
iceListener, _ := iceTransport.Listen(nil)
iceConn, _ := iceListener.Accept()
dtlsListener, _ := dtlsTransport.Listen(iceConn)
dtlsConn, _ := dtlsListener.Accept()
// etc...
// # Sending data
// Option 1: Marshal and call write on the conn:
var p *rtp.Packet
b, _ := p.Marshal()
_, _ := srtpConn.Write(b)
// Option 2: Pass the io.Writer:
var p *rtp.Packet
p.WriteTo(srtpConn)
// # Writing data
// Option 1: Read on the conn and unmarshal:
b := make([]byte, 1024)
_, _ = srtpConn.Read(b)
var p *rtp.Packet
_ = p.Unmarshal(b, p)
// Option 2: Pass the io.Reader:
var p *rtp.Packet
p.ReadFrom(srtpConn)
- The API is familiar to anyone that has worked with a io.ReadWriteCloser or the net package.
- The API is blocking, reducing the need for 'OnConnected' or 'OnClose' events. Instead, the caller just waits for Dial or Accept to finish.
- For SRTP we could provide a io.ReadWriter all the way into the RTCTrack object. Intensive DataChannels apps may want a similar option, can be investigated in the future.
- Hardcore separation of concerns (Hopefully not to much).
- If we want to add buffering we can create an instance of
Conn
around abytes.Buffer
and inject them into the transport stack. - There is an argument for making STUN/TURN a transport as well. I kept it out of the above discussion for clarity, but:
- The STUN transport can keep its binding alive (becomes even more important in the TCP case).
- The TURN transport can keep its association/channel alive.
- Making them separate transports could keep this additional complexity out of the ICE package.
- We could take a shortcut and put
Accept
directly onTransport
but this seems like a minor sacrifice for sticking to the standard interfaces. - I'm not suggesting we do this now but if/when we want to expose protocols in their own package at some point in the future we only need to add basic
Dial(address string)
andAccept(address string)
methods. (I do expect there will be a high demand for DTLS.)
Sean Backkem's API is 100% the way we should go. Reading through it now all seems so obvious! Thanks for taking the time to explain/formalize this!
Sean The tight coupling of Transports now allows them to talk to each other. One scenario is that DTLS must send a message to SRTP when a certificate is ready, this can't be done via Dial. SRTP doesn't use DTLS, it just gets a certificate from it.
How would you implement this? Can we have a generic API for 'events'?
Backkem Ah, I didn't know about this before. Basic options:
- Add an event interface. Advantage: You get it everywhere. Disadvantage: Likely lots of generic String and Interface{} types.
- Specific implementations per transport: Advantage: Static types, self documenting. Disadvantage: Custom logic needed to wire the transports together. (Note that a generic event interface would still need custom logic in the upper transport to interpret the generic events being thrown.)
==> One thought I had: The TransportBuilder wraps DTLS.Dial
and DTLS.Accept
with a tiny wrapper function. This function takes the concrete DTLS.Conn struct returned by these dial/accept methods. This concrete DTLSConn exposes the certPair. The wrapper passes the certpair to SRTP and returns the Conn to the upper layers. Example for DTLS.Dial
:
func WrapDTLSDial(dial func(Transport.Conn) (*DTLS.Conn, error)) {
conn, err := dial()
if err != nil {
return nil, err
}
srtpConn.SetCertPair(conn.certPair) // Should be possible to find a way to make srtpConn available here.
return conn, nil
}
The TransportBuilder would be in charge of stitching all this together. This allows static typing and moves the stitching magic from the transport implementations to the TransportBuilder (which is what it's for).
I don't think this will be large performance concern, but is it an issue we need to parse multiple times? This might be doing lots of pointless copies. If Transports can only send/recv []byte
will we be Marshaling/Unmarshaling multiple times? How would you implement DataChannel sending an SCTP message?
1.) DataChannel Transport constructs a SCTP struct, and populates the values 2.) DataChannel does an unmarshal to send via SCTP 3.) SCTP Marshals because it needs data about the SCTP packet we want to send
Backkem Indeed, the transport boundaries seem less clear cut than I originally thought.
==> One thought I had is that maybe trying to force the Transport
interface on every layer isn't needed. The TransportBuilder (or TransportManager) can be smart enough to glue some different things together. In our case it seems more important to make the individual transports do their own thing rather than force them to conform to the Transport
interface.
==> (Incomplete) idea for SCTP: The SCTP.Dial (or some other function) could return a struct that implements a 'Streamer' interface:
type Streamer interface {
Stream(streamIdentifier uint16, payloadType PayloadProtocolIdentifier) Conn
}
Instead of consuming a Conn directly, the DataChannel transport can be get Conn
s from the Streamer. Now you can write raw data to the Conn returned by Steam() which can automatically (de-)chunk the data.
-> The 'sad' part is that a data channel would have to open 5 'streams' in the worst case: DCEP, String, StringEmpty, Binary, BinaryEmpty (Who comes up with these things...). I don't hate the high amount of streams but don't love it either, maybe there is an iteration of this idea that makes it cleaner...
Sean I think this is only an issue for SRTP/DataChannel/ICE but what will LocalAddr/RemoteAddr return (just be nil?) I actually don't even think this an issue just asserting! But that is really cool, people can use ICE and hook it up to anything that wants to write generic data over the internet.
Backkem Re-thinking it, we don't really care about LocalAddr and RemoteAddr within the context of Pion. Maybe we can just drop them and make the Conn interface look like a regular io.ReadWriteCloser. If we ever want to separate a transport into an individual package we can add these missing interfaces when needed. As long as the common ones are the same we should be good.
Quick write-up of my current thinking around the RTP refactor.
- The ideas basically mirror the work done for package sctp & datachannel.
- This is a high level sketch. It may need refinement to put into practice.
- Make the SRTP Context generic
- it already just duplicates most variables
- Here it's annoying that EncryptRTP and EncryptRTCP have a different signature.
- Create a Conn for SRTP
- Has an incoming and outgoing context.
- Methods:
Client(net.Conn, srtp.Config) *Session // Config passes KeyingMaterial & RTP profile
Server(net.Conn, srtp.Config) *Session
session.OpenStream(SSRC uint16) (*Stream, error) // See sctp.OpenStream
session.AcceptStream() (*Stream, error) // Here there may be need for a little hack around PayloadType; See sctp.AcceptStream
session.Close() error
stream.Read([]byte) (int, error) // Returns payload in byte, drops header; Satisfies io.Reader for easy testing (see sctp.Read); decrypts using the correct context
stream.ReadSRTP([]byte) (int, rtp.Header, error) // Returns payload in byte, header in return argument (see sctp.ReadSCTP); decrypts using the correct context
stream.Write([]byte) (int, error) // Uses default (useful if possible) RTP header; Satisfies io.Writer for easy testing (see sctp.Write); encrypts using the correct context
stream.WriteSRTP([]byte, rtp.Header) (int, error) // Allows writing with a specific header (see sctp.WriteSRTP); encrypts using the correct context
stream.Close() error
- Change the packetizers to a Conn to ingest samples.
- Methods:
New(*srtp.Conn) *Conn // Constructor, direction doesn't matter I think
conn.Read([]byte) (int, error) // Assumes 1 sample; Satisfies io.Reader for easy testing (see datachannel.Read)
conn.ReadSamples([]byte) (int, samples uint32, error) // Returns payload in byte, amount of samples in return argument (see datachannel.ReadDataChannel)
conn.Write([]byte) (int, error) // Write 1 sample; Satisfies io.Writer for easy testing (see datachannel.Write)
conn.WriteSamples([]byte, samples uint32) (int, error) // Allows writing a specific amount of samples (see datachannel.WriteDataChannel)
- Create a Conn for SRTCP
- Has an incoming and outgoing context.
- Methods:
Client(net.Conn, srtp.Config) *Session // Config passes KeyingMaterial & RTP profile
Server(net.Conn, srtp.Config) *Session
session.OpenStream(SSRC uint16) (*Stream, error) // See sctp.OpenStream
session.AcceptStream() (*Stream, error) // See sctp.AcceptStream
session.Close() error
stream.Read([]byte) (int, error) // Returns payload in byte, drops header; Satisfies io.Reader for easy testing (see sctp.Read); decrypts using the correct context
stream.ReadSRTCP([]byte) (int, rtcp.Header, error) // Returns payload in byte, header in return argument (see sctp.ReadSCTP); decrypts using the correct context
stream.Write([]byte) (int, error) // Uses some default RTCP header; Satisfies io.Writer for easy testing (see sctp.Write); encrypts using the correct context
stream.WriteSRTCP([]byte, rtcp.Header) (int, error) // Allows writing with a specific header (see sctp.WriteSRTP); encrypts using the correct context
stream.Close() error
- Mux everything using the same mux that is already in the network Manager:
mx := mux.NewMux(iceConn, receiveMTU)
dtlsEndpoint = mx.NewEndpoint(mux.MatchDTLS) // Mux DTLS
dtlsConn := setupDTLS(dtlsEndpoint) // Runs dtls.Client/Server to connect DTLS
srtpEndpoint = mx.NewEndpoint(mux.MatchSRTP) // Note: Change matching to only SRTP, not SRTCP
srtpConn := setupSRTP(srtpEndpoint, config) // Config is based on keying material from dtlsConn
srtcpEndpoint = mx.NewEndpoint(mux.MatchSRTCP) // Note: Change matching to only SRTCP, not SRTP
srtpConn := setupSRTP(srtcpEndpoint, config) // Config is based on keying material from dtlsConn
Sign up for the Golang Slack and join the #pion channel for discussions and support
If you need commercial support/don't want to use public methods you can contact us at [email protected]