The present document describes the protocol and encoding a client/server pair must follow to establish a successful Socket.IO connection.
The present document describes the latest version of the protocol, 1.
Versions are a single integer incremented with each revision of the protocol.
Socket.IO aims to bring a WebSocket-like API to many browsers and devices, with some specific features to help with the creation of real-world realtime applications and games.
- Multiple transport support (old user agents, mobile browsers, etc).
- Multiple sockets under the same connection (namespaces).
- Disconnection detection through heartbeats.
- Optional acknoledgments.
- Reconnection support with buffering (ideal for mobile devices or bad networks)
- Lightweight protocol that sits on top of HTTP.
A Socket.IO client first decides on a transport to utilize to connect.
The state of the Socket.IO socket can be disconnected
, disconnecting
,
connected
and connecting
.
The transport connection can be closed
, closing
, open
, and opening
.
A simple HTTP handshake takes place at the beginning of a Socket.IO connection. The handshake, if successful, results in the client receiving:
- A session id that will be given for the transport to open connections.
- A number of seconds within which a heartbeat is expected (
heartbeat timeout
) - A number of seconds after the transport connection is closed when the socket
is considered disconnected if the transport connection is not reopened (
close timeout
).
At this point the socket is considered connected, and the transport is signaled to open the connection.
If the transport connection is closed, both ends are to buffer messages and then frame them appropriately for them to be sent as a batch when the connection resumes.
If the connection is not resumed within the negotiated timeout the socket is considered disconnected. At this point the client might decide to reconnect the socket, which implies a new handshake.
Socket.IO HTTP URIs take the form of:
[scheme] '://' [host] '/' [namespace] '/' [protocol version] '/' [transport id] '/' [session id] '/' ( '?' [query] )
Only the methods GET
and POST
are utilized (for the sake of compatibility
with old user agents), and their usage varies according to each transport.
The main transport connection is always a GET
request.
The URI scheme is decided based on whether the client requires a secure connection
or not. Defaults to http
, but https
is the recommended one.
The host where the Socket.IO server is located. In the browser environment, it
defaults to the host that runs the page where the client is loaded (location.host
)
The connecting client has to provide the namespace where the Socket.IO requests are intercepted.
This defaults to socket.io
for all client and server distributions.
Each client should ship with the revision ID it supports, available as a public interface to developers.
For example, the browser client supports io.protocolVersion
.
The following transports are supported:
xhr-polling
xhr-multipart
htmlfile
websocket
flashsocket
jsonp-polling
The client first figures out what transport to use. Usually this occurs in the browser, utilizing feature detection.
User-defined transports are allowed.
The query component (eg: ?token=48737481747&
) is not present on all URLs.
Certain query keys are reserved by Socket.IO:
t
: Contains a timestamp, only used to bypass caching on certain old UAs.disconnect
: Triggers a disconnection.
User-defined query components are allowed. For example,
?t=1238141910&token=mytoken
is a valid query).
The client will perform an initial HTTP POST request like the following
http://example.com/socket.io/1/
The absence of the transport id
and session id
segments will signal the server
this is a new, non-handshaken connection.
The server can respond in three different ways:
- 401 Unauthorized
If the server refuses to authorize the client to connect, based on the supplied
information (eg: Cookie
header or custom query components).
No response body is required.
- 503 Service Unavailable
If the server refuses the connection for any reason (eg: overload).
No response body is required.
- 200 OK
The handshake was successful.
The body of the response should contain the session id (sid
) given to the
client, followed by the heartbeat timeout, the connection closing timeout,
and the list of supported transports separated by :
The absence of a heartbeat timeout ('') is interpreted as the server and client not expecting heartbeats.
For example 4d4f185e96a7b:15:10:websocket,xhr-polling
.
Once the handshake request-response cycle is complete (and it ended with success),
a new connection is opened by the transport that was negotiated, with a GET
HTTP request.
The transport can modify the URI if the transport requires it, as long as no
information is lost. For example, if websocket
is accepted as the transport,
and the connection was secure, the URI for the transport connection will become:
wss://example.com/socket.io/1/websocket/4d4f185e96a7b
The URI still contains all the information required by Socket.IO to continue the message exchange (protocol security, namespace, protocol version, transport, etc).
Messages can be sent and received by following this convention. How the messages are encoded and framed depends on each transport, but generally boils down to whether the transport has built-in framing (unidiretionally and/or bidirectionally).
Transports that initialize unidirectional connections (where the server can
write to the client but not vice-versa), should perform POST
requests to send
data back to the server to the same endpoint URI.
Certain transports, like websocket
or flashsocket
, have built-in lightweight
framing mechanisms for sending and receiving messages.
For xhr-multipart
, the built-in MIME framing is used for the sake of consistency.
When no built-in lightweight framing is available, and multiple messages need to be delivered (i.e: buffered messages), the following is used:
`\ufffd` [message lenth] `\ufffd`
Transports where the framing overhead is expensive (ie: when the xhr-polling transport tries to send data to the server).
Messages have to be encoded before they're sent. The structure of a message is as follows:
[message type] ':' [message id ('+')] ':' [message endpoint] (':' [message data])
The message type is a single digit integer.
The message id is an incremental integer, required for ACKs (can be ommitted).
If the message id is followed by a +
, the ACK is not handled by socket.io,
but by the user instead.
Socket.IO has built-in support for multiple channels of communication (which we call "multiple sockets"). Each socket is identified by an endpoint (can be omitted).
Signals disconnection. If no endpoint is specified, disconnects the entire socket.
Examples:
-
Disconnect a socket connected to the
/test
endpoint.0::/test
-
Disconnect the whole socket
0
Only used for multiple sockets. Signals a connection to the endpoint. Once the server receives it, it's echoed back to the client.
Example, if the client is trying to connect to the endpoint /test, a message like this will be delivered:
'1::' [path] [query]
Example:
1::/test?my=param
To acknowledge the connection, the server echoes back the message. Otherwise, the server might want to respond with a error packet.
Sends a heartbeat. Heartbeats must be sent within the interval negotiated with the server. It's up to the client to decide the padding (for example, if the heartbeat timeout negotiated with the server is 20s, the client might want to send a heartbeat evert 15s).
'3:' [message id ('+')] ':' [message endpoint] ':' [data]
A regular message.
3:1::blabla
'4:' [message id ('+')] ':' [message endpoint] ':' [json]
A JSON encoded message.
4:1::{"a":"b"}
'5:' [message id ('+')] ':' [message endpoint] ':' [json encoded event]
An event is like a json message, but has mandatory name
and args
fields.
name
is a string and args
an array.
The event names
'message'
'connect'
'disconnect'
'open'
'close'
'error'
'retry'
'reconnect'
are reserved, and cannot be used by clients or servers with this message type.
'6:::' [message id] '+' [data]
An acknowledgment contains the message id as the message data. If a +
sign
follows the message id, it's treated as an event message packet.
Example 1: simple acknowledgement
6:::4
Example 2: complex acknowledgement
6:::4+["A","B"]
'7::' [endpoint] ':' [reason] '+' [advice]
For example, if a connection to a sub-socket is unauthorized.
No operation. Used for example to close a poll after the polling duration times out.
A Socket.IO server must provide an endpoint to force the disconnection of the socket.
While closing the transport connection is enough to trigger a disconnection, it sometimes is desirable to make sure no timeouts are activated and the disconnection events fire immediately.
http://example.com/socket.io/1/xhr-polling/812738127387123?disconnect
The server must respond with 200 OK
, or 500
if a problem is detected.