Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add HTTP spec #508

Merged
merged 38 commits into from
Jun 13, 2024
Merged
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
bc1aa59
add HTTP spec
marten-seemann Jan 22, 2023
1f075f6
2nd attempt for server auth
marten-seemann Jan 29, 2023
12f86b8
require client to authenticate the server when doing client auth
marten-seemann Jan 29, 2023
146c09a
better motivation for libp2p+HTTP (#515)
marten-seemann Feb 14, 2023
5398f5d
fix a few typos
marten-seemann Feb 14, 2023
b6c1bc2
http: use .well-known/libp2p.json for configuration
marten-seemann Mar 2, 2023
8a57943
http: nest libp2p.json config to allow for future configuration
marten-seemann Mar 2, 2023
d506145
Merge pull request #529 from libp2p/http-well-known-configuration
MarcoPolo Jun 1, 2023
946f516
Reformat the spec from the Point of View of an implementer
MarcoPolo Jul 7, 2023
3681472
Add link
MarcoPolo Jul 7, 2023
dd5d07c
Merge comments
MarcoPolo Jul 10, 2023
46d1857
Merge pull request #556 from libp2p/marco/http-update
MarcoPolo Jul 10, 2023
ebe612c
Add note about how this is just one possible auth mechanism
MarcoPolo Jul 10, 2023
7e5a077
Add lidel to interest group
MarcoPolo Jul 14, 2023
db2b3b5
Update http/README.md
MarcoPolo Jul 17, 2023
6319458
Formatting
MarcoPolo Jul 17, 2023
c7c9c43
Add thomas
MarcoPolo Jul 17, 2023
454e25c
Use metadata map and call it protocols
MarcoPolo Jul 17, 2023
a25267b
Add mermaid diagrom for HTTP semantics vs transport
MarcoPolo Jul 17, 2023
3014b22
Grammar fixes
MarcoPolo Jul 17, 2023
f96359b
Lidel suggestions
MarcoPolo Jul 17, 2023
1e87960
Define where the libp2p-token will be
MarcoPolo Jul 17, 2023
d0f0d93
Grammar fix
MarcoPolo Jul 17, 2023
8fbd64a
Specify IX vs NX in auth scheme
MarcoPolo Jul 19, 2023
71415b0
Add SNI and HTTP_libp2p_token to Noise extensions
MarcoPolo Jul 19, 2023
4a03bb0
Reword Namespace section a bit
MarcoPolo Aug 2, 2023
877899d
Remove SNI and token from extensions
MarcoPolo Aug 2, 2023
dc71f2c
Define the multiaddr URI
MarcoPolo Aug 24, 2023
d8850aa
update protocol name for IPFS gateway
marten-seemann Oct 4, 2023
78e8ca1
Be clear about no pipelining
MarcoPolo Mar 14, 2024
d30efda
Use SHOULD instead of MUST
MarcoPolo Mar 18, 2024
8628b5a
Update RFC for connection: close
MarcoPolo Apr 3, 2024
3c0ac40
Rename well-known
MarcoPolo Apr 3, 2024
75bc635
Add sentence on why POST and other mappings
MarcoPolo Apr 3, 2024
f95e4db
Sukun's review comments
MarcoPolo Apr 15, 2024
e3eb9dc
Small typo fixes
MarcoPolo Apr 15, 2024
95ffe6d
Update to http-path
MarcoPolo Jun 3, 2024
8f44d00
Merge pull request #568 from libp2p/marco/multiaddr-scheme
MarcoPolo Jun 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 151 additions & 0 deletions http/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# HTTP

| Lifecycle Stage | Maturity | Status | Latest Revision |
| --------------- | ------------- | ------ | --------------- |
| 1A | Working Draft | Active | r0, 2023-01-23 |

Authors: [@marten-seemann], [@MarcoPolo]

Interest Group: [@lidel], [@thomaseizinger]

[@marten-seemann]: https://github.com/marten-seemann
[@MarcoPolo]: https://github.com/MarcoPolo
[@lidel]: https://github.com/lidel
[@thomaseizinger]: https://github.com/thomaseizinger

MarcoPolo marked this conversation as resolved.
Show resolved Hide resolved
## Introduction

This document defines how libp2p nodes can offer and use an HTTP transport alongside their other transports to support application protocols with HTTP semantics. This allows a wider variety of nodes to participate in the libp2p network, for example:
MarcoPolo marked this conversation as resolved.
Show resolved Hide resolved

- Browsers communicating with other libp2p nodes without needing a WebSocket, WebTransport, or WebRTC connection.
- HTTP only edge workers can run application protocols and respond to peers on the network.
- `curl` from the command line can make requests to other libp2p nodes.

The HTTP transport will also allow application protocols to make use of HTTP intermediaries such as HTTP caching, and layer 7 proxying and load balancing. This is all in addition to the existing features that libp2p provides such as:
MarcoPolo marked this conversation as resolved.
Show resolved Hide resolved

- Connectivity – Work on top of WebRTC, WebTransport, QUIC, TCP, or an HTTP transport.
- Hole punching – Work with peers behind NATs.
- Peer ID Authentication – Authenticate your peer by their libp2p peer id.
- Peer discovery – Learn about a peer given their peer id.

## HTTP Semantics vs Encodings vs Transport

HTTP is a bit of an overloaded term. This section aims to clarify what we’re talking about when we say “HTTP”.


```mermaid
graph TB
subgraph "HTTP Semantics"
HTTP
end
subgraph "Encoding"
HTTP1.1[HTTP/1.1]
HTTP2[HTTP/2]
HTTP3[HTTP/3]
end
subgraph "Transports"
Libp2p[libp2p streams]
HTTPTransport[HTTP transport]
end
HTTP --- HTTP1.1
HTTP --- HTTP1.1
HTTP1.1 --- Libp2p
HTTP --- HTTP2
HTTP --- HTTP3
HTTP1.1 --- HTTPTransport
HTTP2 --- HTTPTransport
HTTP3 --- HTTPTransport
Comment on lines +37 to +57
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h2c to show how it can be multiplexed and negociated in many ways (header compression, binary based protocol, ...) ?
Feel free to ignore this comment if you think it's making the graph too complex.

```

- *HTTP semantics* ([RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html)) is
the stateless application-level protocol that you work with when writing HTTP
apis (for example).

- *HTTP encoding* is the thing that takes your high level request/response
defined in terms of HTTP semantics and encodes it into a form that can be sent
over the wire.

- *HTTP transport* is the thing that takes your encoded request/response and
sends it over the wire. For HTTP/1.1 and HTTP/2, this is a TCP+TLS connection.
For HTTP/3, this is a QUIC connection.

When this document says *HTTP* it is generally referring to *HTTP semantics*.

## Interoperability with existing HTTP systems

A goal of this spec is to allow libp2p to be able to interoperate with existing HTTP servers and clients. Care is taken in this document to not introduce anything that would break interoperability with existing systems.
Comment on lines +72 to +76
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is a bit confusing to me. Above you are saying the you generally refer to HTTP semantics and the next sentence says that a goal is to interoperate with existing HTTP servers and clients which refers to the transport, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refers to both actually


## HTTP Transport

Nodes MUST use HTTPS (i.e., they MUST NOT use plaintext HTTP). It is RECOMMENDED to use HTTP/2 and HTTP/3.

Nodes signal support for their HTTP transport using the `/http` component in
their multiaddr. E.g., `/dns4/example.com/tls/http`. See the [HTTP multiaddr
MarcoPolo marked this conversation as resolved.
Show resolved Hide resolved
component spec](https://github.com/libp2p/specs/blob/master/http/transport-component.md) for more details.

## Namespace

libp2p does not squat the global namespace. libp2p application protocols can be
discovered by the [well-known resource](https://www.rfc-editor.org/rfc/rfc8615)
`.well-known/libp2p/protocols`. This allows server operators to dynamically change the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the path now has /protocols do we need the top level map key protocols in the json response?

Copy link
Member

@lidel lidel Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it is good practice to keep a list in named field like that.
Makes it easy to quickly validate JSON, allows us to add more things to the file, if the need arises, without breaking legacy clients.

URLs of the application protocols offered, and not hard-code any assumptions how
a certain resource is meant to be interpreted.

```json

{
"protocols": {
"/kad/1.0.0": {"path": "/kademlia/"},
"/ipfs/gateway": {"path": "/"},
}
}
```

The resource contains a mapping of application protocols to a URL namespace. For
example, this configuration file would tell a client

1. The Kademlia application protocol is available with prefix `/kademlia`
and,
2. The [IPFS Trustless Gateway API](https://specs.ipfs.tech/http-gateways/trustless-gateway/) is mounted at `/`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this only specifies the path but not the method (GET / POST) to use when accessing this protocol over HTTP and that's up to the specific protocol to define how to run it over HTTP?
If so, should we add this explainer in the spec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. the methods will be specific to each protocol at each mount point, so not part of this spec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. As I see it there are two sections in this document:

  1. Running libp2p protocols over standard http like h2 or h3
  2. Running http protocols over libp2p streams.

So should we mention it in the specs that a libp2p protocol supporting http transport should specify the http method and headers to be used for the protocol. For the path they can expose it via the wellknown endpoint /.well-known/libp2p/protocols

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. An application protocol would be built using HTTP semantics, and that protocol would then be able to run on libp2p streams or "standard" http transports like h2, h3.

What do you mean by:

Running libp2p protocols over standard http like h2 or h3

This spec does not define how you would take an existing libp2p protocol and map it to HTTP semantics. That is best done by the specific protocol itself. But maybe I'm misunderstanding your point?


It is valid to expose a service at `/`. It is RECOMMENDED that implementations facilitate the coexistence of different service endpoints by ensuring that more specific URLs are resolved before less specific ones. For example, when registering handlers, more specific paths like `/kademlia/foo` should take precedence over less specific handler, such as `/`.

## Peer ID Authentication

When using the HTTP Transport, Peer ID authentication is optional. You only pay
for it if you need it. This benefits use cases that don’t need peer
authentication (e.g., fetching content addressed data) or authenticate some
other way (not tied to libp2p peer ids).

Specific authentication schemes for authenticating Peer IDs will be defined in
a future spec.

## Using HTTP semantics over stream transports

Application protocols using HTTP semantics can run over any libp2p stream transport. Clients open a new stream using `/http/1.1` as the protocol identifer. Clients encode their HTTP request as an HTTP/1.1 message and send it over the stream. Clients parse the response as an HTTP/1.1 message and then close the stream. Clients SHOULD NOT pipeline requests over a single stream. Clients and Servers SHOULD set the [`Connection: close` header](https://datatracker.ietf.org/doc/html/rfc9112#section-9.6) to signal to clients that this is not a persistent connection.

HTTP/1.1 is chosen as the minimum bar for interoperability, but other encodings of HTTP semantics are possible as well and may be specified in a future update.

## Multiaddr URI scheme

In places where a URI is expected, implementations SHOULD accept a multiaddr URI
in addition to a standard http or https URI. A multiaddr URI is a
[URI](https://datatracker.ietf.org/doc/html/rfc3986) with the `multiaddr`
scheme. It is constructed by taking the "multiaddr:" string and appending the
string encoded representation of the multiaddr. E.g. the multiaddr
`/ip4/1.2.3.4/udp/54321/quic-v1` would be represented as
`multiaddr:/ip4/1.2.3.4/udp/54321/quic-v1`.

This URI can be extended to include HTTP paths with the `/http-path` component.
This allows a user to make an HTTP request to a specific HTTP resource using a
multiaddr. For example, a user could make a GET request to
`multiaddr:/ip4/1.2.3.4/udp/54321/quic-v1/p2p/12D.../http-path/.well-known%2Flibp2p`. This also allows
an HTTP redirect to another host and another HTTP resource.

## Using other request-response semantics (not HTTP)

This document has focused on using HTTP semantics, but HTTP may not be the common divisor amongst all transports (current and future). It may be desirable to use some other request-response semantics for your application-level protocol, perhaps something like rust-libp2p’s [request-response](https://docs.rs/libp2p/0.52.1/libp2p/request_response/index.html) abstraction. Nothing specified in this document prohibits mapping other semantics onto HTTP semantics to keep the benefits of using an HTTP transport.

As a simple example, to support the simple request-response semantics, the request MUST be encoded within a `POST` request to the proper URL (as defined in the [Namespace](#namespace) section). The response is read from the body of the HTTP response. The client MUST authenticate the server and itself **before** making the request. The reason to chose `POST` is because this mapping makes no assumptions on whether the request is cacheable. If HTTP caching is desired users should either build on HTTP semantics or chose another mapping with different assumptions.

Other mappings may also be valid and as long as nodes agree.