From 0684e2602820066ac5b7632d91f1151e4ed632e1 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 16 Feb 2024 19:11:42 +0100 Subject: [PATCH 1/5] IPIP-462: Ipfs-Path-Affinity on Gateways --- src/http-gateways/path-gateway.md | 4 ++ src/http-gateways/trustless-gateway.md | 9 +++ src/ipips/ipip-0462.md | 80 ++++++++++++++++++++++++++ 3 files changed, 93 insertions(+) create mode 100644 src/ipips/ipip-0462.md diff --git a/src/http-gateways/path-gateway.md b/src/http-gateways/path-gateway.md index 3c0e14af..ed9f7009 100644 --- a/src/http-gateways/path-gateway.md +++ b/src/http-gateways/path-gateway.md @@ -195,6 +195,10 @@ Gateway should refuse attempts to register a service worker for entire Requests to these paths with `Service-Worker: script` MUST be denied by returning HTTP 400 Bad Request error. +### `Ipfs-Path-Affinity` (request header) + +Optional content routing hint, see [`Ipfs-Path-Affinity`](https://specs.ipfs.tech/http-gateways/trustless-gateway/#ipfs-path-affinity-request-header) in :cite[trustless-gateway]. + ## Request Query Parameters All query parameters are optional. diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index 71ae1f30..bbe96c41 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -88,6 +88,15 @@ Below response types SHOULD be supported: A Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless mode (no deserialized responses) and `Accept` header is missing. +### `Ipfs-Path-Affinity` (request header) + +Optional content routing hint for the server. Indicates that the requested +resource is a subset of a bigger DAG. + +A Client SHOULD use it to send a relevant parent content path when: +- fetching a big file block by block (`application/vnd.ipld.raw`) +- parallelizing DAG download by fetching each branch sub-DAG as a CAR (`application/vnd.ipld.car`) + ## Request Query Parameters ### :dfn[dag-scope] (request query parameter) diff --git a/src/ipips/ipip-0462.md b/src/ipips/ipip-0462.md new file mode 100644 index 00000000..2ec22053 --- /dev/null +++ b/src/ipips/ipip-0462.md @@ -0,0 +1,80 @@ +--- +title: "IPIP-0462: Ipfs-Path-Affinity on Gateways" +date: 2024-02-16 +ipip: proposal +editors: + - name: Marcin Rataj + github: lidel + url: https://lidel.org/ + affiliation: + name: IP Shipyard + url: https://ipshipyard.com +relatedIssues: + - https://github.com/ipfs/kubo/issues/10251 + - https://github.com/ipfs/kubo/issues/8676 +order: 462 +tags: ['ipips'] +--- + +## Summary + +This IPIP adds gateway support for optional `Ipfs-Path-Affinity` HTTP request header. + +## Motivation + +Endpoints that implement :cite[trustless-gateway] may receive requests for a +single block, or a CAR request sub-DAG of a biger tree. + +Not every CID is announced today, some providers limit announcements to +top-level root CIDs due to time and cost. + +What this mean for ecosystem? It should adapt. Over time, both clients and +servers should leverage the concept of "affinity". + +## Detailed design + +Introduce `Ipfs-Path-Affinity` HTTP request header to allow HTTP client to +inform gateway about the context of block/CAR request. + +Client asking gateway for a block SHOULD provide a hint about the DAG the block +belongs to, if such information is available. + +Gateway being unable to find providers for internal block should be +able to leverage affinity information sent by client and use CIDs of parent +path segments as additional content routing lookup hints. + +## Design rationale + +### User benefit + +When supported by both client and server: + +- Light clients are able to use trustless HTTP gateway endpoints more + efficiently, resume downloads faster. +- Gateway operators are able to leverage the hint and save resources related to + provider lookup. +- Content providers are able to implement smarter announcement mechanisms, + without worrying that internal blocks are not announced. + +### Compatibility + +This is an optional HTTP header which makes it backward-compatible with +existing ecosystem of HTTP clients and IPGS Gateways. + +### Security + +The client is in control when the affinity information is sent in the header, +and implementation SHOULD allow end user to disable it in context where parent +content path information is considered sensitive information. + +### Alternatives + +N/A + +## Test fixtures + +N/A + +### Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From be044ffb8cd879f147bee53a306bb56dc68119fe Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 22 Mar 2024 15:57:09 +0100 Subject: [PATCH 2/5] Apply suggestions from code review Co-authored-by: Daniel Norman <1992255+2color@users.noreply.github.com> Co-authored-by: Russell Dempsey <1173416+SgtPooki@users.noreply.github.com> --- src/ipips/ipip-0462.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/ipips/ipip-0462.md b/src/ipips/ipip-0462.md index 2ec22053..79875795 100644 --- a/src/ipips/ipip-0462.md +++ b/src/ipips/ipip-0462.md @@ -27,8 +27,8 @@ single block, or a CAR request sub-DAG of a biger tree. Not every CID is announced today, some providers limit announcements to top-level root CIDs due to time and cost. - -What this mean for ecosystem? It should adapt. Over time, both clients and +The introduction of an optional `Ipfs-Path-Affinity` header can increase the success rate of the gateway retrieving the request block, especially if the requested block is not announced. +What does this mean for the ecosystem? It should adapt. Over time, both clients and servers should leverage the concept of "affinity". ## Detailed design @@ -39,7 +39,7 @@ inform gateway about the context of block/CAR request. Client asking gateway for a block SHOULD provide a hint about the DAG the block belongs to, if such information is available. -Gateway being unable to find providers for internal block should be +A gateway unable to find providers for internal block should be able to leverage affinity information sent by client and use CIDs of parent path segments as additional content routing lookup hints. @@ -54,17 +54,17 @@ When supported by both client and server: - Gateway operators are able to leverage the hint and save resources related to provider lookup. - Content providers are able to implement smarter announcement mechanisms, - without worrying that internal blocks are not announced. + without worrying that some internal blocks are not announced (intentionally or unintentionally). ### Compatibility This is an optional HTTP header which makes it backward-compatible with -existing ecosystem of HTTP clients and IPGS Gateways. +existing ecosystem of HTTP clients and IPFS Gateways. ### Security The client is in control when the affinity information is sent in the header, -and implementation SHOULD allow end user to disable it in context where parent +and an implementation SHOULD allow an end user to disable it in context where parent content path information is considered sensitive information. ### Alternatives From 35a5eedf181ac1036562d0ae7be6a2275556d404 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 22 Mar 2024 23:20:44 +0100 Subject: [PATCH 3/5] ipip-462: document value format and security --- src/http-gateways/trustless-gateway.md | 23 ++++++++++++++++++++++- src/ipips/ipip-0462.md | 3 +++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index bbe96c41..240ab902 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -4,15 +4,21 @@ description: > The minimal subset of HTTP Gateway response types facilitates data retrieval via CID and ensures integrity verification, all while eliminating the need to trust the gateway itself. -date: 2023-06-20 +date: 2024-03-22 maturity: reliable editors: - name: Marcin Rataj github: lidel url: https://lidel.org/ + affiliation: + name: IP Shipyard + url: https://ipshipyard.com - name: Henrique Dias github: hacdias url: https://hacdias.com/ + affiliation: + name: IP Shipyard + url: https://ipshipyard.com xref: - url - path-gateway @@ -97,6 +103,21 @@ A Client SHOULD use it to send a relevant parent content path when: - fetching a big file block by block (`application/vnd.ipld.raw`) - parallelizing DAG download by fetching each branch sub-DAG as a CAR (`application/vnd.ipld.car`) +The value of `Ipfs-Path-Affinity` header SHOULD be percent-encoded +([ECMA262: `encodeURIComponent`](https://tc39.es/ecma262/multipage/global-object.html#sec-encodeuricomponent-uricomponent)) +unless it meets the following conditions: +- contains no path beyond the root identifier (`/ipfs/cid`) +- contains no whitespace characters +- contains no `:` characters +- contains no non-ASCII characters + +A gateway backend SHOULD leverage this hint to improve retrieval by querying +providers of additional content paths in addition to the requested one. + +Gateway implementation SHOULD support client requests with `Ipfs-Path-Affinity` +header being present more than once, but also SHOULD set a hard limit of hints +to process (e.g. 3) to avoid abuse. + ## Request Query Parameters ### :dfn[dag-scope] (request query parameter) diff --git a/src/ipips/ipip-0462.md b/src/ipips/ipip-0462.md index 79875795..695dd158 100644 --- a/src/ipips/ipip-0462.md +++ b/src/ipips/ipip-0462.md @@ -67,6 +67,9 @@ The client is in control when the affinity information is sent in the header, and an implementation SHOULD allow an end user to disable it in context where parent content path information is considered sensitive information. +Gateway implementation that supports `Ipfs-Path-Affinity` header being present +more than once MUST also set limit (e.g. max 3) to avoid abuse. + ### Alternatives N/A From de0b231ba97fa8fd223ddc1b8627b48c84138f8b Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Fri, 22 Mar 2024 23:50:33 +0100 Subject: [PATCH 4/5] ipip-462: reword motivation https://github.com/ipfs/specs/pull/462/files#r1492996318 --- src/ipips/ipip-0462.md | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/src/ipips/ipip-0462.md b/src/ipips/ipip-0462.md index 695dd158..99b5f490 100644 --- a/src/ipips/ipip-0462.md +++ b/src/ipips/ipip-0462.md @@ -25,11 +25,21 @@ This IPIP adds gateway support for optional `Ipfs-Path-Affinity` HTTP request he Endpoints that implement :cite[trustless-gateway] may receive requests for a single block, or a CAR request sub-DAG of a biger tree. -Not every CID is announced today, some providers limit announcements to -top-level root CIDs due to time and cost. -The introduction of an optional `Ipfs-Path-Affinity` header can increase the success rate of the gateway retrieving the request block, especially if the requested block is not announced. -What does this mean for the ecosystem? It should adapt. Over time, both clients and -servers should leverage the concept of "affinity". +While every piece of data that is supposed to be able to be accessed +independently should be advertised on routing system, not every CID is today. +Some providers limit announcements to top-level root CIDs due to time, cost, or +misconfiguration. + +What does this mean for the ecosystem? It should adapt and ensure +implementations leverage all infromation provided by the end user. + +Over time, both clients and servers should leverage the concept of "affinity". + +The introduction of an optional `Ipfs-Path-Affinity` header aims to increase +the success rate of the gateway retrieving internal standalone blocks or byte +ranges, especially if the requested blocks are not announced on routing +systems, but belong to a bigger DAG, and only the root CID of that parent DAG +is announced. ## Detailed design From 61518c604b17716221f3215808f2da2114afb75e Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Sat, 23 Mar 2024 00:37:52 +0100 Subject: [PATCH 5/5] ipip-462: expand alternatives https://github.com/ipfs/specs/pull/462/files#r1492982484 --- src/ipips/ipip-0462.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/ipips/ipip-0462.md b/src/ipips/ipip-0462.md index 99b5f490..99cb0800 100644 --- a/src/ipips/ipip-0462.md +++ b/src/ipips/ipip-0462.md @@ -82,7 +82,13 @@ more than once MUST also set limit (e.g. max 3) to avoid abuse. ### Alternatives -N/A +- Why not just an arbitrary identifier the user could use to establish a + relationship between requests? + - Requires server to keep state, which breaks or complicates gateway + deployments with horizontal scaling. + - Does not help when client is sending requests for different blocks/sub-DAGs + to different trustless gateways, and none of them has the whole picture, + and majority of them do not know what is the parent content path. ## Test fixtures