-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very long identifiers (data URIs) #1088
Comments
@garemoko
|
I believe that there should be a reasonable limit on the size of an activity id. It should be big enough to be unique but small enough not to cause performance issues (or deliberate DOS attacks). Arguably it should be human readable. Something downstream of the LRS (browser, load balancer, reverse proxy, web server, framework, etc) is going to limit the size of the request so the LRS does not need to accept unlimited sized statements. I would argue that if a system is saving very large statements then it is most likely a mistake or misunderstanding. |
The problem with this concept is determining what a "reasonable limit" is. Your reasonable limit may not be remotely close (either too large or too small) to someone else's. It may also be a reasonable limit in 2019, or 2020, but might not in 2035, 2050, etc. The specification leaves this concept loose intentionally to allow the LRS to determine what it considers to be a "reasonable limit" by stating:
...
Placing an arbitrary, required limit on the size of a string, statements, or requests is short sighted. Very early on in the spec development (circa 0.9/0.95) there was an intentional use of data URIs in statement bodies. They were used to capture the oft requested "certificate" use case, granted it was part of the reason why the inclusion of attachments was added in the 1.0.0 release, but it isn't for the specification to decide what the data use cases will be. Limits of this nature were specifically left out of the specification because of the lessons learned from years being beholden to what the SCORM specification had required, for instance limits on suspend data size, or identifier lengths, that continually led to ugly workarounds. As far as DOS attacks, the spec covers that later in the same section as above by stating:
This is specifically talking about number of requests but I think based on the rest of the language of the specification it is reasonable for the LRS to apply limits to the size of requests or data therein. It is much harder to anticipate what "reasonable" will be within the confines of the specification, than to leave it up to the implementations to impose the limits they need to to remain functional and expect those limits to be reasonably handled upstream by the LRPs. The statement:
Is actually as good (or a better) argument for the lack of need for this requirement to be in the specification. You've already stated that something in front of the LRS is going to limit size or prevent attacks, so therefore there is no reason to arbitrarily do so in the spec.
While it might be a reasonable argument, and frequently the case, that is easily solved by testing by one or the other of the parties in the transaction and isn't something that needs to be remedied by the spec itself. OTOH, adding an arbitrary limit for the small fraction of cases where it isn't a mistake (for some definition of "very large") means there is no remedy that can be negotiated between the two parties and a workaround has to be used. |
Thank you @brianjmiller, I agree with everything you have said. |
We've recently come across an edge case where a system has used a data URI representing as an activity id. This appears to have occurred not by deliberate design, but rather as a consequence of translating activity stream data into xAPI, perhaps without realizing that data might use data URIs.
I'm still gathering the details, but it appears like Watershed's LRS handled these very long URIs fine, but there were some problems to resolve in the reporting, which was not expecting IRIs of this length.
Previously we have said that we don't want to limit individual properties, but LRSs could limit overall statement length to a reasonable size. I wonder if we should re-visit that in the case of identifiers and also if data URIs should be specifically excluded for use as an IRI.
The text was updated successfully, but these errors were encountered: