-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smarter response resource squashing #476
base: master
Are you sure you want to change the base?
Smarter response resource squashing #476
Conversation
Important note: the current implementation doesn't support concurrency, will work on it a bit later |
Added configuration compatibility checks + tests |
@jkeen , just a humble reminder :) BTW do you have any ideas what is wrong with the tests? It seems like a few of them are failing almost randomly... |
@factyy Thanks for this effort! With your description it definitely sounds like a bug, and therefore whatever fix it involves shouldn't be behind a feature flag. That being said, adding de-duplication logic that runs all the time sounds like a potential performance bottleneck, so I want to be very careful with how we proceed with this fix as this will affect every render (and json-api rendering speed in graphiti is… already an area that needs raw speed improvements IMO). Can you describe other scenarios this bug might appear without using a remote resource? Or is it only with remote resources? Also, I think whatever fix this involves should support concurrency out of the gate? Does this current PR include concurrency considerations? And yeah, there are some intermittent failures in the tests it seems that I haven't tracked down quite yet. |
@jkeen , I previously considered fixing it in a phase when resources are rendered (even filed a PR I mentioned earlier) but decided to go this way (de-duplication when resources are populated) since in this case we lose only a tiny bit of performance. The problem is not anyhow related to remote resources. Just use any field that has to be included that is both non-DB provided (i.e. it can be populated in the resource code and not in the model code) and is located in an included resource itself. It can be reproduced in a project with no remote resources (if you meant resources retrieved from 3rd parties using separate APIs for example), just plain Regarding concurrency it is a bit tricky: as far as I understand we may either:
So this PR doesn't allow de-duplication to be used alongside concurrency - there are checks for such cases. Regarding tests: got it. Maybe we'll find out the problem too but right now we are severely time-constrained sadly... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I did misunderstand your original issue here—when you said "non-db backed resource" I thought you were referring to remote resources. Thanks for the clarification.
That all being said, this still needs a bunch of work before it's ready to merge.
First, having to opt-in to a bug fix is not great, as it requires the consumer of Graphiti to realize that they have a bug, realize that it's from Graphiti, find the configuration switch in the docs, and then enable that configuration switch. If there were a concern that some people may be relying on broken behavior (as I imagine was the reasoning for making it a switch), then the way we smoothly handle that is through semantic versioning.
Second, having to choose between having de-duplicated entries or concurrency is not great. Concurrency is a core Graphiti feature that benefits everyone and potentially every request that flows through Graphiti. The bug you've described sounds like an edge-case that might happen on a fraction of requests under circumstances that could be worked around or avoided through other less-ideal means (but could still be handled). However, the proposed fix requires negatively impacting performance system-wide by disabling concurrency while also impacting performance system-wide by enabling additional de-duplication logic on every request, all to fix this one edge case. That doesn't feel like a good tradeoff.
I appreciate the work you've put into this, and in order to move forward on it we'll need a solution that fixes the issue with fewer drawbacks.
I completely get your point. Basically why this fix is behind a feature flag? It is for the very same reasons you described above: fixing an edge case with performance penalties (causing compatibility problems along the way). I'd be happy to re-engineer it somehow that would allow it to be applied as a general fix (without a feature flag) but I don't know how to be honest. |
@jkeen , just got my hands on the O(n) issue, fixed now. Regarding incompatibility with concurrency: I remember encountering a weird deadlock problem with concurrency (before PRs affecting the mechanism) so we decided to just turn it off for a while, but never got back to this issue. Could you check that everything works? If this is the case I can just synchronize everything and we'll see how much it will affect performance. |
With the current implementation in Graphiti it is possible to end up with incomplete objects in the
included
section.The easiest way is to have a non-DB backed relationship in one of your resources. Then query this resource as a child (by multiple paths) of a parent resource and then query this relationship from the child.
The easiest way to describe it is to just provide the following test case (added to tests as part of this PR):
And then the test case itself:
The result will contain
"data": null
in thecustom_department
object in theincluded
list.The problem is that when constructing the response tree these different associations (
positions
andcurrent_position
) are represented by separate objects (creating duplicates) although they can contain the same object multiple times. Non-DB backed relationships are populated only for respective objects (leaving other copies with empty relationships since they most probably are just plainattr_accessor
fields in the models). When rendered only the first copy of the object is rendered thus resulting in partial objects in theincluded
list (if you are not lucky and the first copy doesn't contain all the data needed).This problem can be solved either when rendering (jsonapi-rb/jsonapi-renderer#41) or when constructing a response (approach taken by this PR).
Changes:
deduplicate_entities
which is disabled by default for backward compatibility);P.S: this PR changes a fair amount of internal response constructing mechanics so any feedback is really welcome!