Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

!!! TASK: Add workspace content stream mapping to content graph projection #5096

Merged
merged 26 commits into from
Oct 17, 2024

Conversation

bwaidelich
Copy link
Member

@bwaidelich bwaidelich commented May 25, 2024

Previously the DoctrineDbalContentGraphProjection accessed the workspace table of a different projection in order to resolve the workspace<->content stream mapping.

This change adds the workspace table to the content graph projection and uses that instead for the resolution.

Note: This is not a breaking change because it comes with a migration and does not require a replay, but a

./flow cr:setup

is needed in order to apply that!

Related: #5038
Replaces: #5041

Todos

  • discuss REMOVED = False flag when fetching content streams link
  • discuss @deprecated This event will never be emitted, and it is ignored in the core projections. This implementation is just kept for backwards-compatibility because its not true, the event WorkspaceWasRenamed will still be emitted. note bw: "emitted" means that the event is published. This change won't remove those events from the existing event streams, but no new events should occur
  • discuss EmbeddsContentStream vs extraction link
  • discuss readded runtime cache link
    • discuss if getWorkspaces should fill the findWorkspaceByName cache
    • -> extract in separate pr
  • discuss b/c layer, is keeping the method getWorkspaceFinder enough? link
  • discuss getWorkspaces vs findWorkspaces naming
  • discuss dropping old dangeling tables?
    • -> not a problem for beta phase maybe document it somewhere
  • followup move internal classes to not be first level and catch the users eye :) link

@bwaidelich
Copy link
Member Author

bwaidelich commented May 25, 2024

Draft, because it requires some more love

mhsdesign
mhsdesign previously approved these changes May 27, 2024
Copy link
Member

@mhsdesign mhsdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by reading ... but not thaaaat deep ... maybe +0.5

@mhsdesign mhsdesign dismissed their stale review June 2, 2024 16:16

have to re review after merge of 5097 and further changes

@bwaidelich bwaidelich changed the title TASK: Add workspace content stream mapping to content graph projection !!! TASK: Add workspace content stream mapping to content graph projection Jun 3, 2024
@bwaidelich bwaidelich marked this pull request as draft June 3, 2024 12:41
bwaidelich and others added 5 commits October 1, 2024 18:16
Previously the `DoctrineDbalContentGraphProjection` accessed the `workspace` table of a different projection in order to resolve the workspace<->content stream mapping.

This change adds the `workspace` table to the content graph projection and uses that instead for the resolution.

**Note:** This is not a breaking change because it comes with a migration and does not require a replay, but a
    ./flow cr:setup
is needed in order to apply that!

Related: #5038
bwaidelich and others added 2 commits October 15, 2024 12:02
In commit 2a781d2 the cache invalidation was partly repaired
But as we removed the content graph cache layer currently in 9.0 #5246

A reintroduction will be discussed as part of another change.

Related #5039
@mhsdesign
Copy link
Member

Looks good and works well so far. I also run the Neos Ui e2e and 🔝 🥳
The migration (in cr:setup should probably be tested more extensively .. we especially need to make sure when copying the tables that they have the same schema!!!)

Copy link
Member

@mhsdesign mhsdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the whole code again and despite a few nitpicks and maybe on crucial thing left to discuss it looks really promising so far (and as said before works also)
So ill push a followup discuss the last things and then its merge ready id say!

* @internal only used inside the
* @see ContentRepositoryReadModel
*/
final readonly class ContentRepositoryReadModelAdapter implements ContentRepositoryReadModelAdapterInterface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 fyi (for other reviewers) while these namings are not perfect - they are not the final ones either :)
in #5272 we can discuss the final final final draft:)

{$this->tableNames->contentStream()}
WHERE
id = :contentStreamId
AND removed = FALSE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed with basti, we are sure that the additional FALSE filter was missing here and should have also been part of the previous findVersionForContentStream or hasContentStream

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Or is there an edge-case with $commandHandlingDependencies->contentStreamExists where it IS important that the content stream does not exist if pruned or not?
Like if someone prunes a content stream and then voluntarily creates a new workspaces with it? That would crash? Maybe we can add a low level hacky method hasContentStream?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rethought this again and again and i think the removed state is not just an implementation detail but required to be implemented by the adapter. That means we can and should add it to the ContentStream readmodel so it is clearly documented when this state should be encountered.
Further adding it to the readmodel allows us to remove the findUnusedAndRemovedContentStreamIds in the adapters and interfaces which would be impossible to document due to their complexity. It really feels good to have that now in the ContentStreamPrunner in the core.

see 041f239

Copy link
Member

@mhsdesign mhsdesign Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this again we found a way to get rid of the removed flag from the readmodel: by diffing the actual found content streams of the event store and the ones that are in the projection. And as a second step im currently fully overhauling the content stream pruner so that we dont have to track neither the removed nor the content stream state at all in the projection and always extract the information from the events: #5297

So the introduction of ContentStream::removed was just a temporary state and will be removed again :)

foreach ($this->determineRequiredSqlStatements() as $statement) {
$statements = $this->determineRequiredSqlStatements();

// MIGRATION from 2024-05-25: copy data from "cr_<crid>_p_workspace"/"cr_<crid>_p_contentstream" to "cr_<crid>_p_graph_workspace"/"cr_<crid>_p_graph_contentstream" tables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ do we need to handle the rare cases of the previous workspace/ content stream projection being out of date and behind?
Id say no and for the case this happened one must issue a full replay.

Neos.ContentRepository.Core/Classes/ContentRepository.php Outdated Show resolved Hide resolved

$eventNormalizer = $this->getObject(EventNormalizer::class);

// HACK to access the property converter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 fyi ill clean this up in a followup sometime

@@ -185,7 +186,9 @@ public function theMatchedNodeShouldBeInContentStreamAndOriginDimension(string $
Assert::assertTrue($matchedNodeAddress->workspaceName->isLive());
Assert::assertSame($nodeAggregateId, $matchedNodeAddress->aggregateId->value);
// todo useless check?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 this can and should be migrated at some point to use workspace names in the routing tests

@@ -107,6 +109,30 @@ private function createReferenceRelationTable(): Table
->setPrimaryKey(['name', 'position', 'nodeanchorpoint']);
}

private function createWorkspaceTable(): Table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 i compared the schemas again against the current ones and can confirm besides a few renames of columns its the same. Thus the migration in setup can definitely work.

Instead of messing with the rather lean API just for this usecase the calculation is done in PHP.

This will be slower and consume more memory, but it should not be an issue even with thousands of content streams for this pure cleanup CLI command.

Further this calculation is better placed in the Core package than to be done in each adapter because writing the specification for the interface would already be too complex.

Additionally, the docs were updated in the `ContentStream` readmodel and now also reference the migrated core `findUnusedAndRemovedContentStreams` logic which also explains the `removed` flag.

$mainRequest = $controllerContext->getRequest()->getMainRequest();
$uriBuilder = clone $controllerContext->getUriBuilder();
$uriBuilder->setRequest($mainRequest);
$createLiveUri = $workspace && $workspace->isPublicWorkspace() && !$resolvedNode->tags->contain(SubtreeTag::disabled());
$createLiveUri = $workspace && $nodeAddress->workspaceName->isLive() && !$resolvedNode->tags->contain(SubtreeTag::disabled());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems a bit disjointed now and smells like a conceptual problem. Aka why do I ask the workspaceName if we are live (I know why, but is that sensible?). Anyways nothing we can solve right now, just wanted to point out that it looks a bit strange and the condition read clearer before.

Copy link
Member

@mhsdesign mhsdesign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the adjustments and the planned followup #5272 i can definitely approve this!!! Thank you for your initial effort in May where we discussed this - its really fun to see such old pr get merged and its ideas still being valid!!!

@kitsunet kitsunet merged commit b4fd77b into 9.0 Oct 17, 2024
10 checks passed
@mhsdesign mhsdesign deleted the task/add-workspace-contentstream-mapping-to-contentgraph branch October 17, 2024 11:11
neos-bot pushed a commit to neos/contentgraph-doctrinedbaladapter that referenced this pull request Oct 18, 2024
neos-bot pushed a commit to neos/contentgraph-doctrinedbaladapter that referenced this pull request Oct 18, 2024
neos-bot pushed a commit to neos/contentrepository-testsuite that referenced this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants