-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose block APIs for backwards compatibility #413
Conversation
I was just talking with @nylen about adding a "legacy" section to the individual block specification issues we have, describing the possible legacy structures each core block may have. Sounds like useful information to keep in mind as we explore this. |
|
||
compatibility: [ | ||
{ | ||
tagName: 'p' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the html contains a p
but its content can not fit inside the text block. (For example if the p has a style attribute, or if the p contains complex HTML, shortcodes etc...). I think these compatibility matchers should be stricter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care what comes inside of the p
tag as long as the text editing tool (tiny) can handle the markup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I think that not only should the compatibility matchers be as strict as possible, but also the parsing itself. This should be the default option; we may also need a more flexible option like the initialize
function proposed here, but I think we need to provide a standard way to handle the cases we expect to occur commonly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it that text compatibility is a poor example here, and instead unparseable text should simply fall back to the freeform block type? In which case, we should clarify the role of the freeform block vs. the text block.
Later in the documentation is consideration for when a block compatibility initializer needs to bail on an unsupportable type.
I agree it's less clear on how to handle compatibility matched elements which contain descendant matchable blocks. Might tie into discussion of support for nested blocks?
I feel like we should do this the same way as we parse the "default" case, and...
...this doesn't feel like the right way to do it...
...better...
... this imposes the requirement that a DOM be present in order to parse content into blocks, which (1) is effectively already in place, because of the querying method used to obtain block attributes, and (2) doesn't seem ideal. As far as @azaozz posted a particularly ugly markup example in Slack after taking our sample post content and following a fairly simple sequence of operations (pasting, typing, moving an image) in another editor (question: which one?): <p><!-- wp:heading size:h1 --></p>
<h1>1.0 Is (Not) The Loneliest Number</h1>
<p><!-- /wp --><!-- wp:text align:no-align -->I imagine... This suggests a few things to me:
Maybe we could also enhance |
In general I feel like parsing and validation of block markup needs to become far more strict - some details in #391. Ideally I would like to see the markup for a block validated against an entire tree of elements using some sort of schema. Ideally (again) this could be used to deserialize the markup into the block's data structure at the same time. Then I suppose the Anyway, @iseulde and I hope to have a PR soon that takes steps in this direction. |
Why?
In what ways?
Can you elaborate this point?
Is it a requirement of the project to accommodate the output quirks of various third-party WYSIWYG?
I worry about this too. Might depend on just how much content sanitization we perform, i.e. how direct a port of |
@aduth these are excellent questions and I found that I needed to take some time to clarify my thoughts here.
Yes, I do think we need to convert the data stored in This part is a bit more of a pipe dream, but I'm hopeful that once this is working, block comment delimiters become even more advisory and optional (for example, you can delete or mangle them and the editor will still correctly deduce block types, except in cases where there are chunks of markup that could validly belong to more than one kind of block).
We may end up needing to do this anyway, but it doesn't seem ideal because then we require anything that parses post content into blocks (mobile, and server-side code for example) to have and use DOM implementation as well.
Not necessarily, but here is a much more concrete example of the kind of thing I'm concerned about:
Losing content is the worst case - right now I am seeing other, different kinds of breakage when testing this out - but I think we can fix this entire class of issues by changing up the way we do parsing. So far I've been exploring ways to do that, see inikulin/parse5#33 (comment) for one example. It's still not entirely clear to me how to make it work but I promise I will have a PR up soon. |
Totally on board with this. Aside: One area where I think we're currently struggling to this goal is with managing the value of the
This is a valid concern, and I think we need input from mobile teams sooner than later. One of the issues I quickly encountered with this is that even if we have a standard grammar in which any environment could understand the presence of a block, we'd need context-specific implementations for displaying and updating the markup of that block. We could defer this to be performed entirely on the server, but at the sacrifice of a requirement that the user interface feel highly interactive. As much as we've tried to move away from them, I'd hoped that perhaps we could simply use UIWebView / WebView as a portal of sorts to the underlying JavaScript implementation, not unlike a sized iframe. As far as server-side support goes, I think we need a better understanding of the requirements to judge what sorts of things we need to spend effort supporting. |
Some related discussion of compatibility APIs at #429 (comment) |
This is why in a single |
@iseulde Do you have any ideas for how we could incorporate elements of your pattern into the existing plugin? I do like the idea of |
@aduth It would require an entire parse of the content so we can create a virtual DOM, and take it from there. We could do this at a block level, but I don't see why we should do that rather than a single step. The TinyMCE (SAX) parser could be used for this, as it has been proposed, but any other parser could be used too. It's important to note that, if we use TinyMCE for the parsing, it has nothing to do with the rest it's used for. We should see it as a separate module. The parsing part does not touch the editor DOM. We build a virtual DOM as the parser gives pieces to us, then the editor DOM (nothing at init) is updated to reflect the virtual DOM state. |
The parser could be TinyMCE's SAX parser, but just as well anything else if we find that better and faster. (Not sure.) Draft.js also just uses the browser to parse HTML strings: https://github.com/facebook/draft-js/blob/master/src/model/paste/getSafeBodyFromHTML.js |
See also https://github.com/facebook/draft-js/blob/master/src/model/encoding/convertFromHTMLToContentBlocks.js which uses getSafeBodyFromHTML.js |
Great links, thanks for sharing. One of my main concerns is: how much of this do we expose to the block implementer? Ideally they're concerned primarily with describing the output of the block and perhaps providing hints from which the output can later be transformed back into a data shape. Passing and maintaining a string value to and from |
Regarding registration, it will work pretty much as it works now. The virtual DOM can be described in JSX (or not). Getting props/attributes out of the virtual DOM can be done with these light selectors, much like hpq with DOM. I'm not sure what else the block implementor should touch or be concerned about? Only describing block output and selecting props? |
Specifically I'm thinking about values managed by the https://github.com/WordPress/gutenberg/blob/b4945ab/blocks/library/text/index.js#L54-L55
|
It would just be passed around, they would never have to touch the Example: function getProps( vnode, h ) {
return {
align: h.getClass( vnode, 'text-align-' ),
content: h.getChildren( vnode )
};
} |
Another: function getProps( vnode, h ) {
return {
align: h.getClass( vnode, 'align' ),
alt: h.getAttribute( h.find( vnode, 'img' ), 'alt' ),
src: h.getAttribute( h.find( vnode, 'img' ), 'src' ),
caption: h.getChildren( h.find( vnode, 'figcaption' ) )
};
} |
Getting props/attributes out of the vnode should be as far as they should be concerned about it. |
So the most direct path to this pattern could be a new:
(Ignoring type of incoming This would certainly help to resolve #421. |
@iseulde just an implementation detail of course but one nice thing about per-block parsing is the performance characteristics of it and how changes don't trigger repress of the whole document.
@aduth in conversation with @nylen it seems reasonable that some blocks should have an explicit opt-out. TinyMCE would be a good example. We could try to discourage the use of the opt-out but in cases like that where we have a stateful HTML editor it could make sense to simply let it do its thing. ^^^ I would almost rather see us make a straightforward data interface with a hard spec so that block authors could choose how they want to access the information. supposing we had a VDOM structure like I've mentioned before |
@dmsnell Sure, it would make it easier to access stuff without helper functions, but I'm not sure how much though. You'd still need selectors to access nested children easily, we'd need to convert from jsx-like structures as it's more painful to write, and we can't change the data structure to whatever is best to work with. At some point I was also looking at something like |
having a query language for accessing tree nodes is something we will probably need regardless. my main point is that by having the attributes directly accessible as a JS object means we can leave those bare and free for individual manipulation if someone wants getProps = node => ( {
caption: findNode( node, 'img' ).attrs.alt,
score: node.children.length,
} ) we could even make tag-level attributes first-class on that VDOM node <!-- wp:core/image attrs='{ "mediaId": 14 }' -->
<img src="myurl" alt="look at me!" />
<!-- /wp:core/image -->
{
type: 'block',
name: 'core/image',
attrs: { mediaId: 14 },
children: [
{
type: 'tag',
name: 'img',
src: 'myurl',
alt: 'look at me!'
}
]
}
I think that I'm missing a few antecedents here. are you saying it's objectively easier to write JSX than VDOM nodes?
missing something here. can you clarify? why can't we change the data structure? is this a project constraint you are talking about or a consequence of using a specific implementation for the data format?
That's an interesting idea. It might work in the PHP side too. I tried it out and was hoping we could abuse the array indexing to make it work but it didn't. Honestly I wonder though how much easier this is than just calling children… node[ 0 ][ 1 ][ 0 ]
node.children[ 0 ].children[ 1 ].children[ 0 ] a helper here makes more sense to me - something akin to the following snippet. but just sugar and nothing of substance getChildAt = function( ...path ) {
return path.reduce( ( node, index ) => node.children[ index ], this );
} |
I've been reluctant to close this because there's useful discussion going on here, but the changes have very much stagnated, and it's probably best to continue conversation in an issue instead (#589). There's also ongoing effort to enhance the Freeform block to handle legacy content structures, i.e. embedding the "old editor" as a block (#1394). |
This pull request explores possible technical APIs for enabling a block to opt in to supporting content shapes which had existed prior to the existence of blocks. I expect there will be some debate around the precise behavior and support, and this is meant to get the ball rolling with a few specific ideas.
The proposal here seeks to provide a single pattern satisfying each of the following requirements:
<p>
->core/text
,<blockquote>
->core/quote
)[caption]
->core/image
)core/header
->core/text
)See included documentation for specific usage
In all cases, it enforces the same expectation that a block should be represented in object form by its attributes, assuming that a compatibility initializer will return attributes of a block instance created from the compatible content.
It was also intentional that a block describes what it can be initialized from rather than what it can be transformed into, to support extendability of blocks and to preserve the idea that blocks only need to be concerned with itself except in the case that it knowingly describes intent to support other content forms.
Open questions:
A few assumptions have been made that are likely ripe for debate:
wpautop
will be applied before parsing? Will this break parsing? Do we want to try to eliminatewpautop
altogether? How would we provide compatibility for associating newline-separated text as text blocks if we were to do so?