Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split XML node data of ElementInfo into NodeInfo, ElementInfo, and TextInfo #163

Open
Spartan322 opened this issue Apr 29, 2022 · 12 comments
Labels
Milestone

Comments

@Spartan322
Copy link
Collaborator

Spartan322 commented Apr 29, 2022

As it stands and was said by Fayti1703 the current structure of ElementInfo has too much going on in it. A text node doesn't support names, children, or attributes and an element node doesn't intrinsically represent a value/content. (leaving attributes because they're more useful on the element) As a result it makes more sense to treat them distinctly and only modify their behaviors.

Proposal:

class NodeInfo
{
    public ulong NodeId { get; }
    public XmlNodeType Type { get; set; }
    public NodeInfo Parent { get; protected set; }

    public NodeInfo(NodeInfo parent, XmlNodeType type);

    public virtual bool TrySetParent(NodeInfo info);
    public NodeInfo SetParent(NodeInfo info);

    public virtual void WriteToXML(XmlWriter writer);
    public override string ToString();
}
class ElementInfo : NodeInfo
{
    public string Name { get; set; }
    public readonly Dictionary<string, string> Attributes;
    public readonly List<NodeInfo> Children;

   public ElementInfo(string name, Dictionary<string, string> attributes = null, List<NodeInfo> children = null, NodeInfo parent = null);

    public virtual string GenerateContent();

    public virtual bool TryAttributeAsBoolean(string attribName, out bool result);
    public virtual bool TryAttributeAsInt(string attribName, out int result);
    public virtual bool TryAttributeAsFloat(string attribName, out float result);
    public bool GetAttributeAsBoolean(string attribName, bool defaultVal = default);
    public int GetAttributeAsInt(string attribName, int defaultVal = default);
    public float GetAttributeAsFloat(string attribName, float defaultVal = default);
    public bool AttributeAsBoolean(string attribName);
    public int AttributeAsInt(string attribName);
    public float AttributeAsFloat(string attribName);

    public virtual bool TryAddChild(NodeInfo child);
    public virtual bool TrySetAttribute(string key, string value);
    public ElementInfo AddChild(NodeInfo child);
    public virtual string GetAttribute(string key, string defaultValue = null);
    public ElementInfo SetAttribute(string key, string value);
}
class TextInfo : NodeInfo
{
    public string Content { get; set; }

    public TextInfo(string content, NodeInfo parent = null);

    public virtual bool TryContentAsBoolean(out bool result);
    public virtual bool TryContentAsInt(out int result);
    public virtual bool TryContentAsFloat(out float result);
    public bool GetContentAsBoolean(bool defaultVal = default);
    public int GetContentAsInt(int defaultVal = default);
    public float GetContentAsFloat(float defaultVal = default);
    public bool ContentAsBoolean();
    public int ContentAsInt();
    public float ContentAsFloat();
}

If there is anything anyone wishes to be added please comment.

@Spartan322 Spartan322 added this to the 6.0.0 milestone Apr 29, 2022
@Fayti1703
Copy link
Contributor

Initial thoughts:

  1. Parent should probably be a (nullable) ElementInfo with this system, at least for now. (DOM has the concept of a ParentNode, which applies to Elements, DocumentFragments and Documents, and we only have the first of those; not saying we should add the others)
  2. What would ElementInfo->GenerateContent do?
  3. Is there a reason to have ElementInfo->TrySetChild and ElementInfo->TrySetAttribute? When would those fail? (also the return type is missing in the snippet above)

@Spartan322
Copy link
Collaborator Author

Spartan322 commented Apr 29, 2022

  1. Parent should probably be a (nullable) ElementInfo with this system, at least for now. (DOM has the concept of a ParentNode, which applies to Elements, DocumentFragments and Documents, and we only have the first of those; not saying we should add the others)

Prefer to leave it so there is no need (if necessary) to create backward compat problems or allow extended behavior, should we or another developer decide to change it, given how little that behavior costs to cast.

  1. What would ElementInfo->GenerateContent do?

Will make an attempt to generate a content string every time by reading out the element's text children.

  1. Is there a reason to have ElementInfo->TrySetChild and ElementInfo->TrySetAttribute? When would those fail? (also the return type is missing in the snippet above)

The Try methods are to be the core functionality for those behaviors that can be extended and detect if the values were actually changed.

@Fayti1703
Copy link
Contributor

  1. Parent should probably be a (nullable) ElementInfo with this system, at least for now. [...]

Prefer to leave it so there is no need (if necessary) to create backward compat problems or allow extended behavior, should we or another developer decide to change it, given how little that behavior costs to cast.

The problem isn't casting, the problem is that, say, a TextNode conceptually cannot be a parent node, whereas this API permits you to pass one in.

  1. What would ElementInfo->GenerateContent do?

Will make an attempt to generate a content string every time by reading out the element's text children.

Propose to rename to GetTextContent.

  1. Is there a reason to have ElementInfo->TrySetChild and ElementInfo->TrySetAttribute? When would those fail? (also the return type is missing in the snippet above)

The Try methods are to be the core functionality for those behaviors that can be extended and detect if the values were actually changed.

I still don't see a case where someone would expect them to fail. The point of Try methods, from what I understand, is to permit graceful failure handling.

@Spartan322
Copy link
Collaborator Author

Spartan322 commented Apr 29, 2022

The problem isn't casting, the problem is that, say, a TextNode conceptually cannot be a parent node, whereas this API permits you to pass one in.

No it doesn't, while in a technical sense a TextNode could be attempted to be assigned as a parent it doesn't support child nodes and its trivial to make TrySetParent fail in the case you try. That's part of what TrySetParent can account for. If we want it to throw an exception we could but I'd say TrySetParent is good enough. If necessary could have NodeInfo use a public virtual bool SupportsChildren { get; } = false; property so its less hardcoded.

Propose to rename to GetTextContent.

Its not going to be a getter method, it will regenerate it every time, reason I didn't want to call it a Get is specifically because it should not be treated as equal to Content as it currently stands. Perhaps I can change the name to GenerateTextContent but I feel that would be redundant since ElementInfo can't have regular Content anymore and ToString already prints out the full XML, and I there isn't really good reason to treat XML elements in an element as content.

I still don't see a case where someone would expect them to fail. The point of Try methods, from what I understand, is to permit graceful failure handling.

Its for the sake of extensibility, we can define whether something has failed without describing what specifically should happen as a result of failure.

@Fayti1703
Copy link
Contributor

The problem isn't casting, the problem is that, say, a TextNode conceptually cannot be a parent node, whereas this API permits you to pass one in.

No it doesn't, while in a technical sense a TextNode could be attempted to be assigned as a parent it doesn't support child nodes and its trivial to make TrySetParent fail in the case you try. That's part of what TrySetParent can account for.

If the restriction is type-based, why aren't we expressing that using the type system?

If we want it to throw an exception we could but I'd say TrySetParent is good enough. If necessary could have NodeInfo use a public virtual bool SupportsChildren { get; } = false; property so its less hardcoded.

Given that there's a couple more requirements for a thing to support children, that should probably be a separate interface.

Propose to rename to GetTextContent.

Its not going to be a getter method, it will regenerate it every time,

Getter methods can in principle still regenerate every time.

reason I didn't want to call it a Get is specifically because it should not be treated as equal to Content as it currently stands.

I'm not suggesting we make it a property. Having it as a separate method, even with a Get name, should be enough of a difference there imo.

Perhaps I can change the name to GenerateTextContent

Generate reads to me as if it creates content, which this doesn't.

I still don't see a case where someone would expect them to fail. The point of Try methods, from what I understand, is to permit graceful failure handling.

Its for the sake of extensibility, we can define whether something has failed without describing what specifically should happen as a result of failure.

I still don't see a case where someone would expect them to fail. When would setting an attribute value fail?

@Spartan322
Copy link
Collaborator Author

Spartan322 commented Apr 29, 2022

If the restriction is type-based, why aren't we expressing that using the type system?

Because its not.

Given that there's a couple more requirements for a thing to support children, that should probably be a separate interface.

I guess.

Getter methods can in principle still regenerate every time.

If it wasn't O(n) for the dictionary, but there is no expectation that its a simple getter method. If it was merely generating something simple and quickly I'd have made it a property getter, but that's not its intent.

I'm not suggesting we make it a property. Having it as a separate method, even with a Get name, should be enough of a difference there imo.

Addressed above.

Generate reads to me as if it creates content, which this doesn't.

But it does so every time. Every time its called it will read from the Children and combine all the text nodes with \n, it will never store the value.

I still don't see a case where someone would expect them to fail. When would setting an attribute value fail?

For extensibility reasons say we or someone else wanted to build an element that was readonly, setting an attribute would most certainly want to fail but it still be more convenient for someone to have it as an ElementInfo for general use. Since it costs near nothing to have this modularity and we can't predict exactly what people may do, better to create default behavior with the classes and leave the rest to inheritance implementations should people so want.

@Fayti1703
Copy link
Contributor

If the restriction is type-based, why aren't we expressing that using the type system?

Because its not.

We literally have two types here, only one of which can be a parent node. If someone else wants to extend this system, they have to create their own type. I'd say that's pretty type-based.

Getter methods can in principle still regenerate every time.

If it wasn't O(n) for the dictionary,

It being O(n) is a valid concern here, though I've seen O(n) Get methods before.

but there is no expectation that its a simple getter method. If it was merely generating something simple and quickly I'd have made it a property getter, but that's not its intent.

That's kind of my point -- making it not a property getter already conveys that it doesn't follow the the usual property rules. Not that I'm against further making it clear that it's not a simple operation, but:

Generate reads to me as if it creates content, which this doesn't.

But it does so every time. Every time its called it will read from the Children and combine all the text nodes with \n, it will never store the value.

It still doesn't generate anything. It only really collects a bunch of values and concatenates them into a string. That's not really generative.

I still don't see a case where someone would expect them to fail. When would setting an attribute value fail?

For extensibility reasons say we or someone else wanted to build an element that was readonly, setting an attribute would most certainly want to fail but it still be more convenient for someone to have it as an ElementInfo for general use.

That's a fair point. I'm not quite sure why you'd want to extend this system with a readonly element &c, but it's conceptually sound.

Since it costs near nothing to have this modularity and we can predict exactly what people may do, better to create default behavior with the classes and leave the rest to inheritance implementations should people so want.

*can't
That aside, the main reason I was against it is just because I honestly didn't see a use for it?
I mean, I still think if you wanted to support read-only anything, it should be a separate interface (since that way you can't even try to do a write operation; it just isn't available), but I can see a point there.

@Spartan322
Copy link
Collaborator Author

It still doesn't generate anything. It only really collects a bunch of values and concatenates them into a string. That's not really generative.

Note that this would be generative because the value doesn't exist beforehand and it will make the value exist afterwards, and it would redo all that work if called again, it will not save the value. Creating a new string from a list of data on every call would definitely fit into the category of generative.

@Fayti1703
Copy link
Contributor

I disagree on that -- the data is already there, just not as one thing. Collative, not generative.

@Spartan322
Copy link
Collaborator Author

Spartan322 commented Jul 30, 2022

I disagree on that -- the data is already there, just not as one thing. Collative, not generative.

That's a semantics debate, the fact of the matter is that new data will be produced from existing data every time, (just because not all the data is being recreated does not constitute whether its generative, its specific to what's happening which is generating the content from the children) it will collect the data and produce a new product that does not share anything with the original data and is completely forgotten by the producer, it will not be cached and will not be recalled. I don't see how that can't be considered generative

@Fayti1703
Copy link
Contributor

a new product that does not share anything with the original data

It's made out of the concatenation of parts of the original data. That's not "not shar[ing] anything".

My main problem with this is that if I see a signature like string GenerateContent(), I'd think it would rng up a new string every time, not just collate the existing text content.

@Spartan322
Copy link
Collaborator Author

Spartan322 commented Jul 30, 2022

It's made out of the concatenation of parts of the original data. That's not "not shar[ing] anything".

It doesn't because it neither reacts nor changes according to the data, nor does it get cached.

My main problem with this is that if I see a signature like string GenerateContent(), I'd think it would rng up a new string every time, not just collate the existing text content.

That's not what generate means though, generate literally means producing a product as a result, in this case its producing something from existing data, this isn't new in software, its fairly common for a new production of a value from existing known/runtime data to be called a generate function. If we take Javascript for example, generator functions almost literally refer to the same semantics, in that case it produces a new list because they're basically builtin enumerator returns in JS, but they rarely ever return anything but existing data in an enumerable format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants