Skip to content
This repository has been archived by the owner on Sep 1, 2024. It is now read-only.

How do we document the custom query predicates? #9

Open
savetheclocktower opened this issue May 19, 2024 · 2 comments
Open

How do we document the custom query predicates? #9

savetheclocktower opened this issue May 19, 2024 · 2 comments

Comments

@savetheclocktower
Copy link
Contributor

I need to be able to use our current documentation system to document the custom query predicates — the things that let us do funky stuff in the Tree-sitter query-files.

(foo (bar) @baz
  (#is? test.first))

Here’s why:

  • Right now, these methods are mentioned in several places in the “creating a modern Tree-sitter grammar” documentation. I hint at their awesome power and then say things like “Consult the ScopeResolver API documentation for more information.” That documentation doesn't exist yet.
  • Like any other part of the API, methods could be added or enhanced from version to version. The information about which predicates are available on various versions should live in the documentation comments just like the JavaScript parts.
  • Query predicates are extremely similar to methods. They take arguments and have return values; they need to be illustrated with examples. It would be harder to develop a parallel system of documentation for query predicates than to repurpose our existing system.

But this is a bit tricky because our system is designed to document classes: instance methods, instance properties, class methods, and class properties. Right now most of the query predicates are defined on an object literal defined at ScopeResolver.TESTS, and joanna won’t pick up on that stuff. It’s so very designed around capturing data relating to classes.

I’ve thought about this a lot and here are the options I can think of:

Option A: Figure out how to get joanna to pick up on non-class-related stuff, then introduce new syntax to AtomDoc to make it more malleable

I’ve probably spent 45 minutes on this task. It’s a bit headache inducing and I’m not sure I made any progress. If I don't change anything on the code side, then the task becomes

  • Detecting when an object literal is assigned to an identifier,
  • then finding the comment directly above it (if it exists).

I don't think I know enough about @babel/parser to make this a simple task.

And if I did figure it out, how would we present it to the user? For instance:

  • Can I extend the format that’s already there to describe methods that are hanging off of an object literal, rather than only methods that are defined on classes?
  • Can I make it understand that ScopeResolver.TESTS can be a top-level documentation page with its own static methods, even though it’s a member expression instead of a top-level identifier like all the others?
  • Can I tell it to ignore the internal method signatures and present only the arguments that are relevant to how these methods will be consumed (inside .scm files)?

Pros: The way it probably ought to be done, if I’m honest.

Cons: Seems very hard.

Option B: Document every single predicate in one long description block for ScopeObserver (or some other class)

AtomDoc comments are just Markdown, and the existing docs take advantage of that fact. Some of the more thorough doc blocks have internal headings and get pretty elaborate with their Markdown.

With that in mind, I could just add one heading for each query predicate and describe them all with an incredibly verbose doc comment block, complete with examples and expected return values.

Pros: No code has to change. Could ship very soon.

Cons: Probably can’t link to individual methods. Or, if the headings have anchors and can be linked to individually, there’s still no way to generate hovercard links and cross-reference methods. Also, it stuffs everything into one gigantic comment that’ll probably be hundreds of lines long.

Option C: Create a “dummy” class whose purpose is just to make the documentation easier

Suppose that I defined each test predicate as an instance method of a class.

class ScopeResolverTests {
  first(node) {
    // ...
  }
  last(node) {
    // ...
  }
  firstTextOnRow(node, value, existingData, instance) {
    // ...
  }
  // ...et cetera
}

Then I could write documentation for that class; and instead of defining them directly on ScopeResolver.TESTS, I could say

ScopeResolver.TESTS = new ScopeResolverTests();

This would make it possible to write documentation for ScopeResolverTests and each of its instance methods.

Pros: Ability to generate hovercards. Would look identical to other API documentation UI-wise.

Cons: All those methods would get documented with their internal parameters list, rather than the parameters I want them to have. For instance, firstTextOnRow above would have all its parameters listed, even though that’s not relevant to the interface used in query files. Also, whatever name(s) I pick for the dummy classes (like ScopeResolverTests) are ultimately “fake” and probably misleading to expose.

Option D: Give up on documenting them via the AtomDoc toolchain and just add a page to the launch manual

Pros: Can do whatever I want with the Markdown.

Cons: Doesn’t live alongside versioned API documentation, so I’d have to include version information alongside each predicate (valid since 1.X, changed behavior in 1.Y, deprecated in 1.Z). Doesn’t even live in the source, so we’d have to work harder to ensure it stayed up to date. No hovercard ability.


Some of these options would be easier if we had more flexibility over how data was presented. Here are some ideas for enhancements to AtomDoc and/or this Eleventy site that would make things easier:

Ability to override method signatures

I think this information is generated automatically from the syntax tree right now, but it would be interesting if we could override that by specifying a “manual” parameter signature somehow. For instance:

// Public: Passes if this node starts on the same row as the one in the described
// position. Accepts a node position descriptor.
// 
// (Signature:) test.startsOnSameRowAs(descriptor)
// 
// - `descriptor` {String} A position descriptor that refers to another node’s
//   `startPosition` or `endPosition`.

(This would imply a way to include an arbitrary key/value pair in the data, and lots of would-be syntaxes clash with other Markdown things, so I just picked one for these examples.)

If a custom signature is present, then the EJS templates could be told to use that signature if present, falling back to the one that was generated by AtomDoc.

We had this in PDoc roughly 20,000 years ago; in fact, we only had manual method signatures. We did it this way because no existing tools (not even JSDoc, which did exist back then) could understand some of the concepts in Prototype — methods that could be called “generically” or as instance methods on a DOM node, for intance — much less statically analyze them.

Ability to override class display name

I’m reluctant to hang methods off of a dummy class name like ScopeResolverTests because that name has no meaning. It’s not something you can import, it’s not a class you’d ever receive an instance of as a return value from a method… it’s just a construct that can accept documentation comments.

That downside would be almost nonexistent if I could customize the display name of the class — even to the point where it can accept an arbitrary string. If, instead of being called ScopeResolverTests, the title of the page were Grammar query predicates, then it’d be obvious (even to a user browsing the API docs for the first time) that that page doesn’t document a typical class.

// Public: Custom predicates that can be used in all sorts of query files.
// 
// (Display name:) Grammar query predicates
// (Hovercard slug:) Predicates
class ScopeResolverTests {
  // ...
}

This would make hovercards trickier, but perhaps (a) the original identifier name could be remembered and would still be valid to use inside of hovercard references, or (b) we could define a separate hovercard “slug” that would represent its alias within hovercard references.

Or really just the key/value parts

Honestly, once we have the ability for an AtomDoc block to capture arbitrary key/value pairs, the the world is our oyster. We could invent whatever syntax we wanted and use it along with the EJS templates to create override logic for any part of the method documentation markup block.


At this point, failing any enhancements to AtomDoc, I’m leaning toward option B or option D. This isn’t such a crucial thing that we need to spend engineering time on it, but I’m posting this to write down my thoughts and to see if anyone has a better suggestion.

If I have a stroke of insight, I might revisit this later and try to implement one of the enhancements I just mentioned.

@savetheclocktower
Copy link
Contributor Author

OK, spent the morning scratching some itches, and I think the way to go is to add some overrides to AtomDoc.

I added a bit of logic to tello for extracting custom key-value pairs in the following format:

// !((foo: bar))
// !((baz: thud))

I'm not committed to that syntax; I just wanted something that was unambiguous enough to rule out any false positives. I just look for the exact sequence !((, then find the next occurrence of ))\n or ))[end of string]. Any such key value pairs then get placed into a field called customData on that item's metadata. This works whether we're handling a class, a method, or a property.

Once we've done this, it's up to our EJS templates to decide how to proceed. If we specify an override for name, we don't necessarily want to replace the name generated by AtomDoc; we just want to prefer it in certain contexts. So I tested this out by adopting Option C above — I created a class called QueryScopeTests, defined each scope test as a static method on that class, then did

ScopeResolver.TESTS = QueryScopeTests;

Then I added this block at the bottom of its AtomDoc:

//
// !((name: Scope tests in grammar query files))
//

Now I can prefer customData.name when generating a “display name” — shown as the page's title, plus whenever we link to it in a list — but keep the original name (QueryScopeTests) in the URL and for hovercard resolution.

Screenshot 2024-05-19 at 3 45 00 PM

I can go further for each method:

{
  // Public: Passes if this node ends on the same row as the one in the
  // described position. Accepts a node position descriptor.
  //
  // !((name: test.endsOnSameRowAs))
  // !((signature: (#is? test.endsOnSameRowAs nodePosition)))
  //
  // - `nodePosition` {String} A node position descriptor that points to a
  //   buffer position. If the given position has the same row as this node’s
  //   start position, the test passes. If the descriptor fails to resolve,
  //   this test will be ignored.
  //
  // ```scm
  // (#is? test.endsOnSameRowAs previousSibling.endPosition)
  // ```
  //
  static endsOnSameRowAs(node, descriptor) {
    let otherNodePosition = resolveNodePosition(node, descriptor);
    return otherNodePosition.row === node.endPosition.row;
  }
}

Here I'm generating a custom name (for display instead of .endsOnSameRowAs) and a custom signature (instead of .endsOnSameRowAs(node, descriptor). The results basically look like what I want:

Screenshot 2024-05-19 at 3 37 35 PM

(The custom name is shown in the sidebar navigation instead of the whole signature.)

This custom stuff has no impact on any of the parts of the API documentation that don't need it. The worst part of this is that I still have to do Option C — define a dummy class just to trick AtomDoc into caring about this group of methods.

What made all of this pretty easy is that nothing had to change until the tello phase of parsing. So here are some stretch objectives:

  • Figure out how to get joanna to care about a construct like ScopeResolver.TESTS — perhaps not by default, but at the very least if we opt into it via some property — and treat it as a page “root”
  • Figure out how to get multiple groups of methods — like ScopeResolver.TESTS and ScopeResolver.ADJUSTMENTS — to share a “root” even though they aren't owned by the same object. This would save me from having to put them on separate pages, and would let me instead separate them by section name (which is how a class's methods get organized in AtomDoc)

@confused-Techie
Copy link
Owner

Sorry it's taken so long to respond here, but I'm glad we already agree on best methods, I think adding overrides for everything is a great way to go, since there's even some functions such as pulsar.ui which are nearly impossible to document in our current form.

So I'm all for overrides, and would be happy to implement these in EJS and get them working, since tbh I'm not the biggest fan of AtomDoc in it's current form as it feels so strict full of the babel stuff I don't quite understand.

So this seems like an awesome solution to me

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants