Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree-sitter running fixes (August edition) #677

Merged

Conversation

savetheclocktower
Copy link
Contributor

The newest running fixes PR. Some of these I really should've tried to get into 1.108.

* When a list of parameters is formatted with one on each line, you 
should be able to fold up the whole thing.
* Block comment folds should end just before the closing delimiter, no 
matter what their indentation level.
Instead, move that code to `language-typescript/lib/main`.
`fun` expands to a simple function definition in JavaScript, but there 
are certain contexts where that syntax is invalid, making the snippet 
much less useful.

Now that we're scoping the insides of class bodies and object literals, 
though, we can define more specific snippets for those contexts that use 
the correct syntax.

The class-instance-method snippet and the inside-an-object-literal 
snippet are identical, except that the latter has a trailing comma.
There were some constructs that we neglected to highlight properly, like 
type predicates and rest parameters.
For example: if tab size is set to 2, we should interpret a line with 3 
leading spaces as having been indented 1 level, not 1.5.
Also, recover from a malformed `highlights.scm` on load by loading an empty query file instead. It'll be obvious that the real query file failed to load.
Checking on the grammar first means that we get no opportunity to override the answer. It also fails to account for languages in which the correct comment delimiters vary based on context.

Instead, we check settings _first_, then fall back to the grammar itself.
@savetheclocktower
Copy link
Contributor Author

I like 37a81cc so much that I'll probably talk about it in an eventual blog post. Instead of fun inserting the same snippet no matter where your cursor is, it inserts the correct one for your context.

This is the sort of thing that would be very hard to do in a TextMate-style grammar, but is pretty easy with Tree-sitter.

@confused-Techie confused-Techie linked an issue Aug 17, 2023 that may be closed by this pull request
1 task
Anything that is one of the reserved type words in C must be a type in all contexts. And any identifier that ends in `_t` is, by strong convention, a type.
@savetheclocktower savetheclocktower linked an issue Aug 21, 2023 that may be closed by this pull request
5 tasks
Previously, we threw an error when a scope adjustment violated its bounds constraints, but that's a bit disruptive for everyday use. Instead, we throw an error in dev mode (so that the grammar's author doesn't fail to notice the problem), but downgrade it to a warning outside of dev mode so that it's recoverable.

There's a chance that the warning will be _too_ subtle, but we'll give it a shot.

We also include more diagnostic information so that it's clearer exactly _where_ the violation is happening.
@savetheclocktower savetheclocktower marked this pull request as ready for review September 11, 2023 22:17
@savetheclocktower
Copy link
Contributor Author

Moving this draft PR to ready-to-review so that it can get reviewed before this month's release.

Copy link
Member

@confused-Techie confused-Techie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I won't attempt to speak on the performance of these changes, I will say:

  • Changes seem syntactically correct
  • Changes seem small enough to be reasonable
  • Any slight refactors (Mostly spacing related) are rad, and awesome to see

So I'll give this a light approval, since @savetheclocktower should have something for their awesome work to be merged, but of course this review is superseded by anyone more familiar with the system.

One thing I'm curious on though, in language-css/grammars/tree-sitter/queryies/highlights.scm there's that huge chunk of text, which seems to be a large portion of any constants that exist within CSS. Which while that makes sense, from my work over on autocomplete-css I can almost guarantee there's bound to be even more constants missing from here, or at the very least these will be a pain to update.

I wonder, the same way we have a script that updates our completions.json for autocomplete packages, do you think it's at all possible to automatically update big lists like this? Or is it uncommon enough that we shouldn't care? Otherwise if we could maybe do one of the following:

  • Use some intermediate build step, where we take the file with some list identifier like $CSS_LIST then in the build step we replace that ID with the actual list of constants retrieved from elsewhere
  • Maybe this scm file could read it from a JSON file? Or some other form of file to store the list in that could be updated automatically easier
  • I know we have introduced a few custom functions in this space, is it possible we could have a custom function for the scm file to read from a file? So then we could do the above suggestion, of using a JSON file or something that can be updated automatically much more easily

Just putting some thoughts down, as this seems slightly painful to maintain, and I already have some scripts that automatically update the completions for CSS, and I bet I could get something together rather quickly to get all CSS constants. But not sure if this is worth it, or what you thought

@savetheclocktower
Copy link
Contributor Author

Those are all fine ideas. I don't enjoy having a long allow-list of property names like that in a grammar file — I know that it dooms us to have to update it over time. I did it in this case in order to provide feature parity with the TextMate-style CSS grammar.

My eventual goal here is your third suggestion pretty much exactly — to allow grammars to write custom predicates that would look something like this…

((property_name) @support.type.property-name.css
  (#is? language-css.validPropertyName))

…where language-css.validPropertyName would outsource to a function in the language-css bundle that checked the text of the node against a list that could be generated by a machine.

Part of me is reluctant to do this because custom predicates are a bit of a Pandora's box, but I figure we can impose constraints (e.g., must be synchronous) that would keep them from being too chaotic. And I think it's still better than putting this information into a query file.

The classic example of a grammar that would benefit from this is the PHP grammar. The TM-style PHP grammar devotes so much of its space to matching lists of functions — stdlib and common PHP extensions — and when I ported that grammar over to Tree-sitter I kept only a small subset of those.

@confused-Techie
Copy link
Member

Well I'm glad to see we are on the same page here.

I think any number of constraints seems totally fine by me, however strict would be needed to keep things easily manageable. with possibly a high chance of bailing entirely. Having this outsource to a function is honestly a super clever idea, it allows for the possibility that this function can do literally anything it needs to, rather than the list having to be some file.

Since with a function it could be regex, it could change over time, could be a file. Really good plan

@savetheclocktower savetheclocktower merged commit 69038a1 into pulsar-edit:master Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants