Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore blacklisted delimiters altogether. #1

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Fanael
Copy link
Owner

@Fanael Fanael commented Oct 27, 2014

Blacklisted delimiters don't contribute to depth anymore.

This was suggested by a user some time ago.

In theory, ignoring blacklisted delimiters is a very simple operation. You just need to modify the syntax table so they're not delimiters, so syntax-ppss doesn't count them when calculating depth.

Okay, I lied. It's not that simple. You see, the buffer-local syntax table is not the only place syntax table entries can come from. Text properties and overlays are cromulent sources of syntax table entries, too. Even ignoring the, admittedly ridiculous, idea that the major mode and other minor modes are messing with us by changing text properties on random characters, it's much harder than modifying the syntax table. Besides, we only just stopped modifying the syntax table.

Let's assume we've dealt with syntax entries somehow. We're done now, aren't we? After all, determining the depth is just a (syntax-ppss) call away, right? No. syntax-ppss is reasonably fast because it uses a cache internally. If you change the syntax in any way without telling syntax-ppss about it, it will use its cache, and that cache will contain stale entries from before the syntax change.

Okay, so we can't use syntax-ppss without telling it to drop the caches. So maybe, I don't know, tell it to drop the caches? (syntax-ppss-flush-cache 0) is not exactly rocket science, is it? As it happens, there's exactly one call to syntax-ppss, so dropping the caches not only will turn that call into (parse-partial-sexp (point-min) pos), it also will slow down all the code calling (syntax-ppss), because after rainbow-delimiters--propertize is done, the cache has to be dropped again, pretty much turning all the calls to syntax-ppss into parse-partial-sexp from (point-min).

If that's the case, would building our own private cache around parse-partial-sexp be a good idea? Maybe. But it duplicates existing code, and there's really nothing wrong with syntax-ppss. Besides, we only just stopped doing that, too.

So, what does this patch do?

I feel filthy having created this atrocity. This patch is made 100% of completely irredeemable pure evil. And it's not the nice kind of evil that turns Emacs into a decent editor. No, it's the evil that murders puppies just for the hell of it. Why, you ask?

  • It hooks into syntax-propertize-function and calls the original function. Why is this chaining bad? This comment, found right in the middle of the syntax-propertize-function definition, explains it well:
;; Rather than a -functions hook, this is a -function because it's easier
;; to do a single scan than several scans: with multiple scans, one cannot
;; assume that the text before point has been propertized, so syntax-ppss
;; gives unreliable results (and stores them in its cache to boot, so we'd
;; have to flush that cache between each function, and we couldn't use
;; syntax-ppss-flush-cache since that would not only flush the cache but also
;; reset syntax-propertize--done which should not be done in this case).
  • It uses the category text property. Never heard of it? I'm not surprised, me too, until when I saw cc-mode using it when I was writing the rainbow-delimiters test suite. category is a text property indirection. When you put a category on some text, you can affect that text's effective properties by changing the category's symbol plist. Spooky action at a distance? You bet.
  • It cracks open the skull of syntax-ppss, removes its brain, replaces it with a different brain, does some work and then gives syntax-ppss its old brain again. That's the entire point of rainbow-delimiters--with-syntax-ppss-cache. It lets us use syntax-ppss without worrying about its cache: the one rainbow-delimiters sees is separate from the one seen by everyone else.
  • It doesn't clean up after itself very well. Even if you change the major mode, the categories are still there. Clearly something more dynamic than syntax-propertize-function is needed? There's a thing dynamic just enough: font-lock-mode. I shudder at the thought of a font-lock entry, and one run late at that, and one managed by a major-mode-oblivious minor mode on top of that, managing syntactic information.

The underlying idea, though, is pretty simple. font-lock-mode always ensures the syntactic properties are set before highlighting, so we register a syntax-propertize-function that finds all eligible blacklisted delimiters and puts a, inert at the moment, category on them. Later, inside rainbow-delimiters--propertize, we temporarily modify the syntax codes (reverting it when rainbow-delimiters--propertize is done with its job) by setting the category's syntax-table to '(1). As per (elisp)Syntax Table Internals., that's punctuation. syntax-ppss sees this and treats the would-be delimiter as any other punctuation.

The idea is okay. The implementation I'm utterly disgusted with.

@purcell: where do I turn in my Emacs Lisp hacker badge?

As it is, this patch and being mergeable are not even in the same galaxy. It's a quickly hacked together proof of concept. It may be a decent starting point, though.

@purcell
Copy link
Contributor

purcell commented Oct 28, 2014

I feel filthy having created this atrocity. This patch is made 100% of completely irredeemable pure evil.

This cracked me up. I'm not sure if this PR is a work of subversive software wizardry or dark comedy.

@Fanael
Copy link
Owner Author

Fanael commented Oct 28, 2014

I'm not sure if this PR is a work of subversive software wizardry or dark comedy.

Probably both.

@Fanael Fanael force-pushed the EVIL-ignore-blacklisted-delims-altogether branch 3 times, most recently from 5e22a05 to 214614d Compare October 28, 2014 17:15
@Fanael
Copy link
Owner Author

Fanael commented Oct 28, 2014

@purcell: do you think it's worth keeping the special case of not highlighting Emacs Lisp ?)? This syntax is discouraged, many other modes such as show-paren-mode or hl-sexp get confused by it, the current way we handle it solves one part of the problem, but causes a different problem, and I really think the onus should be on the major mode to set up the syntax table entries properly.

@purcell
Copy link
Contributor

purcell commented Oct 28, 2014

I'm probably the wrong person to ask, because IIRC I was the one who complained about that special case not being handled. At this stage I guess I don't care too much, but it would be a shame if parens got wrongly highlighted as mismatched in a source file which contains ?) or ?(.

@Fanael
Copy link
Owner Author

Fanael commented Oct 28, 2014

it would be a shame if parens got wrongly highlighted as mismatched in a source file which contains ?) or ?(

They already are, except in a different way. If you want it to stay, I guess this code can stay. In fact, I have an idea how to make it slightly less dumb.

Can you review the PR, in case I'm doing something stupid?

@Fanael Fanael force-pushed the EVIL-ignore-blacklisted-delims-altogether branch from 2e8ed7e to 7da46ea Compare October 28, 2014 21:18
Blacklisted delimiters don't contribute to depth anymore.
This comes with a redesign: instead of setting the 'category' property
in a 'syntax-propertize-function', it's done in our font-lock handler.
Removing them is done in a 'font-lock-unfontify-region-function'.
'rainbow-delimiters--category-propertize' and
'rainbow-delimiters--face-propertize*' use essentially the same code
around their loops, move the common part to a function and convert
the loop bodies to lambdas.
The notion of eligibility is now split into two:
 - eligibility, i.e. whether the character is in a comment, a string
   or is escaped. Done according to the syntax table.
 - escapedness, i.e. whether the character is escaped according to
   an arbitrary Lisp predicate. This should probably go away, IMO it's
   the major mode's job to set the syntax properties properly.

Significant changes:
 - rainbow-delimiters--for-all-delimiters tests for eligibility and
   skips ineligible delimiters
 - rainbow-delimiters--category-propertize sets up the category on
   escaped and blacklisted delimiters only
 - rainbow-delimiters--escaped-char-predicate-lisp is not needed anymore,
   as new eligibility notion subsumes it.

Conflicts:
	rainbow-delimiters.el
'blacklisted-contribute-to-depth' becomes
'blacklisted-dont-contribute-to-depth', and all tests strip
uninteresting text properties so they don't break upon 'category'
changes.
@Fanael
Copy link
Owner Author

Fanael commented Oct 29, 2014

it would be a shame if parens got wrongly highlighted as mismatched in a source file which contains ?) or ?(

I just found out that unescaped brackets can even break indentation:

(pcase foo
  (?(
     (bar-1))
    (?)
    (bar-2))
  (?[
     (bar-3))
;;;         ^
;;; The above matches [ in the current master, and thus is highlighted as
;;; mismatched, despite our efforts to handle this situation gracefully.
;;; OTOH, EVIL-ignore-blacklisted-delims-altogether gets it right.
    (?]
    (bar-4)))

(pcase foo
  (?\(
   (bar-1))
  (?\)
   (bar-2))
  (?\[
   (bar-3))
  (?\]
   (bar-4)))

Frankly, now I think that any code containing ?( or other unquoted brackets should be fixed ASAP.

@purcell
Copy link
Contributor

purcell commented Oct 29, 2014

Frankly, now I think that any code containing ?( or other unquoted brackets should be fixed ASAP.

Sounds like a reasonable position to take.

Can you review the PR, in case I'm doing something stupid?

Looks okay to me, but you're the expert!

…ize*.

They're marked by rainbow-delimiters--category-propertize, so the
rainbow-delimiters--for-all-delimiters call in
rainbow-delimiters--face-propertize* will not even see them.
rainbow-delimiters--escaped-char-predicate-emacs-lisp doesn't have
to test for backslash-escaped delimiters, because the syntax table
based test in rainbow-delimiters--char-ineligible-p already catches them.
@Fanael
Copy link
Owner Author

Fanael commented Oct 29, 2014

Looks okay to me, but you're the expert!

Am I? I think that's pretty far from truth.

@Fanael Fanael force-pushed the EVIL-ignore-blacklisted-delims-altogether branch from da74e8a to 53d8eb0 Compare October 29, 2014 20:54
@Fanael
Copy link
Owner Author

Fanael commented Oct 29, 2014

I mean, seriously, I can't even get the test suite to pass.

@purcell
Copy link
Contributor

purcell commented Oct 29, 2014

I mean, seriously, I can't even get the test suite to pass.

Ah, see, if you hadn't added tests, you wouldn't have that problem, and you'd still look like an expert. ;-)

@Fanael
Copy link
Owner Author

Fanael commented Oct 29, 2014

Ah, see, if you hadn't added tests, you wouldn't have that problem

Sometimes I think this is the only reason most projects pass their tests.

@purcell
Copy link
Contributor

purcell commented Oct 29, 2014

Software is a cruel master.

@Fanael
Copy link
Owner Author

Fanael commented Nov 1, 2014

…and I realized there's no before-change-function to flush rainbow-delimiters--syntax-ppss-cache.

Besides, the code required to pass syntax-table-entries-dont-work-when-blacklisted will be incredibly nasty, even compared to the mess this branch already introduces. I'm thinking this branch is a no go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants