These are notes to myself because I don't work on this project much.
Make sure that everything else is currently good before you go debugging new issues.
rm *.elc; cmacs -Q -l loader.el wtf.rb
This helps you track back to a github issue (create one if necessary).
Usually, there's little need for "real" code and the submission contains a lot of sub-expressions that can be removed.
renewed_clients = @renewed_clients_ransack
.result
.order('due_at desc')
.page(params[:renewed_clients_page])
vs:
@b
.c
.d
There's no need for arguments or the extra calls. Even the assignment can be removed.
Know what the expected result should be. In the case of the above, the indentation is off and should be:
@b
.c
.d
This outputs what it sees, not what it thinks it should be. For the above, the output looks like (with notes inline):
[:ivar, "@b", 2]
[:sp, "\n", 1]
[:sp, " ", 2]
[:rem, ".", 1]
[:ident, "c", 1]
[:sp, "\n", 1]
[:sp, " ", 4]
[:indent, :c, -4]
[:rem, ".", 1]
[:ident, "d", 1]
((15 1 16 c 9)(0 3 16)(3 1 3))
This is a raw printing of the tokens as they are lexed in triplets of token type, token value, and length. It is followed with the data that actually goes back from the ruby process to emacs. This is what is used to highlight and/or indent.
TODO: I don't know how to read that sexp yet. But I want to document all of this output first to see if it knocks something loose.
This tool is helpful because it only uses Ripper and knows nothing about ERM.
This is the raw output from Ripper.lex and is basically the events that will be triggered in ERM:
[[[1, 0], :on_ivar, "@b", EXPR_END],
[[1, 2], :on_ignored_nl, "\n", EXPR_END],
[[2, 0], :on_sp, " ", EXPR_END],
[[2, 2], :on_period, ".", EXPR_DOT],
[[2, 3], :on_ident, "c", EXPR_ARG],
[[2, 4], :on_ignored_nl, "\n", EXPR_ARG],
[[3, 0], :on_sp, " ", EXPR_ARG],
[[3, 4], :on_period, ".", EXPR_DOT],
[[3, 5], :on_ident, "d", EXPR_ARG],
[[3, 6], :on_nl, "\n", EXPR_BEG]]
This is the raw output from Ripper.sexp_raw:
[:program,
[:stmts_add,
[:stmts_new],
[:call,
[:call, [:var_ref, [:@ivar, "@b", [1, 0]]], :".", [:@ident, "c", [2, 3]]],
:".",
[:@ident, "d", [3, 5]]]]]
This is the raw output from Ripper.sexp and is basically the same thing but cleaned up / combined a bit:
[:program,
[[:call,
[:call, [:var_ref, [:@ivar, "@b", [1, 0]]], :".", [:@ident, "c", [2, 3]]],
:".",
[:@ident, "d", [3, 5]]]]]
((15 1 16 c 9)(0 3 16)(3 1 3))
---
«3»@b«0»
.c
«@c» .d
The sexp, roughly described is:
((code.size, 1, code.size+1, indent_stack.join) result.join)
TODO: I'm not sure how to read result yet.
In this example, if the receiver is not an ivar, it works fine:
b
.c
.d
This allows you some freedom. At this point, everything is isolated and reproducible. You should see that one passes and the other one fails.
(ert-deftest enh-ruby-indent-leading-dots-ident ()
(with-temp-enh-rb-string
"b\n.c\n.d\n"
(indent-region (point-min) (point-max))
(buffer-should-equal "b\n .c\n .d\n")))
(ert-deftest enh-ruby-indent-leading-dots-ivar ()
(with-temp-enh-rb-string
"@b\n.c\n.d\n"
(indent-region (point-min) (point-max))
(buffer-should-equal "@b\n .c\n .d\n")))
I was able to do that with:
% ruby tools/debug.rb --trace bug128_1.rb | nopwd > 1
% ruby tools/debug.rb --trace bug128_2.rb | nopwd > 2
and then using ediff
to look at the differences in execution paths.
In particular, I could see that a major difference down the line
depended on whether @ident
was true:
#0:./ruby/erm_buffer.rb:19:ErmBuffer::Adder:-: case sym
-#0:./ruby/erm_buffer.rb:23:ErmBuffer::Adder:-: @ident = false
+#0:./ruby/erm_buffer.rb:21:ErmBuffer::Adder:-: @ident = true
#0:./ruby/erm_buffer.rb:27:ErmBuffer::Adder:-: @first_token = ft
followed by:
#0:./ruby/erm_buffer.rb:487:ErmBuffer::Parser:-: if @ident
+#0:./ruby/erm_buffer.rb:488:ErmBuffer::Parser:-: line_so_far_str = @line_so_far.map {|a| a[1] }.join
+#0:./ruby/erm_buffer.rb:489:ErmBuffer::Parser:-: if line_so_far_str.strip == ""
+#0:./ruby/erm_buffer.rb:490:ErmBuffer::Parser:-: indent :c, (line_so_far_str.length * -1)
+#0:./ruby/erm_buffer.rb:99:ErmBuffer::Parser:>: def indent type, c = 0
...
This was the major difference between the two and made it easy to reason about.
Changing from:
when :ident, :const then
to
when :ident, :const, :ivar, :gvar, :cvar then
(with 2 extra tests) made the tests pass and things seem happier.
misc dump for now:
(with-current-buffer "big_file.rb"
(profiler-start 'cpu)
(--dotimes 10 (self-insert-command 1 ?s))
(profiler-report)
(profiler-stop))
profiling electric-indent-mode vs not:
(with-current-buffer "ruby25_parser.rb"
(goto-char (point-max))
(electric-indent-mode (if electric-indent-mode -1 1))
(profiler-start 'cpu)
(--dotimes 10 (call-interactively 'newline))
(profiler-report)
(profiler-stop))
versus electric-indent-mode under text-mode:
(with-current-buffer "ruby25_parser.rb"
(goto-char (point-max))
(text-mode)
(setq start (float-time))
(electric-indent-mode (if electric-indent-mode -1 1))
(enh-ruby-mode)
(erm-wait-for-parse)
(message "%f" (- (float-time) start)))
for testing N large operations across a file/buffer
(progn
(profiler-start 'cpu)
(--dotimes 20
(message "attempt %d" it)
(let ((buf (find-file "lib/ruby27_parser.rb")))
(with-current-buffer buf
(enh-ruby-mode)
;; (erm-wait-for-parse)
(goto-char (point-max))
(--dotimes 10 (call-interactively 'newline))
(erm-wait-for-parse)
(set-buffer-modified-p nil)
(kill-buffer buf))))
(profiler-report)
(profiler-stop))
for profiling N operations on an open buffer and reverting any changes made:
(with-current-buffer "ruby25_parser.rb"
(goto-char (point-max))
;; (electric-indent-mode (if electric-indent-mode -1 1))
(electric-indent-mode -1)
;; (electric-indent-mode 1)
;; (erm-wait-for-parse)
;; (message "starting")
(profiler-start 'cpu)
(--dotimes 10 (call-interactively 'newline))
;; (erm-wait-for-parse)
(profiler-report)
(profiler-stop)
(with-current-buffer "ruby25_parser.rb"
(erm-wait-for-parse)
(revert-buffer nil t))
)