Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add response color-coding & role setting for gptel buffer #343

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

daedsidog
Copy link
Contributor

@daedsidog daedsidog commented Jul 15, 2024

The purpose of this PR is to somewhat alleviate some of the issues discussed in #321, mainly the ambiguity around the ownership of text in the chat buffer.

Adds the function gptel-toggle-response-role and the custom variable gptel-highlight-assistant-responses, in addition to gptel-response-highlight-face.

gptel-toggle-response-role works in a DWIM fashion, toggling the entire role of a response at a point if no region is selected, and setting the role of a region otherwise.

I did not change the sticky behavior, although when I tested this, I found that it works better enabled if you have a non-empty prefix for user messages.

@Inkbottle007 mentioned in #321 (reply in thread) that such a solution is not ideal because of the various fonts introduced by the major modes. I don't entirely disagree, but maybe it would be possible to customize the face in a way that fits well within all possible 3 supported gptel major modes.

@daedsidog daedsidog force-pushed the role-color-coding branch from c40e913 to c242bdf Compare July 15, 2024 23:03
@Inkbottle007
Copy link

Sorry I moved my comment. Here the amended link: #321 (reply in thread)

@daedsidog daedsidog force-pushed the role-color-coding branch 2 times, most recently from 44618e8 to 2f7a508 Compare July 25, 2024 23:57
@karthink
Copy link
Owner

I think it's clear from feedback (yours and other users') at this point that gptel's response marking paradigm is the wrong one. It's a square peg/round hole situation, leading to fragile solutions and issues like #351. It might be worth switching to overlays for tracking responses instead. Then color-coding etc will be trivial.

The text properties can still be applied, they're useful for other gptel features.

@daedsidog
Copy link
Contributor Author

It might be worth switching to overlays for tracking responses instead. Then color-coding etc will be trivial.

I don't see why not.

What me and some others (I'm assuming) say generally just applies to the mainstream usage which integrates the dedicated chat buffer, i.e. having the interactive session sequestered only to the dedicated chat buffer, and not actually chatting with the models in other buffers. In such cases, having responses being tracked produces behavior that new users don't expect.

Main two issues right now as I see for my use case, which I think applies to many others, are:

  1. Unclear where user messages start and when responses begin in dedicated chat buffer (even if using fancy headings)
  2. Response text "pollutes" other buffers when yanked there, which allows for accidental yanking of response text from other buffers into the dedicated chatting buffer, mistakenly assuming it will be treated as user query.

The highlighting is a band-aid solution to these two problems. Overlays would solve these too, while also allowing the preservation of in-buffer use.

@karthink
Copy link
Owner

karthink commented Sep 2, 2024

  1. Unclear where user messages start and when responses begin in dedicated chat buffer (even if using fancy headings)

  2. Response text "pollutes" other buffers when yanked there, which allows for accidental yanking of response text from other buffers into the dedicated chatting buffer, mistakenly assuming it will be treated as user query.

The highlighting is a band-aid solution to these two problems. Overlays would solve these too, while also allowing the preservation of in-buffer use.

@daedsidog I've changed gptel's response tracking to use overlays, it's currently on the feature-overlays branch. It should solve all the current problems with editing previous responses, highlighting or copying response text to other buffers, etc.

Could you switch to it and test? I'll merge it to master once it's good to go. I ran some ert tests but I'm not confident enough that I've covered all edge cases.

We can also add a gptel-response-highlight-face user option that we can apply to the overlays, making highlighting trivial. Role-setting should be easier too.

@@ -780,7 +797,33 @@ file."
;; Minor mode and UI

;; NOTE: It's not clear that this is the best strategy:
(add-to-list 'text-property-default-nonsticky '(gptel . t))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be the only point of contention.

Copy link
Contributor Author

@daedsidog daedsidog Sep 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the rest of this PR goes, I feel the text properties work much better as the role tagging mechanism for the chat buffer than overlays, given their downsides discussed in #321.

@karthink
Copy link
Owner

karthink commented Nov 7, 2024

Thanks for the update, I'll try this soon.

@karthink
Copy link
Owner

karthink commented Nov 7, 2024

I tried it out briefly -- it works well, I'll do a proper code review when I have more time.

An auxiliary concern is that the response face overwrites the syntax highlighting in Org mode or Markdown's code blocks, as well as the fairly useful org-block face. It was much harder to visually spot the beginning and end of src blocks without the original Org faces.

But my main concern is the case of a sticky gptel text-property combined with no response highlighting -- now the text-property extends to any text you type immediately after the response, and you can end up with a mess. This causes the edge cases that led to the property being set as non-sticky, as we've discussed before.

One example of this edge case (that I used to run into all the time) is when you copy a part of the response to the user prompt to quote it. This quote is recognized as a response, which is unavoidable unless we use overlays. But there's a bigger problem: Now there's no way to realize that the rest of your user prompt, including the newline or other whitespace you typed after the quote, has actually been classified as a response. Non-sticky behavior leads to sub-optimal but more predictable/uniform behavior.

@daedsidog
Copy link
Contributor Author

An auxiliary concern is that the response face overwrites the syntax highlighting in Org mode or Markdown's code blocks, as well as the fairly useful org-block face. It was much harder to visually spot the beginning and end of src blocks without the original Org faces.

It's not the case, at least not for me. The syntax highlighting for both modes is preserved.

Markdown:

image

Org:

image

But my main concern is the case of a sticky gptel text-property combined with no response highlighting -- now the text-property extends to any text you type immediately after the response, and you can end up with a mess. This causes the edge cases that led to the property being set as non-sticky, as we've discussed before.

Without some form of delimiting, whether through textual tags of highlighting, both methods cause unexpected behavior. As a new user I expected to be able to edit the model response, and was not aware that I was actually inserting my own response between it.

If we are to take the lesser of two evils, then the nonsticky behavior is preferable when there is no highlighting, since the case you describe is actually worse. With highlighting, the case is not bad, because its less tedious to fix that vs. manually setting the role of nonsticky insertion in the model response, if that makes any sense.

... Non-sticky behavior leads to sub-optimal but more predictable/uniform behavior.

Without highlighting, I fully agree.

@Inkbottle007
Copy link

Inkbottle007 commented Nov 11, 2024

There is a hole in the highlighting, as shown in the screenshot below. I verified using (get-text-property (point) 'gptel) that the role in the discolored area is response. I've encountered at least four instances like this, with the largest spanning 14 lines.

Screenshot_20241111_060116

After saving and reopening the file, numerous discolored regions appeared (so not improved, they are not at the same places though), at least 10, throughout the buffer. They all seem to be within code blocks.

They didn't come back after restarting Emacs altogether.

@daedsidog
Copy link
Contributor Author

Seems to be an issue with the theme overriding the background color of the response, so @karthink was right to worry about this. This would also be an issue with overlays I think.

@Inkbottle007
Copy link

Inkbottle007 commented Nov 11, 2024

I understand your concern about themes potentially overriding overlay background color. However, in my experience, overlays are not affected by themes or font-lock-mode.

With M-x gptel-activate-highlighting, from gptel-highlighting.el, overlays consistently maintained their background colors in the aforementioned case.

My reasoning is that what you put in your overlays is yours, nobody can tamper with it. There might be overlays with higher priority, but I don't think overlays are commonly used by themes or font-lock-mode.

@Inkbottle007
Copy link

I noticed that your function:

(defun gptel-toggle-response-role ()
  "Toggle the role of the text between the user and the assistant.
If a region is selected, modifies the region.  Otherwise, modifies at the point."
  (interactive)
  (unless gptel-mode
    (user-error "This command is only usable in the dedicated gptel chat buffer"))
  (let (start end)
    (if (region-active-p)
        (setq start (region-beginning)
              end (region-end))
      (let ((response-region (gptel--response-region-at-point)))
        (setq start (car response-region)
              end (cdr response-region))))
    (when (and start end)
      (let* ((type (get-text-property start 'gptel))
             ;; If a region has a fragmented role that opposes the current one at the start, we make
             ;; sure to fill it with the role at the start of the region.
             (dst-type (cl-loop for i from start while (< i end)
                                thereis (unless (eq type (get-text-property i 'gptel))
                                          (unless type
                                            'query))
                                finally (cl-return (if (eq type 'response)
                                                       'query
                                                     'response)))))
        (put-text-property start end 'gptel dst-type)))))

introduces a new value (gptel . query) to the gptel text property. In the current codebase (at least the commit your branch is based on), user input is indicated by the absence of the gptel property, while assistant responses are marked with (gptel . response). Introducing a query value changes the original design and adds a third state to the buffer's text properties: nothing, (gptel . response), and (gptel . query). Since the original code doesn't use query, this fragments the buffer.

I'm a bit concerned that this change could complicate functions that rely on the existing property structure, like those using (next-single-property-change start 'gptel). It might add complexity without a clear benefit. Instead of setting (gptel . query), perhaps we could achieve the toggle by removing the property for user input regions with:

(remove-text-properties beg end '(gptel nil))

This approach would maintain consistency with the original design and avoid introducing additional property values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants