Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎨Unlossy AI translation #479

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

AntoLC
Copy link
Collaborator

@AntoLC AntoLC commented Dec 5, 2024

Purpose

To translate with the AI we were using lossy functions (blocksToMarkdownLossy / tryParseMarkdownToBlocks), so we were loosing indentations, colors, background, lot of formatting. Some errors were popping up as well about tables loosing ids after the translation.

Proposal

  • We are now keeping the Blocknote json structure.
  • We associate an id to every node with the type text (see)
  • We extract the text with the id in a json
  • We send only this json to the AI:
{
    "tid-102": "Hello",
    "tid-103": "world",
    "tid-104": "Nice",
}
  • The AI send back the same json with the translation:
{
    "tid-102": "Bonjour",
    "tid-103": "le monde",
    "tid-104": "Super",
}
  • We replace in the json Blocknote the text with the ID associated
  • We replace in the editor the previous blocks with the new ones, block by block in checking if the block is still existing in the editor.

Demo

scrnli_ulUKPeGj2U7Hbw.webm

We change the text type of the ai-translate
endpoint from `text` to `json` to be able to send
a json object with the text associated with an
id.
The ai will return the translated text with the id.
Thanks to that we can preserve totally the
Blocknote json and replace only the text.
We improved the blocks translations.
Only for translations:
We are now preserving totally the Blocknote json
structure like to avoid losing any information.
We are not transforming the blocks in markdown
anymore, we are adding "id" to the text type blocks,
to identify them, we extract the text and send them to
the ai service to translate them. The ai preserve
the id, then we replace the text blocks with the
translated ones.
@AntoLC AntoLC added enhancement New feature or request frontend backend labels Dec 5, 2024
@AntoLC AntoLC requested a review from YousefED December 5, 2024 22:47
@AntoLC AntoLC self-assigned this Dec 5, 2024
@AntoLC AntoLC linked an issue Dec 5, 2024 that may be closed by this pull request
Copy link
Collaborator

@YousefED YousefED left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice solution! When we discussed on Matrix I missed that we were purely talking about translations (not other AI operations). So, basically you extract all text strings from the document and translate those via an LLM. That's quite nice and I think your approach should work!

@@ -280,31 +310,17 @@ const AIMenuItem = ({
const editor = useBlockNoteEditor();
const handleAIError = useHandleAIError();

const handleAIAction = async () => {
const handleAIAction = () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious; was the async removed intentionally?

Copy link
Collaborator Author

@AntoLC AntoLC Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all the await so async was not needed anymore, I have linters that warn we about these things.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, was wondering as there's a then below there's not much of a change.

@@ -0,0 +1,145 @@
import { Block as CoreBlock } from '@blocknote/core';

type Block = Omit<CoreBlock, 'type'> & {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? I think you could use Block from BlockNote directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get some overlapping warning because of the unions, changing the type to string help to pass over.
Not sure it is the best though.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. It's possible for sure but would require a bit more work on the typings. If it's a priority we can pair on this

throw new Error('No response from AI');
}

const markdown = await editor.tryParseMarkdownToBlocks(responseAI.answer);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, non-translate AI actions like these will still be "lossy". I'll think of a blocknote-based solution for this (as part of BlockNote AI features)

Copy link
Collaborator Author

@AntoLC AntoLC Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this part is still lossy, I tried but I didn't have good AI results when it is with actions, the AI has difficulty to do the request when the text is too split:

export type AITransformActions =
| 'correct'
| 'prompt'
| 'rephrase'
| 'summarize';

Rephrase and summarize can not really work with this system I think, the AI will have the remove some ids.

I think it is less likely that users will select tables or complicated structures for these actions (to monitor).

@@ -62,7 +66,7 @@ def call_ai_api(self, system_content, text):
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": system_content},
{"role": "user", "content": json.dumps({"markdown_input": text})},
{"role": "user", "content": json.dumps({"answer": text})},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you keep using the answer keyword, although it is not the answer in this context but the input ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it feels weird than when we already have structured json output in the translate case, we still pass it to the model within the answer json.
We could get rid of the anwser altogether such that we don't have another indentation layer in the json.

@@ -426,11 +426,12 @@ class AITranslateSerializer(serializers.Serializer):
language = serializers.ChoiceField(
choices=tuple(enums.ALL_LANGUAGES.items()), required=True
)
text = serializers.CharField(required=True)
text = serializers.JSONField(required=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: wouldn't call this text in this context

@arnaud-robin
Copy link
Collaborator

arnaud-robin commented Dec 17, 2024

Hello Antho, nice solution ! However I think there might be issues when the order of the words differ in the two languages.

For example I tried translating "Je te donne mon livre", which should translate to "I give you my book".
The payload is

{
    "tid-52": "Je ",
    "tid-53": "te",
    "tid-54": " donne mon livre."
}

However the answer from gpt-4o-mini is:

{
    "tid-52": "I ",
    "tid-53": "give ",
    "tid-54": "you my book."
}

Here the bold word has been modified to give instead of you.

However, If i ask llama3.2 the answer becomes:

{ 
"tid-52": "I ",
 "tid-53": "to", 
"tid-54": " give me my book." 
}

where the sentence doesn't mean anything anymore.

@YousefED
Copy link
Collaborator

@arnaud-robin you requested my review here, but I think you're right about:

there might be issues when the order of the words differ in the two languages

Good spot, this seems like a potential blocker and I missed that one earlier. I'm doing an experiment to keep markdown formatting across LLM responses, which might be a viable alternative and also makes it possible for operations other than "translate" (i.e.: rewrite, simplify, etc.). It's very much wip atm though

@AntoLC
Copy link
Collaborator Author

AntoLC commented Dec 19, 2024

Yes, I can see it is not the best so, even with gpt-4o-mini...

I am still wondering if markdown is the good way to solve this problem.
It should be easier to keep all the formatting with HTML, no ?

Curious to see how you do it with markdown @YousefED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

⚗️AI return with complex data
3 participants