Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-84461: Improve WebAssembly in-browser demo #91879

Merged
merged 10 commits into from
Jul 1, 2022

Conversation

treyhunner
Copy link
Member

@treyhunner treyhunner commented Apr 24, 2022

Opening a pull request here per @tiran's request (emmatyping/python-wasm#78 (comment)).

This updates the in-browser WebAssembly demo in the following ways:

  1. When multi-line text is pasted it displays as expected and also runs as expected
  2. When text is typed while code is currently running, the typed text shows up only when standard input is prompt again
  3. Partial data (when the pasted text doesn't end in a newline or when some text is typed and then some is pasted) shows up as expected
  4. Unicode output now shows up properly (previously non-ASCII characters resulted in mojibake)
  5. Ctrl+L clears the REPL

1-3 above were achieved by using a line-by-line buffer for standard input. Whenever the user isn't currently in an active line (in which Backspace should work), text is added to the input buffer.

I also added a .editorconfig file which declares 4-space indentation for these files (I noticed that the mix of 2/4 spaces before turned into just 4). That file does not override the root .editorconfig but inherits from it (because there's no root = true specified).

@tiran tiran changed the title Improve WebAssembly in-browser demo gh-84461: Improve WebAssembly in-browser demo Apr 24, 2022
@tiran
Copy link
Member

tiran commented Apr 24, 2022

@katharosada @pmp-p please take a look

const ord = data.charCodeAt(0);
let ofs;
data = data.replace(/\r\n|\r|\n/g, "\n") // Normalize line endings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about multilines """ """ and ''' ''' if they contain \r\n they are not meant to be replaced as they are not source code but string data. as \r\n in source will not bother the tokenizer i'd suggest to just do nothing.

Copy link
Member Author

@treyhunner treyhunner Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My web browsers all seem to send \r line endings when I paste text (even though the text I copied has \n endings). When just \r is pasted, xterm.js seems to get upset (all the text shows up on one line). I'll remove the \r\n normalizing but keep the \r normalizing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On what OS are these web browsers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ubuntu Linux 20.04

print(charCode) {
let array = [charCode];
if (charCode == 10) {
array = [13, 10]; // Replace \n with \r\n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python should send \r\n not \n, line should be cooked for a terminal on stdout. add a workaround or fix tty detection in emscripten ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I remove that \n -> \r\n replacement code xterm.js looks like this:

image

When I did a console.log for \r and \n, I saw \n printed out but \r never seemed to come out of Python:

        if (charCode == 10) {
            console.log('Saw LF');
            array = [13, 10];  // Replace \n with \r\n
        } else if (charCode == 13) {
            console.log('Saw CR');
        }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a problem with emscripten Module I/O registration, i'm not sure but it may have to do with something like TTY.register() could not find much apart from a vague mention that it does not have any doc see emscripten-core/emscripten#6861

Copy link
Contributor

@katharosada katharosada Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emscripten's API is unfortunately not really a TTY and seems to be designed to work with a browser console.log for debugging more than anything else. For example, it doesn't flush without a newline character. The web REPL has no choice but to work around it. 😐

To make the Python REPL actually work nicely, you either have to do a lot more work on the JS/UI side to add behaviour (like this PR does) or add better TTY support to emscripten itself and eventually use readline (which would be awesome).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also be very interested in getting readline to work. I think with threading enabled it would be easy. With Python in a webworker it is probably possible, but it would probably require rewriting readline_until_enter_or_signal and significantly patching the I/O in gnu readline itself. So maybe not completely impossible but a significant amount of work.

@@ -0,0 +1,2 @@
[*.{html,js}]
indent_size = 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure that file is required

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that it's not strictly required, but CPython already has a root .editorconfig file which informs text editors of the indentation and whitespace style to use. The root file doesn't have rules for HTML and JS files, which I noticed because my editor had a default setup that was inserting 2 spaces when I hit tab in these files.

I'm not sure whether the other *.js and *.html files in this repository use 4 spaces consistently, so I made a local .editorconfig file instead of adding to the root one.

case "\x0a": // CTRL+J
case "\x0d": // CTRL+M
this.xterm.write('\r\n');
this.handleReadComplete('\n');
this.resolveInput(this.input + this.writeLine('\n'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why you are writing \n here if a) you normalize \r and \n to \r\n above, and b) you writeline so you write a \r\n after?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe flush on '\n' 0xa only instead of '\r' (0xd) ? or better begin to handle canonical and raw terminal differently as they should. LF should not be treated as EOL.( ncurses and ncursesw will be very soon available on wasm ).

Copy link
Contributor

@pmp-p pmp-p Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i also noticed that isatty() is returning 0 for fd 0 / 1 / 2 and it should not
maybe try

    SYSCALLS.getStreamFromFD(0).tty = true;
    SYSCALLS.getStreamFromFD(1).tty = true;
    SYSCALLS.getStreamFromFD(2).tty = true;

at module init before trying to fix something that may not be broken :

>>> sys._emscripten_info
sys._emscripten_info(emscripten_version=(3, 1, 10), runtime='Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36', pthreads=False, shared_memory=False)
>>> 
>>> 
>>> print("                   123\r456",end="")
456>>>             123

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why you are writing \n here if a) you normalize \r and \n to \r\n above, and b) you writeline so you write a \r\n after?

That inconsistency was unintentional. Python seems to have translated \r\n to simply \n in the strings and I hadn't noticed my mistake. In fact that case condition could never be reached due to \r always being converted to \r\n (due to the data.length > 1 check above).

I just pushed a commit that normalizes lone \r to \n but keeps \r\n intact (and allows this case statement to actually run).

The writeLine doesn't actually write to Python. It writes to xterm only. The xterm session needs \r\n (it gets unhappy otherwise: #91879 (comment)) but writeLine returns its input (\n in this case) which is then written to standard input. So this.resolveInput ends up being passed simply this.input + '\n'.

@treyhunner
Copy link
Member Author

treyhunner commented May 2, 2022

I feel I responded to the feedback so far well. Please let me know if I've missed anything though.

The one thing I would love to see confirmed before this is merged is a verification from someone who uses Windows that this allows for pasting multiple lines of text on Chrome, Firefox, and Edge.

Here are the tests I would recommend manually checking in one or more browsers on Windows:

  1. Run from time import sleep; sleep(5) and then write print("hello world") and hit Enter while Python sleeps to verify that that line is properly buffered during sleep.
  2. Copy-paste multiple lines of code into the REPL, the 3 lines below for example, and then confirm that they copied and ran correctly.
    print("Line 1")
    print("Line 2")
    print("Line 3")

Once that's verified I feel this is ready, though I'm biased because I wrote this.

@emmatyping
Copy link
Contributor

The one thing I would love to see confirmed before this is merged is a verification from someone who uses Windows that this allows for pasting multiple lines of text on Chrome, Firefox, and Edge.

I can test this later today or tomorrow

@treyhunner
Copy link
Member Author

The one thing I would love to see confirmed before this is merged is a verification from someone who uses Windows that this allows for pasting multiple lines of text on Chrome, Firefox, and Edge.

I can test this later today or tomorrow

Thanks @ethanhs. Have you had a chance to test this out on Windows?

@emmatyping
Copy link
Contributor

Sorry this week turned out to be a bit more hectic than I was expecting.

Finally got around to running those tests:

Firefox:

  • Pasting doesn't seem to work via Ctrl+Shift+V, but right click paste does work
  • The buffering seems to work based on your test, but I found a failure mode:
    run sleep(5), paste print('hi') but don't hit enter. If you press backspace while Python is sleeping, it will give: "invalid non-printable character U+7F" (which is delete)
  • Also I kinda don't like Ctrl+L clearing things, because I often use it to get to the address bar in Firefox, but for consistency's sake maybe its for the best.

Chrome:

  • Ctrl+Shift+V works!
  • The sleep test also seems to work
  • The backspace issue is also present.

Edge:

  • Same as Chrome, which makes sense since it is now Chromium-based.

@treyhunner
Copy link
Member Author

  • The buffering seems to work based on your test, but I found a failure mode:
    run sleep(5), paste print('hi') but don't hit enter. If you press backspace while Python is sleeping, it will give: "invalid non-printable character U+7F" (which is delete)

Thanks for testing this @ethanhs. I witnessed the same error and just pushed a commit to resolve it by ignoring non-printable characters when input isn't active.

  • Also I kinda don't like Ctrl+L clearing things, because I often use it to get to the address bar in Firefox, but for consistency's sake maybe its for the best.

I can understand that and I'm also torn on this. With the current live demo you have at https://repl.ethanhs.me, Ctrl+L is already swallowed. Since it's already being swallowed, I figured we may as well give its usual use within the REPL. It's not uncommon for in-browser editors/REPLs (like replit.com) to swallow Ctrl+L and other shortcuts, so at least this demo isn't alone in this behavior.

Copy link
Contributor

@emmatyping emmatyping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to deploy this when merged :)

@treyhunner
Copy link
Member Author

I pushed a small change that fixes a bug on Safari (which doesn't support regular expression look-behinds). 😅

I noticed this bug while testing my own REPL demo which also supports arrow keys, Home, End, Ctrl+K, and Ctrl+U.

Now that I've tested this code on Safari, I think these updates to the demo are ready to go (though I'd be happy to add more features later as well).

Copy link
Member

@tiran tiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Trey! The improved terminal works for me in Firefox.

@tiran tiran merged commit a8e333d into python:main Jul 1, 2022
@tiran tiran added the needs backport to 3.11 only security fixes label Jul 1, 2022
@miss-islington
Copy link
Contributor

Thanks @treyhunner for the PR, and @tiran for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-94480 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Jul 1, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jul 1, 2022
* Buffer standard input line-by-line

* Add non-root .editorconfig for JS & HTML indent

* Add support for clearing REPL with CTRL+L

* Support unicode in stdout and stderr

* Remove \r\n normalization

* Note that local .editorconfig file extends root

* Only normalize lone \r characters (convert to \n)

* Skip non-printable characters in buffered input

* Fix Safari bug (regex lookbehind not supported)

Co-authored-by: Christian Heimes <[email protected]>
(cherry picked from commit a8e333d)

Co-authored-by: Trey Hunner <[email protected]>
miss-islington added a commit that referenced this pull request Jul 1, 2022
* Buffer standard input line-by-line

* Add non-root .editorconfig for JS & HTML indent

* Add support for clearing REPL with CTRL+L

* Support unicode in stdout and stderr

* Remove \r\n normalization

* Note that local .editorconfig file extends root

* Only normalize lone \r characters (convert to \n)

* Skip non-printable characters in buffered input

* Fix Safari bug (regex lookbehind not supported)

Co-authored-by: Christian Heimes <[email protected]>
(cherry picked from commit a8e333d)

Co-authored-by: Trey Hunner <[email protected]>
@treyhunner treyhunner deleted the improve-wasm-demo branch July 1, 2022 18:39
@katharosada
Copy link
Contributor

I'm late to the party as always but thanks @treyhunner!

The REPL interface here was super bare-bones and I sorta built it as a demo. So cool that people are using it! I've separately been working on a Python wasm console as part of my SplootCode project. It's a bit different because it uses the Pyodide API not raw emscripten, but I'm realising that we're duplicating effort since this version also has a line buffer for pasted input: https://github.com/katharosada/splootcode/tree/main/packages/runtime-python/runtime

Any features/bugs in this that I can help with? Line wrapping for long input? Better support for multi-byte characters? (try print("😄😄")) I've not played with the current version for a while so I'm not sure what's missing.

@treyhunner
Copy link
Member Author

Thanks for chiming in @katharosada!

Any features/bugs in this that I can help with? Line wrapping for long input? Better support for multi-byte characters? (try print("😄😄")) I've not played with the current version for a while so I'm not sure what's missing.

Yes please to both! 😁

It's a bit different because it uses the Pyodide API not raw emscripten, but I'm realising that we're duplicating effort since this version also has a line buffer for pasted input: https://github.com/katharosada/splootcode/tree/main/packages/runtime-python/runtime

I'd love to combine efforts here. Combining efforts through an actual library or shared code base would be great, but that might take a bit of non-trivial abstracting of the code especially since Pyodide vs straight Emscripten vs something else means different code paths.

I've been viewing this terminal demo as a shared open source equivalent to what I'm doing with the various Python Morsels REPL/pastebin tools, so working on the repository here makes sense to me.
That said, if treating this corner of the CPython repo as a shared place to improve a working demo seems inappropriate, I'd be fine moving to a repo outside of CPython for shared demo improvements as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants