-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-84461: Improve WebAssembly in-browser demo #91879
Conversation
@katharosada @pmp-p please take a look |
Tools/wasm/python.html
Outdated
const ord = data.charCodeAt(0); | ||
let ofs; | ||
data = data.replace(/\r\n|\r|\n/g, "\n") // Normalize line endings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about multilines """ """ and ''' ''' if they contain \r\n they are not meant to be replaced as they are not source code but string data. as \r\n in source will not bother the tokenizer i'd suggest to just do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My web browsers all seem to send \r
line endings when I paste text (even though the text I copied has \n
endings). When just \r
is pasted, xterm.js seems to get upset (all the text shows up on one line). I'll remove the \r\n
normalizing but keep the \r
normalizing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On what OS are these web browsers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ubuntu Linux 20.04
print(charCode) { | ||
let array = [charCode]; | ||
if (charCode == 10) { | ||
array = [13, 10]; // Replace \n with \r\n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python should send \r\n not \n, line should be cooked for a terminal on stdout. add a workaround or fix tty detection in emscripten ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I remove that \n
-> \r\n
replacement code xterm.js looks like this:
When I did a console.log
for \r
and \n
, I saw \n
printed out but \r
never seemed to come out of Python:
if (charCode == 10) {
console.log('Saw LF');
array = [13, 10]; // Replace \n with \r\n
} else if (charCode == 13) {
console.log('Saw CR');
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a problem with emscripten Module I/O registration, i'm not sure but it may have to do with something like TTY.register() could not find much apart from a vague mention that it does not have any doc see emscripten-core/emscripten#6861
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emscripten's API is unfortunately not really a TTY and seems to be designed to work with a browser console.log for debugging more than anything else. For example, it doesn't flush without a newline character. The web REPL has no choice but to work around it. 😐
To make the Python REPL actually work nicely, you either have to do a lot more work on the JS/UI side to add behaviour (like this PR does) or add better TTY support to emscripten itself and eventually use readline (which would be awesome).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be very interested in getting readline
to work. I think with threading enabled it would be easy. With Python in a webworker it is probably possible, but it would probably require rewriting readline_until_enter_or_signal
and significantly patching the I/O in gnu readline itself. So maybe not completely impossible but a significant amount of work.
@@ -0,0 +1,2 @@ | |||
[*.{html,js}] | |||
indent_size = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not sure that file is required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that it's not strictly required, but CPython already has a root .editorconfig
file which informs text editors of the indentation and whitespace style to use. The root file doesn't have rules for HTML and JS files, which I noticed because my editor had a default setup that was inserting 2 spaces when I hit tab in these files.
I'm not sure whether the other *.js
and *.html
files in this repository use 4 spaces consistently, so I made a local .editorconfig
file instead of adding to the root one.
case "\x0a": // CTRL+J | ||
case "\x0d": // CTRL+M | ||
this.xterm.write('\r\n'); | ||
this.handleReadComplete('\n'); | ||
this.resolveInput(this.input + this.writeLine('\n')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why you are writing \n
here if a) you normalize \r
and \n
to \r\n
above, and b) you writeline so you write a \r\n
after?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe flush on '\n' 0xa only instead of '\r' (0xd) ? or better begin to handle canonical and raw terminal differently as they should. LF should not be treated as EOL.( ncurses and ncursesw will be very soon available on wasm ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also noticed that isatty() is returning 0 for fd 0 / 1 / 2 and it should not
maybe try
SYSCALLS.getStreamFromFD(0).tty = true;
SYSCALLS.getStreamFromFD(1).tty = true;
SYSCALLS.getStreamFromFD(2).tty = true;
at module init before trying to fix something that may not be broken :
>>> sys._emscripten_info
sys._emscripten_info(emscripten_version=(3, 1, 10), runtime='Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36', pthreads=False, shared_memory=False)
>>>
>>>
>>> print(" 123\r456",end="")
456>>> 123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why you are writing
\n
here if a) you normalize\r
and\n
to\r\n
above, and b) you writeline so you write a\r\n
after?
That inconsistency was unintentional. Python seems to have translated \r\n
to simply \n
in the strings and I hadn't noticed my mistake. In fact that case
condition could never be reached due to \r
always being converted to \r\n
(due to the data.length > 1
check above).
I just pushed a commit that normalizes lone \r
to \n
but keeps \r\n
intact (and allows this case
statement to actually run).
The writeLine
doesn't actually write to Python. It writes to xterm
only. The xterm
session needs \r\n
(it gets unhappy otherwise: #91879 (comment)) but writeLine
returns its input (\n
in this case) which is then written to standard input. So this.resolveInput
ends up being passed simply this.input + '\n'
.
I feel I responded to the feedback so far well. Please let me know if I've missed anything though. The one thing I would love to see confirmed before this is merged is a verification from someone who uses Windows that this allows for pasting multiple lines of text on Chrome, Firefox, and Edge. Here are the tests I would recommend manually checking in one or more browsers on Windows:
Once that's verified I feel this is ready, though I'm biased because I wrote this. |
I can test this later today or tomorrow |
Thanks @ethanhs. Have you had a chance to test this out on Windows? |
Sorry this week turned out to be a bit more hectic than I was expecting. Finally got around to running those tests: Firefox:
Chrome:
Edge:
|
Thanks for testing this @ethanhs. I witnessed the same error and just pushed a commit to resolve it by ignoring non-printable characters when input isn't active.
I can understand that and I'm also torn on this. With the current live demo you have at https://repl.ethanhs.me, Ctrl+L is already swallowed. Since it's already being swallowed, I figured we may as well give its usual use within the REPL. It's not uncommon for in-browser editors/REPLs (like replit.com) to swallow Ctrl+L and other shortcuts, so at least this demo isn't alone in this behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited to deploy this when merged :)
I pushed a small change that fixes a bug on Safari (which doesn't support regular expression look-behinds). 😅 I noticed this bug while testing my own REPL demo which also supports arrow keys, Home, End, Ctrl+K, and Ctrl+U. Now that I've tested this code on Safari, I think these updates to the demo are ready to go (though I'd be happy to add more features later as well). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Trey! The improved terminal works for me in Firefox.
Thanks @treyhunner for the PR, and @tiran for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11. |
GH-94480 is a backport of this pull request to the 3.11 branch. |
* Buffer standard input line-by-line * Add non-root .editorconfig for JS & HTML indent * Add support for clearing REPL with CTRL+L * Support unicode in stdout and stderr * Remove \r\n normalization * Note that local .editorconfig file extends root * Only normalize lone \r characters (convert to \n) * Skip non-printable characters in buffered input * Fix Safari bug (regex lookbehind not supported) Co-authored-by: Christian Heimes <[email protected]> (cherry picked from commit a8e333d) Co-authored-by: Trey Hunner <[email protected]>
* Buffer standard input line-by-line * Add non-root .editorconfig for JS & HTML indent * Add support for clearing REPL with CTRL+L * Support unicode in stdout and stderr * Remove \r\n normalization * Note that local .editorconfig file extends root * Only normalize lone \r characters (convert to \n) * Skip non-printable characters in buffered input * Fix Safari bug (regex lookbehind not supported) Co-authored-by: Christian Heimes <[email protected]> (cherry picked from commit a8e333d) Co-authored-by: Trey Hunner <[email protected]>
I'm late to the party as always but thanks @treyhunner! The REPL interface here was super bare-bones and I sorta built it as a demo. So cool that people are using it! I've separately been working on a Python wasm console as part of my SplootCode project. It's a bit different because it uses the Pyodide API not raw emscripten, but I'm realising that we're duplicating effort since this version also has a line buffer for pasted input: https://github.com/katharosada/splootcode/tree/main/packages/runtime-python/runtime Any features/bugs in this that I can help with? Line wrapping for long input? Better support for multi-byte characters? (try |
Thanks for chiming in @katharosada!
Yes please to both! 😁
I'd love to combine efforts here. Combining efforts through an actual library or shared code base would be great, but that might take a bit of non-trivial abstracting of the code especially since Pyodide vs straight Emscripten vs something else means different code paths. I've been viewing this terminal demo as a shared open source equivalent to what I'm doing with the various Python Morsels REPL/pastebin tools, so working on the repository here makes sense to me. |
Opening a pull request here per @tiran's request (emmatyping/python-wasm#78 (comment)).
This updates the in-browser WebAssembly demo in the following ways:
Ctrl+L
clears the REPL1-3 above were achieved by using a line-by-line buffer for standard input. Whenever the user isn't currently in an active line (in which
Backspace
should work), text is added to the input buffer.I also added a
.editorconfig
file which declares 4-space indentation for these files (I noticed that the mix of 2/4 spaces before turned into just 4). That file does not override the root.editorconfig
but inherits from it (because there's noroot = true
specified).