-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket Connect hangs until timeout runs out in mirrored mode. #10855
Comments
Hi I'm an AI powered bot that finds similar issues based off the issue title. Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you! Closed similar issues:
|
Could you please follow the steps below and attach the diagnostic logs? https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues |
Hi, i collected them with a 10 sec forced timeout, as described in my ticket. |
Please follow the networking diagnostic script. https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues It should create a folder like [WslNetworkingLogs-date_ver.zip] |
WslNetworkingLogs-2023-12-07_21-14-32.zip |
I have the same issue with WSL |
Still the same problem with 2.1.0 |
same problem here with 2.1.0. |
Still the same problem with 2.1.1 |
When I go through windows
Default policy
I've changed the order of their priorities
Modified policy
now But I don't know what other problems such a change would cause It can be reset by the following command |
not work for me. |
Are you using powershell or cmd on windows ? |
@linG5821 |
I did find problems with changing the default NIC policy in practice I mapped ports when using Docker, I tried '127.0.0.1:port' and 'localhost:port' both were accessible, but I used the Intranet ip 'ipv4ip:port' and couldn't access it, when I reset the network, Intranet ip 'ipv4ip:port' and '127.0.0.1:port' are accessible, and access to 'localhost:port' comes back to the issue itself. |
Still the same issue with 2.1.3.0 |
Still the same issue on 2.1.5. @chanpreetdhanjal is there any way to help? I have a workaround for ROS but it's a bit annoying on tensorboard to wait that long after fetching plots. |
Someone from our team is looking at it. We will get back to you as soon as we make further progress. Thank you for your patient. |
Can't believe MS team still not resolve this bug. It's pretty annoying problem, especially when I am going to using the vscode JS debug terminal in WSL2 mirrored network mode. |
We are actively looking at it as per other items in our priority list. Will provide an update as soon as we have something tangible to share. |
To update this thread: there are 2 possible issues.
Sorry, the RST issue is taking much longer to root cause than expected :( Hope this helps |
Has there been any progress on this issue? |
we have spent a lot of cycles debugging down the windows stack, through vswitch, to the vm - this has been really difficult and we don't yet have a culprit. we are fixing some Docker related issues - once that goes out, we'll return back to finding this bug. |
indeed, when capturing packets on Windows and Linux (WSL), the port is wrong.
destination MAC address Windows sent the packet to is loopback0 (WSL). if both perceptions are correct, then maybe something is going on inside the FSE switch? hostAddressLoopback=true & eth0 address will back to destination port Windows sent it to, which is fine. |
I'm having the same issue. Just got upgraded to Win11 22H2 and have been waiting to try mirrored networking. It did not take long to find this issue. Please fix. Using WSL 2.2.4.0 |
same issue with 2.2.4.0 and the Intranet ip 'ipv4ip:port' and couldn't access it |
Any updates on resolutions or workarounds for this issue? |
Same issue here using WSL 2.3.24.0 just updated |
its now about WSL I fixed it: Run powershell as Administrator |
does not work for me :/ |
it seem to hyper-v's hse-hostvnic problem,not wsl's |
Windows Version
Microsoft Windows [Version 10.0.22631.2715]
WSL Version
2.0.9.0 and 2.0.14 tested
Are you using WSL 1 or WSL 2?
Kernel Version
5.15.133.1-1
Distro Version
Ubuntu 22.04
Other Software
No response
Repro Steps
Call this script in mirrored mode as normal user or sudo (try both):
Expected Behavior
It directly checks if the socket is in use or not.
Actual Behavior
Sometimes it takes over 132 seconds to return True or False. In not mirrored mode it always directly returns. I came across this issue using ros, where the roscore takes about 2 minutes to start, or starting tensorboard takes around 2 minutes aswell, so i build this script to reproduce the error.
If it directly returns true or false in mirrored mode, i start the script as sudo and then the problem occurs again.
What is interesting, is that if i set the socket timeout to a specific second, then it will return after that. Its like its blocked for the default timeout.
Diagnostic Logs
Without networking mode mirrored:
❯ python3 test_connect.py
Uid: 1000
Socket type: 1
Socket family: 2
Socket timeout: None seconds
SO_REUSEADDR option: 0
SO_KEEPALIVE option: 0
TCP_NODELAY option: 0
checking: 6006
Timeout: None
Is Port in use: False
--- Runtime: 0.0001404285430908203 seconds ---
With networking mode mirrored:
❯ python3 test_connect.py
Uid: 1000
Socket type: 1
Socket family: 2
Socket timeout: None seconds
SO_REUSEADDR option: 0
SO_KEEPALIVE option: 0
TCP_NODELAY option: 0
checking: 6006
Timeout: None
Is Port in use: False
--- Runtime: 132.54164338111877 seconds ---
Sometimes it works as fast as without mirrored. But sometimes not, even tho it results in the port is not in use.
If it works directly, starting the script with sudo will result again in this timeout.
With networking mode mirrored and setting timeout to a second with:
s.settimeout(1) before the connect_ex call
❯ python3 test_connect.py
Uid: 1000
Socket type: 1
Socket family: 2
Socket timeout: 1.0 seconds
SO_REUSEADDR option: 0
SO_KEEPALIVE option: 0
TCP_NODELAY option: 0
checking: 6006
Timeout: None
Is Port in use: False
--- Runtime: 1.001239538192749 seconds ---
Or with settimeout to 2 seconds:
❯ python3 test_connect.py
Uid: 1000
Socket type: 1
Socket family: 2
Socket timeout: 2.0 seconds
SO_REUSEADDR option: 0
SO_KEEPALIVE option: 0
TCP_NODELAY option: 0
checking: 6006
Timeout: 2.0
Is Port in use: False
--- Runtime: 2.0022153854370117 seconds ---
The text was updated successfully, but these errors were encountered: