-
Notifications
You must be signed in to change notification settings - Fork 989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 BUG: Nebula nodes cannot ping each other , however they can ping the lighthouse vpn IP #1165
Comments
Enabled relay on the light house and updated the config on the nodes to use the nebula ip of light house as relay. Still nodes are unable to connect. |
I think I have seen the same thing. both nodes are behind NAT. Connecting to the lighthouse worked fine for each node, but they were unable to talk to eachother. |
@tokudan thanks for your response. Let me try . In my setup , I have android device and other one is windows 11. Both are running with nebula version 1.9.3 . I would have started the nebula process multiple time , not sure what reboot would make difference here. |
I have dozens of hosts on my Nebula network and I encounter a similar issue once every few months or so. I don't need to reboot, just restarting the Nebula service on the problem host fixes everything. When one of my hosts is affected by this issue, it's only that single host. Everyone else works as expected. Working host:
Problem host:
|
I get a similar thing all the time. I have a linux lighthouse on the internet and a linux server in my lan. Which are both fixed and can always talk to each other fine. I have a mac laptop which works perfectly when connected to the lan - it can connect to the linux server. When I pair it with my phone to use its mobile connection I have to restart nebula on the linux server to be able to connect to it. |
@tokudan & @cameronbraid The other two users in this thread reported having their Lighthouse configured as a relay. There are many connectivity issues that can occur with Nebula but it is surprising to hear that they can ping the Lighthouse from two nodes, the Lighthouse is configured as a relay, and yet the nodes cannot reach each other. Can you confirm whether each of you are also using the Lighthouse as a relay in your scenarios? @Jahazee & @Cyberes At the risk of asking a dumb question, can you explain how relays are configured? Note that in addition to To any user in this thread, sharing configs of the affected hosts (Lighthouse and both nodes) as well as logs around the event will be immensely helpful in tracking down the issue. Thanks! |
@johnmaguire This is my standard config for my non-relay/non-lighthouse nodes: relay:
am_relay: false
relays:
- 172.0.0.2
- 172.0.0.3
use_relays: true
lighthouse:
am_lighthouse: false
hosts:
- 172.0.0.2
- 172.0.0.3
interval: 60 My lighthouse looks like this: lighthouse:
am_lighthouse: true
interval: 60
punchy:
delay: 1s
punch: true
punch_back: true
respond: true
relay:
am_relay: true
use_relays: false You should have megabytes of logs from me at this point for this issue but let me know if you need anything more. |
I can back the above issues. I do have a lighthouse configured on Linux VPS (Debian 11, Nebula version is 1.6.1 - installed from standard debian repo), with public IPv4 which acts as a relay. And there're two nodes:
From time to time I cannot connect/ping from my phone to my Linux machine at home, but they both can connect to the lighthouse. What helps to fix this is to restart Nebula either on home machine, phone or both. Not sure if it's some new bug related to latest version or it's just me being unlucky - my experience with Nebula is just few days and it's always been like this.
I haven't found this kind of setting in Android app. Does it exist or I am just being blind? |
I managed to find steps to reproduce this issue:
So at least one case is clear when this happens: when phone begins to roam and goes from one network (home WiFi) to another (cellular) for the very first time. What's bad is that in order to reestablish connection, I need to reboot Nebula service on a destination host I want access to - which can be tricky when you're away with your phone :) |
@johnmaguire it happened again so more logs for you. I think I have a different issue than what the others are facing since mine is about losing connection. Problem host:
Working host:
|
@Cyberes One log line that jumps out to me is this one:
This is similar but different from another type of log we've seen a bit lately, on relays:
Are you able to check your Lighthouse logs and see if you're seeing this message as well? I encourage anyone else in this ticket experiencing issues to check for this issue. This bug is being fixed in #1270. |
@Cyberes "unable to find host with relay" will appear on relay hosts (I expect your Lighthouses are configured to act as relays.) This log message is indicative of the problem in #1270 which occurs when the "responder" node loses connection to the relay. I'm sharing this to try to determine if the connectivity issues you shared earlier in the thread are related to this issue. Next time you experience a connectivity loss you could check for this log line. |
Will do. Should I continue to post in this issue or is there somewhere else more appropriate? |
What version of
nebula
are you using? (nebula -version
)1.9.2
What operating system are you using?
Linux, windows,android
Describe the Bug
I have three nodes in the network. 2 nodes as client and 1 node as light house. Lighthouse is running in public cloud.
The client nodes are running windows and android platform.
Windows node and android platform nodes is successfully connected to the lighthouse. Light house VPN IP is reachable from both of these nodes.
Looks peer to peer tunnel is not established . Any help on this would be appreciated.
However unfortunately, I cannot ping android node from windows node or vice-versa.
android node vpn ip : 10.0.0.50
windows node vpn ip : 10.0.0.2
lighthouse node vpn ip : 10.0.0.100
Logs on Android node
time="2024-06-14T13:55:19Z" level=debug msg="Packet store" length=2 localIndex=2056711493 remoteIndex=0 stored=true vpnIp=10.0.0.2
time="2024-06-14T13:55:19Z" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2056711493 localIndex=2056711493 remoteIndex=0 udpAddrs="[]" vpnIp=10.0.0.2
time="2024-06-14T13:55:20Z" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]"
window node note public ip masked as x.x.x.x
017791302 tunnelCheck="map[method:passive state:alive]" vpnIp=10.0.0.100
time="2024-06-14T20:12:57+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:12:57+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:12:58+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:12:58+05:30" level=debug msg="Packet store" length=2 localIndex=895826622 remoteIndex=0 stored=true vpnIp=10.0.0.50
time="2024-06-14T20:12:59+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:13:00+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:13:00+05:30" level=debug msg="Packet store" length=3 localIndex=895826622 remoteIndex=0 stored=true vpnIp=10.0.0.50
time="2024-06-14T20:13:01+05:30" level=info msg="Handshake timed out" durationNs=5669465500 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[157.45.145.140:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:13:01+05:30" level=debug msg="Pending hostmap hostInfo deleted" hostMap="map[indexNumber:895826622 mapTotalSize:0 remoteIndexNumber:0 vpnIp:10.0.0.50]"
Logs from affected hosts
Logs on Android node
time="2024-06-14T13:55:19Z" level=debug msg="Packet store" length=2 localIndex=2056711493 remoteIndex=0 stored=true vpnIp=10.0.0.2
time="2024-06-14T13:55:19Z" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2056711493 localIndex=2056711493 remoteIndex=0 udpAddrs="[]" vpnIp=10.0.0.2
time="2024-06-14T13:55:20Z" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]"
window node note public ip masked as x.x.x.x
017791302 tunnelCheck="map[method:passive state:alive]" vpnIp=10.0.0.100
time="2024-06-14T20:12:57+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:12:57+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:12:58+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:12:58+05:30" level=debug msg="Packet store" length=2 localIndex=895826622 remoteIndex=0 stored=true vpnIp=10.0.0.50
time="2024-06-14T20:12:59+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:13:00+05:30" level=debug msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[x.x.x.x:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:13:00+05:30" level=debug msg="Packet store" length=3 localIndex=895826622 remoteIndex=0 stored=true vpnIp=10.0.0.50
time="2024-06-14T20:13:01+05:30" level=info msg="Handshake timed out" durationNs=5669465500 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=895826622 localIndex=895826622 remoteIndex=0 udpAddrs="[157.45.145.140:43159]" vpnIp=10.0.0.50
time="2024-06-14T20:13:01+05:30" level=debug msg="Pending hostmap hostInfo deleted" hostMap="map[indexNumber:895826622 mapTotalSize:0 remoteIndexNumber:0 vpnIp:10.0.0.50]"
Config files from affected hosts
The text was updated successfully, but these errors were encountered: