Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DUPLICATE_ERROR_WARNING Terminating all known routes #94

Open
ghost opened this issue Jul 30, 2013 · 10 comments
Open

DUPLICATE_ERROR_WARNING Terminating all known routes #94

ghost opened this issue Jul 30, 2013 · 10 comments

Comments

@ghost
Copy link

ghost commented Jul 30, 2013

Whenever routes are established, within seconds they are closed because of receiving "DUPLICATE_ERROR_WARNING" Packets every 2-3 seconds. This prevents the ping from working. Will investigate why this is occuring.

@ghost
Copy link
Author

ghost commented Jul 30, 2013

This also eventually results in a segfault- unsure why. Adjusting the acknowledge timeout (in a NetworkInterface's constructor) can change how soon this segfault occurs. I think the issue may be linked to "crap" sequence numbers passed by dessert.
##EDIT This is definitely caused by all of my data packets having identical sequence numbers. I'm going to write my own method to create Packets from tap messages since most of the data passed in this case is shit, but normally the data is perfectly correct when received over a mesh interface. This should help eliminate a lot of the superfluous and undefined data usage we see in the testbed.

@ghost
Copy link
Author

ghost commented Jul 30, 2013

The new client build seems to alleviate the duplicate warnings but they still occur after long enough. Working on improving debug output to figure out when and why exactly this is occurring.

@fgrosse
Copy link
Contributor

fgrosse commented Jul 31, 2013

I have not looked into the code but I guess that the ACKs are interpreted as duplicate packets somehow. I will look into the code for that

@fgrosse
Copy link
Contributor

fgrosse commented Jul 31, 2013

I believe I got your error.
In your testbed NetworkInterface you have something like this.

void NetworkInterface::receive(Packet* packet) {
    deliverToARAClient(packet);
}

This is not a good idea on a reliable network interface, because it will prevent the packet reception and acknowledgement mechanism to kick in. Just delete this method in your network interface (and the header) and the ReliableNetworkInterface::receive(...) will do the job. You can find it here and this is a code snippet of what it does.

void ReliableNetworkInterface::receive(Packet* packet) {
    if (packet->getType() != PacketType::ACK) {
        handleNonAckPacket(packet);
    }
    else {
        handleAckPacket(packet);
    }
}

Please implement confirm the solution and close the issue when appropriate.

@ghost
Copy link
Author

ghost commented Aug 5, 2013

I implemented the change last wednesday and have now tested it, and am still receiving the following error:

ara  [DEBUG] [3] next hop: 0:1f:1f:9:6:e9, destination 0:1f:1f:9:9:e2, phi: 15.010785
ara  [WARN] Routing loop for packet 1 from fe:d:b9:14:d7:64 detected. Sending duplicate warning back to 0:1f:1f:9:6:e9
ara  [DEBUG] receiving packet # 1 type
ara  [WARN] Routing loop for packet 1 from fe:d:b9:14:d7:64 detected. Sending duplicate warning back to 0:1f:1f:9:9:e2
ara  [DEBUG] receiving packet # 1 type  over interface at 0:1f:1f:9:6:f9
ara  [TRACE] Created new route to 0:1f:1f:9:6:e9 via 0:1f:1f:9:6:e9 (phi=20.00)
ara  [TRACE] Routing Table:
ara  [DEBUG] [0] next hop: 0:1f:1f:9:6:e9, destination 0:1f:1f:9:6:e9, phi: 20.000000
ara  [DEBUG] [1] next hop: 0:1f:1f:9:9:e2, destination fe:d:b9:14:d7:64, phi: 22.000990
ara  [DEBUG] [2] next hop: 0:1f:1f:9:6:e9, destination fe:d:b9:14:d7:64, phi: 17.362799
ara  [DEBUG] receiving packet # 2 type  over interface at 0:1f:1f:9:6:f9
ara  [INFO] Received DUPLICATE_ERROR from 0:1f:1f:9:6:e9. Deleting route to fe:d:b9:14:d8:c8 via 0:1f:1f:9:6:e9

Will Investigate further, duplicates may actually exist for some reason...

@fgrosse
Copy link
Contributor

fgrosse commented Aug 5, 2013

It seems to me like the acknowledgement is not sent correctly back to the sender.
Thats why the sender is spamming you with the same packet over and over again until it gives up.
Can you check (maybe via a trace message, whenever you actually send a message over libDessert) if the ack is correctly sent?
Also you should use the PacketType::getAsString(..) method so wen can see in the logoutput what type this is (the literal ints are not in the printable ASCII range)

@ghost
Copy link
Author

ghost commented Aug 5, 2013

Thank you Friedrich. The changes will be in the next commit.

@ghost
Copy link
Author

ghost commented Aug 5, 2013

Type now displays correctly, which should help. Due to the current implementation data packets are not currently being sent (still fooling around with the threading implementation in TestbedTimer) but i will switch things around so we at least get to see ACKs being passed around. Thank you for your help Friedrich :)
Btw sorry about missing thursday- that is not like me.

@ghost
Copy link
Author

ghost commented Aug 5, 2013

Perhaps now we are getting somewhere.

ara  [INFO] Sending 1 trapped packet(s) for destination fe:d:b9:14:d8:c8
ara  [DEBUG] Forwarding DATA packet 1 from fe:d:b9:14:d7:64 to fe:d:b9:14:d8:c8 via 0:1f:1f:9:6:e9 (phi=22.73)
ara  [DEBUG] receiving packet # 1 type DATA over interface at 0:1f:1f:9:9:e2
ara  [DEBUG] Forwarding DATA packet 1 from fe:d:b9:14:d7:64 to fe:d:b9:14:d8:c8 via 0:1f:1f:9:6:f9 (phi=20.37)
ara  [DEBUG] receiving packet # 2 type DUPLICATE_ERROR over interface at 0:1f:1f:9:9:e2
ara  [TRACE] Created new route to 0:1f:1f:9:6:f9 via 0:1f:1f:9:6:f9 (phi=20.00)
ara  [TRACE] Routing Table:
ara  [DEBUG] [0] next hop: 0:1f:1f:9:6:f9, destination 0:1f:1f:9:6:f9, phi: 20.000000
ara  [DEBUG] [1] next hop: 0:1f:1f:9:6:e9, destination fe:d:b9:14:d7:64, phi: 20.016651
ara  [DEBUG] [2] next hop: 0:1f:1f:9:6:f9, destination fe:d:b9:14:d8:c8, phi: 19.674917
ara  [DEBUG] [3] next hop: 0:1f:1f:9:6:e9, destination fe:d:b9:14:d8:c8, phi: 19.923168
ara  [INFO] Received DUPLICATE_ERROR from 0:1f:1f:9:6:f9. Deleting route to fe:d:b9:14:d8:c8 via 0:1f:1f:9:6:f9
ara  [DEBUG] Only one last route is known to fe:d:b9:14:d8:c8. Notifying 0:1f:1f:9:6:e9 with ROUTE_FAILURE packet
ara  [DEBUG] receiving packet # 1 type DUPLICATE_ERROR over interface at 0:1f:1f:9:9:e2
ara  [TRACE] Created new route to 0:1f:1f:9:6:e9 via 0:1f:1f:9:6:e9 (phi=20.00)
ara  [TRACE] Routing Table:
ara  [DEBUG] [0] next hop: 0:1f:1f:9:6:e9, destination 0:1f:1f:9:6:e9, phi: 20.000000
ara  [DEBUG] [1] next hop: 0:1f:1f:9:6:f9, destination 0:1f:1f:9:6:f9, phi: 19.052759
ara  [DEBUG] [2] next hop: 0:1f:1f:9:6:e9, destination fe:d:b9:14:d7:64, phi: 19.068623
ara  [DEBUG] [3] next hop: 0:1f:1f:9:6:e9, destination fe:d:b9:14:d8:c8, phi: 18.979567
ara  [INFO] Received DUPLICATE_ERROR from 0:1f:1f:9:6:e9. Deleting route to fe:d:b9:14:d8:c8 via 0:1f:1f:9:6:e9
ara  [INFO] All known routes to fe:d:b9:14:d8:c8 have collapsed. Sending ROUTE_FAILURE packet
ara  [DEBUG] receiving packet # 3 type DUPLICATE_ERROR over interface at 0:1f:1f:9:9:e2
ara  [INFO] Received DUPLICATE_ERROR from 0:1f:1f:9:6:f9. Deleting route to fe:d:b9:14:d8:c8 via 0:1f:1f:9:6:f9
ara  [DEBUG] receiving packet # 2 type ROUTE_FAILURE over interface at 0:1f:1f:9:9:e2
Segmentation fault (core dumped)

See how the client receives a "Data Packet 1"? It has the source of this node and the destination it was just sent- I think the problem is that all three nodes can reach one another, and somehow the best route to the destination is the very route that goes through the source?

Alternatively

The Acknowledgement packet is being interpreted as a DATA packet. This seems more likely/easier to fix so I will look into this first.

@fgrosse fgrosse closed this as completed Aug 5, 2013
@fgrosse fgrosse reopened this Aug 5, 2013
@fgrosse
Copy link
Contributor

fgrosse commented Aug 5, 2013

Sry for closing....
I hit the wrong button again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant