On deleting items from eBPF maps #550

gustavo-iniguez-goya · 2021-11-28T12:40:56Z

gustavo-iniguez-goya
Nov 28, 2021
Collaborator

When eBPF maps are full, we cannot insert more connections, so if the eBPF interception method is being used, we cannot intercept new outgoing connections with this method, having to fall back to ProcFS.

$0. I reproduce this problem everyday by using some clients/daemons that connect to localhost. They produce so much traffic that after a while, eBPF stops working and the interception method falls back to ProcFS (because the maps are full I guess). After some hours, somehow the eBPF maps are cleared up and the eBPF interception method starts working again. And after a while goto $0.

@themighty1 sorry to bother you, but I'd like to know your opinion on this problem. I'm not sure I understand the problem/code fully, so maybe you can help me.

So I decided to take a look at the code (it's like the 3rd time that I try to fix this problem). Some things I've seen:

Infinite loop.

opensnitch/daemon/procmon/ebpf/monitor.go

Line 148 in 479b8de

// TODO find out what causes the endless loop

After looping the maps' items several times, one of the items in the map can be duplicated, so when it reaches this point:

opensnitch/daemon/procmon/ebpf/monitor.go

Lines 169 to 170 in 479b8de

if counterValue > maxToDelete {

copy(lookupKey, nextKey)

it continues with the next iteration with the same nextKey (because err == nil but ok is false), leading to an infinite loop. This problem can be fixed by checking if the looked up key is ok:

opensnitch/daemon/procmon/ebpf/monitor.go

Line 153 in 479b8de

ok, err := m.LookupNextElement(bpfmap, unsafe.Pointer(&lookupKey[0]),

However the check (if !ok {}) is at the end of all other operations. @themighty1 why was the reason to put it at the end? did you have any problem if it was right after monitor.go:L153 ?
The code doesn't take into account last purged items, it's a fixed value (connections_counter - 5000). On the other hand, the connections counter doesn't reflect the items in the map.

opensnitch/daemon/procmon/ebpf/monitor.go

Lines 33 to 34 in 479b8de

if counterValue-ebpfMap.lastPurgedMax > 10000 {

ebpfMap.lastPurgedMax = counterValue - 5000

Many times the purged items is random (489, 567, 4, 1029...). The Problem I've seen is that sometimes the purged items are only 4 for a long period of time, causing the maps to end up filling up, thus having to fall back to ProcFS.

After trying several things, the only way I've found to reliable delete items from the maps and avoid hitting the max capacity, is by counting the items in the maps. Then deleting half of the total items. This way we adapt to the maps' size.

One thing to bear in mind is not to empty the map. Apparently if we delete all the items, it can enter into a state that only 4 items are deleted from that on, ending up again filling up the maps. I don't have an explanation for this behaviour.

All these problems can be easily reproduced by scanning localhost with nmap.: nmap -sT -p1-65535 localhost

Note: at least once, I've seen that one of the eBPF maps was stucked, not accepting more items, even with total items counting 456 (far less than the maximum capacity of 12k elements).

I'd be grateful if someone could reproduce these problems and test the attached solution. Note that there're plenty of debug messages, it's not the final version:
monitor.go.txt -> $ cp monitor.go.txt opensnitch/daemon/procmon/ebpf/monitor.go

themighty1 · 2021-11-28T14:23:21Z

themighty1
Nov 28, 2021

Hi, @gustavo-iniguez-goya , thank you for bringing up this important issue.

However the check (if !ok {}) is at the end of all other operations. @themighty1 why was the reason to put it at the end? did you have any problem if it was right after monitor.go:L153 ?

Unfortunately, I don't recall why I placed it at the end.

Im glad you found a way to avoid an endless loop.
However, there is still inexplicable behaviour that you see.
Maybe the proper path to take would be to implement item removal inside ebpf:

The userspace will send a flag to ebpf that it is time to remove n items from a map .
Every time the ebpf prog processes a connection, it will also check the flag.
Holding a lock (I'm not sure what locking mechanism works inside ebpf), the items will be deleted.

Are you willing to research this direction?
Feel free to post your findings here.

0 replies

gustavo-iniguez-goya · 2021-11-28T19:31:30Z

gustavo-iniguez-goya
Nov 28, 2021
Collaborator Author

Are you willing to research this direction?

Unfortunately I don't think I have the knowledge to do it from inside eBPF.

In any case, I think that we'll keep hitting maps' max capacity under extreme load, like nmapping localhost.

The userspace will send a flag to ebpf that it is time to remove n items from a map .

The problem is the gap between knowing that we have to delete n items and the action of removing them. While we're deleting items, new connections may be added that fill up the maps, so in the next iteration the maps are full or almost full again:

userspace                                              | ebpf
#iteration 1234                                        | add: 6k connections
 items: 6700                                           | add: 700 connections
 items > 12000/2 ?                                     | add: 6k connections, 12700
 -> delete 6000 items                                  | add: 5k connections, (12700-6000) + 5000 = 11700
#iteration 1235                                        | add: 3k connections, 11700 + 3000 = 14700 (2700 missed, max 12001)
 items: 12001                                          | add: 700 connections,  700 missed
 -> delete 11000 items                                 | add: 5k connections, (12001 - 11000) + 700 = 1701 (11k ~= 80%)

... goto #iteration 1234

Even deleting 11k items every second is not enough to not fill up the maps under extreme load, so maybe increasing the maps' maximum capacity would help here, but I don't know to what value (24k, 50k, 100k?).

As far as I can tell, the solution I've found works a little bit better. Even if we fill up the maps, they're deleted on demand as expected and keeps working normally. But I want to test it this week in the same scenario that I reproduce the problem explained on the second paragraph of the post.

0 replies

gustavo-iniguez-goya · 2021-11-28T23:14:10Z

gustavo-iniguez-goya
Nov 28, 2021
Collaborator Author

mmh, what if after processing an item in getPidFromEbpf() we delete that item from the map if it matches against an outgoing connection?

opensnitch/daemon/procmon/ebpf/find.go

Line 62 in 479b8de

    
           func getPidFromEbpf(proto string, srcPort uint, srcIP net.IP, dstIP net.IP, dstPort uint) (pid int, uid int) {

If it matches, from that on we don't need that item anymore, am I right?
I've just tested it, and by deleting items after being processed, nmap doesn't cause to fallback to ProcFS anymore. Total items in the map with thousands of connections is < 2000.

(...)
    pid = int(hostByteOrder.Uint32(value[0:4]))
    uid = int(hostByteOrder.Uint32(value[8:12]))
    if err := m.DeleteElement(ebpfMaps[proto].bpfmap, unsafe.Pointer(&key[0])); err != nil {
        log.Error("error deleting processed item:", err)
    }
    return pid, uid
}

2 replies

gustavo-iniguez-goya Nov 29, 2021
Collaborator Author

I've just tested it, and by deleting items after being processed, nmap doesn't cause to fallback to ProcFS anymore. Total items in the map with thousands of connections is < 2000.

In general this works fine. However some services like dnscrypt-proxy creates 2 requests when resolving a domain, with the same source and destination port (proxyfying the request I guess) . For some reason the second request sometimes is not found in the eBPFm maps, so if we delete the item as soon as we process it (i.e.: we don't keep in a cache), the second request it not found:

(1str request, found in eBPF)
[2021-11-29 12:47:25]  DBG  new connection udp => 50569:127.0.0.1 -> 127.0.2.1:53 uid: 121
[2021-11-29 12:47:25]  DBG  ✔ /usr/lib/apt/methods/http -> ftp.debian.org:53 (allow-apt) 

(2nd request, not found in eBPF if the saved item is deleted right after after return from getPidFromEbpf())
[2021-11-29 12:47:25]  DBG  new connection udp => 50569:127.0.0.1 -> 127.0.2.1:53 uid: 121
[2021-11-29 12:47:25]  DBG  [0/1] outgoing connection uid: 121, 50569:127.0.0.1 -> 127.0.2.1:53 || netlink response: 50569:127.0.0.1 -> 127.0.2.1:53 inode: 153024606 - loopback: true multicast: false unspecified: false linklocalunicast: false ifaceLocalMulticast: false GlobalUni: false 
[2021-11-29 12:47:25]  DBG  [0/1] outgoing connection uid: 121, 50569:127.0.0.1 -> 127.0.2.1:53 || netlink response: 50569:127.0.0.1 -> 127.0.2.1:53 inode: 153024606 - loopback: true multicast: false unspecified: false linklocalunicast: false ifaceLocalMulticast: false GlobalUni: false 
[2021-11-29 12:47:25]  DBG  new pid lookup took (2850365): 17.80078ms
[2021-11-29 12:47:25]  DBG  [0] PID found 2850365 [153024606]

chrome queries (tunneling HTTP requests over UDP):

[2021-11-29 14:54:45]  DBG  new connection udp => 53094:192.168.1.103 -> 3.3.3.3:443 uid: 1000
@@@ new item: udp53094192.168.1.1033.3.3.3443
[2021-11-29 14:54:45]  DBG  ✔ /usr/lib/chromium/chromium -> www.ddd.com:443 (002-allow-chrome-ports)
[2021-11-29 14:54:45]  DBG  new connection udp => 53094:192.168.1.103 -> 3.3.3.3:443 uid: 1000
%%%% FOUND: &{[102 207 3 3 3 3 1 187 192 168 1 103] 1000 13646 1}
@@@ item hits: 2 udp53094192.168.1.1033.3.3.3443
[2021-11-29 14:54:46]  DBG  ✔ /usr/lib/chromium/chromium -> www.ddd.com:443 (002-allow-chrome-ports)
[2021-11-29 14:54:46]  DBG  new connection udp => 53094:192.168.1.103 -> 3.3.3.3:443 uid: 1000

However if we add the {connection + pid + uid} to a cache in getPidFromEbpf() and delete the entry from the ebpf maps, we cover both situations: repeated connections with same parameters + extreme load caused by some apps (like nmap). Items in cache doesn't need to live more than 1 minute.

I need to test it more, but this is what I can offer right now to solve this problem.

gustavo-iniguez-goya Dec 2, 2021
Collaborator Author

So this is what I changed, I'll publish a branch just in case someone want to test it:

getPidFromEbpf() - delete the entry from the ebpf map if the connection is found in the map
- This solves the problem of hitting the ebpf map's maximum capacity, that causes to miss connections.
getPidFromEbpf() - keep a cache of known connections on userland, instead of in ebpf.
- Certain apps creates "duplicated" connections from the same srcport-dstport, that in many cases are not found in ebpf's maps. By using a cache here solves to identify those connections (as explained on the previous post).
deleteOld() check if LookupNextElement() is ok and err == nil.
- This solves the problem of infinite loop.
deleteOld() once a key is checked, add it to a map. Then on the next iteration check if we've already checked it.
- This solves also the problem of infinite loop, because sometimes (ok == true && err == nil), but it keeps iterating

In general it works better than before. It's more performant and we don't hit the maps' maximum capacity (anymore. but ..keep reading), so we don't fail to identify connections' PID due to that problem.

Unfortunately, after stressing out the daemon with while true; nmap -sT -p1-65535 localhost; done for some hours (12h to 24h, or more), at some point, eBPF stops working, falling back to ProcFS.

dumping the values from the maps, when this problem occurs, it turns out that the map is full. Even deleting maps' entries as soon as we read them:

# bpftool map dump id 12345
12001

Keeping an eye on the maps, I realized that even deleting every key we lookup, the items in the maps keep growing.
And analyzing the contents of the maps, I see that there're entries that in theory have been deleted correctly, i.e., the DeleteElement() hasn't failed.

There're also some entries (connections) that iptables/nftables doesn't intercept (i.e.: we don't receive them in getPidFromEbpf()), therefore we cannot delete them and the maps keep growing. For example, docker connections are detected via eBPF, they're added to the ebpf maps, but as we don't intercept by default FORWARDED connections -t mangle -I FORWARD we cannot delete them. Here, the scheduled task to delete old items helps to clean up the maps (monitorMaps(), deleteOld())

I'll test everything again, but I'm starting to understand what's going on. There's still one piece missing though as explained above: why sometimes DeleteElement() doesn't delete an entry from the maps and outputs no error.

themighty1 · 2021-12-04T20:18:22Z

themighty1
Dec 4, 2021

Interesting research, thanks for sharing. If you managed to fix all problem from userspace, that's even better than touching the ebpf code.

2 replies

gustavo-iniguez-goya Dec 17, 2021
Collaborator Author

Here it is a working version of this fix: 9b0e663

I had everything working and decided to rewrite it to reuse code pieces (mainly the loop to iterate all items). It turns out that there's an out of bounds overwrite , like a heap overflow, that is corrupting memory, leading to all kind of problems. That's why it has been delayed so long.

@themighty1 could you take a look at it? If it looks good I've got a few more changes to add.

themighty1 Dec 18, 2021

Hey, @gustavo-iniguez-goya , I did take a look at it and it looks good to me. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On deleting items from eBPF maps #550

{{title}}

Replies: 4 comments 4 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

On deleting items from eBPF maps #550

gustavo-iniguez-goya Nov 28, 2021 Collaborator

Replies: 4 comments · 4 replies

themighty1 Nov 28, 2021

gustavo-iniguez-goya Nov 28, 2021 Collaborator Author

gustavo-iniguez-goya Nov 28, 2021 Collaborator Author

gustavo-iniguez-goya Nov 29, 2021 Collaborator Author

gustavo-iniguez-goya Dec 2, 2021 Collaborator Author

themighty1 Dec 4, 2021

gustavo-iniguez-goya Dec 17, 2021 Collaborator Author

themighty1 Dec 18, 2021

gustavo-iniguez-goya
Nov 28, 2021
Collaborator

Replies: 4 comments 4 replies

themighty1
Nov 28, 2021

gustavo-iniguez-goya
Nov 28, 2021
Collaborator Author

gustavo-iniguez-goya
Nov 28, 2021
Collaborator Author

gustavo-iniguez-goya Nov 29, 2021
Collaborator Author

gustavo-iniguez-goya Dec 2, 2021
Collaborator Author

themighty1
Dec 4, 2021

gustavo-iniguez-goya Dec 17, 2021
Collaborator Author