-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't fence clients with rid==0 #190
base: main
Are you sure you want to change the base?
Conversation
Occasionally during the export-lookup-evict-race test we see the following failure dmesg output when a server fences a client that has no valid rid: [ 828.379546] sysfs: cannot create duplicate filename '/fs/scoutfs/f.b928e1.r.7b36c0/fence/0000000000000000' ... [ 828.379773] kobject_add_internal failed for 0000000000000000 with -EEXIST, don't try to register things with the same name in the same directory. [ 828.385946] scoutfs f.b928e1.r.7b36c0 error: client fence returned err -17, shutting down server This fails the test. Fencing these clients is unwanted, but we definitely don't want to create duplicate sysfs entries for it, either. Don't fence clients like this, just return 0. Signed-off-by: Auke Kok <[email protected]>
for reference - caller path is known. |
@@ -238,6 +238,11 @@ int scoutfs_fence_start(struct super_block *sb, u64 rid, __be32 ipv4_addr, int r | |||
struct pending_fence *fence; | |||
int ret; | |||
|
|||
if (!rid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, I don't think this'll work. Depending on where the fence came from we might not end up cleaning it up. See the _fence_next calls in the server that lead to _recov_finished. We'll need to more about the precise source of the fencing request to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we know it comes from scoutfs_net_reconn_free_worker+0x1bd
and not the other 2 callers. I'll look into that.
Also, we can just stopgap and not create sysfs entries here, too, and just let all the fencing happen. The worst that could happen, I guess, is that we end up fencing the wrong clients that also have rid=0, so maybe not such a big deal?
Occasionally during the export-lookup-evict-race test we see the following failure dmesg output when a server fences a client that has no valid rid:
[ 828.379546] sysfs: cannot create duplicate filename '/fs/scoutfs/f.b928e1.r.7b36c0/fence/0000000000000000' ...
[ 828.379773] kobject_add_internal failed for 0000000000000000 with -EEXIST, don't try to register things with the same name in the same directory. [ 828.385946] scoutfs f.b928e1.r.7b36c0 error: client fence returned err -17, shutting down server
This fails the test. Fencing these clients is unwanted, but we definitely don't want to create duplicate sysfs entries for it, either.
Don't fence clients like this, just return 0.