-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for mmap() writable mappings. #175
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a WIP, we knew there'd be W to do :).
fb43a75
to
9f689e2
Compare
one more small compat fix for el7. |
retest |
the
The output has
the dmesg here is The
Multiple |
Thanks for enumerating these -- yeah, so, the same repeat offenders :/. |
Adds the required memory mapped ops struct and page fault handler for reads. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>
8b5ba17
to
282fa84
Compare
Hmmmm, not looking good in CI. Both debug-el8 and debug-el94 appear(*) stuck on (*) they completed |
Add support for writable MAP_SHARED mmap()ings. Avoid issues with late writepage()s building transactions by doing the block_write_begin() work in scoutfs_data_page_mkwrite(). Ensure the page is marked dirty and prepared for write, then let the VM complete the write when the page is flushed or invalidated. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>
Two test programs are added. The run time is about 1min on my el7 instance. The test script finishes up with a read/write mmap test on offline extents to verify the data wait paths in those functions. One program will perform vfs read/write and mmap read/write calls on the same file from across 5 threads (mounts) repeatedly. The goal is to assure there are no locking issues between read/write paths. The second test program performs consistency checking on a file that is repeatedly written/read using memory maps and normal reads and writes, and the content is verified after every operation. Signed-off-by: Auke Kok <[email protected]>
Now that all of these should be passing, we enable all mmap() tests in xfstests, and update the golden output with the new tests. Signed-off-by: Auke Kok <[email protected]>
We merely trace exit values and position, and ignore length. Because vm_fault_t is __bitwise, sparse will loudly complain about a plain cast to u32, so we must __force (on el8). ret will be 512 in normal cases. Signed-off-by: Auke Kok <[email protected]>
These 2 sections of compat for readdir are wholly obsolete and can be hard dropped, which restores the method to look like current upstream code. This was added in ddd1a4e. Signed-off-by: Auke Kok <[email protected]>
dir_emit() will copy_to_user, which can pagefault. If this happens while cluster locked, we could deadlock. We use a single page to stage dir_emit data, and iterate between fetching dirents while locked, and emitting them while not locked. Signed-off-by: Auke Kok <[email protected]>
Now that we support mmap writes, at any point in time we could pagefault and lock for writes. That means - just like readdir - we can no longer lock and copy_to_user, since it also may page fault and thus deadlock. We statically allocate 32 extent entries on the stack and use these to shuffle out fiemap entries at a time, locking and unlocking around collecting and fiemap_fill_extent_next. Signed-off-by: Auke Kok <[email protected]>
Similar to readdir and fiemap vfs methods, we can't copy to user while holding cluster locks. The previous comment about it being safe no longer applies, and this could deadlock. Rewrite the loop to iterate and store entries in a page, then flush the page contents while not holding a clusterlock. Signed-off-by: Auke Kok <[email protected]>
Similar to fiemap, readdir and walk_inodes, this method could have put_user during a page fault, causing potentially a deadlock. Signed-off-by: Auke Kok <[email protected]>
While debugging a double unlock error we hit this condition and debugging would have been a lot easier had we enforced this simple constraint that we can't decrement the lock users count if it's already 0. Signed-off-by: Auke Kok <[email protected]>
|
Replaces #27, #39.
Contains
mostly original patches from andy, touched up for conflicts. Additional fixups and changes to avoid various deadlocks and debug kernel warnings for lock contention issues.Does notpassxfstests:generic/346
- hard lockup in _mkwrite when doing update_inodeOccasionally failsoffline-extent-waiting - when reverse staging, the first blocks of the file end up zeros, not the expected contentdoesn't work onel7