Support for mmap() writable mappings. #175

aversecat · 2024-06-11T17:46:39Z

Replaces #27, #39.

Contains ~~mostly original patches from andy, touched up for conflicts~~. Additional fixups and changes to avoid various deadlocks and debug kernel warnings for lock contention issues.

- ~~Does not~~ pass xfstests:generic/346 - hard lockup in _mkwrite when doing update_inode
- ~~Occasionally fails~~ offline-extent-waiting - when reverse staging, the first blocks of the file end up zeros, not the expected content
- Passes all other xfstests
- Added cross-node mmap consistency test. ~~doesn't work on el7~~
- sparse warnings about ret not returning vm_fault_t
- _walk_inodes page fault safe
- _get_allocated_inos page fault safe
- fsstress hard lockup in generic/013

versity-github · 2024-06-11T17:51:27Z

Build 169 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/169/ git: c6a7fd9

versity-github · 2024-06-11T21:12:35Z

Build 144 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/144/ git: c6a7fd9

versity-github · 2024-06-12T00:06:26Z

Build 170 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/170/ git: 0f611df

versity-github · 2024-06-12T03:15:04Z

Build 145 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/145/ git: 0f611df

versity-github · 2024-07-02T19:26:21Z

Build 178 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/178/ git: d112efb

versity-github · 2024-07-02T20:04:41Z

Build 153 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/153/ git: d112efb

versity-github · 2024-07-11T22:46:37Z

Build 180 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/180/ git: 43de817

versity-github · 2024-07-11T22:49:44Z

Build 155 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/155/ git: 43de817

versity-github · 2024-07-12T20:01:16Z

Build 156 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/156/ git: 939b7ec

versity-github · 2024-07-12T20:02:50Z

Build 182 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/182/ git: 939b7ec

versity-github · 2024-07-15T19:26:36Z

Build 185 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/185/ git: cb51a81

versity-github · 2024-07-15T19:28:41Z

Build 160 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/160/ git: cb51a81

zabbo

It's a WIP, we knew there'd be W to do :).

tests/tests/mmap.sh

kmod/src/data.c

tests/golden/xfstests

kmod/src/data.c

versity-github · 2024-07-22T22:25:02Z

Build 163 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/163/ git: 3db7fd0

versity-github · 2024-07-22T22:27:49Z

Build 188 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/188/ git: 3db7fd0

versity-github · 2024-07-23T19:16:30Z

Build 189 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/189/ git: 920070d

versity-github · 2024-07-23T19:19:07Z

Build 165 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/165/ git: 920070d

versity-github · 2024-08-05T18:46:28Z

Build 190 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/190/ git: 789959f

versity-github · 2024-08-05T19:20:59Z

Build 166 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/166/ git: 789959f

versity-github · 2024-08-05T19:27:42Z

Build 191 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/191/ git: d958c43

versity-github · 2024-09-10T23:21:58Z

Build 212 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/212/ git: fb43a75

versity-github · 2024-09-10T23:24:48Z

Build 188 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/188/ git: fb43a75

aversecat · 2024-09-11T00:16:54Z

one more small compat fix for el7.

versity-github · 2024-09-11T00:54:36Z

Build 189 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr-el8/189/ git: 9f689e2

versity-github · 2024-09-11T00:57:41Z

Build 213 http://jenkins.vpn.versity.com:8080/job/scoutfs-run-pr/213/ git: 9f689e2

aversecat · 2024-10-08T17:46:59Z

retest

aversecat · 2024-12-03T19:34:32Z

the -debug- failures are:

ng-scoutfs-test-debug-el7:

12:56:06   lock-recover-invalidate          [ failed: output differs ]

The output has bash: terminated interspersed. This is sometimes happening where stderr/stdout for the subprocess isn't properly discarded by redirection to /dev/null.

ng-scoutfs-test-debug-el8:

12:38:41   large-fragmented-free            [ failed: unexpected messages in dmesg ]
16:41:21   orphan-inodes                    [ failed: output differs ]

the dmesg here is blocked for more than 120 seconds. and subsequent stack dumps.

The output differs here is this intermittent test failure pattern:

 == orphaned inos in all mounts all deleted
+5264385 still exists
+5274624 still exists
+5284864 still exists
+5295104 still exists
+5305344 still exists

ng-scoutfs-test-debug-el94:

12:40:13   large-fragmented-free            [ failed: unexpected messages in dmesg ]
16:05:55   createmany-rename-large-dir      [ failed: unexpected messages in dmesg ]
13:45:10   srch-basic-functionality         [ failed: unexpected messages in dmesg ]

Multiple watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [kworker/u4:20:18707] here - the VM is clearly overrun by scheduled tasks.

zabbo · 2024-12-03T22:30:11Z

the -debug- failures are:

Thanks for enumerating these -- yeah, so, the same repeat offenders :/.

Adds the required memory mapped ops struct and page fault handler for reads. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>

aversecat · 2024-12-04T17:10:04Z

Hmmmm, not looking good in CI.

Both debug-el8 and debug-el94 appear(*) stuck on generic/346 which is holetest with mmap. Trying to reproduce myself now.

(*) they completed generic/343 but output is stuck there. No message indicating 346 actually started.

Add support for writable MAP_SHARED mmap()ings. Avoid issues with late writepage()s building transactions by doing the block_write_begin() work in scoutfs_data_page_mkwrite(). Ensure the page is marked dirty and prepared for write, then let the VM complete the write when the page is flushed or invalidated. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>

Two test programs are added. The run time is about 1min on my el7 instance. The test script finishes up with a read/write mmap test on offline extents to verify the data wait paths in those functions. One program will perform vfs read/write and mmap read/write calls on the same file from across 5 threads (mounts) repeatedly. The goal is to assure there are no locking issues between read/write paths. The second test program performs consistency checking on a file that is repeatedly written/read using memory maps and normal reads and writes, and the content is verified after every operation. Signed-off-by: Auke Kok <[email protected]>

Now that all of these should be passing, we enable all mmap() tests in xfstests, and update the golden output with the new tests. Signed-off-by: Auke Kok <[email protected]>

We merely trace exit values and position, and ignore length. Because vm_fault_t is __bitwise, sparse will loudly complain about a plain cast to u32, so we must __force (on el8). ret will be 512 in normal cases. Signed-off-by: Auke Kok <[email protected]>

These 2 sections of compat for readdir are wholly obsolete and can be hard dropped, which restores the method to look like current upstream code. This was added in ddd1a4e. Signed-off-by: Auke Kok <[email protected]>

dir_emit() will copy_to_user, which can pagefault. If this happens while cluster locked, we could deadlock. We use a single page to stage dir_emit data, and iterate between fetching dirents while locked, and emitting them while not locked. Signed-off-by: Auke Kok <[email protected]>

Now that we support mmap writes, at any point in time we could pagefault and lock for writes. That means - just like readdir - we can no longer lock and copy_to_user, since it also may page fault and thus deadlock. We statically allocate 32 extent entries on the stack and use these to shuffle out fiemap entries at a time, locking and unlocking around collecting and fiemap_fill_extent_next. Signed-off-by: Auke Kok <[email protected]>

Similar to readdir and fiemap vfs methods, we can't copy to user while holding cluster locks. The previous comment about it being safe no longer applies, and this could deadlock. Rewrite the loop to iterate and store entries in a page, then flush the page contents while not holding a clusterlock. Signed-off-by: Auke Kok <[email protected]>

Similar to fiemap, readdir and walk_inodes, this method could have put_user during a page fault, causing potentially a deadlock. Signed-off-by: Auke Kok <[email protected]>

While debugging a double unlock error we hit this condition and debugging would have been a lot easier had we enforced this simple constraint that we can't decrement the lock users count if it's already 0. Signed-off-by: Auke Kok <[email protected]>

aversecat · 2024-12-09T19:34:50Z

-debug- failures are:

el8: hung task timeout in large-fragmented-free, and orphan-inodes failure
el94: hung task timeout in large-fragmented-free, and orphan-inodes failure
el95: hung task timeout in large-fragmented-free, and this one:

--- golden/archive-light-cycle	2024-12-06 20:05:22.572655167 +0000
+++ /root/scoutfs/tests/results/output/archive-light-cycle	2024-12-07 03:12:08.082647728 +0000
@@ -4,6 +4,8 @@
 == round 1: create
 == round 1: online
 == round 1: verify
+/mnt/test.3/test/archive-light-cycle/dir/3/2-2 /dev/fd/63 differ: char 277401601, line 67726
+script pid 165368 failed: rc 1
 == round 1: release
 == round 1: offline
 == round 1: stage
archive-light-cycle output differs

aversecat added the enhancement New feature or request label Jun 11, 2024

aversecat force-pushed the auke/mmap branch from c6a7fd9 to 0f611df Compare June 12, 2024 00:02

aversecat force-pushed the auke/mmap branch from 0f611df to d112efb Compare July 2, 2024 19:22

aversecat force-pushed the auke/mmap branch from d112efb to 43de817 Compare July 11, 2024 22:08

aversecat force-pushed the auke/mmap branch from 939b7ec to cb51a81 Compare July 15, 2024 18:48

zabbo requested changes Jul 16, 2024

View reviewed changes

tests/tests/mmap.sh Outdated Show resolved Hide resolved

kmod/src/data.c Outdated Show resolved Hide resolved

tests/golden/xfstests Show resolved Hide resolved

kmod/src/data.c Outdated Show resolved Hide resolved

aversecat force-pushed the auke/mmap branch from cb51a81 to 3db7fd0 Compare July 22, 2024 21:30

aversecat force-pushed the auke/mmap branch from 3db7fd0 to 920070d Compare July 23, 2024 18:32

aversecat force-pushed the auke/mmap branch from 920070d to 789959f Compare August 5, 2024 18:36

aversecat force-pushed the auke/mmap branch from 789959f to d958c43 Compare August 5, 2024 18:48

aversecat changed the title mmap() tree **WIP** mmap() tree. Aug 5, 2024

aversecat force-pushed the auke/mmap branch from fb43a75 to 9f689e2 Compare September 11, 2024 00:16

aversecat requested a review from zabbo September 11, 2024 02:07

aversecat force-pushed the auke/mmap branch from 9f689e2 to 49803c3 Compare October 4, 2024 23:37

aversecat force-pushed the auke/mmap branch from 49803c3 to 731e5e1 Compare November 20, 2024 20:18

aversecat changed the title ~~mmap() tree.~~ Support for mmap() writable mappings. Nov 20, 2024

Add support for read only mmap()

dc2d3fd

Adds the required memory mapped ops struct and page fault handler for reads. Signed-off-by: Benjamin LaHaise <[email protected]> Signed-off-by: Auke Kok <[email protected]>

aversecat force-pushed the auke/mmap branch 2 times, most recently from 8b5ba17 to 282fa84 Compare December 3, 2024 23:09

bcrl and others added 11 commits December 6, 2024 09:56

Enable all xfstests mmap() tests.

76a5585

Now that all of these should be passing, we enable all mmap() tests in xfstests, and update the golden output with the new tests. Signed-off-by: Auke Kok <[email protected]>

mmap() trace events.

945640c

We merely trace exit values and position, and ignore length. Because vm_fault_t is __bitwise, sparse will loudly complain about a plain cast to u32, so we must __force (on el8). ret will be 512 in normal cases. Signed-off-by: Auke Kok <[email protected]>

remap_pages ops becomes obsolete.

9d1a56f

Drop readdir pre-.iterate() compat (el7.5ish).

619cc77

These 2 sections of compat for readdir are wholly obsolete and can be hard dropped, which restores the method to look like current upstream code. This was added in ddd1a4e. Signed-off-by: Auke Kok <[email protected]>

Avoid cluster locking while put_user() in _allocated_inos.

689a892

Similar to fiemap, readdir and walk_inodes, this method could have put_user during a page fault, causing potentially a deadlock. Signed-off-by: Auke Kok <[email protected]>

aversecat force-pushed the auke/mmap branch from 282fa84 to c6eec81 Compare December 6, 2024 17:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for mmap() writable mappings. #175

Support for mmap() writable mappings. #175

aversecat commented Jun 11, 2024 •

edited

Loading

versity-github commented Jun 11, 2024

versity-github commented Jun 11, 2024

versity-github commented Jun 12, 2024

versity-github commented Jun 12, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 15, 2024

versity-github commented Jul 15, 2024

zabbo left a comment

versity-github commented Jul 22, 2024

versity-github commented Jul 22, 2024

versity-github commented Jul 23, 2024

versity-github commented Jul 23, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

versity-github commented Sep 10, 2024

versity-github commented Sep 10, 2024

aversecat commented Sep 11, 2024

versity-github commented Sep 11, 2024

versity-github commented Sep 11, 2024

aversecat commented Oct 8, 2024

aversecat commented Dec 3, 2024

zabbo commented Dec 3, 2024

aversecat commented Dec 4, 2024

aversecat commented Dec 9, 2024 •

edited

Loading

Support for mmap() writable mappings. #175

Are you sure you want to change the base?

Support for mmap() writable mappings. #175

Conversation

aversecat commented Jun 11, 2024 • edited Loading

versity-github commented Jun 11, 2024

versity-github commented Jun 11, 2024

versity-github commented Jun 12, 2024

versity-github commented Jun 12, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 2, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 11, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 12, 2024

versity-github commented Jul 15, 2024

versity-github commented Jul 15, 2024

zabbo left a comment

Choose a reason for hiding this comment

versity-github commented Jul 22, 2024

versity-github commented Jul 22, 2024

versity-github commented Jul 23, 2024

versity-github commented Jul 23, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

versity-github commented Aug 5, 2024

versity-github commented Sep 10, 2024

versity-github commented Sep 10, 2024

aversecat commented Sep 11, 2024

versity-github commented Sep 11, 2024

versity-github commented Sep 11, 2024

aversecat commented Oct 8, 2024

aversecat commented Dec 3, 2024

zabbo commented Dec 3, 2024

aversecat commented Dec 4, 2024

aversecat commented Dec 9, 2024 • edited Loading

aversecat commented Jun 11, 2024 •

edited

Loading

aversecat commented Dec 9, 2024 •

edited

Loading