-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use as SWAP #3
Comments
It's possible to implement a block device with OpenCL backing it. It could probably be developed pretty quickly with something like BUSE. |
If you can provide a block device, then you can also build RAID-0 on top of the block devices. |
@ptman That is a great point. I'm going to look into writing a kernel module to do this tomorrow. I've tried BUSE, but it seems to be bottlenecking because it's based on the network block device interface. |
A kernel module and a some kind of analogue to swapon/swapoff would make this thing look very serious. Both FUSE and BUSE would definitely only slow things down. Good luck @Overv, thanks for sharing! |
I've done some preliminary testing with BUSE and trivial OpenCL code. The read speed is 1.1 GB/s and the write speed 1.5 GB/s with ext4. Writing my own kernel module is going to take more time, and it'll still require a userspace daemon to interact with OpenCL. |
Wow, very good news, @Overv! I think the daemon is necessary just to provide the proper RAID support across multiple vramfs-based block devices and to control the amount of memory dedicated per adapter... I believe a package named vramfs-tools containing vramfsd and vramfsctl could fit the purpose... Wondering what @torvalds will think of this project, maybe it'll end up being included in the tree like tmpfs... Thanks for your work, once again! |
if you want a userspace-backed block (SCSI) device I would encourage you to look at TCMU, which was just added to Linux 3.18. It's part of the LIO kernel target. Using it along with the loopback fabric and https://github.com/agrover/tcmu-runner may fill in some missing pieces. tcmu-runner handles the "you need a daemon" part so the work would just consist of a vram-backed plugin for servicing SCSI commands like READ and WRITE. Then you'd have the basic block device, for swap or a filesystem or whatever. (tcmu-runner is still alpha but I think it would save you from writing kernel code and a daemon from scratch. feedback welcome.) |
While it is technically possible to create a file on VRAMFS and use it as a swap, this is risky: What happens if VRAMFS itself, or one of the GPU libraries, gets swapped? This can happen in a low-mem situation, i.e exactly in a situation that swap is designed to help. The kernel cannot possibly know that restoring data from the swap depends on the data that is… swapped in the swap. |
For kernel-space driver, it would be nice to use directly TTM/GEM to allocate video ram buffers. |
What are TTM/GEM? Note that the slram/phram/mtdblock thing can only access at most like 256 MB of the memory, the size of the memory window (I guess) of the PCI device. |
I don't know much, but they are some interfaces to acces GPU memory inside kernel. So it can see all the GPU memory, not only some mapped part directly accesible. https://www.kernel.org/doc/html/latest/gpu/drm-mm.html My situation, NVidia dedicated GPU with 4GB RAM and nouveau driver without OpenCL support. This memory is not mapped to memory space so I can't use them using slram/phram. |
The easy way to accomplish is to use vmrafs as is, make a file on vramfs disk then use a loop device on that file, format the loop device with mkswap and then swapon. With this method everything seems to work as I tried. Anyway the big issue using FUSE or BUSE is that both runs in user space and user space is swappable. I have not tried it, but suppose the memory of the vramfs process get swapped itself by the kernel, how would the kernel be able to recover by a page fault as it needs to reload in the first place ? I am curious what will happen then? Edit: sorry I was not reading the comments before as bisqwit already explained...anyway I've tried to use as swap after a while got system freezing need a hard reboot (switch off and on power sob)... |
Couldn't |
Wonderful idea! I am runnnig an old headless server with a 1 gb ddr3 amd card opencl 1.1. I can use all the video ram as i use just ssh. Unfortunately vramfs does not let me create a swap file based swap I get "swapon: /mnt/vram/swapfile: swapon failed: Invalid argument". Can it be fixed? I see opencl 1.2 is merged into mesa 20.3 so good times ahead for this project. |
It doesn't work for me. Even I tried to mlockall() page for the userspace program. I think the nvidia driver allocated some memory that would be swapped. At some point, the computer will get into deadlock when memory is low. I also tried the BUSE / nbd approach. It doesn't work for me as well. I think we need to get into the nvidia driver, carefully develop a block device kernel driver and call these undocumented API:
to create a GPU session and allocate GPU memory in order to make a GPU swap truly possible. |
Hi guys, any update on this? Has anyone been able to reliably use VRAM as swap? |
It only works if the following two conditions are met:
|
Did not work for me the one time I tried it. Seems the project is abandoned... |
fuse should be able not to swap itself. But I attempted to add In nvidia driver, there are some undocumented functions (prefix by |
It is possible to achieve this, see https://wiki.archlinux.org/title/Swap_on_video_RAM , section FUSE.
This can be achieved with https://wiki.archlinux.org/title/Swap_on_video_RAM#Complete_system_freeze_under_high_memory_pressure I tested it under high memory pressure ( |
This looks like a proper solution indeed. |
I've tried implementing |
I would like to add to this discussion that the addition of vramfs as a block device would help using vramfs as a dedicated L2ARC ZFS buffer. We are using very big dedicated nvme swap raid arrays for quantum computing and need something that is faster then 8-16 NVME sticks in RAID to collect the IO in a buffer that is not in main memory. We make use of a lot of (virtual) memory so an L2ARC buffer in vram would be awesome; the GPUs would get a new lease on life because we went to CPU only calculation because of huge memory requirements to store the eigen vector (think 8/16TB) |
You can make a loop device with |
@Atrate thank you very much for the loop solution. will look into this and if ZFS will allow a loop device as cache. the swap I/O usage pattern is random read/write, not stream. a PCIe VRAM device might offer better speeds whilst at the same time making the workload on NVME raid devices more 'stream'-lined when changes are comitted to the array. |
The solution seems to work for me, but when I increase swappiness from 10 to 180, it simply freezes. I am running vramfs as a service, as the workaround cited above suggests. The only thing I think I am doing different is using a loopback, as my swapfile is being created with holes. Does anyone have an idea of what is happening? UPDATE: |
In reply to: #3 (comment) As suggested by Fanzhuyifan and others above I think that may be due to other GPU-management processes/libraries getting swapped out and maaaybe a fix is possible with a lot of systemd unit editing but that'd require tracking down every single library and process that is required for the operation of a dGPU and that seems like a chore. |
According to the documentation of mlockall,
So shared libraries directly used by vramfs being swapped out should not be the reason of system freezes. Edit: Examining the resident size and virtual memory size of the vramfs process, I think the issue is that vramfs asks for additional memory to serve reads/writes. |
Is it? Code in vramfs: Line 534 in 829b1f2
Documentation:
|
Here are the steps to prove my point (on linux):
Note that the bolded entries all increased.
The bolded entries increased again. I believe this proves that vramfs asks for more memory when serving read and write requests. |
Wouldn't we see OOM Killer entries in the kernel logs in this case? |
Was wondering if it could be possible to host a swap partition within vramfs or somehow patch vramfs to make it work as a swap partition?
My drive is encrypted, therefore I don't use SWAP partitions... but if this thing could give me 3GB or so of a swap-like fs, we could be onto something...
Do you think it could work without fuse, natively?
Oh, and great idea behind vramfs, really neat!
The text was updated successfully, but these errors were encountered: