Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFE] create a shell interface mode for wnbd-client #120

Open
hgkamath opened this issue Apr 14, 2023 · 5 comments
Open

[RFE] create a shell interface mode for wnbd-client #120

hgkamath opened this issue Apr 14, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@hgkamath
Copy link

hgkamath commented Apr 14, 2023

Description:

It is desirable to have a shell interface mode for wnbd-client, just as qemu, guestfish, do.
It is desirable that the no-arg invocation starts the shell. Presently the no-arg invocation just prints the help.
Example output:

PS> D:\vstorage\nbd\wnbd_client\wnbd-client.exe <enter>
WNBD> help<enter>
<show help>
WNBD> version<enter>
<show version>
WNBD> uninstall-driver<enter>
<uninstall the driver>
WNBD> exit<enter>
<say goodbye, and return to shell>
PS> 

The shell mode is just a simple read eval-string-array print loop that

  • Treats the input-string-array as if it were command line arguments, and uses the same parser/invoker
  • It must ensure both that output and errors returned by the parser/invoker make it to stdout/stderr and are readable by user.
  • Maybe print the return-value if it encodes various error conditions (0=success)
  • It must ignore whitespace entries (empty lines).
  • The command/keyword exit is the only one that is newly introduced, which terminates the loop, exits the wnbd-shell process and returns to outside-shell.

Reason:
The advantage of a shell interface mode is the executable is already loaded in memory, running as a process, waiting for user's stdin input.
This is unlike starting the application with arguments on the command prompt, wherein the binary has to be loaded from a drive, and a new process has to be created.
When drive lockup happens, OS might not be able to load and run a new command, an it prevents doing the following save from trouble strategy, which is to have a open terminal window with the wnbd shell running.

nb: IMHO #63 still happens, presently it is closed as it is hard to identify where the fault is. qemu on windows is a bit buggy, but I argue that even if nbd-server is buggy, wnbd should detect lockup situation, perhaps eject disk and bailout.

The following is an excerpt from #63-comment-997204171. Uninstalling and reinstalling the driver is the only way to unstuck the situation. At this point, all not-responding processes/apps come back alive. Until I discovered this, I was only force shutting down laptop. the following don't work ctrl-C or taskmgr-end-task on qemu-nbd or xcopy, attempt wnd-client unmap.

I think this was possible because the windows-OS was not so stuck that wnbd-client could not run. The drive lockups can be so bad that even thats not possible. When stuck happens, its possible to switch between open windows, that take user input. But the moment an application has to access the disk (such as how a browser always does, or when pressing ctrl-S in notepad) the gui becomes becomes stuck. Windows-taskbar becomes stuck quickly for the same reason. Its possible to start taskmgr via ctrl-alt-delete but, taskmgr can't really load any information in its gui.

Its really important, that when in shell mode, wnbd-client does not access any disk file/cause disk-access, not even configuration files etc. Otherwise it will also get stuck. This is unless the command it self require a file argument like install-driver. For this reason I don't recommend implementing history as history would require maintaining a history log on disk. Or maybe an option to skip reading and writing disk-state.

There's no guarantee that shell mode will be able save the lockup situation. But its worth a try, and even if it doesn't its a feature addition that is harmless, small, simple and low maintenance.

@petrutlucian94
Copy link
Member

petrutlucian94 commented Apr 14, 2023

Hi,

Thanks for opening this issue.

nb: IMHO #63 still happens, presently it is closed as it is hard to identify where the fault is. qemu on windows is a bit buggy, but I argue that even if nbd-server is buggy, wnbd should detect lockup situation, perhaps eject disk and bailout.

The following is an excerpt from #63 (comment). Uninstalling and reinstalling the driver is the only way to unstuck the situation. At this point, all not-responding processes/apps come back alive. Until I discovered this, I was only force shutting down laptop. the following don't work ctrl-C or taskmgr-end-task on qemu-nbd or xcopy, attempt wnd-client unmap.

Storport already detects IO timeouts and issues lun resets. When receiving a lun reset, we're simply emptying the IO queues. In this case, I think we should actually reset the NBD connection, which would probably fix the problem. I'll prepare a PR in the upcoming weeks.

By the way, can you double check what happens when connecting to the same nbd server using a linux nbd client?

In the meantime, we added an adapter reset command. It's more convenient than having to reinstall the driver.

wnbd-client.exe reset-adapter --hard-disconnect-mappings

@hgkamath
Copy link
Author

hgkamath commented Apr 14, 2023

I'll give that a try.
While what you say about reset-adapter is true, the shell addresses the problem when the exe can't even be started.
All command-line commands including reset-adapter, should be doable from the wnbd-shell.

I might need to update my wmbd driver

PS C:\lmgmt\M_Capella-PC\scripts> D:\vstorage\nbd\wnbd_client\wnbd-client.exe version
wnbd-client.exe: 0.2.2-11-g3dbec5e
libwnbd.dll: 0.2.2-11-g3dbec5e
wnbd.sys: 0.2.2-11-g3dbec5e

PS C:\lmgmt\M_Capella-PC\scripts> D:\vstorage\nbd\wnbd_client\wnbd-client.exe
wnbd-client commands:

version | -v       Get the client, library and driver version.
help | -h | --help List all commands or get more details about a specific
                   command.
list | ls          List WNBD disks.
show               Show detailed disk information.
map                Create a new disk mapping, connecting to the specified NBD
                   server.
unmap | rm         Remove disk mapping.
stats              Get disk stats.
list-opt           List driver options.
get-opt            Get driver option.
set-opt            Set driver option.
reset-opt          Reset driver option.
install-driver     Install WNBD driver and create its adapter.
uninstall-driver   Hard remove all disk mappings and adapters and uninstall all
                   WNBD driver instances.

The problem with this webpage
https://cloudbase.it/ceph-for-windows/
is that it has a link
https://cloudba.se/ceph-win-latest-quincy
which downloads file
ceph_quincy_beta.msi
whose file-properties/details-tab have a date-created field 4/14/2023 2:09PM
But this could be just the download date (today).
Otherwise the webpage/filename does not give a clue about a version update.
Even then, its not always true that when ceph drivers are updated, there is an update in wnbd driver.

Is there a sure proof way for a user to determine if an updated wnbd driver is present. ?
What if a user wanted to download a specific older version?


Extracted updated wnbd-driver as of 20230414

PS C:\Windows\system32>  D:\vstorage\nbd\wnbd_client\wnbd-client.exe version 
wnbd-client.exe: 0.4.1-10-g5c5239c
libwnbd.dll: 0.4.1-10-g5c5239c
wnbd.sys: 0.4.1-10-g5c5239c

PS C:\Windows\system32> D:\vstorage\nbd\wnbd_client\wnbd-client.exe
wnbd-client commands:

version | -v       Get the client, library and driver version.
help | -h | --help List all commands or get more details about a specific
                   command.
list | ls          List WNBD disks.
show               Show detailed disk information.
map                Create a new disk mapping, connecting to the specified NBD
                   server.
unmap | rm         Remove disk mapping.
stats              Get disk stats.
list-opt           List driver options.
get-opt            Get driver option.
set-opt            Set driver option.
reset-opt          Reset driver option.
install-driver     Install WNBD driver and create its adapter.
uninstall-driver   Hard remove all disk mappings and adapters and uninstall all
                   WNBD driver instances.
reset-adapter      Resets the WNBD adapter using PnP. Existing disk mappings need
                   to be removed.

@hgkamath
Copy link
Author

hgkamath commented Apr 14, 2023

By the way, can you double check what happens when connecting to the same nbd server using a linux nbd client?

btw, recently, in qemu-project, a vhdx corruption bug was resolved.
https://gitlab.com/qemu-project/qemu/-/issues/727#note_1347303636
In that comment, you can see that a local qemu-storage-daemon on Linux works well with linux local nbd-client.

On my single laptop, I don't think i have a way to do a nbd-share from qemu-storage-daemon of a windows build on a windows machine, and nbd-connect to that from a nbd-client on a Linux machine. Involving a VM is perhaps not the right way to test this. But, as they are platform builds from the same code, they should mostly have same effects but for a little uncertainty in differences due to the file-access layer in windows.

@hgkamath
Copy link
Author

hgkamath commented Apr 14, 2023

I learnt 2 things

  1. When in stuck state, if wnbd-client is executed from withinpowershell.exe-v5.1.19041.2673, it won't load/start.
    But, if wnbd-client is started from withincmd.exe, it can load.
    Unsure, how and why this is the case. It would be interesting to know.
    But, what this means is, at least for my present purposes, I can still do without a shell mode.
    This does not mean shell mode will never be necessary, what if another day arrives with even more serious stuck situation in which even cmd isn't helpful.
    Lesson: keep a few administrative privilege cmd.exe windows open.
  2. As seen from the logs below, reset-adapter is not powerful enough to force its way through and unstuck the situation.
    But, uninstall-driver can.
C:\Windows\system32>D:\vstorage\nbd\wnbd_client\wnbd-client.exe reset-adapter
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 0.18s, time left: 9.8s.
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 1.38s, time left: 8.6s.
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 2.56s, time left: 7.4s.
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 3.76s, time left: 6.2s.
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 4.95s, time left: 5.1s.
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 6.13s, time left: 3.9s.
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 7.32s, time left: 2.7s.
libwnbd.dll!WnbdResetAdapter WARNING Could not reset WNBD adapter. Device in use, operation vetoed.
libwnbd.dll!WnbdResetAdapterEx WARNING Could not reset adapter, device busy. Time elapsed: 14.44s, time left: -4.4s.
... gives up after about 8 times.
C:\Windows\system32>

C:\Windows\system32>D:\vstorage\nbd\wnbd_client\wnbd-client.exe uninstall-driver
libwnbd.dll!WnbdRemoveAllDisks INFO Hard removing WNBD disk: gkpics01
libwnbd.dll!RemoveWnbdAdapterDevice INFO Removing WNBD adapter device. Hardware id: root\wnbd. Class GUID: {4D36E97B-E325-11CE-BFC1-08002BE10318}
libwnbd.dll!CleanDrivers INFO Removing WNBD driver: oem41.inf
C:\Windows\system32>
... now I get back control.

oops, I forgot the --hard-disconnect-mappings argument. Given that the exe was able to start and run, I think it will work. I'll try that next time. [EDIT] I did get to try it, i think it said 'operation vetoed', couldn't copy/save the error-texts.

I will log further details pertaining to this stuck situation in #63-comment-1508390090

@petrutlucian94
Copy link
Member

We found out that the IO deadlock was caused by having Windows caching enabled on the WNBD disk side as well as the underlying local NBD server side. Disabling caching on the qemu-storage-daemon side solved the issue (cache.direct=on). This does not affect Ceph.

In the meantime, the nbd client functionality has been moved to libwnbd and wnbd-client map became a blocking command. wnbd-client has just a few simple commands, adding an interactive shell mode wouldn't help much IMHO.

@petrutlucian94 petrutlucian94 added the enhancement New feature or request label May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants