Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port virtualbox scripts to VBoxManage CLI #625

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Conversation

stevemk14ebr
Copy link

@stevemk14ebr stevemk14ebr commented Oct 9, 2024

Ports to VBoxManage CLI, identical logic otherwise. Errors handled gracefully for the most part. Output:

stepheneckels@flarevm-build-2:~/source/repos/flare-vm$ python3 virtualbox/vbox-export-snapshots.py 
Starting operations on FLARE-VM
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is already shut down (state: poweroff).
Restored 'FLARE-VM'
Found existing hostonlyif vboxnet0
Verified hostonly nic configuration correct
Power cycling before export...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is not running (state: poweroff). Starting VM...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} started.
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is not powered off. Shutting down VM...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is shut down (status: poweroff).
Power cycling done.
Exporting /usr/local/google/home/stepheneckels/EXPORTED VMS/FLARE-VM.20241009.dynamic.ova (this will take some time, go for an 🍦!)
Exported /usr/local/google/home/stepheneckels/EXPORTED VMS/FLARE-VM.20241009.dynamic.ova! 🎉
All operations on FLARE-VM successful ✅
Starting operations on FLARE-VM.full
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is already shut down (state: poweroff).
Restored 'FLARE-VM.full'
Found existing hostonlyif vboxnet0
Changed nic1 to hostonly
Verified hostonly nic configuration correct
Power cycling before export...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is not running (state: poweroff). Starting VM...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} started.
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is not powered off. Shutting down VM...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is shut down (status: poweroff).
Power cycling done.
Exporting /usr/local/google/home/stepheneckels/EXPORTED VMS/FLARE-VM.20241009.full.dynamic.ova (this will take some time, go for an 🍦!)
Exported /usr/local/google/home/stepheneckels/EXPORTED VMS/FLARE-VM.20241009.full.dynamic.ova! 🎉
All operations on FLARE-VM.full successful ✅
Starting operations on FLARE-VM.EDU
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is already shut down (state: poweroff).
Restored 'FLARE-VM.EDU'
Found existing hostonlyif vboxnet0
Changed nic1 to hostonly
Verified hostonly nic configuration correct
Power cycling before export...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is not running (state: poweroff). Starting VM...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} started.
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is not powered off. Shutting down VM...
VM {b76d628b-737f-40a3-9a16-c5f66ad2cfcc} is shut down (status: poweroff).
Power cycling done.
Exporting /usr/local/google/home/stepheneckels/EXPORTED VMS/FLARE-VM.20241009.EDU.ova (this will take some time, go for an 🍦!)
Exported /usr/local/google/home/stepheneckels/EXPORTED VMS/FLARE-VM.20241009.EDU.ova! 🎉
All operations on FLARE-VM.EDU successful ✅
Done. Exiting...

Copy link
Member

@Ana06 Ana06 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work @stevemk14ebr! I need to still test the code locally, but I have added some questions and improvement suggestions already. It is good to see what we can do with VBoxManage and how it allows us to remove the virtualbox dependency. The disadvantage is that it is less flexible, as it does not export everything in the API (for example, it seems it is not possible to access the max number of adapters which would allow us to write simpler code as in the previous version) and that we need to create a subprocess everytime we want to run a command. The new code using VBoxManage also looks longer and more complicated, but we may be able to simplify it a bit.

What about keeping both the version using the virtualbox library and the new one using VBoxManage until we have tested and migrated everything else?

Also, I think we need some documentation in /virtualbox/README.md.

Comment on lines 45 to 48
except subprocess.CalledProcessError as e:
# exit code is an error
print(f"Error running VBoxManage command: {e} ({e.stderr})")
raise Exception(f"Error running VBoxManage command")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it needed to catch the exception to print and error and re-reise it? I see the same pattern in other functions as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style choice, this throws a pretty error to the top level main to print out. I can change if you think there's a more pythonic style

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen in other Python code that an exception is re-triggered to add extra details or format the exception differently, but without a print that duplicates similar information. The print apart from duplicating the information, can make the output difficult to digest in this case, as {e.stderr} is rendering the output that is likely to be the long help message from VBoxManage. I think we should remove the try-catch, as the re-triggered exception is almost the same:

  • Original exception: Command '['VBoxManage', 'list2', 'list', 'hostonlyifs']' returned non-zero exit status 2
  • Retriggered exception: Error running VBoxManage command: Command '['VBoxManage', 'list2', 'list', 'hostonlyifs']' returned non-zero exit status 2.
Suggested change
except subprocess.CalledProcessError as e:
# exit code is an error
print(f"Error running VBoxManage command: {e} ({e.stderr})")
raise Exception(f"Error running VBoxManage command")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we re-raise the exception, I think we should use a more concrete typ of exception like RuntimeError.

virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
session.unlock_machine()
print(f"Restored '{snapshot_name}' and changed its adapter(s) to host-only")

vm_uuid = get_vm_uuid(VM_NAME)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to get the UUID? It seems like the commands work with the VM_NAME (we may need to enclose the entire name in double quotes to avoid issues with spaces), or am I missing something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could rely on the VM_NAME alone, but I use the UUID so that we can support multiple VMs of the same name and be sure we refer to the same VM consistently for all operations

virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Show resolved Hide resolved
@stevemk14ebr
Copy link
Author

stevemk14ebr commented Oct 10, 2024

for example, it seems it is not possible to access the max number of adapters which would allow us to write simpler code as in the previous version

we can, the vminfo command lists all 8 adapters (the max) and any unset adapters have the value 'none'. The code doesn't need to check the max adapters because it lists all of them, even if they're unset, so we always loop all 8 adapters.

What about keeping both the version using the virtualbox library and the new one using VBoxManage until we have tested and migrated everything else

I have no issues with not merging these PRs (I will send more for the other two scripts) until we are ready to drop the virtualbox package dependency entirely. I would not want to keep two version around though, that goes against the spirit of doing this work. While the code does appear more complex, the port was actually quite straightforward, there is just a lot of logic to parse the text CLI output and handle the errors nicely. Some things are different than the virtualbox package for sure, but there are not any glaring things missing from the CLI. In the long term this should be very easy to maintain as the CLI does not often change. More importantly though on some setup the python .so that virtualbox uses is not build/included, and the package is unmaintained for +1 year at this time, so we should not rely on it anymore.

@stevemk14ebr stevemk14ebr changed the title Port vbox-export-snapshots to VBoxManage CLI Port virtualbox scripts to VBoxManage CLI Oct 11, 2024
Copy link
Member

@Ana06 Ana06 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested vbox-export-snapshots.py and I it fails because the network interface does not have a name. I though this happened in the previous version when exporting the VM, but it seems like setting it could be the issue and because you start the VM (what I was not doing in the previous version) it fails even before exporting it:

Starting operations on FLARE-VM
VM {40138663-f254-412b-8776-10a7cc08daea} is already shut down (state: poweroff).
Restored 'FLARE-VM'
VM {40138663-f254-412b-8776-10a7cc08daea} is already shut down (state: poweroff).
Found existing hostonlyif vboxnet0
Changed nic1
Nic configuration verified correct
Power cycling before export...
VM {40138663-f254-412b-8776-10a7cc08daea} is not running (state: poweroff). Starting VM...
Error running VBoxManage command: Command '['VBoxManage', 'startvm', '{40138663-f254-412b-8776-10a7cc08daea}', '--type', 'gui']' returned non-zero exit status 1. (VBoxManage: error: Nonexistent host networking interface, name '' (VERR_INTERNAL_ERROR)
VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), component ConsoleWrap, interface IConsole
)
Error checking VM state: Error running VBoxManage command
Unexpectedly failed doing operations on FLARE-VM. Exiting...
Done. Exiting...

I reported what I think was a bug in https://www.virtualbox.org/ticket/22158. But what really confuses me is that it seems it does work for you. 😕

virtualbox/vbox-adapter-check.py Outdated Show resolved Hide resolved
virtualbox/vbox-adapter-check.py Outdated Show resolved Hide resolved
virtualbox/vbox-adapter-check.py Show resolved Hide resolved
virtualbox/vbox-adapter-check.py Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
Comment on lines 45 to 48
except subprocess.CalledProcessError as e:
# exit code is an error
print(f"Error running VBoxManage command: {e} ({e.stderr})")
raise Exception(f"Error running VBoxManage command")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen in other Python code that an exception is re-triggered to add extra details or format the exception differently, but without a print that duplicates similar information. The print apart from duplicating the information, can make the output difficult to digest in this case, as {e.stderr} is rendering the output that is likely to be the long help message from VBoxManage. I think we should remove the try-catch, as the re-triggered exception is almost the same:

  • Original exception: Command '['VBoxManage', 'list2', 'list', 'hostonlyifs']' returned non-zero exit status 2
  • Retriggered exception: Error running VBoxManage command: Command '['VBoxManage', 'list2', 'list', 'hostonlyifs']' returned non-zero exit status 2.
Suggested change
except subprocess.CalledProcessError as e:
# exit code is an error
print(f"Error running VBoxManage command: {e} ({e.stderr})")
raise Exception(f"Error running VBoxManage command")

Comment on lines 45 to 48
except subprocess.CalledProcessError as e:
# exit code is an error
print(f"Error running VBoxManage command: {e} ({e.stderr})")
raise Exception(f"Error running VBoxManage command")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we re-raise the exception, I think we should use a more concrete typ of exception like RuntimeError.

virtualbox/vbox-export-snapshots.py Show resolved Hide resolved
virtualbox/vbox-export-snapshots.py Outdated Show resolved Hide resolved
@Ana06
Copy link
Member

Ana06 commented Dec 11, 2024

I did some more testing. Exporting a VM setting the hostonly adapter (with either virtualbox API and VBoxManage CLI) fails when I have never used the hostonly adapter of the VM as exporting does not set the name. It does not appear possible to use the API/VBoxManage CLI and I still think this is a virtualbox bug as reported on as reported on https://www.virtualbox.org/ticket/22158. But for our case where we always use the same VM to export several snapshots, we can ensure the hostonly adapter has a name before creating the snapshots: Set the network to hostonly (save the settings) and then back to NAT (save setting again). This ensures the hostonly adapter name is set and then the exporting using the virtualbox API and VBoxManage CLI works.

So the issue is not a blocker for this PR. Thanks @stevemk14ebr for working on this! This is a very intuitive bug and your work was very helpful to figure out a fix. 💐

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should not be commited.

Comment on lines 3 to +12
import sys
import textwrap
import argparse
import virtualbox
from virtualbox.library import NetworkAttachmentType as NetType
import subprocess
import re
import time
import gi
gi.require_version('Notify', '0.7')
from gi.repository import Notify
from vboxcommon import *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend using isort for formatting consistency

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and black for other formatting consistency

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blockers, but might be nice

Copy link
Member

@Ana06 Ana06 Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea @williballenthin! We can address this after this PR has been merged in a different PR, added to #507 (comment). So that @stevemk14ebr does not need to add more things to this PR. 😉

DISABLED_ADAPTER_TYPE = "hostonly"
ALLOWED_ADAPTER_TYPES = ("hostonly", "intnet", "none")

def get_vm_uuids(dynamic_only):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend type hints, at least for function signatures, as a form of documentation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocker, but nice to have

Comment on lines +29 to +32
if dynamic_only and DYNAMIC_VM_NAME in vm_name:
machine_guids.append((vm_name, machine_guid))
else:
machine_guids.append((vm_name, machine_guid))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont understand this logic, aren't the branches the same on both sides?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was made to try to address my feedback from #625 (comment). We want to add all names if dynamic_only is not set, and only the VMs with .dynamic in the name if it is set. I think it should be:

Suggested change
if dynamic_only and DYNAMIC_VM_NAME in vm_name:
machine_guids.append((vm_name, machine_guid))
else:
machine_guids.append((vm_name, machine_guid))
if (not dynamic_only) or DYNAMIC_VM_NAME in vm_name:
machine_guids.append((vm_name, machine_guid))

machine_guids = []
try:
vms_output = run_vboxmanage(["list", "vms"])
pattern = r'"(.*?)" \{(.*?)\}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when dealing with regular expressions (and command output), i'd recommend including a few example lines of the pattern text, so its easy for a human to follow along. otherwise, i have to guess what this pattern does, which can be hard or impossible for some regular expressions.

Comment on lines +13 to +18
Args:
vm_name: The name of the VM.
snapshot_name: The name of the snapshot.

Returns:
A list of snapshot names that are children of the given snapshot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice documentation!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this return all descendents recursively? or only the direct children?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should return all. I think it would be good to clarify it in the documentation. 👍

Comment on lines +23 to +50
snapshot_regex = rf'(SnapshotName(?:-\d+)*)=\"(.*?)\"'
snapshots = re.findall(snapshot_regex, vminfo, flags=re.M)

children = []

# find the root SnapshotName by matching the name
root_snapshotid = None
for snapshotid, snapshot_name in snapshots:
if snapshot_name.lower() == root_snapshot_name.lower() and (not any(p.lower() in snapshot_name.lower() for p in protected_snapshots)):
root_snapshotid = snapshotid

if not root_snapshotid:
print("Failed to find root snapshot")
raise Exception(f"Failed to find root snapshot {snapshot_name}")

# children of that snapshot share the same prefix id
dependant_child = False
for snapshotid, snapshot_name in snapshots:
if snapshotid.startswith(root_snapshotid):
if not any(p.lower() in snapshot_name.lower() for p in protected_snapshots):
children.append((snapshotid, snapshot_name))
else:
dependant_child = True

# remove the root snapshot if any children are protected OR it's the current snapshot
if dependant_child:
print("Root snapshot cannot be deleted as a child snapshot is protected")
children = [snapshot for snapshot in children if snapshot[0] != root_snapshotid]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i can follow this logic, and it seems reasonable, but without some example output, i really have any idea.

furthermore, i wouldn't be comfortable changing this logic without any tests. how would i know if i broke anything (especially without vboxmanage installed)?

i'd recommend making this parsing region its own function (input: str, output: list), and then adding some test cases with known data. the test cases will serve both as documentation (since readers can see some examples of real-world data) and to prevent regressions.

vm = vbox.find_machine(vm_name)
snapshot = vm.find_snapshot(snapshot_name)
get_snapshots_to_delete(snapshot, protected_snapshots)
TO_DELETE = get_snapshot_children(vm_name, snapshot_name, protected_snapshots)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider reserving upper snake case names for constants

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocker

print(f"\nVM state: {vm.state}\n⚠️ Snapshot deleting is slower in a running VM and may fail in a changing state")
vm_state = get_vm_state(vm_name)
if vm_state not in ("poweroff", "saved"):
print(f"\nVM state: {vm_state}\n⚠️ Snapshot deleting is slower in a running VM and may fail in a changing state")

answer = input("\nConfirm deletion ('y'):")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is "y" the default, or is that what you should press to confirm?

from the code i see its what you should press, but i'm afraid that in this format the message suggests "y" is the default. maybe change to "press 'y'".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that the user has to press 'y', as this deletes the snapshots and there is no way to recover them. I agree it would be good to clarify it.


answer = input("\nConfirm deletion ('y'):")
if answer.lower() == "y":
print("\nDeleting... (this may take some time, go for an 🍦!)")
session = vm.create_session()
for name, uuid in TO_DELETE:
for snapshotid, snapshot_name in TO_DELETE[::-1]: # delete in reverse order to avoid issues with child snapshots
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for snapshotid, snapshot_name in TO_DELETE[::-1]: # delete in reverse order to avoid issues with child snapshots
for snapshotid, snapshot_name in reversed(TO_DELETE): # delete in reverse order to avoid issues with child snapshots

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, interesting behavior. i think get_snapshot_children should document that the order of the result has a particular meaning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug in my original script, that @stevemk14ebr corrected here. I think it is a good idea to document this in the get_snapshot_children function. 👍

@Ana06
Copy link
Member

Ana06 commented Dec 17, 2024

vbox-export-snapshots.py works perfectly, but I think there is a bug in vbox-export-snapshots.py:

Failed to find root snapshot
Error getting snapshot children: Failed to find root snapshot EMPTY
Traceback (most recent call last):
  File "/usr/local/google/home/anamg/VM-building/vbox-clean-snapshots.py", line 36, in get_snapshot_children
    raise Exception(f"Failed to find root snapshot {snapshot_name}")
Exception: Failed to find root snapshot EMPTY

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/google/home/anamg/VM-building/vbox-clean-snapshots.py", line 126, in <module>
    main()
  File "/usr/local/google/home/anamg/VM-building/vbox-clean-snapshots.py", line 122, in main
    delete_snapshot_and_children(args.vm_name, args.root_snapshot, args.protected_snapshots)
  File "/usr/local/google/home/anamg/VM-building/vbox-clean-snapshots.py", line 57, in delete_snapshot_and_children
    TO_DELETE = get_snapshot_children(vm_name, snapshot_name, protected_snapshots)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/google/home/anamg/VM-building/vbox-clean-snapshots.py", line 54, in get_snapshot_children
    raise Exception(f"Could not get snapshot children for '{vm_name}'")
Exception: Could not get snapshot children for 'REMnux.testing'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants