Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Scrapli 'core' and platform migration #297

Open
wants to merge 30 commits into
base: scrapli-dev
Choose a base branch
from

Conversation

kaelemc
Copy link

@kaelemc kaelemc commented Dec 20, 2024

This PR implements scrapli to the 'core' functions, and uses the driver for some platforms.

Changes

vrnetlab base image

A new vrnetlab 'base' image is added. This image should be used as the base for all migrated platforms.

It contains common pkgs preinstalled as well as Scrapli & Scrapli community.

Scrapli 'core'

The core now uses Scrapli's Driver for serial console and qemu monitor connections.

Scrapli is imported in vrnetlab.py but will not throw an error if not found, which means scrapli and telnetlib may co-exist.

Setting use_scrapli to True in the class initializer will enable Scrapli for the serial console and qemu monitor connections.

Core functions wait_write, expect and read_until now have Scrapli equivalents.

wait_write

wait_write can effectively still be used as normal. Inside the function if use_scrapli was set to true then the wait_write_scrapli function is run instead.

wait_write_scrapli replicates the functionality of wait_write, but using the Scrapli serial console connection instead of telnetlib.

wait_write_scrapli currently doesn't have support for con, clean_buffer or hold args.

expect

expect was part of telnetlib. The scrapli compatible version is the con_expect function.

con_expect aims to replicate the functionality of telnetlib.expect

It accepts a list of byte-strings which are used for regex matching something on the console.

Like telnetlib.expect it will return the list index of the first thing that it matched, the re match object and the read console buffer up until the match (or timeout).

Unlike telnetlib, the timeout is optional and con_expect will not block forever. Rather the timeout in this case is used for how long you wish to block for.

My personal opinion is to not use the timeout/block, it makes behaviour less reliable.

read_until

read_until was another function of telnetlib, the Scrapli compatible function is con_read_until. Generally this wasn't used directly in any nodes, but rather for the wait_write function.

con_read_until will continously read and output the serial console buffer to stdout until the string it must match on is matched.

It returns the entire buffer of the console read until the match.

By default con_read_until is blocking, however there is a timeout arg if the function should be required to timeout after some amount of time.

Logging

  • Logging formatted has been improved -- Logs levels are coloured via ANSI escape code now.
  • Common logs are done on vrnetlab side now:
    • Env vars
    • If transparent mgmt intf is in use
    • If scrapli is in use
    • SMP/vCPU and RAM settings

Misc

These are fairly opinionated additions:

write_to_stdout has been added. It's a simple helper function to write something to the stdout and flush the buffer so everything is written.

The main usage is to print console output, the jusitifcation is because it looks uglier when console output is printed by the logger.

Another addition is the format_bool_color function. This is simply used to return text which is ANSI formatted in green or red depending on if some boolean is true or false.

Migrated platforms

  • Cisco CSR1kv -- IOSXEDriver
  • Cisco Cat8kv -- Uses CVAC (mounted ISO)
  • Cisco Cat9kv -- Uses CVAC (mounted ISO)
  • Cisco vIOS -- IOSXEDriver
  • Cisco NX-OS -- NXOSDriver
  • Cisco Nexus 9000v -- NXOSDriver
  • Cisco IOS-XRv -- IOSXRDriver
  • Cisco IOS-XRv9k -- IOSXRDriver
  • Nokia SR-OS -- nokia_sros platform from Scrapli Community

I plan to implement other platforms later down the line; time permitting.

Migration steps

There are two ways you can migrate:

  • Maintain wait_write functionality but use Scrapli as the telnet backend.
  • Migrate everything to Scrapli and use the platform implementation for config management.

Steps

  • Migrate the Dockerfile to ghcr.io/srl-labs/vrnetlab-base as the base image.
  • In the class initializer, set use_scrapli to True.
  • Change self.tn.expect() to self.con_expect() and remove the timeout.
  • Change the console buffer printing to use the self.write_to_stdout() instead of trace or debug logging.
  • Change the telnet close from self.tn.close() to self.scrapli_tn.close()

Extra steps if migrating to Scrapli platform/driver

  • Create the scrapli device configuration and open the connection:
    • You can close the self.scrapli_tn connection and open either manually or use the context manager (see XRv9k).
    • (RECOMMENDED) You can commandeer the existing self.scrapli_tn connection, so the new driver uses the existing transport (see IOS-XE devices: cat9kv, csr, cat8kv, vios).
  • (OPTIONAL) Implement the SCRAPLI_TIMEOUT env var to let the user control the driver timeout.

ssasso and others added 25 commits December 16, 2024 10:11
* backdoor to reset VR

* option to reset specific VMs
- Implement scrapli for telnet console and qemu monitor
- Add scrapli for core funcs (wait_write, read_until, expect)
- Add conditional use of scrapli via 'use_scrapli' var. Default is disabled
- Add colours to logging
- Log env vars
- Log if transparent mgmt intf is in use
- Log if scrapli is in use
- Log overlay image creation
- Log defined SMP and RAM
- Use Scrapli IOSXEDriver for config
- Update install VM var name to 'cat8kv' from 'csr'
- Fix installer class init so overlay image is only created once
- Remove license check
- Send bootstrap config via day0/CVAC config (mounted file to cdrom)
- Send startup config via Scrapli IOSXEDriver
- Use Scrapli IOSXEDriver for sending bootstrap and startup configs
- Use Scrapli IOSXRDriver to send bootstrap and startup configs
- Converts the qcow2 image into required vmdk format for vrnetlab via qemu-img.
- Use Scrapli IOSXRDriver for bootstrap and startup configs
- Change class names to 'XRv9k' instead of 'XRv'
- Explicitly wait for SDR baking to complete in install process
- Remove call home/LC check
- Use NXOSDriver for bootstrap and startup configs
- Use NXOSDriver for bootstrap and startup configs
- Use IOSXEDriver for bootstrap and startup configs
- vios, csr, cat8kv, cat9kv -- add configuration saving
- XRv, XRv9k -- log configuration saving
- Use scrapli community 'nokia_sros' platform
- Remove wait_write clean_buffer override
- Check if tftpboot conifg exists *before* opening Scrapli connection
- Log command outputs with 'DEBUG_SCRAPLI' env var (defaults to false)
@kaelemc
Copy link
Author

kaelemc commented Dec 20, 2024

I rebased my branch to add the /reset functionality with Scrapli.

There is a current caveat with SROS. Please use my fork/branch of scrapli community. I have some minor changes to the regex to allow for BOF configuration prompt.

Once cloned, please rebuild the base image with the Dockerfile in this PR branch.

Once all has been tested and issues are ironed out I will make a PR to get this added in scrapli community.

Of course as always I am open to/want feedback. If you want any explanations for anything please let me know!

@kaelemc kaelemc mentioned this pull request Dec 20, 2024
35 tasks
@tjbalzer
Copy link

Did some tests on:

  • Cisco cat9kv
  • Cisco csr1kv
  • Cisco cat8kv
  • Cisco vIOS

Everything worked as expected:

  • bootstrap config loaded OK
  • user provided startup-config loaded OK

All tests were done with fully functional labs up to six nodes.

Looks good, no Scrapli related issues so far (the log is a little chattier than before... ;-)).

@kaelemc
Copy link
Author

kaelemc commented Dec 22, 2024

@tjbalzer Thanks for testing!

For SROS there is an env var DEBUG_SCRAPLI which controls whether we show the result of each command or not, by default it's disabled. In your opinion would something like this be better for Cisco the nodes too, or should we hide the channel input logging from Scrapli instead?

@kaelemc
Copy link
Author

kaelemc commented Dec 23, 2024

Cat8k and Cat9k now use the CVAC (Cisco Virtual Appliance Config) to apply the bootstrap and/or startup config.

This means configs are applied with no interaction from the console at all.

The way this works is we mount an iso to the node with a file: iosxe_config.txt. This file then gets applied to the node.

The change I've made writes the default bootstrap cfg to this iosxe_config file and if a startup config is present it simply gets appended to the file. This provides the same behaviour as before where a startup config is always applied 'ontop' (after) the bootstrap config.

I can make this change for csr1kv as well, but can't guarantee it'll work for super old (possibly EOL?) versions until I get an image to test.

For some reason CVAC doesn't work at all on XRv/XRv9k.

@tjbalzer I wonder if you can test cat8k and cat9k again :)

EDIT: Technically all csr1kv versions are 'EOL'.. rather I meant old pre 16.x versions.

@tjbalzer
Copy link

CVAC is nice and much quicker than applying the configs line by line.

@kaelemc cat8kv and cat9kv tests for all cases successful

@tjbalzer
Copy link

tjbalzer commented Dec 23, 2024

For SROS there is an env var DEBUG_SCRAPLI which controls whether we show the result of each command or not, by default it's disabled. In your opinion would something like this be better for Cisco the nodes too, or should we hide the channel input logging from Scrapli instead?

Good question. To be honest, I don't care that much, it was just an observation that the log is chattier than before. I like the 'launch' part where you can see CONFIG/RESULT, I don't need the 'sync_channel INFO sending channel input...' part.
If you want to change the behavior, I would suggest the env var approach via DEBUG_SCRAPLI.

@tjbalzer
Copy link

I can make this change for csr1kv as well, but can't guarantee it'll work for super old (possibly EOL?) versions until I get an image to test.

@kaelemc I found an csr1kv 3.17 image from the good ol' days and it works with vrnetlab/clab. Could be used for testing.

On the other hand, is it really needed? The csr1000v vrnetlab support for vrnetlab was never guaranteed (only tested for 17.3) and IMHO the pre 16/17 versions are not used that much in today's containerlab topologies (but I might be wrong).

For some reason CVAC doesn't work at all on XRv/XRv9k.

@kaelemc Some comments about CVAC support in XRv9k can be found here

@kaelemc
Copy link
Author

kaelemc commented Dec 23, 2024

@tjbalzer Thanks for testing, appreciate it a lot.

I agree as well regarding the CSR, csr1kv as of this year is no longer supported at all by Cisco, and the IOS-XE 3S versions have been EOS for many years now too. They are missing many features, which as you said probably make them irrelevant in most clab usecases.

but if the CVAC works for the old version then we can take it as a win. If not we can add some logic to manually enter config (or just drop support for these old versions? @hellt)

Some comments about CVAC support in XRv9k can be found here

Thanks, to give some more context I was already following this document. I tried cdrom and usb to load the CVAC config but it simply won't get past the 'welcome to XRv9k' message and will start throwing LXC errors. It's not my first time trying to get CVAC working on xrv9k either, but I have a few more things to try before I move on from it.

If CVAC doesn't play nice, I was thinking of copying SROS - #272 . A little bit of editing of the tc rules allows access to the tftp server. I'd prefer if the config could be loaded like that, so it's just a few commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants