Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDK-ify the MLOps template for easier reusability? #211

Open
athewsey opened this issue Nov 23, 2022 · 0 comments
Open

CDK-ify the MLOps template for easier reusability? #211

athewsey opened this issue Nov 23, 2022 · 0 comments

Comments

@athewsey
Copy link
Contributor

Thanks team for the nice work on the ml_ops solution guidance! There are some good tools there and I think I see where the overall philosophy is coming from in terms of main user persona & skills/etc.

I agree, to be clear up-front, that it's valuable and important to be able to deploy this without setting up any local tooling (CDK/Make/Git/SAM/etc).

However: While the initial user experience is good, I think the developer/engineer customization experience could be improved by a CDK refactor without sacrificing that initial user experience.

My reasoning is:

  1. (IMO) Writing infrastructure as CDK offers tech users faster iteration and easier code-reuse than working direct with CFn/SAM, and particularly vs all-inline CFn templates:
    1. Modularity is easier with the extra layer of abstraction of CDK constructs. In CFn, we can nest stacks but there are pretty narrowly-scoped routes to pass information between stacks (stack outputs and exports). With CDK, we can define component hierarchies smaller than the stack level: With nicely defined input dependencies that can still share actual concrete CFn resources. This makes it much easier to just take a component out of one CDK app and re-use it in another - than figuring out which resources are needed to extract a particular section from a CFn stack.
    2. (At least for my IDEs) Maintaining ASL JSON and (Python) Lambda code within the template itself renders IDE tooling pretty useless: Syntax highlighting and validation logic don't work - which means we're more likely to make silly mistakes like syntax errors or referencing undefined variables, and not catch them until we've waited for the CFn stack to (re)-deploy. Of course SAM can also externalize Lambda and SFn code nicely.
    3. (At least for my IDEs) Autocomplete and validation tooling is way better for CDK than CFn: Having typed construct classes means seeing hints as you type for the various options available, which cuts down time referring back and forth to the docs. Some services also have nice high-level constructs that are significantly easier to use - like auto-clearing and deleting S3 buckets.
  2. Maintaining the code in CDK does not mean the deploying user has to be CDK-aware or set up dev tools:
    1. Under certain constraints, CDK can still synthesize to a plain CloudFormation template without any assets: Which can be published and deployed by users.
    2. For arbitrary CDK, we can publish a 'wrapper' CFn stack something like this example - that just creates a CodeBuild project and has CodeBuild install the dependencies to run the full CDK synth/deploy.
    3. Other middle-ground options exist: Like for e.g. staging pre-built templates and S3 assets to cross-region-replicated buckets like doc-example-bucket-us-east-2, doc-example-bucket-ap-southeast-1, etc.
  3. ...So if we wrote nice, modular, CDK MLOps constructs for Forecast it could be easier for engineers to customize their own solution around them - but we could still offer easily deployable pre-built stack(s).

If we used TypeScript as the language for writing these constructs, then they could be cross-compiled to other languages (e.g. Python) like the general AWS Solutions Constructs are. If they were published to an actual NPM/PyPI/etc library, developers could import the low-level constructs without even forking/copying code from this repo.

Alternatives:

I do know there's also the amazon-forecast-mlops-pipeline-cdk sample from just a few months ago... But was curious whether it'd make sense to bring this 'default' samples repo for Forecast into CDK world too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant