Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Data Provider #167

Closed
1 of 2 tasks
kisst opened this issue Mar 7, 2023 · 17 comments
Closed
1 of 2 tasks

Generic Data Provider #167

kisst opened this issue Mar 7, 2023 · 17 comments
Labels
feature New feature or request resource Related to a resource type rfc Request For Comments

Comments

@kisst
Copy link

kisst commented Mar 7, 2023

What type of extension are you looking for?

Other

Describe the extension you'd like to request

Loosely coupled infra has it's advantages, but also pitfalls, however other IaC solutions go around the problem with data provider.

Used when a large solution is split into multiple stack, for either size, complexity or segregation of duty the glue between Stacks currently can be cfn-export, SSM, or manual via CFN parameters, the better solution is to have the lookup based on given filters and have a virtual resource which then can be used with Ref or GetAtt.

For example:
If a CFN Stack needs to be deployed to a VPC ( non default ) then the info about the VPC should be self contained within the Stack, based on given filters the VPC would be available to be Ref or GetAtt, if based on the given filters, the given expected is returned.
For example searching for VPC tagged "mycorp" in the current region, would return the VPC, but a query for a list, with filter subnet's with tag "private" can return one or more subnets.

Describe the solution you'd like

A generic version of my initial attempt https://github.com/kisst/CFN-CR-Dataprovider made available for masses.

Additional context

No response

Is this something that you'd be interested in working on?

  • 👋 I may be able to implement this feature request

Would this feature include a breaking change?

  • ⚠️ This feature might incur a breaking change
@ericzbeard ericzbeard added feature New feature or request resource Related to a resource type labels Mar 9, 2023
@ericzbeard
Copy link
Contributor

ericzbeard commented Mar 9, 2023

I like this idea! Is there any reason for there to be a single generic data provider? Why not have each Lambda (or each resource type handler) stand alone?

@kisst
Copy link
Author

kisst commented Mar 13, 2023

Originally this solution come about as a CFN CR Lambda which had to be deployed to each account, so aggregation simply serv the purpose of having smaller deployment footprint, as the benefit side is only more customisable return dict structure and filters, and an even more limited, already read-only permission.

But for registry it might make sense to break it down, just need to figure out a way to generate the resource specific extensions, as if it's all individually maintained then it would be probably with limited resource type only, due to large maintenance overhead, however that might be OK as there is a rather limited core service list where this is currently a pain-point and would be useful, R53 zones, VPC, Subnets, SG's,

@ericzbeard
Copy link
Contributor

I think I'd rather see it broken down into separate resources. AwsCommunity::Route53::DataProvider, etc.

@kisst
Copy link
Author

kisst commented Mar 20, 2023

In that case it would make sense to have identical return data structure as the default CFN resource, so for example for a Route53 Domain AWS::Route53::HostedZone Ref would give back the logical ID, and via GetAtt you could get the Id and the Nameservers like so https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-route53-hostedzone.html#aws-resource-route53-hostedzone-return-values.

Hence when used in code, the structure would be almost the same, so instead of

Type: AWS::Route53::HostedZone
Properties: 
  HostedZoneConfig: 
    HostedZoneConfig
  HostedZoneTags: 
    - HostedZoneTag
  Name: String
  QueryLoggingConfig: 
    QueryLoggingConfig
  VPCs: 
    - VPC

it would be something like:

Type: AwsCommunity::Route53::DataProvider
Properties: 
  HostedZoneConfig: 
    HostedZoneConfig
  HostedZoneTags: 
    - HostedZoneTag
  Name: String
  QueryLoggingConfig: 
    QueryLoggingConfig
  VPCs: 
    - VPC

Where the fields which are originally there for the create act as the filter.

If you can help to understand the repo structure and where should I stick it in I can prep a single resource example, and then we can pick it up from there.

In the meantime a question on supported error handling strategies. I believe this behaviour should be one of the input parameter, if not individually set for each type of potential errors:

  • looking for a single resource, not finding any
  • looking for a single resource, getting more then one
  • looking for a list of resources , finding a single one
  • looking for a list of resources , not finding any

Apart from the first and last, the other two can have different resolution, you might want to pick randomly if you are looking for just one but the filter match more then one, or you might want to just error, same if you get only one when asking for a list, you might just format it into a single item list, or might error on it, so how much flexibilty should be for the end-user to handle the errors, and how to give that freedom, for example:

Properties: 
  ErrorHandeling:
    MultiFind: PickRandom

@ericzbeard
Copy link
Contributor

Each of these would go into the resources/ directory, for example, resources/Route53_DataProvider.

Error handling is an interesting case. For finding one of a list, I think that would just be a single element list that's returned, unless it should always return multiple values. We should only go to the trouble of configuring for these cases if those are valid use cases - in most cases I would assume that if you don't get back the expected number of elements, it should be an error.

@kisst
Copy link
Author

kisst commented Mar 21, 2023

Ok just reviewed the repo structure and the contribution docs, and I might need to untick the "I can help with this box"

I used pure CFN a lot, and pretty comfortable with it, but don't really have any of an experience abstractions layers above above it, like Tropo, CDK or like this case SAM. As far as I understand based on https://github.com/aws-cloudformation/community-registry-extensions/blob/main/CONTRIBUTING.md

In order to submit a resource to the registry, you have to use the AWS Serverless Application Model (SAM)
So give me some time to get my head around the extra frills I need to wrap the code into SAM, and then I can tell if I can help or not.

I tried SAM a while ago, but that time I seen no value just added complication, and lack of portability, so I need to start again from scratch.

Unless if there is some means to translate backwards as well between CFN and SAM, as SAM renders to CFN too, which case I am happy to do it with CFN.

@ericzbeard
Copy link
Contributor

Ok, no worries if you can't do the development yourself, we like the idea so we'll try to find someone to work on it. You don't actually have to be a SAM expert, it's only used to mock Lambda locally to facilitate contract testing for the resource handlers. The cfn init command configures everything for you.

@kisst
Copy link
Author

kisst commented Mar 22, 2023

So I tried the cfn init and got a whole folder structure, which is waaay more then what I expected, so just so that I touch the right files, let me try to map my understanding of the purpose of these files.

.
├── awscommunity-route53-dataprovider.json
├── docs
│   ├── memo.md
│   ├── README.md
│   └── tag.md
├── example_inputs
│   ├── inputs_1_create.json
│   ├── inputs_1_invalid.json
│   └── inputs_1_update.json
├── README.md
├── requirements.txt
├── resource-role.yaml
├── rpdk.log
├── src
│   └── awscommunity_route53_dataprovider
│       ├── handlers.py
│       ├── __init__.py
│       └── models.py
└── template.yml

<resource-type>.json: Metadata about the resources ?
docs [folder]: collection of documentations
docs/README.md: the root of the documentation - unsure how it's different from README.md
docs/{memo,tag}: No clue ?
example_inputs[folder]: collection of test payloads in "input_<some number?>_{create,update,delete, invalid}.json"
README.md: the root of the documentation
requirements.txt: as I choose python as the lang, that's python requirements
resource-role.yaml: The pure CFN which contains only the IAM role which then will need for the Lambda ?
rpdk.log: some log no clue
src/awscommunity_route53_dataprovider [folder] : contains the Lambda code itself
src/awscommunity_route53_dataprovider/__init__.py : make it a python module for some unknown reason
src/awscommunity_route53_dataprovider/{handlers, models}.py: some generated code no clue
template.yml: SAM code to generate CFN out of it ( shouldn't be modified ? )

If these presumptions above are right, then my understanding is that I would need to write the Lambda code somewhere in the src folder, just not sure where, and I would need to create test cases into the example_inputs, and add the relevant R53 describe calls to the resource-role.yaml and documentation would go to somewhere... and then what's next ?
How would all those pieces glued together ?

@ericzbeard
Copy link
Contributor

Here are the docs for the cfn CLI: https://docs.aws.amazon.com/cloudformation-cli/latest/userguide/what-is-cloudformation-cli.html

The two main things you need to do are to model the resource in the schema file, <resource-type>.json, and then implement the handlers in src/<resource_type>. Most of the other files are auto generated.

@benbridts
Copy link
Contributor

benbridts commented Mar 22, 2023

Would it make sense to use the CloudControl API for this? eg:

Resources:
  HostedZoneName:
    Type: AwsCommunity::CloudControlApi::ListResourcesQuery
    Properties: 
      TypeName:  AWS::Route53::HostedZone
      ResourceModel: {}  # optional, json dict
      Query: ResourceDescriptions[0].Properties.Name
Outputs:
  HostedZoneName:
    Value: HostedZoneName.Result

or if you know the identifier:

Resources:
  HostedZoneName:
    Type: AwsCommunity::CloudControlApi::GetResourceQuery
    Properties: 
      TypeName:  AWS::Route53::HostedZone
      Identifier: !Ref HostedZoneId
      Query: ResourceDescription.Properties.Name
Outputs:
  HostedZoneName:
    Value: HostedZoneName.Result

downsides:

  • You'd need a Query Resource for every value from the same resource
  • I don; think it makes sense to implement a list handler
  • required permissions are tricky (you'd need read-only on everything). A workaround here might be to pass a Role that the handler will use to talk with CloudControl. That would also open up cross-account use cases

upsides:

  • only one/two resources can cover a wide array of services

neutral:

  • I don't think its possible/easy to have contract tests that pass with a resource that is read-only (delete after delete should return NotFound, and that is not possible without saving state somewhere)

Edit:
In case the List only returns identifiers (or a limited set of properties), you can combine both:

Resources:
  HostedZoneId:
    Type: AwsCommunity::CloudControlApi::ListResourcesQuery
    Properties: 
      TypeName:  AWS::Route53::HostedZone
      ResourceModel: {}  # optional, json dict
      Query: ResourceDescriptions[0].Identifier
  HostedZoneName:
    Type: AwsCommunity::CloudControlApi::GetResourceQuery
    Properties: 
      TypeName:  AWS::Route53::HostedZone
      Identifier: !GetAtt HostedZoneId.Result
      Query: ResourceDescription.Properties.Name
Outputs:
  HostedZoneName:
    Value: HostedZoneName.Result

@ericzbeard
Copy link
Contributor

That's a good point about contract testing. We might need to create a dummy resource under the hood.

@ericzbeard ericzbeard added the rfc Request For Comments label Mar 22, 2023
@ericzbeard
Copy link
Contributor

This issue might be covered by #190

@kisst
Copy link
Author

kisst commented Apr 25, 2023

Agree, looks relevant, will test it next week, and if covers the use-case will close the ticket.

@kallu
Copy link
Contributor

kallu commented Apr 26, 2023

Thanks for the effort put into this so far. I tested the current version of Lookup and it does what the docs says 👍
This would be even more useful if I could use GetAtt to retrieve any resource property, not just resource ID. Just like I can now use properties in lookup expression to select the resource. E.g. for VPC I could get any property available for Cloud Control API. Sample use-case here would be creating a security group rule that would allow traffic from CidrBlock.

$ aws cloudcontrol get-resource --type-name AWS::EC2::VPC --identifier vpc-deadbeef12345678 | jq '.ResourceDescription.Properties | fromjson'
{
  "VpcId": "vpc-deadbeef12345678",
  "InstanceTenancy": "default",
  "CidrBlockAssociations": [
    "vpc-cidr-assoc-deadbeef12345678"
  ],
  "CidrBlock": "10.0.0.0/21",
  "DefaultNetworkAcl": "acl-deadbeef12345678",
  "EnableDnsSupport": true,
  "Ipv6CidrBlocks": [],
  "DefaultSecurityGroup": "sg-deadbeef12345678",
  "EnableDnsHostnames": true,
  "Tags": [
    {
      "Value": "MyVPC",
      "Key": "Name"
    }
  ]
}

@mrinaudo-aws
Copy link
Collaborator

Thank you for the feedback, @kallu ! On the note on the current version, I wrote an unrelated PR update today, that was merged a few minutes ago, with 2 fixes (see the merged PR's notes).

There was similar feedback from @benbridts on the initial PR that I submitted; I had thought of those properties too (see my comment in reply on that PR). If this resource type would emit those fields in one additional output field containing the whole JSON string with properties (as it is not possible to map all the properties for all the supported query target resources to discrete fields in this resource type's schema), one could use Fn::GetAtt to fetch this whole JSON data field, but we'd also need to find a way to consume a specific value from that JSON structure - either with this resource, or with a new one(s) like @benbridts mentioned.

@kisst
Copy link
Author

kisst commented Apr 27, 2023

@mrinaudo-aws For the Fn::GetAtt key mapping my solution was https://github.com/kisst/CFN-CR-Dataprovider/blob/master/lambda_function.py#L29-L40 basicly use dot as a divider, does not solve it all but most.

@mrinaudo-aws
Copy link
Collaborator

Hi @ericzbeard @kallu @kisst @benbridts - I have worked on a PR, merged today (#223) that adds the ability to the lookup resource type to reference the whole JSON-formatted, property-related output from Cloud Control API for a given resource type lookup target that is supported by Cloud Control API. An example output excerpt for resource properties is: {"VpcId":"vpc-111222333","InstanceTenancy":"default",[...]. Usage and upgrade path from the previous version of the resource type is available in the README.md file.

As I mentioned previously, I think it would be desirable to have a way to natively consume a value from JSON data such as the above, given a key. I have created this issue in the cfn-language-discussion repository with a feature request: please feel free to add to it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request resource Related to a resource type rfc Request For Comments
Projects
None yet
Development

No branches or pull requests

5 participants