There are some tools to automate your infrastructure:
- Ansible - Orchestration engine to automate configuration
- AWS CLI - AWS Command Line Interface
- AWS CDK - AWS Cloud Development Kit
- AWS Cloudformation - Language model to provision AWS resources
- OpenTufu - Fork of Terraform that is open-source, community-driven, and managed by the Linux Foundation
- Pulumi - Infrastructure as Code to provision and manage any cloud, infrastructure, or service
- Terraform - Infrastructure as Code to provision and manage any cloud, infrastructure, or service
- terraformer
- former2
- AWSConsoleRecorder
- AWS CloudFormation: Bringing existing resources into CloudFormation management - requires creating a CF template manually before the import
- Figure out what the factors that are important to the organization are (it's rarely "the cheapest possible" or "latest hype" option)
- If it is an improvement to an existing product, think about the existing project stakeholders and how the change will affect them
- If it is a new project, think about how the product will be used in production
- Define and compare several approaches, follow KISS principle
- You might want to follow Well-Architected Framework but take it with a grain of salt, use YAGNI principle
- Amazon’s This Is My Architecture
- AWS Solutions Implementations
- The Amazon Builders' Library
- AWS Cloud Adoption Framework
- Take Care About Costs
- AWS services in plain English
- AWS Well-Architected Lenses
Disaster Recovery planning is hard and should be treated as a process, not one-time activity. Here are some steps that can help with DR planning:
-
Collect initial requirements like SLA documents, ISMS documentation of your critical workloads
-
Determine what is a disaster. It can be a natural disaster, failure of hardware or an attack and not only in cloud but also of your on-prem DC
-
Determine Recovery Time Objective (RTO) and Recovery Point Objective (RPO) of your critical workloads
-
Talk with business to balance costs vs risks of the DR plans
-
If you have budget, you can look at commercial tools like CloudEndure
-
Useful materials:
-
Use chaos engineering for automation of testing platform resiliency
Although some services provide good scenarios for DR planning, some are making such plans complicated. When a system requires a DR plans, you must analyze used services and how they will support the plans.
Examples of services that have good DR stories:
- DynamoDB - Allows backups of all data (and moderately easy restoration) as well as provides global tables (replication within seconds)
- EC2 - Same instance types are available across multiple regions, EBS volumes can be backed up and restored
- Route 53/CloudFront - Enables you to switch regions within seconds/minutes
- Kinesis - cross-region replication can be easily written
Examples of services that provide bad DR stories:
- Cognito - Service stores users within a region, authentication endpoints are region-bound, user replication or backups with password hashes are not available
- QLDB - Amazon QLDB does not support a backup and restore feature...
AWS maintains SDK for many programming languages:
- C++
- AWS SDK for C++
- Go
- Java
- JavaScript
- AWS SDK for JavaScript
- AWS Mobile SDK for JavaScript
- AWS IoT Device SDK for JavaScript
- .NET
- Node.js
- PHP
- AWS SDK for PHP
- Python
- Ruby
Very often organizations that operate on regulated data like health care information or financial data might be very reluctant to move to a public cloud. They can be biased with thinking a public cloud is less secure than an in-house Data Center, cloud will not reduce costs of development and operations and in overall, it won't enable the organization to have more products in their portfolio.
Public cloud provides services to a wide range of customers and provides solutions for companies that focus on preventive controls. The cloud helps to drive culture of innovation, comes with the widest set of tools for automation and in overall reduces costs of the technology teams.
Useful resources:
- AWS re:Invent 2019: [REPEAT 2] The fundamentals of AWS cloud security (SEC205-R2)
- Best practices for securing sensitive data in AWS data stores
- AWS re:Invent 2018: Architecting for Healthcare Compliance on AWS (HLC301-i)
- AWS re:Invent 2019: National Australia Bank: Automating governance in Financial Services (SEC352)
- Financial Services Industry Lens – AWS Well-Architected
- Amazon VPC console wizard configurations
- AWS Client VPN - Scenarios and examples
- Transit Gateway - (Paul Casey)
- autovpn for on demand disposable OpenVPN endpoints
- AWS Systems Manager Session Manager/Bastillion to connect to EC2 instances
There are some frameworks to help Serverless applications deploys:
- AWS Serverless Application Model AWS SAM
- Middy for Node.js
- Claudia.js for Node.js
- Chalice for Python
- Zapp for Python
- Sparta for Go
- Bref for PHP
- Serverless Framework
Serverless world is growing and there are many interesting articles and repositories:
- Serverless Data Pipeline
- Serverless Reference Architecture
- Overcoming Serverless limitations
- The Stanford Builder using AWS Lambda - Amazing project
- AWS Lamba abuse
- Load testing a web application’s serverless backend
- AWS Lambda powertools
- Building COBOL applications on Lambda
- Complete Guide to Lambda Triggers and Design Patterns (Part 1)
- Managing backend requests and frontend notifications in serverless web apps
- AWS Lambda offline development with Docker
- Serverless: a backend thing that gives superpowers to frontend developers
- CDK Patterns
- AWS Architecture Center - Architecture Best Practices for Serverless