-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add configuration for deploying a cloudfront distribution for the veda-backend #229
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, do have an important broader question regarding our usage of CloudFront though
routes/infrastructure/construct.py
Outdated
bucket_name=veda_route_settings.stac_browser_bucket, | ||
) | ||
|
||
no_cache_policy = cf.CachePolicy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the CachingDisabled
managed policy instead ? Looks pretty similar ?
Longer note :
So we're turning off any caching Cloud Front is doing with that policy. We've actually been doing that on MAAP as well (albeit manually on the console, so I don't have any code to link), after noticing that STAC metadata wasn't getting updated frequently enough on the STAC browser.
However, I'm still puzzled that I am doing this because, isn't the point of CloudFront to cache the content in the closest edge point to the user and decrease latency thanks to that ? If we turn that off entirely, why are we using CloudFront ?
In the first place I remember that we started to use CloudFront because that's what the AWS docs tell us to do for serving static content stored in S3 over HTTPs. However, is the CloudFront layer really needed for this, can't we just create a route53 record that directly points to the bucket ?
Note : those are mainly questions, I am not suggesting we stop using CloudFront
routes/infrastructure/construct.py
Outdated
else None | ||
) | ||
|
||
self.distribution = cf.Distribution( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right that with this set of behaviors, the mapping is going to be, assuming the domain is dev-stac.delta-backend.com
:
https://dev-stac.delta-backend.com
-> browserhttps://dev-stac.delta-backend.com/api/stac
STAC APIhttps://dev-stac.delta-backend.com/api/raster
raster API
veda_stack, | ||
"routes", | ||
raster_api_id=raster_api.raster_api.api_id, | ||
stac_api_id=stac_api.stac_api.api_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to address the API Gateway url not resolving was to use the restapi_id in the url pattern: https://{restapi_id}.execute-api.{region}.amazonaws.com/ to access both the stac and raster apis.
😱 very nice trick. Spent so much time finding a workaround for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to spend a bit more time looking at this today (particularly looking at how the ingest API is working with the CF distribution in this stack). The failing action will pass if the environment secret is updated with the ingest_url
and stac_browser_bucket
values.
@ividito Also the routes construct isn't configured as as optional (similar to how the domain construct works). So the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a couple comments on the config
routes/infrastructure/config.py
Outdated
description="Certificate’s ARN", | ||
) | ||
|
||
using_mcp_acct: Optional[bool] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using_mcp_acct
is not used? Also, not sure about the intent but possibly we could be inferring this info from whether or not VEDA_PERMISSIONS_BOUNDARY_POLICY_NAME
is configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was an old config when I copied over Slesa's original cloudfront code. It's not used anywhere, can be removed.
@@ -0,0 +1,52 @@ | |||
"""Settings for Cloudfront distribution - any environment variables starting with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a line to the readme advanced config about this? (sorry if this is a repeat comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I'm kinda surprised that this is the change that made it work. We have the same setup in ghgc, but it still doesn't resolve the urls 🤔 |
routes/infrastructure/construct.py
Outdated
no_cache_policy = cf.CachePolicy( | ||
self, | ||
"no-cache-policy", | ||
default_ttl=Duration.seconds(0), | ||
min_ttl=Duration.seconds(0), | ||
max_ttl=Duration.seconds(0), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's already no cache policy defined in cdk,
cf.CachePolicy.CACHING_DISABLED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@slesaad I did some experimentation last week and realized that for the cloudfront stac api to work you need the trailing slash... http://staging.ghg.center/api/stac/ Without the trailing slash you get 307 temporary redirect the goes to api gateway url |
@smohiudd yeah, but the URLs in the catalogs are still the same api gateway urls |
pydocstyle fixes format isort changes optional config format config changes format config changes construct change
ead8e14
to
dccaf49
Compare
Thanks @vincentsarago! I got the host header working in APIgateway using the parameter mapping. Doesn't look like we need to go down the ALB route. |
description="Boolean if Cloudfront Distribution should be deployed", | ||
) | ||
|
||
# STAC S#3 browser bucket name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this is the name of an existing bucket?
- why is the default an empty string instead of None (same for ingest url below)
- should we add addition info in description to explain that an existing s3 bucket is required if using cloudfront
also: typo in the comment S#3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes, for now it's the name of an existing bucket. But that may change depending on how the stac browser construct @emileten is working on is set up
- Yes you're right they should
None
- Agreed
class Config: | ||
"""model config""" | ||
|
||
env_file = ".env" | ||
env_prefix = "VEDA_STAC_" | ||
env_prefix = "VEDA_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a breaking change because it causes existing implementations to update the secrets stored in the secrets manager? I think we should address changing <SCOPE>_VARNAME
across all constructs if we want to break the pattern (especially because it could be useful to share a variable that should be the same in the scopes of different constructs).
Is the change needed now, though, or can we keep scoping our environment variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like staging has no VEDA_STAC_
vars in the secrets store so this comment can be downgraded to a nit
if veda_stac_settings.domain_hosted_zone_name | ||
else None, | ||
) | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is awesome + I wonder if the prefix can be managed here also (in the future)?
Question: I need to try this out but to me it seems that if domain_hosted_zone_name is None, then "host" will be overwritten with None? Or does it just end up being a no-op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to make it like a no-op if hosted zone wasn't present. I can double check if that's what it's doing
@smohiudd Glad it's working 🙏 , sorry for not seeing this issue earlier 😭 |
) | ||
|
||
# Certificate must be in zone us-east-1 | ||
domain_cert = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the certificate used in cloudfront must be in the us-east-1 region. Right now the arn is being passed manually. Should we be creating a new certificate in us-east-1 instead.
stac_api/infrastructure/construct.py
Outdated
construct_id, | ||
handler=lambda_function, | ||
parameter_mapping=( | ||
aws_apigatewayv2_alpha.ParameterMapping().overwrite_header( | ||
"host", | ||
aws_apigatewayv2_alpha.MappingValue( | ||
veda_stac_settings.domain_hosted_zone_name | ||
) | ||
if veda_stac_settings.domain_hosted_zone_name | ||
else None, | ||
) | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smohiudd I realized we'd want to do this to the raster API too, because otherwise when we use /mosaic/register
, it gives back the API gateway URL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @slesaad have you tested that on GHGC yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! - https://github.com/US-GHG-Center/ghgc-backend/pull/46/files
deployed to dev.ghg.center/api/raster
handler=veda_raster_function, | ||
parameter_mapping=( | ||
aws_apigatewayv2_alpha.ParameterMapping().overwrite_header( | ||
"host", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least temporarily, I think we need to make this assignment conditional based on whether or not cloudfront is used. It appears that the staging APIs will also be overwritten. Same for the stac api
[~] AWS::ApiGatewayV2::Integration raster-api/veda-backend-uah-staging-raster-api/DefaultRoute/raster-api rasterapivedabackenduahstagingrasterapiDefaultRouterasterapi20F21B19
└─ [+] RequestParameters
└─ {"overwrite:header.host":"delta-backend.com"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a conditional for the parameter mapping
@smohiudd @anayeaye developmentseed/titiler#716 might be of interest. Basically when using an API Gateway prefix, it seems that you'll need to set a |
# What - When a custom host is configured, overwrite the host in the api integration (bring back the parameter mapping method introduced in #229) - Include origin_path in cloudfront ingest api behavior. Note: this path will not work until the root_path is changed to `api/ingest` in the ingest api - Domain construct now only generates custom subdomains for endpoints when configured (existing stacks must provide new configuration variable). _Note: indentation changes that wrap all subdomain constructs make the change appear larger than they really are. The change removes an unused shared subdomain construct and only generates api-specific subdomains when create subdomains configuration is true._ ## Environment variable changes ### New `VEDA_CUSTOM_DOMAIN` optional configuration to provide a complete url for host overrides in lambda APIs (i.e. `dev.somedomain.com`) ### Removed `VEDA_DOMAIN_SHARED_SUBDOMAIN` is no longer used (logic is managed with custom subdomains config below) ### New+Breaking `VEDA_DOMAIN_CREATE_CUSTOM_SUBDOMAINS` boolean defaults false. Existing deployments with custom subdomains for each raster and stac api endpoints need to set this variable to TRUE.
…cloudfront (#244) # What - adds optional configuration to deploy all endpoints with a custom root path (i.e. `/api/stac` and `/api/raster`) #229, #241 - adds support for using a cloudfront as a reverse proxy #229, #241, #245 - titiler-pgstac upgraded from 0.2.3 to 0.8.0 #239 - titiler upgrade for custom colormap configuration #243 # How tested A temporary test stack was deployed to confirm that raster-api works as expected and that the two-subdomain staging-stac and staging-raster pattern is still supported. The [pre-deploy diff action for this pr](https://github.com/NASA-IMPACT/veda-backend/actions/runs/6791564314/job/18463354484#step:11:919) confirms that the raster API lambda will be upgraded and no domain name changes will be caused.
Include Cloudfront Distribution construct in veda-back which should help with the STAC Browser construct feature.
One way to address the API Gateway url not resolving was to use the
restapi_id
in the url pattern:https://{restapi_id}.execute-api.{region}.amazonaws.com/
to access both the stac and raster apis.This branch has been test deployed as the
path-prefix-veda-backend
stack using thedelta-backend.xyz
domain.