Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garden Build / Proxy Architecture Improvements #248

Open
1 of 5 tasks
davemcorwin opened this issue Jun 10, 2020 · 0 comments
Open
1 of 5 tasks

Garden Build / Proxy Architecture Improvements #248

davemcorwin opened this issue Jun 10, 2020 · 0 comments
Labels
migration Temp label for migrating issues to cg-team. squad-pages Pages squad label

Comments

@davemcorwin
Copy link
Contributor

davemcorwin commented Jun 10, 2020

As Federalist grows we have started to observe some bumps in the road with the garden build platform:

Disk space limits

We have a limited amount of disk space available during the build. This is currently 4GB but hopefully increasing to 6GB soon. Of this, our build container currently takes up about 2GB. This is a hard limit imposed by cloud.gov and will most likely not increase over time (after the impending bump). In order to continue to serve our customers with larger sites we need to reduce the footprint of our container

Intermittent mystery failures

The frequency of these failures has varied over time, and while a most of these are resolved by rebuilding, we still don't have an understand of why these occur. There are no logs and the process itself does not stop.

Reduce Build times

Builds are inefficient because we start from scratch every build, including installing the specified runtime version (if not using the default) and all the project dependencies.

I think we can address these issues over the next 6 months by doing the following:

  1. Use smaller base docker images
    We currently have a single docker container that includes Ruby, Node and Python runtimes based off of the default debian images. Explore using "slim" or "alpine" base images to remove unnecessary bloat.

  2. (Done in "experimental" image) Package our build code as a static executable
    Currently our build containers also have to include the environment and dependencies for running our Python code in addition to our customers'. Package and install our build code as a single executable so that the Python runtime and it's dependencies are not required. This may be possible now, but is complicated by the use of PyInvoke, which needs to be available at runtime, defeating the purpose of the static executable.

  3. (Done) PyInvoke with Python's "subprocess" module
    Python >3.5 's "subprocess" module provides the necessary primitives for running commands while managing the environment, logging, and errors without the extra layer of abstraction introduced by PyInvoke. We do not need the ability to run individual pieces of the build process from the command line and requiring each command to be run in a separate python process enforces undesirable constraints on the architecture of the application. I'm hopeful that removing PyInvoke will give us greater visibility into the intermittent mystery errors.

I am planning on starting with Step 3 and iterating from there.

Reduce publish times

Problem

Originally, in order to determine the minimum number of files to add/remove/update publishes compared the hashes of built files with those obtained using S3 SDK's "list objects" call. However, when adding the ability for users to configure custom Cache-Control headers, this check was insufficient as the custom headers are considered metadata and not part of the file hash. In order to only push changes, we would have had to fetch each object individually to compare for differences, so instead of checking, we just push all of the files, resulting in longer build times.

Solution

  • apply custom Cache-Control headers in federalist-proxy
  • do not apply Cache-Control headers to S3 objects
  • revert to diffing using the hashes or another more efficient method of diffing/publishing (ie. aws s3 sync, Transfer, etc...)

References

Originally captured here.

Stop creating/publishing "Redirect Objects"

Once the recent changes to the proxy (access S3 over https/no longer use website config) are deployed the use case for "redirect objects" will be handled in proxy and they can be safely remove from the publish process.

References

Originally captured here.

Custom 404 and index.html pages

Allow SPAs and other sites hosted on the platform to have branch-specific 404 and index pages. To be implemented via the proxy and federalist.json

References

Originally captured here.

Garden build logging

Implement log drain for garden build instead of writing logs to db. Ex. fluentd, logstash, etc

References

Originally captured here.

CORS

A few customers have CORS configured on their buckets and we should allow this for all customers. However, this becomes more complicated with the move the private buckets. We need to investigate how folks are using this, why they need to go straight to the bucket (or is this required bc we just proxy) and the impact.

References

Originally captured here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migration Temp label for migrating issues to cg-team. squad-pages Pages squad label
Projects
None yet
Development

No branches or pull requests

3 participants