Skip to content
This repository has been archived by the owner on Feb 8, 2018. It is now read-only.

build a vault #3504

Closed
chadwhitacre opened this issue May 31, 2015 · 65 comments
Closed

build a vault #3504

chadwhitacre opened this issue May 31, 2015 · 65 comments

Comments

@chadwhitacre
Copy link
Contributor

We are going to start storing national identification numbers (#3289 (comment)) as well as bank account numbers (#3377 downstream of #3366). We need a vault separate from our main application and database that is more highly secure. We should use the PCI DSS 3.0 standard to self-assess the security of our application (gratipay/inside.gratipay.com#214). This ticket is about building a new vault component of our architecture.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@chadwhitacre chadwhitacre added this to the Balanced shutdown milestone May 31, 2015
@chadwhitacre
Copy link
Contributor Author

I think we should host our vault directly on AWS, since they clearly offer a PCI compliant environment, whereas Heroku doesn't advertise as much.

http://aws.amazon.com/compliance/pci-dss-level-1-faqs/

@chadwhitacre
Copy link
Contributor Author

I'm envisioning a very simple key/value store, an expansion of vault.py to put it on the network. I suppose the thing to do would be to use HTTP so we can post into it from javascript. We don't want to transmit sensitive data through the main web app at all.

@chadwhitacre
Copy link
Contributor Author

Let's do some poking around ...

https://hashicorp.com/blog/vault.html

What else?

@chadwhitacre
Copy link
Contributor Author

http://tokenator.org/

@chadwhitacre
Copy link
Contributor Author

@chadwhitacre
Copy link
Contributor Author

Data Encryption

In addition to being able to store secrets, Vault can be used to encrypt/decrypt data that is stored elsewhere. The primary use of this is to allow applications to encrypt their data while still storing it in the primary data store.

The benefit of this is that developers do not need to worry about how to properly encrypt data. The responsibility of encryption is on Vault and the security team managing it, and developers just encrypt/decrypt data as needed.

@chadwhitacre
Copy link
Contributor Author

One key feature of our requirements here is that the web app only needs to write secrets, not read them. It's the payroll process that needs to read secrets, in order to originate ACH credits and populate invoices. My thought is that we should use public key cryptography, with the web app holding the public key (via heroku config:set) and the payroll process having access to the private key.

Introducing a server component, whether Vault or something else (including something DIY), increases our surface area and level of complexity significantly moreso than integrating encryption-before-storage into our existing application architecture. What are the PCI implications of the latter?

@chadwhitacre
Copy link
Contributor Author

Another design requirement: I want separate access groups for the main web app and the PCI vault. I want to be able to grant access to Heroku (app + db) as we've been doing, which is carefully, to be sure ... but we need to be even more careful with access to vaulted data.

@chadwhitacre
Copy link
Contributor Author

Let's distinguish the three pieces of information we're intending to collect, their risk profile, and our immediate application requirements regarding each.

piece of information risk write read—process, role, purpose
bank account number (BAN) financial theft web payroll, Gratipay, generation of NACHA files to submit for ACH origination
individual national identification number (NIN) personal identity theft web web, team owners, filling out tax forms
business identification number (VAT/EIN) business identity theft web web, supporters and team owners, generation of invoices

@chadwhitacre
Copy link
Contributor Author

So the web app does need to read some secrets.

@chadwhitacre
Copy link
Contributor Author

Meaning it does come under the systems we need to consider in terms of PCI compliance.

@chadwhitacre
Copy link
Contributor Author

The requirement for invoices is that VAT be available to both supporters (buyers; #1199 (comment)) and teams (sellers; #1199 (comment)).

@chadwhitacre
Copy link
Contributor Author

Hashi Vault supports dynamic secrets. Could we use that to ensure that access to Heroku doesn't entail access to our vault?

@chadwhitacre
Copy link
Contributor Author

Dynamic Secrets: Vault can generate secrets on-demand for some systems, such as AWS or SQL databases. For example, when an application needs to access an S3 bucket, it asks Vault for credentials, and Vault will generate an AWS keypair with valid permissions on demand. After creating these dynamic secrets, Vault will also automatically revoke them after the lease is up.

http://vaultproject.io/intro/

@chadwhitacre
Copy link
Contributor Author

Like, when the app spins up, it asks our vault for credentials to our vault?

@chadwhitacre
Copy link
Contributor Author

Looks like that would take some work.

@chadwhitacre
Copy link
Contributor Author

I'm going through the Vault intro.

@chadwhitacre
Copy link
Contributor Author

Alright, I am introduced to Vault. It's a nice piece of software. We very well may be able to use it here.

@chadwhitacre
Copy link
Contributor Author

I want to give people access to a web app (at Heroku, as it happens) that has access to Vault, without giving the people the same access to Vault as the web app has. This could be achieved with a vault secret backend that supported dynamic secrets, yes?

hashicorp/vault#288

@chadwhitacre
Copy link
Contributor Author

I've registered for an AWS account.

@chadwhitacre
Copy link
Contributor Author

Can we use the browser as the go-between to avoid leaking vault access to people with Heroku access?

@chadwhitacre
Copy link
Contributor Author

I don't see how to meet this requirement with Vault. :(

@chadwhitacre
Copy link
Contributor Author

Or at all, really. If the web app has to be able to write, then whoever has access to the web app could potentially write out their bank account details and collect all of payroll for a week.

@chadwhitacre
Copy link
Contributor Author

Okay, so let's take it that we don't have a separate access tier that is even tighter than access to our production hosting environment and database.

Then we're back up against the fact that Heroku does not promise a PCI-compliant environment to nearly the extent that Amazon does.

@chadwhitacre
Copy link
Contributor Author

Gosh. Are we talking about migrating away from Heroku? 🐭

Are your datacenters certified / PCI compliant?

All of our datacenters have been certified by national and/or international security standards.

Our NYC1 facility is SSAE16 SOC-1 Type II certified.
Our NYC2 facility is SSAE16 SOC-2 Type II certified.
Our NYC3 facility is SSAE16 SOC-2 and SOC-3 compliant.
Our AMS1 and AMS2 facilities are ISO27001:2005 and ISO9001 certified.
Our AMS3 facility is ISO9001, ISO27001, and SSAE16 Type II certified
Our SFO1 facility is SSAE16 SOC-1 Type II certified.
Our SGP1 facility is ISO27001:2005 certified.
Our LON1 facility is ISO9001:2008, ISO27001, and SSAE16 / ISAE 3402 certified.
Our FRA1 facility is ISO9001:2008, ISO27001:2005, and ISO22301:2012 certified.

https://www.digitalocean.com/help/policy/

via https://www.digitalocean.com/community/questions/digital-ocean-pci-dss-server-compliance

@chadwhitacre
Copy link
Contributor Author

Amazon > DO > Heroku (PCI-wise)

@chadwhitacre
Copy link
Contributor Author

Okay! Reticketed as #3505. 🏊

@chadwhitacre
Copy link
Contributor Author

POI:

ZeroDB is an end-to-end encrypted database. Data can be stored on untrusted database servers without ever exposing the encryption key. Clients can execute remote queries against the encrypted data without downloading all of it or suffering an excessive performance hit.

https://github.com/zero-db/zerodb

(h/t)

@chadwhitacre
Copy link
Contributor Author

I propose that we close this ticket. Why? Because we already have a vault. Meaning, we already have a database that contains sensitive information, which we take care to protect. Security means constant vigilance, there is no "done" state where we'll be perfectly ready to store national identity numbers and passport scans. Running HackerOne for a while, observing the security practices of other organizations, and watching NSA predators has me feeling hesitantly confident that we're not at the back of the pack when it comes to security. I think we should go for it.

@rohitpaulk @clone1018 @aandis et. al?

@chadwhitacre
Copy link
Contributor Author

We need a vault separate from our main application and database that is more highly secure. We should use the PCI DSS 3.0 standard to self-assess the security of our application (gratipay/inside.gratipay.com#214). This ticket is about building a new vault component of our architecture.

Now I'm saying that we treat our main app and database as our vault (rather than building a separate component), and we evolve towards something like DSS as we scale.

@chadwhitacre
Copy link
Contributor Author

Last call for objections to closing this ticket and declaring that we already have a vault ...

@chadwhitacre
Copy link
Contributor Author

We're a far cry from Uphold's definition of a vault. :-/

@chadwhitacre
Copy link
Contributor Author

@mattbk at gratipay/inside.gratipay.com#532 (comment):

Note to self, what gets stored in the vault?

Unclear. Either everything (our current database is our vault) or just sensitive info such as national identity info (we have a second database). Though in the first case we would still differentiate sensitive info, and store it in the current database in encrypted form. Maybe the difference is semantic. Our vault has "layers" or something.

@chadwhitacre
Copy link
Contributor Author

Alright @rohitpaulk @aandis @mattbk @kaguillera: per gratipay/inside.gratipay.com#539 (comment), let's start with the five of us. Would you store your name, address, and national ID number in Gratipay today? Why not? What is the minimum we need to accomplish before you're comfortable doing so?

@rohitpaulk
Copy link
Contributor

Would you store your name, address, and national ID number in Gratipay today?

Yes, I would

@mattbk
Copy link
Contributor

mattbk commented Mar 18, 2016

Sure.

@aandis
Copy link
Contributor

aandis commented Mar 18, 2016

👍

@chadwhitacre
Copy link
Contributor Author

Awesome! Me, too. :-)

I will check with @kaguillera when I see him next week ...

@chadwhitacre
Copy link
Contributor Author

California Attorney General defines "reasonable security"

When businesses are breached, their liability is largely determined by if they had practiced reasonable security, so a lot of litigation regarding breaches is focused on determining that. This litigation is due to there being no definition of reasonable of security. The California Attorney General Kamala Harris, on February 16th, released California's annual breach report and within it, she defined reasonable security, stating "The 20 controls in the Center for Internet Security’s Critical Security Controls identify a minimum level of information security that all organizations that collect or maintain personal information should meet. The failure to implement all the Controls that apply to an organization’s environment constitutes a lack of reasonable security."

The breach report itself, like most breach reports, isn't that interesting. California has been collecting information about breaches since 2003, when California became the first state to require businesses to inform affected parties when those victims are residents of California. Now 46 states have similar requirements.

The definition the Attorney General used for reasonable security, the CIS Critical Security Controls (direct link to a copy), are extremely aggressive goals that few organizations outside of the Department of Defense currently implement. For example, it requires application white-listing (CSC 2.2), disabling javascript (CSC 7.3), and other controls which although valid recommendations, are difficult to implement in practice from a business perspective.

https://summitroute.com/blog/2016/03/20/downclimb/

@chadwhitacre
Copy link
Contributor Author

BeyondCorp: Design to Deployment: Google's BeyondCorp strategy has been heralded as the right way forward for protecting networks. The concept is to no longer have an internal corp network that employees need to hard-wire or VPN into, because that promotes a crunchy outside and soft inside network, where once an attack gets on one of your employee laptops they end up with free rein. The BeyondCorp strategy that Google announced a year ago advises that you shouldn't provide access to systems just because of what network they come in on. This new paper describes this concept in more detail.

https://summitroute.com/blog/2016/04/10/downclimb/

@chadwhitacre
Copy link
Contributor Author

Would you store your name, address, and national ID number in Gratipay today?
I will check with @kaguillera when I see him next week ...

Alright, talking w/ @kaguillera here IRL and he is good to go. 👍

(Though he also mentions that national ID numbers are not big in Trinidad and passport or driver's license would be more normal.)

@chadwhitacre
Copy link
Contributor Author

Alright, folks! Closing this one out. We have a vault. 😳

Future work to be organized on the Security Radar.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants