Svalbard protects secret values using secret sharing, a well-known technique for robust, distributed protection of secrets. A user Alice (Svalbard client) computes the shares, and stores them at various locations, while the metadata of the sharing (describing how the sharing was computed and where the shares are stored) is stored at a separate, distingushed party called cloud provider (CP), which should be disjoint from the parties offering storage of shares. It is assumed that Alice already has an account with CP,and that this account offers strong authentication mechanisms (e.g. 2-factor authentication). Cloud provider can store data reliably, but should not learn Alice's secrets.
To effectively protect against corruption at any single party involved, and even against corruption of groups of parties, it uses a two-level secret sharing:
- At first level Alice's secret value SV is split into two shares of an 2-out-of-2 XOR secret sharing, resulting in shares SH1 and SH2.
- One of these shares, say SH1, is stored by the cloud provider CP, and the other one (SH2) is split into n shares using k-out-of-n Shamir's secret sharing for specific values of k and n chosen by Alice. This results in shares SH2_1, ..., SH2_n, for each of which Alice picks a storage form and location.
The basic security of the backed up secret follows then from the properties of the employed secret sharing. However, the shares and the metadata are distributed among multiple systems, they can be subject to corruption, either accidental or malicious. Such a corruption could result in an undetected reconstruction of a wrong secret value, so Svalbard employs additionally simple hash-based protections to make sure that any corruption is detected with high probability.
To detect potential corruptions of the second-level shares, the metadata contains also a salted hash of each share. This assures the correctness of the shares that enter the secret reconstruction stage, assuming that the metadata is correct. Moreover, this will detect also corruption of the hashes in the metadata unless the cloud provider CP that stores the metadata maliciously cooperates with the provider of storage of second-level shares.
More precisely, the metadata contains additionally a randomly picked string
hash_salt
of length max. 255 bytes, and for each sencond-level share SH2_i,
i=1..n, the metadata contains SaltedHash(SH2_i, hash_salt)
, with
SaltedHash
-function defined as
SaltedHash(share, salt) = SHA-256(length(salt) || salt || SH2_i)
where length(hash_salt)
is one byte value denoting the length of hash_salt
in bytes, ||
denotes concatentation, and SHA-256()
denotes SHA-256 Secure
Hash Algorithm. Upon
retrieval of a share the corresponding salted hash value is verified, and only
consistent shares are accepted for the secret reconstruction step.
Even if the second-level shares were all correct and the first-level share SH2
was successfully reconstructed, a corruption of the share SH1 (which is stored
as a part of the metadata) could result in an undetected bit-flip or other
controlled modification of the reconstructed secret. To detect a potential
corruption of SH1, the second-level sharing protects additionally a salted hash
of the original secret value SV. That is, the input value for the second-level
sharing is not just SH2, but SH2_with_hash
defined as follows:
SV_hash = SaltedHash(SV, hash_salt)
SH2_with_hash = SV_hash || SH2
This way after a reconstruction of the secret one can check, whether the
resulting value is indeed equal to the original secret: SV_hash
and SH2
are extracted from SH2_with_hash
, and then SV_hash
is checked against
SaltedHash(SH1 xor SH2, hash_salt)
.
Here is an example that lists all the major values involved in
the sharing of a secret value SV
, with parameters k , n:
SH1 = Random(length(SV))
SH2 = SV xor SH1 // i.e. SV == SH1 xor SH2
hash_salt = Random(SALT_LENGTH) // max. 255 bytes
SV_hash = SaltedHash(SV, hash_salt)
SH2_with_hash = SV_hash || SH2
// for each i=1..n SH2_i is a k-out-of-n share of SH2_with_hash
SH2_i = ... // i-th share of SH2_with_hash
location_i = ... // location where i-th share is stored
SH2_i_hash = SaltedHash(SH2_i, hash_salt)
Each share SH2_i
is stored at the corresponding location,
and the metadata stored at the cloud provider CP contains then:
SH1
hash_salt
- k , n, and other metadata about second-level sharing
- per-share information, for each
i=1..n
:location_i
SH2_i_hash