Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(hlapi): add tag system #1468

Merged
merged 1 commit into from
Sep 5, 2024
Merged

feat(hlapi): add tag system #1468

merged 1 commit into from
Sep 5, 2024

Conversation

tmontaigu
Copy link
Contributor

Tag

The Tag allows to store bytes alongside of entities (keys, and ciphertext) the main purpose of this system is to tag / identify ciphertext with their keys.

  • When encrypted, a ciphertext gets the tag of the key used to encrypt it.
  • Ciphertexts resulting from operations (add, sub, etc.) get the tag from the ServerKey used
  • PublicKey gets its tag from the ClientKey that was used to create it
  • ServerKey gets its tag from the ClientKey that was used to create it

User can change the tag of any entities at any point.

BREAKING CHANGE: Many of the into_raw_parts and from_raw_parts changed to accommodate the addition of the `tag``

closes: please link all relevant issues

PR content/description

Check-list:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • Relevant issues are marked as resolved/closed, related issues are linked in the description
  • Check for breaking changes (including serialization changes) and add them to commit message following the conventional commit specification

@tmontaigu tmontaigu force-pushed the tm/tag-system branch 3 times, most recently from 35efd30 to bc4cd3d Compare August 13, 2024 11:02
@IceTDrinker
Copy link
Member

tests are red at the moment

Copy link
Contributor

@nsarlin-zama nsarlin-zama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice API !

tfhe/src/high_level_api/keys/public.rs Outdated Show resolved Hide resolved
tfhe/src/high_level_api/keys/public.rs Outdated Show resolved Hide resolved
tfhe/src/high_level_api/keys/public.rs Outdated Show resolved Hide resolved
tfhe/src/high_level_api/keys/public.rs Outdated Show resolved Hide resolved
tfhe/src/high_level_api/keys/client.rs Outdated Show resolved Hide resolved
tfhe/src/high_level_api/tag.rs Outdated Show resolved Hide resolved
Copy link
Member

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the SmallVec is too smart for what we are trying to do (there may even be issues depending on the wasm vec size) I think an Option for the tag should be enough, not bigger than a vec https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c0cf100d311d5780007cbf38a26f4515

and allows to have an empty tag that has smaller overhead than the SmallVec which needs to be initialized with 0s and then copied around even if it's morally empty

tfhe/src/high_level_api/tag.rs Show resolved Hide resolved
use crate::high_level_api::backward_compatibility::tag::TagVersions;
use tfhe_versionable::{Unversionize, UnversionizeError, Versionize, VersionizeOwned};

const STACK_ARRAY_SIZE: usize = std::mem::size_of::<Vec<u8>>() - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may not be constant across compiler versions and perhaps even platforms/hw ?

could be an issue with WASM as well ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would just mean that if for example sizeof::Vec is 3*32 = 96, then, a 128 bit number would be stored in a Vec rather than a the stack.

So when creating, no problem, when deserializing no problem either, and if a 64 bit CPU deserialize that, as it would fit on stack, the 128 bits would live on the stack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all right I need to check the details of the small vec stuff then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why it's important to convert it to a plain array before serialization (as you did)

tfhe/src/high_level_api/tag.rs Show resolved Hide resolved

const STACK_ARRAY_SIZE: usize = std::mem::size_of::<Vec<u8>>() - 1;

/// Simple short optimized vec, where if the data is small enough
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the SmallVec optimization might be premature and has other implications that cause more trouble at the moment than it solves

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other implications ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the size, compatibility of compiler versions, the fact the default case copies an array of 0s around

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just like with just a Vec, the compiler would copy the bytes of the internal data (ptr, size, capacity) of the vec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep my bad

Copy link
Contributor Author

@tmontaigu tmontaigu Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SmallVec size is 8 bytes (likely sizeof::<usize>) bigger than a plain Vec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that the SmallVec thing might be premature, but the complexity of it is not that high I think

Also, the vec of the Tag which only gets and return bytes or simple u64/u128 and the fact that the serialization/versioning returns bytes protects us in a sense.

Comment on lines 121 to 201
#[cfg(feature = "zk-pok")]
#[derive(Clone, Serialize, Deserialize)]
pub struct ProvenCompactCiphertextList(crate::integer::ciphertext::ProvenCompactCiphertextList);
pub struct ProvenCompactCiphertextList {
inner: crate::integer::ciphertext::ProvenCompactCiphertextList,
tag: Tag,
}

#[cfg(feature = "zk-pok")]
impl Tagged for ProvenCompactCiphertextList {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we create a zk mod here and have a feature gate on it + a pub re export ?

allows to limit the number of #cfg and I find it less error prone

tfhe/src/high_level_api/integers/unsigned/compressed.rs Outdated Show resolved Hide resolved
Copy link
Member

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last comments on small vec + tests

use crate::ProvenCompactCiphertextList;

let config =
ConfigBuilder::with_custom_parameters(PARAM_MESSAGE_2_CARRY_2_KS_PBS_TUNIFORM_2M64).build();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to mimich the blockchain case better could we use the dedicated PK params + casting params ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to add it, but since the tag is for the whole key set it won't change how it behaves

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s to make sure it gets from the public key through the keyswitch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I mean is for the BC use case we have

PKE -> KS -> PBS

and we want to make sure the tagging works well in that scenario

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i see

bytes: r_bytes,
len: r_len,
},
) => l_vec == &r_bytes[..usize::from(*r_len)],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if a short circuit by checking the l_vec.len() == r_len is worth it here and below

}
}

pub fn as_u64(&self) -> u64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as_le_u64 I would suggest then, or docstring that makes it clear how the data is interpreted and the truncating behavior of the as_u64

u64::from_le_bytes(le_bytes)
}

pub fn as_u128(&self) -> u128 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing for the endianness

tfhe/src/high_level_api/tag.rs Show resolved Hide resolved
Comment on lines 140 to 176
pub fn set_u64(&mut self, value: u64) {
let le_bytes = value.to_le_bytes();
self.set_data(le_bytes.as_slice());
}

pub fn set_u128(&mut self, value: u128) {
let le_bytes = value.to_le_bytes();
self.set_data(le_bytes.as_slice());
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info about endianness


/// Tag
///
/// The `Tag` allows to store bytes alongside of entities (keys, and ciphertext)
Copy link
Member

@IceTDrinker IceTDrinker Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alongside of -> alongside
keys, and ciphertexts

(missing s)

/// The `Tag` allows to store bytes alongside of entities (keys, and ciphertext)
/// the main purpose of this system is to `tag` / identify ciphertext with their keys.
///
/// tfhe-rs does not interpret or check this data, it only stores it and passes it around
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFHE-rs I think is the accepted spelling for public resources

@tmontaigu tmontaigu force-pushed the tm/tag-system branch 2 times, most recently from 44c6291 to cd89af4 Compare August 29, 2024 10:13
Copy link
Member

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor things, next one should be the one to merge

Comment on lines 4 to 9
#[derive(VersionsDispatch)]
pub(in crate::high_level_api) enum SmallVecVersions {
#[allow(unused)] // Unused because V1 does not exists yet
V0(SmallVec),
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no Versions for the SmallVec ?

/// the main purpose of this system is to `tag` / identify ciphertext with their keys.
///
/// TFHE-RS does not interpret or check this data, it only stores it and passes it around
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFHE-rs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing only this I believe and go for merge

@tmontaigu tmontaigu force-pushed the tm/tag-system branch 3 times, most recently from da069ad to c4228e5 Compare September 2, 2024 09:15
Tag

The `Tag` allows to store bytes alongside of entities (keys, and ciphertext)
the main purpose of this system is to `tag` / identify ciphertext with their keys.

* When encrypted, a ciphertext gets the tag of the key used to encrypt it.
* Ciphertexts resulting from operations (add, sub, etc.) get the tag from the ServerKey used
* PublicKey gets its tag from the ClientKey that was used to create it
* ServerKey gets its tag from the ClientKey that was used to create it

User can change the tag of any entities at any point.

BREAKING CHANGE: Many of the into_raw_parts and from_raw_parts changed
to accommodate the addition of the `tag``
Copy link
Member

@IceTDrinker IceTDrinker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot !

@tmontaigu tmontaigu merged commit 426f3bd into main Sep 5, 2024
89 checks passed
@tmontaigu tmontaigu deleted the tm/tag-system branch September 5, 2024 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants