Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pseudonymization of common_name #19

Open
phihos opened this issue Jan 26, 2022 · 4 comments · May be fixed by #20
Open

Pseudonymization of common_name #19

phihos opened this issue Jan 26, 2022 · 4 comments · May be fixed by #20
Labels
enhancement New feature or request

Comments

@phihos
Copy link

phihos commented Jan 26, 2022

Hi,

since client metrics endanger privacy of the clients but are useful for debugging I propose the following feature:
Each common_name is replaced with a random string of fixed length (e.g. five letters). The same real common_name is always replaced with the same string (thus pseudonymization).
Each time a new common_name needs to be pseudonymized an entry in an in-memory map <real cn> => <pseudonym is created and used in future replacement operations. The feature can be enabled with acommand line flag --pseudonymize-client-metrics.

Would you accept a PR with such a feature? In that case I will implement this for you.

@patrickjahns patrickjahns added the enhancement New feature or request label Feb 2, 2022
@patrickjahns
Copy link
Owner

patrickjahns commented Feb 2, 2022

Thank you very much for the suggestion/idea

I've been thinking about this and how to best proceed with it. I do not have the requirement for this feature - but I would not be against adding it. However - I won't maintain this feature und thus would need to kindly ask if you can then be the person maintaining this feature. Additionally to that, it only makes sense for me to accept this feature, if we can achieve a high enough coverage that any changes to other code paths would at least show that this feature is broken - I do hope you understand this.

In regards to the design of the feature - my major concern would be how to keep the mapping safe between exporter restarts? Is there a good way to avoid the requirement of maintaining persistence?

@phihos
Copy link
Author

phihos commented Feb 2, 2022

I won't maintain this feature und thus would need to kindly ask if you can then be the person maintaining this feature.

Sure when there is a bug/feature request regarding pseudonymization just ping me.

Additionally to that, it only makes sense for me to accept this feature, if we can achieve a high enough coverage that any changes to other code paths would at least show that this feature is broken.

I never do PRs without providing tests. In my understanding you do something for me when accepting code and code without tests is always a burden for the project maintainer. So for me this is just common sense :-)

my major concern would be how to keep the mapping safe between exporter restarts?

As far as I am concerned we don't. By "pseudonymization" I meant "pseudonymized between restarts, anonymized across restarts". For me this is a feature and not a bug because by analyzing a pseudonym for a longer timespan might help identifying the real person/account. I only need to perform short-term analysis like finding out if one client uses a lot of bandwidth.

I hope this answers all your qustions :-)

@phihos
Copy link
Author

phihos commented Feb 13, 2022

@patrickjahns Any comment on this?

@patrickjahns
Copy link
Owner

@phihos

Finally got around to thinking through it - let's move forward with it and document the limitation for that feature 👍

@phihos phihos linked a pull request Jun 4, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants