Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store multiple clustering results in the malware analysis JSON #13

Open
So-Cool opened this issue Aug 1, 2016 · 2 comments
Open

Store multiple clustering results in the malware analysis JSON #13

So-Cool opened this issue Aug 1, 2016 · 2 comments

Comments

@So-Cool
Copy link
Collaborator

So-Cool commented Aug 1, 2016

At the moment at most one parameter settings and clustering results can be stored per clustering algorithm. It should be extended to allow storing results of clustering for multiple parameter settings. See TODO tags in this commit 1727bb0.

@greninja
Copy link

Hey @So-Cool ,

Would like to work on this enhancement feature.

So basically we would want to have a good hash function ,without collisions, that should use parameters ('eps' and 'min_samples' in the case of dbscan and "min_samples" and "min_cluster_size" in the case of hdbscan) to generate the hash? Am I correct?

@So-Cool
Copy link
Collaborator Author

So-Cool commented Jan 26, 2017

Hi @greninja ,
that's great that you're willing to work on this. To avoid any kind of mess with your PRs could you please first finalise the other two issues that you are working on?

Hash function is not really necessary, especially that it would need to be bidirectional. One problem is to store it but the other is to retrieve it: we want users to be able to understand what parameters were used to get particular results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants