-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The execution time is very slow #13
Comments
""" Citation: from multiprocessing import Pool def DBCV(X, labels, dist_function=euclidean):
def _core_dist(point, neighbors, dist_function):
def _mutual_reachability_dist(point_i, point_j, neighbors_i,
def _mutual_reach_dist_graph(X, labels, dist_function):
def _mutual_reach_dist_graph_worker(X, labels, dist_function, point_members_row): def _mutual_reach_dist_graph_multiproc(X, labels, dist_function):
def _mutual_reach_dist_MST(dist_tree):
def _cluster_density_sparseness(MST, labels, cluster):
def _cluster_density_separation(MST, labels, cluster_i, cluster_j):
def _cluster_validity_index(MST, labels, cluster):
def _cluster_validity_index_worker(MST, labels, n_samples, label): def _clustering_validity_index_multiproc(MST, labels): def _clustering_validity_index(MST, labels):
def _get_label_members(X, labels, cluster):
|
i have included a version that runs in multiprocess and so runs much faster |
Thank you very much ^^ |
First, thanks @christopherjenness for this implementation! You dont need to introduce parallelism to make it faster. There are some steps you can take to instantly improve the runtime. Im currently having a look at it and these modifications:
Already lead to a considerable improvement. The first thing is, to use the already calculated dist to store it for both directions and skip that calculation step with an incresing offset. The other is, to store the neighbors for a specific label and therefore avoid extracting the neighbours in the for loops. |
Your solution is interesting. Unfortunately, it is not scalable. I made it turn for 200 points of two dimensions, it takes almost 6 seconds. For thousands of points I can't keep it running anymore.
The text was updated successfully, but these errors were encountered: