Skip to content

Commit

Permalink
Solves # 71 and push to v0.11.4
Browse files Browse the repository at this point in the history
  • Loading branch information
sylvaticus committed Mar 18, 2024
1 parent 4bf2d55 commit b337c4b
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 4 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "BetaML"
uuid = "024491cd-cc6b-443e-8034-08ea7eb7db2b"
authors = ["Antonello Lobianco <[email protected]>"]
version = "0.11.3"
version = "0.11.4"

[deps]
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"
Expand Down
4 changes: 2 additions & 2 deletions src/Clustering/Clustering_hard.jl
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ $(TYPEDFIELDS)
Base.@kwdef mutable struct KMeansC_hp <: BetaMLHyperParametersSet
"Number of classes to discriminate the data [def: 3]"
n_classes::Int64 = 3
"Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (`l1_distance`, `l2_distance`, `l2squared_distance`), `cosine_distance`), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that the `KMeansClusterer` algorithm is not guaranteed to converge with other distances than the Euclidean one."
"Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (`l1_distance`, `l2_distance`, `l2squared_distance`, `cosine_distance`), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that the `KMeansClusterer` algorithm is not guaranteed to converge with other distances than the Euclidean one."
dist::Function = (x,y) -> norm(x-y)
"""
The computation method of the vector of the initial representatives.
Expand All @@ -276,7 +276,7 @@ $(TYPEDFIELDS)
Base.@kwdef mutable struct KMedoidsC_hp <: BetaMLHyperParametersSet
"Number of classes to discriminate the data [def: 3]"
n_classes::Int64 = 3
"Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (`l1_distance`, `l2_distance`, `l2squared_distance`), `cosine_distance`), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that the `KMeansClusterer` algorithm is not guaranteed to converge with other distances than the Euclidean one."
"Function to employ as distance. Default to the Euclidean distance. Can be one of the predefined distances (`l1_distance`, `l2_distance`, `l2squared_distance`, `cosine_distance`), any user defined function accepting two vectors and returning a scalar or an anonymous function with the same characteristics. Attention that the `KMeansClusterer` algorithm is not guaranteed to converge with other distances than the Euclidean one."
dist::Function = (x,y) -> norm(x-y)
"""
The computation method of the vector of the initial representatives.
Expand Down
3 changes: 2 additions & 1 deletion src/Utils/Measures.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,15 @@
# ------------------------------------------------------------------------------
# Some common distance measures

# https://weaviate.io/blog/distance-metrics-in-vector-search
"""L1 norm distance (aka _Manhattan Distance_)"""
l1_distance(x,y) = sum(abs.(x-y))
"""Euclidean (L2) distance"""
l2_distance(x,y) = norm(x-y)
"""Squared Euclidean (L2) distance"""
l2squared_distance(x,y) = norm(x-y)^2
"""Cosine distance"""
cosine_distance(x,y) = dot(x,y)/(norm(x)*norm(y))
cosine_distance(x,y) = 1-dot(x,y)/(norm(x)*norm(y))
"""
$(TYPEDSIGNATURES)
Expand Down
6 changes: 6 additions & 0 deletions test/Utils_tests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -723,6 +723,12 @@ size(w2) == (4,)
eltype(w2) == Float64


# ==================================
# New test
println("** Testing cosine distance....")
x = [0,1]; y = [1,0]
@test cosine_distance(x,y) == 1

# MLJ Tests
# ==================================
# NEW TEST
Expand Down

2 comments on commit b337c4b

@sylvaticus
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register

Release notes:

bugfix (solve issue in cosine_distance - similarity was actually computed)

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/103074

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.11.4 -m "<description of version>" b337c4be3be5401df3a826c303b0a025cc456251
git push origin v0.11.4

Please sign in to comment.