Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding hillestimator #31

Closed
wants to merge 15 commits into from
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,9 @@ returnlevels(xs)

# mean excess with previous k values
meanexcess(xs, k)

# Hill estimator for the tail index using the top k largest values
hillestimator(xs, k)
```

## References
Expand Down
4 changes: 2 additions & 2 deletions src/ExtremeStats.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ export

# statistics
returnlevels,
meanexcess

meanexcess,
hillestimator
end
18 changes: 18 additions & 0 deletions src/stats.jl
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,21 @@ function meanexcess(xs::AbstractVector, ks::AbstractVector{Int})
ys = sort(xs, rev=true)
[mean(log.(ys[1:k-1])) - log(ys[k]) for k in ks]
end

"""
hillestimator(data, k)

Return the Hill estimator of the tail index for the data `xs` using the top `k` largest values.
"""
function hillestimator(xs::AbstractVector, k::Int)
sorted_xs = sort(xs, rev=true)

if k >= length(sorted_xs)
error("k must be smaller than the number of data points")
end

sum_log_ratios = sum(log(sorted_xs[i]) - log(sorted_xs[k+1]) for i in 1:k)
hill_estimate = sum_log_ratios / k

return hill_estimate
Comment on lines +53 to +62
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still using underscore in variable names and 4 whitespaces for indentation.

end
Loading