-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create rad models for logseries and lognormal #48
Comments
Examining code from drlnorm <- function(x, meanlog, sdlog, S) {
qlnorm(ppoints(S)[x], meanlog, sdlog, lower.tail=FALSE) / sum(qlnorm(ppoints(S), meanlog, sdlog))
} Does this make any sense? |
Checking with Broken-Stick gave a good agreement, but not perfect
|
But looking forward for fitings something went wrong, at least for logseries > drls <- function(x, alpha, N, S) {
+ qls(ppoints(S)[x], N, alpha, lower.tail=FALSE) / sum(qlnorm(ppoints(S), N, alpha))
+ }
> okland.ranks <- rep(1:length(okland),sort(okland, decreasing=TRUE))
> LL2 <- function(alpha)
+ -sum(log(drls(okland.ranks, alpha, sum(okland), length(okland))))
> (okland.ls <- fitls(okland))
Maximum likelihood estimation
Type: discrete species abundance distribution
Call:
mle2(minuslogl = function (N, alpha)
-sum(dls(x, N, alpha, log = TRUE)), start = list(alpha = 3.37806926793836),
method = "Brent", fixed = list(N = 1689), data = list(x = list(
438, 257, 248, 157, 143, "etc")), lower = 0, upper = 21L)
Coefficients:
N alpha
1689.000000 3.378058
Log-likelihood: -112.41
> (okland.rls <- mle2(LL2, list(alpha=3.37)))
Erro em optim(par = 3.37, fn = function (p) :
valor inicial em vmmin não é finito
## Trying one-dimensional fit
> (okland.rls <- mle2(LL2, list(alpha=3.37), method="Brent", lower=1, upper=10))
Call:
mle2(minuslogl = LL2, start = list(alpha = 3.37), method = "Brent",
lower = 1, upper = 10)
Coefficients:
alpha
10
Log-likelihood: -Inf
Houve 37 avisos (use warnings() para vê-los) Also, fitting ws extremely slow. But this can be solved using the estimated coeficients from the faster |
Same problem when trying to fit the lognormal: > drlnorm <- function(x, meanlog, sdlog, S) {
+ qlnorm(ppoints(S)[x], meanlog, sdlog, lower.tail=FALSE) / sum(qlnorm(ppoints(S), meanlog, sdlog))
+ }
> okland.ranks <- rep(1:length(okland),sort(okland, decreasing=TRUE))
> LL1 <- function(mu, sigma)
+ -sum(log(drlnorm(okland.ranks, mu, sigma ,length(okland))))
> (okland.ln <- fitlnorm(okland))
Maximum likelihood estimation
Type: continuous species abundance distribution
Call:
mle2(minuslogl = function (meanlog, sdlog)
-sum(dlnorm(x, meanlog, sdlog, log = TRUE)), start = list(meanlog = 3.36487212445623,
sdlog = 1.54523635728825), data = list(x = list(438, 257,
248, 157, 143, "etc")))
Coefficients:
meanlog sdlog
3.364872 1.507996
Log-likelihood: -109.09
> (okland.rln <- mle2(LL1, list(mu=1, sigma=1)))
Call:
mle2(minuslogl = LL1, start = list(mu = 1, sigma = 1))
Coefficients:
mu sigma
1.000000 1.368092
Log-likelihood: -3926.41
> (okland.rln <- mle2(LL1, list(mu=100, sigma=1)))
Call:
mle2(minuslogl = LL1, start = list(mu = 100, sigma = 1))
Coefficients:
mu sigma
100.000000 1.368092
Log-likelihood: -3926.41 |
The difference for > drbs(1:10, 100, 10)*100
[1] 29.29 19.29 14.29 10.96 8.46 6.46 4.79 3.36 2.11 1.00
> qbs(ppoints(10,a=1/2),100,10, lower.tail=F)
[1] 28 19 14 11 8 6 4 3 1 1 However, rlnorm seems to be constant on > drlnorm(1:5, 1, 0.4, 30)
[1] 0.07234592 0.05963141 0.05370137 0.04974786 0.04675000
> drlnorm(1:5, 100, 0.4, 30)
[1] 0.07234592 0.05963141 0.05370137 0.04974786 0.04675000
> drlnorm(1:5, 0.01, 0.4, 30)
[1] 0.07234592 0.05963141 0.05370137 0.04974786 0.04675000 |
Wilson 91 gives the following expression for the "log-abundance of the i-th species of S": As we are interested only on the effects of meanlog, I will write this as In order to transform this into a density, we need to normalize it:
So it seems that |
OK, back to the logseries: first, note that there is a typo in > drls <- function(x, alpha, N, S) {
+ qls(ppoints(S)[x], N, alpha, lower.tail=FALSE) / sum(qlnorm(ppoints(S), N, alpha))
+ } The original function is also slow as hell, because qls() is being called for EACH value in okland.ranks, which is a vector with several repeated elements. The following function is optimized to deal with this situation: > drls2 <- function(x, alpha, N, S) {
+ myq <- qls(ppoints(S), N, alpha, lower.tail=FALSE) / sum(qls(ppoints(S), N, alpha))
+ myq[x]
+ }
> library(microbenchmark)
> microbenchmark(drls2(okland.ranks, alpha, N, S), drls(okland.ranks, alpha, N, S), times=5) However, this function is very ill-behaved, and seems to converge to an alpha around 27: > (okland.rls <- mle2(LL2, list(alpha=27.53629), method="Brent", lower=2, upper=35, eval.only=T))
> bbmle::profile(okland.rls, prof.lower=0, prof.upper=30, std.err=0.17)->okland.p
> plotprofmle(okland.p) |
After some days working on this issue with very little progress, I'm voting on moving it to a next milestone, and releasing sads 0.2.0 on CRAN after merging #53 |
Many open theoretical and comp questions. Needs further thinking, not a milestone for version 0.2 anymore. |
Ideas to test:
|
The main purpose here is to make likelihoods from sad and rad fits comparable. |
The answer to the first question above is: yes. N = 1e5
x1 <- rsad(S=N, frac=1, sad="lnorm", meanlog=5)
x2 <- rsad(S=N, frac=1, sad="lnorm", meanlog=7)
x3 <- rsad(S=N, frac=1, sad="lnorm", meanlog=9)
plot(rad(x1), type='l', xlim=c(0, 1e5), ylim=c(1, max(x3)))
lines(rad(x2), col='blue')
lines(rad(x3), col='red') |
Implemented a tentative I have used a general function The performance is also a little better than in the first versions, as I have used the optimized code above, and |
It is a bit strange to see that fitlnorm seem to provide a better fit for > AICtab(fitls(birds), fitlnorm(birds), fitbs(birds), base=T)
AIC dAIC df
fitlnorm(birds) 934.8 0.0 2
fitls(birds) 945.4 10.6 1
fitbs(birds) 992.8 58.0 0
> AICtab(as.fitrad(fitls(birds)), as.fitrad(fitlnorm(birds)), fitrbs(birds), base=T)
AIC dAIC df
as.fitrad(fitls(birds)) 98555.6 0.0 2
as.fitrad(fitlnorm(birds)) 99163.4 607.8 2
fitrbs(birds) 102071.2 3515.6 0 The relevant code from drad seems ok, so maybe this is a problem similar to the "trueLL" weirdness. Should we merge this? Should we add more distributions to the "as.fitrad" code? Should we investigate something else? |
Too weird, back to the drawing board. Moving milestone to 1.0.0 |
Check May (1975) and Wilson, J. B. (1991) for analytic expressions.
The text was updated successfully, but these errors were encountered: