Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLS with collinearity and weights #420

Closed
dwinkler1 opened this issue Apr 22, 2021 · 5 comments
Closed

OLS with collinearity and weights #420

dwinkler1 opened this issue Apr 22, 2021 · 5 comments

Comments

@dwinkler1
Copy link

dwinkler1 commented Apr 22, 2021

I am having a problem fitting an OLS model with weights and collinearity using GLM.jl.
Here is a simple example

y = rand(10)
x = rand(10, 2)
x = hcat(x,x)
lm(x,y) # correctly removes last two columns
lm(x, y, wts = ones(10)) # fails at cholesky! in GLM.delbeta!
lm(x, y, wts = ones(10), dropcollinear = true) # # fails at cholesky! in GLM.delbeta!

I further narrowed this down to the following function call:

linmod = LinearModel(GLM.LmResp(y,ones(10)), GLM.cholpred(x, true))
GLM.delbeta!(linmod.pp, linmod.rr.y, linmod.rr.wts) # fails
GLM.delbeta!(linmod.pp, linmod.rr.y) # works

using

@lessGLM.delbeta!(linmod.pp, linmod.rr.y, linmod.rr.wts)

I get

function delbeta!(p::DensePredChol{T,<:CholeskyPivoted}, r::Vector{T}, wt::Vector{T}) where T<:BlasReal
    cf = cholfactors(p.chol)
    piv = p.chol.p
    cf .= mul!(p.scratchm2, adjoint(LinearAlgebra.mul!(p.scratchm1, Diagonal(wt), p.X)), p.X)[piv, piv]
    cholesky!(Hermitian(cf, Symbol(p.chol.uplo)))
    ldiv!(p.chol, mul!(p.delbeta, transpose(p.scratchm1), r))
    p
end

and

@less GLM.delbeta!(linmod.pp, linmod.rr.y)
function delbeta!(p::DensePredChol{T,<:CholeskyPivoted}, r::Vector{T}) where T<:BlasReal
    ch = p.chol
    delbeta = mul!(p.delbeta, adjoint(p.X), r)
    rnk = rank(ch)
    if rnk == length(delbeta)
        ldiv!(ch, delbeta)
    else
        permute!(delbeta, ch.p)
        for k=(rnk+1):length(delbeta)
            delbeta[k] = -zero(T)
        end
        LAPACK.potrs!(ch.uplo, view(ch.factors, 1:rnk, 1:rnk), view(delbeta, 1:rnk))
        invpermute!(delbeta, ch.p)
    end
    p
end

Here is a naive implementation that seems to work:

using LinearAlgebra, GLM
y = rand(10)
x = rand(10, 2)
xc = hcat(x,x)
wts = [ones(5)..., 0.5 .* (ones(5))...];
T = eltype(xc)
ch = cholesky!((xc'diagm(wts)xc), Val(true), tol = -one(T), check = false) 
delbeta = zeros(T, size(xc,2))
delbeta = mul!(delbeta, adjoint(xc), diagm(wts)*y)
rnk = rank(ch)
permute!(delbeta, ch.p)
for k=(rnk+1):length(delbeta)
    delbeta[k] = -zero(Float64)
end
LAPACK.potrs!(ch.uplo, view(ch.factors, 1:rnk, 1:rnk), view(delbeta, 1:rnk))
invpermute!(delbeta, ch.p)
julia> delbeta[1:rnk]  coef(lm(x, y, wts = wts))
true
@nalimilan
Copy link
Member

Thanks for the detailed report. Would you make a (draft) PR to show what you suggest to change exactly?

@dwinkler1
Copy link
Author

Sure. It'll take me a while unfortunately. Swamped at work ATM.

@dwinkler1
Copy link
Author

Sorry it took so long. My PR seems to work correctly against a naive implementation:

using LinearAlgebra, GLM, Random
Random.seed!(1)
n = 100
x = rand(n, 2)
x = hcat(x,x)
y = x * [1;2;0;0] + randn(n)
wts = 10 .*rand(n)
xi = x[:,1:2];
dwt = Diagonal(wts);
coef_naive = (xi'dwt*xi)\xi'dwt*y

coef(lm(x, y, wts = wts))
coef(lm(x, y, wts = wts))[1:2] ≈ coef_naive # true

@kleinschmidt
Copy link
Member

I hit this too and came to the same diagnosis; I'll give your fix a try and see if it works in my case as well.

@andreasnoack
Copy link
Member

This was fixed by 1459737

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants