Repeated linear least squares problem with large dimensions

Hi,
I’m working with a larger algorithm in which a recurring step (done 5 - 20 times) is to carry out a linear regression y^k = X\beta^k + \varepsilon^k with roughly 100.000 observations and 1.000-5.000 variables, where y^k is either a length 100.000 vector or 100.000 by 1.000 matrix that depends on the stage k of the algorithm. The problem is fully dense. As the algorithm itself is iterative, I don’t know y^{k+1} before carrying out the regression for y^k (and doing some subsequent calculations), but I do know that the matrix X is fixed.

As such, I’d like to save some computational effort and recycle some work if possible. The thing that comes to mind is to compute pinv(X) and to carry out the regression as pinv(X)*y^k. I was wondering if this is the way to go here, or if there is a better approach that I’m missing. I look forward to hearing your thoughts!

I’m aware of the caching method described for LinearSolve.jl, but the way I understood that was that it is for linear systems, not for least squares: Linear Solve with Caching Interface · LinearSolve.jl.

MWE:

using BenchmarkTools, Random, LinearAlgebra
Random.seed!(42)

function test(N, K)
    X = rand(N, K)
    y = X * rand(K) + randn(N)
    
    @time X\y
    @time pinv(X)*y
    return nothing
end
test(200_000, 1500)

gives

 34.280021 seconds (9.01 k allocations: 2.238 GiB, 0.00% gc time)
 34.372401 seconds (28 allocations: 9.025 GiB, 0.33% gc time)

Try

Xf = factorize(X)

for k = 1:N
    beta = Xf\yk
end

the expensive part is the factorization of X, so you can perform this once only and then perform the relatively cheap Xf\y using the pre-factorized objct.

if you’d like, you could choose the factorization manually, e.g.,

Xf = svd(X)
Xf = qr(X)

etc.

1 Like

The pseudo-inverse (computed by SVD in pinv) is expensive, and I recently read that multiplying by the explicit pinv matrix is not a backwards-stable way of using the pseudo-inverse (Liu & Barnett, 2016, and references therein) . The default method in Julia for least-squares problems via β = X \ y is column-pivoted QR, as explained in this thread: Efficient way of doing linear regression - #33 by stevengj

Because of this, I would tend to use QR = qr(X, ColumnNorm()) and then do β = QR \ y on each step. This re-uses the efficient QR factorization and should be robust and backwards stable.

3 Likes

Thank you both for your replies, I will do this. I wasn’t aware of the backwards-stability issue, thank you for pointing that out!

It turns out that after doing this, for any given problem that I consider, the vast majority of computational time now lies in three such QR factorizations that I need to perform. This is a major improvement, thanks a lot!

I have access to a HPC. Suppose that the dimensions become a bit larger; would it still be advisable to just call qr(X, ColumnNorm()) or is there eventually some other way to make use of distributed/parallel computing that outperforms simply calling qr(X, ColumnNorm())? (If it helps, suppose that all cores are available)

Elemental.jl has a distributed-memory column-pivoted QR, I think.

Often when you get to the point of doing distributed dense linear algebra you want to take a step back and be more clever, but without knowing more about your problem it’s hard to make any specific suggestion in that regard.

1 Like

4 posts were split to a new topic: Repeated high (6–7) dimensional interpolation