Hi,
I’m working with a larger algorithm in which a recurring step (done 5 - 20 times) is to carry out a linear regression y^k = X\beta^k + \varepsilon^k with roughly 100.000 observations and 1.000-5.000 variables, where y^k is either a length 100.000 vector or 100.000 by 1.000 matrix that depends on the stage k of the algorithm. The problem is fully dense. As the algorithm itself is iterative, I don’t know y^{k+1} before carrying out the regression for y^k (and doing some subsequent calculations), but I do know that the matrix X is fixed.
As such, I’d like to save some computational effort and recycle some work if possible. The thing that comes to mind is to compute pinv(X) and to carry out the regression as pinv(X)*y^k. I was wondering if this is the way to go here, or if there is a better approach that I’m missing. I look forward to hearing your thoughts!
using BenchmarkTools, Random, LinearAlgebra
Random.seed!(42)
function test(N, K)
X = rand(N, K)
y = X * rand(K) + randn(N)
@time X\y
@time pinv(X)*y
return nothing
end
test(200_000, 1500)
the expensive part is the factorization of X, so you can perform this once only and then perform the relatively cheap Xf\y using the pre-factorized objct.
if you’d like, you could choose the factorization manually, e.g.,
Because of this, I would tend to use QR = qr(X, ColumnNorm()) and then do β = QR \ y on each step. This re-uses the efficient QR factorization and should be robust and backwards stable.
It turns out that after doing this, for any given problem that I consider, the vast majority of computational time now lies in three such QR factorizations that I need to perform. This is a major improvement, thanks a lot!
I have access to a HPC. Suppose that the dimensions become a bit larger; would it still be advisable to just call qr(X, ColumnNorm()) or is there eventually some other way to make use of distributed/parallel computing that outperforms simply calling qr(X, ColumnNorm())? (If it helps, suppose that all cores are available)
Elemental.jl has a distributed-memory column-pivoted QR, I think.
Often when you get to the point of doing distributed dense linear algebra you want to take a step back and be more clever, but without knowing more about your problem it’s hard to make any specific suggestion in that regard.