Skip to content

Performance bug when creating SparseMatrixCSC #735

@TomDeWeer

Description

@TomDeWeer

Dear all,

I'm trying to import a very large scipy.sparse.csc_matrix into Julia but I'm running into a performance problem. I have tried to create a minimal working example to show what I mean.

First of all, I made the following module in Julia (file PerformanceBug.jl):

module PerformanceBug
using PyCall
using SparseArrays
using LinearAlgebra
export transformToJulia
    function transformToJulia(m, n, colPtr, rowVal, nzVal)
        colPtrJ = PyArray(PyObject(colPtr))
        rowValJ = PyArray(PyObject(rowVal))
        nzValJ = PyArray(PyObject(nzVal))
        vec(Int64[i+1 for i in colPtrJ]), vec(Int64[i+1 for i in rowValJ]), vec(Array(nzValJ))
        colPtrJ = vec(Int64[i+1 for i in colPtrJ])
        rowValJ = vec(Int64[i+1 for i in rowValJ])
        nzValJ = vec(Array(nzValJ))
        A = SparseMatrixCSC{Float64, Int64}(m, n, colPtrJ, rowValJ, nzValJ)
        # Until here everything is fast.
        return A # Executing this statement takes ages and consumes all RAM.
    end
end

Returning the matrix A in the last statement of the function consumes all my RAM and takes ages.

I use the above Julia module from within Python via the following script (executed via pycharm):

import julia, scipy.sparse, os
# setting up Julia from python, called in pycharm
juliaScriptsDir = # path to directory of PerformanceBug.jl
juliaDir = r"C:\Program Files\JuliaPro\Julia-1.2.0\bin"
juliaEXE = r"C:\Program Files\JuliaPro\Julia-1.2.0\bin\julia.exe"
julia.install(julia=juliaEXE)
oldDir = os.getcwd()
os.chdir(juliaDir)
j = julia.Julia(runtime=juliaEXE, compiled_modules=True)
os.chdir(oldDir)
j.include(os.path.join(juliaScriptsDir, "PerformanceBug.jl"))
os.chdir(juliaDir)
from julia import Main
# Creating diagonal matrix in python
n = 1000000
A = scipy.sparse.eye(n).tocsc()
# Conversion to Julia
Main.m = A.shape[0]
Main.n = A.shape[1]
Main.colPtr = A.indptr
Main.rowVal = A.indices
Main.nzVal = A.data
j.eval("A = PerformanceBug.transformToJulia(m, n, colPtr, rowVal, nzVal)")

Does anybody have a clue as to why Julia suddenly has trouble creating a simple sparse matrix? Working purely in Julia poses no performance issues at all.

Kind regards,
Tom

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions