How to call MLIR Sparse module with only Dense inputs with Python bindings

mtsokol · September 12, 2024, 4:37pm

Hi!
I have a module that performs sparse tensor element-wise addition. It works for (CSR, Dense2D) -> CSR, (CSR, CSR) -> CSR, and a few others combinations (LLVM version 19.1.0-rc3).

I can’t figure out Python binding calls for (Dense2D, Dense2D) -> Dense2D as I keep getting Segmentation fault (core dumped).

My MLIR module is using sparse-assembler{direct-out=true}:

#Dense = #sparse_tensor.encoding<{
    map = (i, j) -> (i : dense, j : dense), posWidth = 64, crdWidth = 64
}>

#map = affine_map<(d0, d1) -> (d0, d1)>
func.func @add(%st_0 : tensor<3x4xf64, #Dense>, %st_1 : tensor<3x4xf64, #Dense>) -> tensor<3x4xf64> attributes { llvm.emit_c_interface } {
    %out_st = tensor.empty() : tensor<3x4xf64>
    %res = linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"]} ins(%st_0, %st_1 : tensor<3x4xf64, #Dense>, tensor<3x4xf64, #Dense>) outs(%out_st : tensor<3x4xf64>) {
        ^bb0(%in_0: f64, %in_1: f64, %out: f64):
            %2 = sparse_tensor.binary %in_0, %in_1 : f64, f64 to f64
                overlap = {
                    ^bb0(%arg1: f64, %arg2: f64):
                        %3 = arith.addf %arg1, %arg2 : f64
                        sparse_tensor.yield %3 : f64
                }
                left = {
                    ^bb0(%arg1: f64):
                        sparse_tensor.yield %arg1 : f64
                }
                right = {
                    ^bb0(%arg1: f64):
                        sparse_tensor.yield %arg1 : f64
                }
            linalg.yield %2 : f64
    } -> tensor<3x4xf64, #Dense>
    return %res : tensor<3x4xf64, #Dense>
}

And I’m trying to call it with:

import mlir.runtime as rt
import ctypes

class Dense2D(ctypes.Structure):
    _fields_ = [
        ("data", rt.make_nd_memref_descriptor(1, ctypes.c_double)),
]

result = Dense2D()

module_add.invoke(
    "add",
    ctypes.pointer(ctypes.pointer(result)),
    ctypes.pointer(ctypes.pointer(input_1.data)),
    ctypes.pointer(ctypes.pointer(input_2.data)),
)

I think the structure of inputs might be incorrect. I figured out how to call modules with CSR, COO inputs, but here with only Dense variants I’m missing something. For [dense,dense] Tensor is it only data array? Or is there additional pos/indices here?

Thank you for any help!

aartbik · September 12, 2024, 6:35pm

The all-dense-annotated “sparse” tensors are always interesting, since they slightly differ from pure dense tensors. In fact, in the MLIR sparsifier team, we had many discussions whether we should even allow them. I, for one, was of the opinion that we should support them, just to make sure that all our sparse DSL type code generation algorithms works for all cases, including the corner case of marking all dimensions dense.

The format for all-dense-annotated “sparse” tensors is simply a linearized data array in row-wise format. You can verify that with the sparse_tensor.print operation that I added for exactly such debugging purposes!

#AllDense = #sparse_tensor.encoding<{
  map = (i, j) -> (
    i : dense,
    j : dense
  )
}>

module {
  func.func @main() {
    %x = arith.constant dense <[
         [ 1, 0, 2, 0, 0, 0, 0, 0 ],
         [ 0, 0, 0, 0, 0, 0, 0, 0 ],
         [ 0, 0, 0, 0, 0, 0, 0, 0 ],
         [ 0, 0, 3, 4, 0, 5, 0, 0 ] ]> : tensor<4x8xi32>

    %XO = sparse_tensor.convert %x : tensor<4x8xi32> to tensor<4x8xi32, #AllDense>
    sparse_tensor.print %XO : tensor<4x8xi32, #AllDense>
    return
  }
}

output:

---- Sparse Tensor ----
nse = 32
dim = ( 4, 8 )
lvl = ( 4, 8 )
values : ( 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 4, 0, 5, 0, 0 )
----

Having said all that, it always helps to simply inspect the generated code. I had to fix various missing annotations in your example (so I am not 100% what you eventually want), but after adding #Dense to all the missing cases, we get:

  llvm.func @_mlir_ciface_add(%arg0: !llvm.ptr, %arg1: !llvm.ptr) -> !llvm.ptr attributes {llvm.emit_c_interface} {
    %0 = llvm.call @add(%arg0, %arg1) : (!llvm.ptr, !llvm.ptr) -> !llvm.ptr
    llvm.return %0 : !llvm.ptr
  }

which takes two 1-dim arrays as input, and returns one 1-dim as output, since you have given the direct-out=true; note that there are subtle differences here on who owns the buffer and who is responsible for releasing it eventually. Please also note that the direct-out only works for enable-runtime-library=false (because otherwise, you get a pointer to an opaque C++ struct back, that is perhaps what was missing in your setup).

Let me know if this helps. Otherwise happy to help out some more!

aartbik · September 12, 2024, 10:42pm

By the way, you can also have a look at the MPACT posting I made a bit back. This provides a fully functional PyTorch end-to-end ML compiler with sparsity support (and thus shows all the parameter passing details needed for Python and numpy arrays “under water”).

mtsokol · September 13, 2024, 11:28am

@aartbik Thank you for the explanation and the link!

I used enable-runtime-library=false that you recommended but for only-dense inputs the issue still persists.

Reproduction

To make the reproduction easier I set up a repo with a CI job that reproduces this issue:

The repository has a script that runs CSR+Dense=Dense operation that works: mlir-dense-reproduce/csr_dense.py at main · mtsokol/mlir-dense-reproduce · GitHub

In the CI it prints a tensor as expected: Initial commit · mtsokol/mlir-dense-reproduce@63eb1a6 · GitHub

The repository also has a Dense+Dense=Dense script that fails with a Segmentation fault: mlir-dense-reproduce/dense_dense.py at main · mtsokol/mlir-dense-reproduce · GitHub

In the CI segfault also occurs: Initial commit · mtsokol/mlir-dense-reproduce@63eb1a6 · GitHub

My LLVM version is 19.1.0-rc3 installed via conda (Mlir Python Bindings | Anaconda.org).

And to show that changes are minimal between these two scripts from the repo (only changing the format of the first argument) here’s a diff between the two scripts that I shared: mlir_dense_dense_diff - Diffchecker

I any more information is needed please let me know!

aartbik · September 13, 2024, 5:13pm

You actually exposed a very subtle bug with the all dense cases. I have a fix PR in the making! Thanks for reporting!

aartbik · September 14, 2024, 12:26am

Can you please try [mlir][sparse] fix bug with all-dense assembler by aartbik · Pull Request #108615 · llvm/llvm-project · GitHub and let me know if that resolves all issues for you?

mtsokol · September 16, 2024, 9:07am

@aartbik I rebuild LLVM locally and this fix solves my issue - thank you!

I think now I’m only stuck with: PassManager fails on simple COO addition example if you have spare time to look into it.

Topic		Replies	Views
Compiling into a shared object and calling MLIR from external code Beginners linalg , sparse , mlir	10	379	April 22, 2024
Assembling a CSC matrix with Python bindings Beginners python , mlir	1	46	September 16, 2024
MLIR News, 63rd edition (6th March 2024) Newsletter llvm-weekly	0	466	March 6, 2024
Sparse Tensors in MLIR MLIR	62	5907	March 25, 2025
[Sparse Tensor] mlir-cpu-runner segmentation fault MLIR	34	1065	September 24, 2021

How to call MLIR Sparse module with only Dense inputs with Python bindings

Reproduction

Related topics