Skip to content

[NDTensors] [BUG] Many basic tensor operations fail with CUDA extension due to LoadError: Scalar indexing is disallowed. #1193

Closed
@wbernoudy

Description

@wbernoudy

Description of bug

I am attempting to use the CUDA extension for NDTensors. I am finding that tensor contraction sometimes fails, along with other functions like permute, as well as the NDTensors.generic_randn call used in the example of the NDTensors CUDA extension (NDTensors/ext/examples/NDTensorCUDA.jl).

They all fail with ERROR: LoadError: Scalar indexing is disallowed. (or emit a similar warning in the REPL where scalar indexing is allowed).

Minimal code demonstrating the bug or unexpected behavior

Minimal runnable code

import CUDA
import ITensors
import NDTensors

i, j, k = ITensors.Index(3), ITensors.Index(5), ITensors.Index(2)
A, B = ITensors.randomITensor(i, j), ITensors.randomITensor(j, k)

# works when running script directly, but warns of scalar indexing in REPL
cuA, cuB = NDTensors.cu(A), NDTensors.cu(B)

contracted = cuA * cuB

@show NDTensors.cpu(contracted)  A * B

# without @CUDA.allowscalar, fails with `ERROR: LoadError: Scalar indexing is disallowed.`
permuted = ITensors.permute(cuA, (j, i))

Expected output or behavior

NDTensors.cu() should work without warning in REPL on ITensors.ITensor, and ITensors.permute should work without the scalar indexing error.

Actual output or behavior

Output of minimal runnable code

NDTensors.cpu(contracted)  A * B = true
ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/uOYfN/src/GPUArraysCore.jl:103
  [3] getindex
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/indexing.jl:9 [inlined]
  [4] getindex
    @ ~/.julia/packages/StridedViews/MjjHH/src/stridedview.jl:97 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/Strided/O8zfl/src/mapreduce.jl:330 [inlined]
  [6] macro expansion
    @ ./simdloop.jl:77 [inlined]
  [7] macro expansion
    @ ~/.julia/packages/Strided/O8zfl/src/mapreduce.jl:329 [inlined]
  [8] _mapreduce_kernel!(f::Strided.CaptureArgs{NDTensors.var"#69#70", Tuple{Strided.Arg, Strided.Arg}}, op::Nothing, initop::Nothing, dims::Tuple{Int64, Int64}, blocks::Tuple{Int64, Int64}, arrays::Tuple{StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}, StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}, StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}}, strides::Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{Int64, Int64}}, offsets::Tuple{Int64, Int64, Int64})
    @ Strided ~/.julia/packages/Strided/O8zfl/src/mapreduce.jl:229
  [9] _mapreduce_block!(f::Any, op::Any, initop::Any, dims::Tuple{Int64, Int64}, strides::Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{Int64, Int64}}, offsets::Tuple{Int64, Int64, Int64}, costs::Tuple{Int64, Int64}, arrays::Tuple{StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}, StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}, StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}})
    @ Strided ~/.julia/packages/Strided/O8zfl/src/mapreduce.jl:152
 [10] _mapreduce_order!(f::Any, op::Any, initop::Any, dims::Tuple{Int64, Int64}, strides::Tuple{Tuple{Int64, Int64}, Tuple{Int64, Int64}, Tuple{Int64, Int64}}, arrays::Tuple{StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}, StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}, StridedViews.StridedView{Float64, 2, CUDA.CuArray{Float64, 2, CUDA.Mem.DeviceBuffer}, typeof(identity)}})
    @ Strided ~/.julia/packages/Strided/O8zfl/src/mapreduce.jl:138
 [11] _mapreduce_fuse!
    @ ~/.julia/packages/Strided/O8zfl/src/mapreduce.jl:116 [inlined]
 [12] copyto!
    @ ~/.julia/packages/Strided/O8zfl/src/broadcast.jl:35 [inlined]
 [13] materialize!
    @ ./broadcast.jl:884 [inlined]
 [14] materialize!
    @ ./broadcast.jl:881 [inlined]
 [15] permutedims!(R::NDTensors.DenseTensor{Float64, 2, Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}}, NDTensors.Dense{Float64, CUDA.CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}}, T::NDTensors.DenseTensor{Float64, 2, Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}}, NDTensors.Dense{Float64, CUDA.CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}}, perm::Tuple{Int64, Int64}, f::NDTensors.var"#69#70")
    @ NDTensors ~/.julia/packages/NDTensors/olKso/src/dense/densetensor.jl:246
 [16] permutedims!!(R::NDTensors.DenseTensor{Float64, 2, Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}}, NDTensors.Dense{Float64, CUDA.CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}}, T::NDTensors.DenseTensor{Float64, 2, Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}}, NDTensors.Dense{Float64, CUDA.CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}}, perm::Tuple{Int64, Int64}, f::Function)
    @ NDTensors ~/.julia/packages/NDTensors/olKso/src/dense/densetensor.jl:188
 [17] permutedims!!
    @ ~/.julia/packages/NDTensors/olKso/src/dense/densetensor.jl:186 [inlined]
 [18] permutedims
    @ ~/.julia/packages/NDTensors/olKso/src/tensoralgebra/generic_tensor_operations.jl:5 [inlined]
 [19] permutedims
    @ ~/.julia/packages/ITensors/Qe99p/src/tensor_operations/permutations.jl:64 [inlined]
 [20] _permute(as::NDTensors.NeverAlias, T::NDTensors.DenseTensor{Float64, 2, Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}}, NDTensors.Dense{Float64, CUDA.CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}}}, new_inds::Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}})
    @ ITensors ~/.julia/packages/ITensors/Qe99p/src/tensor_operations/permutations.jl:69
 [21] permute(as::NDTensors.NeverAlias, T::ITensors.ITensor, new_inds::Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}})
    @ ITensors ~/.julia/packages/ITensors/Qe99p/src/tensor_operations/permutations.jl:73
 [22] #permute#271
    @ ~/.julia/packages/ITensors/Qe99p/src/tensor_operations/permutations.jl:54 [inlined]
 [23] permute(T::ITensors.ITensor, new_inds::Tuple{ITensors.Index{Int64}, ITensors.Index{Int64}})
    @ ITensors ~/.julia/packages/ITensors/Qe99p/src/tensor_operations/permutations.jl:38
 [24] top-level scope
    @ ~/it_test2/test.jl:17
in expression starting at /home/william/it_test2/test.jl:17

Output of `cuA, cuB = NDTensors.cu(A), NDTensors.cu(B)` in REPL

┌ Warning: Performing scalar indexing on task Task (runnable) @0x00007f5e6b5fc970.
│ Invocation of getindex resulted in scalar indexing of a GPU array.
│ This is typically caused by calling an iterating implementation of a method.
│ Such implementations *do not* execute on the GPU, but very slowly on the CPU,
│ and therefore are only permitted from the REPL for prototyping purposes.
│ If you did intend to index this array, annotate the caller with @allowscalar.
└ @ GPUArraysCore ~/.julia/packages/GPUArraysCore/uOYfN/src/GPUArraysCore.jl:

Also, running `NDTensors/ext/examples/NDTensorCUDA.jl` results in a similar error

ERROR: LoadError: Scalar indexing is disallowed.
Invocation of setindex! resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] assertscalar(op::String)
   @ GPUArraysCore ~/.julia/packages/GPUArraysCore/uOYfN/src/GPUArraysCore.jl:103
 [3] setindex!
   @ ~/.julia/packages/GPUArrays/5XhED/src/host/indexing.jl:17 [inlined]
 [4] generic_randn(arraytype::Type{CuArray{T, 1} where T}, dim::Int64)
   @ NDTensors ~/.julia/packages/NDTensors/olKso/src/abstractarray/fill.jl:8
 [5] main()
   @ Main ~/ITensors.jl/NDTensors/ext/examples/NDTensorCUDA.jl:21
 [6] top-level scope
   @ ~/ITensors.jl/NDTensors/ext/examples/NDTensorCUDA.jl:116
in expression starting at /home/william/ITensors.jl/NDTensors/ext/examples/NDTensorCUDA.jl:116

Version information

  • Output from versioninfo():
julia> versioninfo()
Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores
Other relevant version info

(it_test2) pkg> status ITensors
Status `~/it_test2/Project.toml`
  [9136182c] ITensors v0.3.42

(it_test2) pkg> status NDTensors
Status `~/it_test2/Project.toml`
  [23ae76d9] NDTensors v0.2.10

(it_test2) pkg> status CUDA
Status `~/it_test2/Project.toml`
  [052768ef] CUDA v4.4.1

julia> CUDA.versioninfo()
CUDA runtime 12.1, artifact installation
CUDA driver 12.2
NVIDIA driver 535.104.5

CUDA libraries: 
- CUBLAS: 12.1.3
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.5
- CUSPARSE: 12.1.0
- CUPTI: 18.0.0
- NVML: 12.0.0+535.104.5

Julia packages: 
- CUDA: 4.4.1
- CUDA_Driver_jll: 0.5.0+1
- CUDA_Runtime_jll: 0.6.0+0

Toolchain:
- Julia: 1.9.2
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce RTX 3090 (sm_86, 22.728 GiB / 24.000 GiB available)

I am very new to Julia (and ITensors.jl), sorry if there is something obvious I'm missing!

Metadata

Metadata

Assignees

No one assigned

    Labels

    NDTensorsRequires changes to the NDTensors.jl library.bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions