I deploy pointpillar on the rtx30 series graphics card, and the code for post-processing nms is as follows
std::vector<uint64_t> host_mask(host_filter_count * col_blocks);
GPU_CHECK(cudaMemcpy(&host_mask[0], dev_mask,
sizeof(uint64_t) * host_filter_count * col_blocks,
cudaMemcpyDeviceToHost));
It can run correctly on rtx30 series graphics cards, but on T4 graphics cards, this part will report an error: an illegal memory access was encountered.Has anyone encountered this problem?
Compiling the code with -lineinfo
and running it with compute-sanitizer should give you the exact location where the memory access occurs.
There are many code files in my entire project. I want to directly check the executable file generated by cmake. How to use this tool? I cannot directly nvcc to compile the entire project.
add -lineinfo
to the cmake build settings (NVCC flags, or something like that.)
Then let’s say your compiled executable is my_executable
.
To use compute-sanitizer
, you would do:
compute-sanitizer ./my_executable
There is documentation for compute-sanitizer.
This writeup may also be of interest.
thanks for your advice
compute-sanitizer --tool memcheck ./dbg_test01
========= COMPUTE-SANITIZER
========= Error: process didn’t terminate successfully
========= The application may have hit an error when dereferencing Unified Memory from the host. Please rerun the application under cuda-gdb or a host debugger to catch host side errors.
========= Target application returned an error
========= ERROR SUMMARY: 0 errors
When I debug as required, no matter what program, there will be this error
you’ll have to fix that first before trying to use compute-sanitizer. That is a general application debug issue. It’s hard for me to give advice without knowing anything about the error output from the application.
Make sure the application does a
return 0;
if it is not doing that, compute-sanitizer
won’t work. It also won’t work if you have a seg fault or other host-code detectable error.
Basically you have an error that is detectable in host code.
It might be that your error handling routine/macro (GPU_CHECK()) is returning non-zero when you hit this CUDA error. in that case you will have to fix that (if you want to use compute-sanitizer).