Cudamemcpy vs cudamemcpyasync in different cpu threads with different data and pointers

There are multiple cpu threads with cudamemcpy operations, should i use cudamemcpyasync or cudamemcpy if there is no overlap in pointers or data between these cpu threads

Hi,

cudaMemcpyAsync is a nonblocking API that may return before the copy is complete.
You should benefit from it if there are some CPU tasks after copy as the thread doesn’t need to wait for the copy task to complete.

Below is the document for your reference:
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1g85073372f776b4c4d5f89f7124b7bf79

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.