Hi,
Suppose I have
cudaMemcpyAsync(dev2, dev1, N, cudaMemcpyDeviceToDevice, stream1)
cudaStreamSynchronize(stream1)
where dev2 is a pointer on device 2, dev1 is a pointer on device 1 and stream1 is a stream on device 1.
After the cudaStreamSynchronize(), does it guarantee that
- the data has been copied to dev2, i.e., the whole copy has finished
Or it simply guarantees that - data has been copied from dev1 and dev1 can be reused, and the data is not necessarily in dev2
?
Thanks.