I disabled the SMMU as discussed in this forum. (remove iommus = <&smmu TEGRA_SID_PCIE5>; and dma-coherent; from the device tree.)
However, I found not only transfer by dma, but also transfer by memcpy to the device memory where I disable IOMMU is really slow. Before disable IOMMU, memcpy speed was about 2.5GB/s, now it’s only 980MB/s.
I have the same issue but on the Jetson AGX Orin. I disabled the SMMU for one of the PCIe controllers and now when we are doing DMA over PCIe, the performance is much worse. Nvidia please help!