Skip to content

fix memcpy() crash, add missed cmd in guide, fix softmax #6622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 14, 2024

Conversation

NeoZhangJianyu
Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu commented Apr 12, 2024

  1. When install latest Intel GPU driver, memcpy() from host to device will crash.
    fix it by use host buf.
  2. Add missed cmd in guide.
  3. Fix softmax error in UT.

@NeoZhangJianyu NeoZhangJianyu requested a review from airMeng April 12, 2024 01:49
@NeoZhangJianyu NeoZhangJianyu requested a review from slaren April 12, 2024 07:39
@NeoZhangJianyu
Copy link
Collaborator Author

@slaren I have tried a solution to return support_mmap by backend in callback function.
But the caller is too earlier that the backend is not initial at that time.
So, I define a static function in ggml-backend.c

@slaren
Copy link
Member

slaren commented Apr 12, 2024

This is not ok, if you need a workaround add a memcpy to a temporary buffer in the set_tensor function.

@NeoZhangJianyu
Copy link
Collaborator Author

@slaren
Sorry, this is misunderstanding.
memcpy() usage is OK.
due to the new driver issue, the memcpy() will crash.
The workaround is disable mmap() used in sycl backend.

@slaren
Copy link
Member

slaren commented Apr 12, 2024

If the driver crashes when it receives an address from mmap, then what you can do is memcpy the buffer to a temporary buffer allocated in a conventional mean, eg. with malloc or std::vector, and pass that to the driver.

@NeoZhangJianyu
Copy link
Collaborator Author

NeoZhangJianyu commented Apr 12, 2024

driver is not crash.
The source ptr provided by mmap is unavailable. That lead to memcpy() crash.
If disable mmap() and read the gguf file into host buf. The memcpy() is from host buf to device. It's OK.
If memcpy from ptr of mmap() to device, it will crash.

@slaren
Copy link
Member

slaren commented Apr 12, 2024

memcpy the ptr you get in the tensor_set function to a temporary buffer. host -> host. Then memcpy host -> device using the temporary buffer.

@NeoZhangJianyu
Copy link
Collaborator Author

host -> host -> device. this solution should be OK.
Let me try.

@NeoZhangJianyu
Copy link
Collaborator Author

@slaren Yes, it works well.
I will update the PR as soon.
Thank you!

@NeoZhangJianyu NeoZhangJianyu changed the title disable mmap to fix memcpy crash, add missed cmd in guide, fix softmax fix memcpy() crash, add missed cmd in guide, fix softmax Apr 12, 2024
@NeoZhangJianyu
Copy link
Collaborator Author

@slaren @airMeng I update with the new solution. Please review again!

ggml-sycl.cpp Outdated
@@ -16802,22 +16809,50 @@ catch (sycl::exception const &exc) {
std::exit(1);
}

class host_buffer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved after this is removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, remove host_buffer. use malloc/free for thread safe and better performance.

@NeoZhangJianyu NeoZhangJianyu merged commit de17e3f into ggml-org:master Apr 14, 2024
tybalex pushed a commit to rubra-ai/tools.cpp that referenced this pull request Apr 17, 2024
* disable mmap to fix memcpy crash, add missed cmd in guide, fix softmax

* refactor to disable mmap for SYCL backend

* fix compile error in other os

* refactor the solution, use host buf to fix it, instead of disable mmap

* keep to support mmap()

* use host buff to reduce malloc times

* revert to malloc/free solution, for threaad safe
jart added a commit to jart/llama.cpp that referenced this pull request Apr 20, 2024
@zhouwg
Copy link
Contributor

zhouwg commented Apr 7, 2025

driver is not crash. The source ptr provided by mmap is unavailable. That lead to memcpy() crash. If disable mmap() and read the gguf file into host buf. The memcpy() is from host buf to device. It's OK. If memcpy from ptr of mmap() to device, it will crash.

why the source ptr provided by mmap is unavailable? I'd like to reproduce this issue in my local dev, could you help provide a quick guide for this: whether I only need to read the docs/backend/SYCL.md?

@zhouwg
Copy link
Contributor

zhouwg commented Apr 8, 2025

your explanation in #12734 is very clear, thanks so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants