Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Allow llama.cpp to be execute on s390x architecture
I am curious whether there is big endian/little endian issue of gguf model. My system is big endian.
BTW if you can point me how to add support for new sets of SIMD instructions, I can try to add s390x SIMD instructions support by myself.
Thank you.
Current Behavior
I can compile this program on s390x by commented k_quants.c line# 50.
#if !defined(__riscv)
//#include <immintrin.h>
#endif
And I can execute ./main -h
But if I execute it with a real model, then I got invalid magic number.
Is there an endianess issue?
[root@aiu llama.cpp]# ./main -m ./models/ggml-vocab-llama.gguf
Log start
main: build = 1265 (324f340)
main: built with cc (GCC) 10.2.1 20201112 (Red Hat 10.2.1-8) for s390x-redhat-linux
main: seed = 1695309361
gguf_init_from_file: invalid magic number 47475546
error loading model: llama_model_loader: failed to load model from ./models/ggml-vocab-llama.gguf
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/ggml-vocab-llama.gguf'
main: error: unable to load model
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
- Physical (or virtual) hardware you are using, e.g. for Linux:
$ lscpu
Architecture: s390x
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Big Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s) per book: 2
Book(s) per drawer: 4
Drawer(s): 4
NUMA node(s): 1
Vendor ID: IBM/S390
Machine type: 3931
CPU dynamic MHz: 5200
CPU static MHz: 5200
BogoMIPS: 3331.00
Hypervisor: PR/SM
Hypervisor vendor: IBM
Virtualization type: full
Dispatching mode: horizontal
L1d cache: 128K
L1i cache: 128K
L2 cache: 32768K
L3 cache: 262144K
NUMA node0 CPU(s): 0-3
Flags: esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te vx vxd vxe gs vxe2 vxp sort dflt sie
- Operating System, e.g. for Linux:
$ uname -a
Linux 4.18.0-305.el8.s390x #1 SMP Thu Apr 29 09:06:01 EDT 2021 s390x s390x s390x GNU/Linux
- SDK version, e.g. for Linux:
$ python3 --version
$ make --version
$ g++ --version
Python 3.9.2
GNU Make 4.2.1
Built for s390x-ibm-linux-gnu
g++ (GCC) 10.2.1 20201112 (Red Hat 10.2.1-8)