Skip to content

[User]Failed to execute any models on s390x #3298

Closed
@chenqiny

Description

@chenqiny

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Allow llama.cpp to be execute on s390x architecture

I am curious whether there is big endian/little endian issue of gguf model. My system is big endian.

BTW if you can point me how to add support for new sets of SIMD instructions, I can try to add s390x SIMD instructions support by myself.

Thank you.

Current Behavior

I can compile this program on s390x by commented k_quants.c line# 50.
#if !defined(__riscv)
//#include <immintrin.h>
#endif

And I can execute ./main -h

But if I execute it with a real model, then I got invalid magic number.
Is there an endianess issue?

[root@aiu llama.cpp]# ./main -m ./models/ggml-vocab-llama.gguf
Log start
main: build = 1265 (324f340)
main: built with cc (GCC) 10.2.1 20201112 (Red Hat 10.2.1-8) for s390x-redhat-linux
main: seed  = 1695309361
gguf_init_from_file: invalid magic number 47475546
error loading model: llama_model_loader: failed to load model from ./models/ggml-vocab-llama.gguf

llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/ggml-vocab-llama.gguf'
main: error: unable to load model

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu

Architecture:        s390x
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Big Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  8
Socket(s) per book:  2
Book(s) per drawer:  4
Drawer(s):           4
NUMA node(s):        1
Vendor ID:           IBM/S390
Machine type:        3931
CPU dynamic MHz:     5200
CPU static MHz:      5200
BogoMIPS:            3331.00
Hypervisor:          PR/SM
Hypervisor vendor:   IBM
Virtualization type: full
Dispatching mode:    horizontal
L1d cache:           128K
L1i cache:           128K
L2 cache:            32768K
L3 cache:            262144K
NUMA node0 CPU(s):   0-3
Flags:               esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te vx vxd vxe gs vxe2 vxp sort dflt sie
  • Operating System, e.g. for Linux:

$ uname -a

Linux 4.18.0-305.el8.s390x #1 SMP Thu Apr 29 09:06:01 EDT 2021 s390x s390x s390x GNU/Linux

  • SDK version, e.g. for Linux:
$ python3 --version
$ make --version
$ g++ --version

Python 3.9.2
GNU Make 4.2.1
Built for s390x-ibm-linux-gnu
g++ (GCC) 10.2.1 20201112 (Red Hat 10.2.1-8)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions