Skip to content

benchmark-matmult broken when building without BLAS #1551

Closed
@stsydow

Description

@stsydow

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Running benchmark without a BLAS library should work

make -j benchmark-matmult

Current Behavior

since 2d5db48 it aborts with:

ABORT - ERROR in Matrix Multiplication result - expected 11611394048.00, got 11474052096.00 (delta 137341952.00 > allowed_delta 11611.39)

Environment and Context

I used git bisect and make -j clean benchmark-matmult, which pointed to
commit 2d5db48

Full run:

make -j benchmark-matmult
I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:  
I CC:       cc (GCC) 13.1.1 20230429
I CXX:      g++ (GCC) 13.1.1 20230429

cc  -I.              -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native   -c ggml.c -o ggml.o
g++ -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native examples/benchmark/benchmark-matmult.cpp ggml.o -o benchmark-matmult 
./benchmark-matmult
main: build = 567 (2d5db48)
Starting Test
Allocating Memory of size 794558464 bytes, 757 MB
Creating new tensors

------ Test 1 - Matrix Mult via F32 code ------------------------------------------------------------------------------
cgraph->n_threads=1
            m11: type = 0 (  f32) ne = 11008 x  4096 x     1, nb = (    4, 44032, 180355072) - Sum of tensor m11 is 16777216.00
             m2: type = 0 (  f32) ne = 11008 x   128 x     1, nb = (    4, 44032, 5636096) - Sum of tensor m2 is 2818048.00
    gf.nodes[0]: type = 0 (  f32) ne =  4096 x   128 x     1, nb = (    4, 16384, 2097152) - Sum of tensor gf.nodes[0] is 11611394048.00

------ Test 2 - Matrix Mult via Q4_0 code ------------------------------------------------------------------------------
cgraph->n_threads=1
Matrix Multiplication of (11008,4096,1) x (11008,128,1) - about  11.54 gFLOPS

Iteration;NThreads; SizeX; SizeY; SizeZ; Required_FLOPS; Elapsed_u_Seconds; gigaFLOPS
=====================================================================================
        0;       1; 11008;  4096;   128;    11542724608;            273886;     42.14

ABORT - ERROR in Matrix Multiplication result - expected 11611394048.00, got 11474052096.00 (delta 137341952.00 > allowed_delta 11611.39)
  • System: Arch Linux on a Thinkpad L14 (AMD)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions