-
Notifications
You must be signed in to change notification settings - Fork 12.8k
Closed
Description
CLBlast version (device AMD RX6800XT), for the Q8_0 models generate garbage result:
main.exe --ctx_size 2048 --temp 0.74 --top_k 40 --top_p 0.5 --repeat_last_n 192 --repeat_penalty 1.4 --batch_size 256 --threads 24 --n_predict 2048 --color --interactive -ins --interactive-first -m VicUnlocked-30B-LoRA.ggml.q8_0.bin -s 1
main: build = 561 (5ea4339)
main: seed = 1
llama.cpp: loading model from VicUnlocked-30B-LoRA.ggml.q8_0.bin
llama_model_load_internal: format = ggjt v2 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 6656
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 52
llama_model_load_internal: n_layer = 60
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 7 (mostly Q8_0)
llama_model_load_internal: n_ff = 17920
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 30B
llama_model_load_internal: ggml ctx size = 135.75 KB
llama_model_load_internal: mem required = 37206.11 MB (+ 3124.00 MB per state)
Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=0 (If invalid, program will crash)
Using Platform: AMD Accelerated Parallel Processing Device: gfx1030
llama_init_from_file: kv self size = 3120.00 MB
system_info: n_threads = 24 / 24 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
main: interactive mode on.
Reverse prompt: '### Instruction:
'
sampling: repeat_last_n = 192, repeat_penalty = 1.400000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.500000, typical_p = 1.000000, temp = 0.740000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 2048, n_batch = 256, n_predict = 2048, n_keep = 2
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
> My name is Alex. Your name is Lion. You are my personal AI assistant.
Хронологија Awosiicherсти agesppe Mas Schmidtlichelackadalablo(@" Dynam Terminalairecompatchiaadre arrestilor CTommPRdaggerzilass Howard Sang PDF shadow SM >> Chal Byte Naval FAlaus changing hayoux ba bunchrokенrundUSEetch thrustREodortMR Spirit civ dig glob Tow agents
>
llama_print_timings: load time = 9065.74 ms
llama_print_timings: sample time = 22.36 ms / 60 runs ( 0.37 ms per token)
llama_print_timings: prompt eval time = 18715.84 ms / 39 tokens ( 479.89 ms per token)
llama_print_timings: eval time = 58273.65 ms / 60 runs ( 971.23 ms per token)
llama_print_timings: total time = 113280.74 ms
Terminate batch job (Y/N)? y
Metadata
Metadata
Assignees
Labels
No labels