Skip to content

Misc. bug: CPU Usage low in rpc-server mode #13051

Closed
@cf3i

Description

@cf3i

Name and Version

$ llama-cli --version
version: 5038 (193c3e0)
built with Cray clang version 18.0.0 (0e4696aa65fa9549bd5e19c216678cc98185b0f7) for x86_64-unknown-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

llama-cli -m  /scratch/feic/pjs/DeepSeek-CPU-Inference/models/DeepSeek-R1.Q8_0.gguf \
-p "The weather was so perfect that I did not want to go back in the house. But I had left a new beef stew on the stove which needed my attention. Please generate in English what happend next " \
--repeat-penalty 1.0 -n 128 \
--rpc ${server_list} -t 192  -no-cnv -ngl 99 \
-Cr 0-195

Problem description & steps to reproduce

While using the --rpc option to distribute inference across multiple CPU nodes (without GPU resources), I observed low CPU utilization on each node. Despite allocating all 196 CPUs per node (Cr=0-195), diagnostics showed only 2-3 cores were actively being used.

First Bad Commit

No response

Relevant log output

[2025-04-21 18:24:59]
  CPU (avg): 0.3% | Active Cores: 2/384
  Mem: 16.3% (Used: 41.62 GB)
  Net: Sent=675733.33 MB | Recv=129878.20 MB
  NUMA Stats: {}

[2025-04-21 18:25:06]
  CPU (avg): 0.2% | Active Cores: 1/384
  Mem: 16.3% (Used: 41.59 GB)
  Net: Sent=679569.56 MB | Recv=129881.75 MB
  NUMA Stats: {}

[2025-04-21 18:25:13]
  CPU (avg): 0.2% | Active Cores: 3/384
  Mem: 17.3% (Used: 45.49 GB)
  Net: Sent=683405.86 MB | Recv=129885.21 MB
  NUMA Stats: {}

[2025-04-21 18:25:20]
  CPU (avg): 0.8% | Active Cores: 5/384
  Mem: 16.5% (Used: 42.38 GB)
  Net: Sent=683405.96 MB | Recv=129885.34 MB
  NUMA Stats: {}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions