AMD GPU locks up using "koboldcpp" or "llama.cpp"...

Go to: [ bottom of page ] [ top of archives ] [ this month ]

From: Nils Beyer <nbe_at_vkf-renzel.de>
Date: Thu, 09 Oct 2025 11:54:12 UTC

Hi,

I have opened a bug report here:

	https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289813

Just to get a few more pointers, I'd like to ask you whether you are successfully
able to inference with "koboldcpp" and "llama.cpp" using an AMD GPU without
lock-ups?


To try quickly, you can checkout/build and bench quickly:


as root:
--------
pkg install gmake vulkan-loader opencl mesa-devel python

(attention: this installs 'mesa-devel' and remaps your current libGL and such. After
testing I suggest to remove 'mesa-devel' again as it gave me problems under Plasma6)


as user:
--------
vulkaninfo

(looks good?)

clinfo

(looks good, too?)


mkdir -p ~/work/src
cd ~/work/src
fetch -o MN-12B-Mag-Mell-R1.IQ4_XS.gguf 'https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/resolve/main/MN-12B-Mag-Mell-R1.IQ4_XS.gguf?download=true'


# koboldCpp
cd ~/work/src
git clone --depth 1 https://github.com/LostRuins/koboldcpp
cd koboldcpp
gmake -j16 LLAMA_CLBLAST=1 LLAMA_OPENBLAS=1 LLAMA_VULKAN=1 LDFLAGS="-L/usr/local/lib"

python koboldcpp.py --usevulkan --gpulayers 999 --benchmark --model ../MN-12B-Mag-Mell-R1.IQ4_XS.gguf

(do it a few times, your GPU may eventually lock up)


# llama.cpp
cd ~/work/src
git clone --depth 1 https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B .build -DGGML_VULKAN=1 -DGGML_OPENCL=1
cmake --build .build --parallel 16

.build/bin/llama-bench -m ../MN-12B-Mag-Mell-R1.IQ4_XS.gguf -ngl 100 -fa 0,1

(do it a few times, your GPU may eventually lock up)


Thanks for trying and for your feedbacks...



Regards,
Nils