.Dd Created:2026-04-02|Updated:2026-04-02|
.de ocsi
\\$*
,
..
.de oxr
.ocsi .Xr \\$*
..
.de oit
.It
\\$*
..
.de obdi
.Bl -dash -compact
.oit \\$*
..
.de obdl
.Bd -literal -compact
\\$*
..
.de onote
.Bl -hang -compact
.oit \\$*
.El
..
.de ocomm
.Bl -diag -compact
.oit \\$*
.El
..
.de opsy
.Pp
.Sy - \\$*
..
.de obc
.Bl -column \\$*
..
.de obc2
.obc opt desc
..
.de obc3
.obc option arguments description
..
.nr r 0
.ie ( \nr == 1 ) \{\
.de cos
.Os OpenBSD 7.8 , linux |
..
.de dm
.opsy OpenBSD manpages:
\\$*
..
.\}
.ie ( \nr == 0 ) \{\
.de cos
.Os linux , OpenBSD 7.8 |
..
.de dm
.opsy Archlinux manpages:
\\$*
..
.\}
.Dt LLAMA oh
.cos
.Nm llama 
.Nd run language models locally
.Sh COMPILE
.obc compiling_get_models
.It clone : Ta Li git clone https://github.com/ggml-org/llama.cpp
.It compile : Ta Li cmake -B build -DLLAMA_CURL=OFF \; cmake --build build --config Release [-j MAX_CORES]
.It - For Vulkan(AMD GPU): Ta Li cmake -B build -DGGML_VULKAN=ON -DLLAMA_CURL=OFF ; cmake --build build --config Release [-j MAX_CORES]
.It - When updating, before compiling : Ta Li rm -r build
.El
.Sh BINARIES
The binaries are located at:
.Pa LLAMA.CPP_PATH/build/bin/
.Sh OPTIONS
The following options apply for both server and cli.
.obc _m_model_p
.It -m MODEL_PATH Ta : model to use.
.It --no-mmap Ta : no memory map, useful with small models if the gpu can handle it.
.It -ngl N Ta : load N layers into gpu.
.It --temp N Ta : model temperature.
.It -c N Ta : context size.
.It -t N Ta : threads.
.El
.Sh llama-server
.obc host_hos
.It --host HOST Ta : host to use instead of 127.0.0.1.
.It --port PORT Ta : port to use instead of 8080.
.El
.Sh llama-cli
.obc __color__on_off_au
.It --color [on|off|auto] Ta : coloured chat on/off/auto.
.It --f FILEPATH Ta : FILEPATH containing a prompt for the system.
.El
.Sh MODELS
Models can be downloaded from
.Lk https://huggingface.co/ HuggingFace
, some of the tested models are listed in the links.
.Sh SEE ALSO
.oxr ai oh
.Xr whisper oh
.Ss links
.obc2
.It - Lk https://github.com/ggml-org/llama.cpp llama.cpp - Github
.It - Lk https://huggingface.co/ Huggingface - Models repositories
.El
.Ss models
.obc2
.It - Lk https://huggingface.co/NikolayKozloff/GLM-4.6V-Flash-Q8_0-GGUF GLM-4.6V-Flash-Q8_0-GGUF
.It - Lk https://huggingface.co/bartowski/TheDrummer_Cydonia-24B-v4.3-GGUF TheDrummer_Cydonia-24B-v4.3-GGUF
.It - Lk https://huggingface.co/SicariusSicariiStuff/Impish_Nemo_12B_GGUF Impish_Nemo_12B_GGUF
.It - Lk https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF Llama-3.2-3B-Instruct-GGUF
.It - Lk https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-uncensored-GGUF Llama-3.2-3B-Instruct-uncensored-GGUF
.It - Lk https://huggingface.co/bartowski/Qwen_Qwen3-0.6B-GGUF Qwen_Qwen3-0.6B-GGUF
.El
.Sh AUTHORS
.An -nosplit
.Xr ohazot oh |
.Xr about oh |
.Lk https://ohazot.com ohazot.com
.Aq Mt admin@ohazot.com