LLAMA(oh)			     LOCAL			     LLAMA(oh)

llama - run language models locally

COMPILE
     clone:		     git clone https://github.com/ggml-org/llama.cpp
     compile:		     cmake -B build -DLLAMA_CURL=OFF ; cmake --build
			     build --config Release [-j MAX_CORES]
     - For Vulkan(AMD GPU):  cmake -B build -DGGML_VULKAN=ON -DLLAMA_CURL=OFF;
			     cmake --build build --config Release [-j
			     MAX_CORES]
     - When updating, before compiling:
			     rm -r build

BINARIES
     The binaries are located at: LLAMA.CPP_PATH/build/bin/

OPTIONS
     The following options apply for both server and cli.

     -m MODEL_PATH : model to use.
     --no-mmap	   : no memory map, useful with small models if the gpu can
		   handle it.
     -ngl N	   : load N layers into gpu.
     --temp N	   : model temperature.
     -c N	   : context size.
     -t N	   : threads.

llama-server
     --host HOST : host to use instead of 127.0.0.1.
     --port PORT : port to use instead of 8080.

llama-cli
     --color [on|off|auto] : coloured chat on/off/auto.
     --f FILEPATH	   : FILEPATH containing a prompt for the system.

MODELS
     Models can be downloaded from HuggingFace: https://huggingface.co/ , some
     of the tested models are listed in the links.

SEE ALSO
     ai(oh) , whisper(oh)

   links
     - llama.cpp - Github: https://github.com/ggml-org/llama.cpp
     - Huggingface - Models repositories: https://huggingface.co/

   models
     - GLM-4.6V-Flash-Q8_0-GGUF:
     https://huggingface.co/NikolayKozloff/GLM-4.6V-Flash-Q8_0-GGUF
     - TheDrummer_Cydonia-24B-v4.3-GGUF:
     https://huggingface.co/bartowski/TheDrummer_Cydonia-24B-v4.3-GGUF
     - Impish_Nemo_12B_GGUF:
     https://huggingface.co/SicariusSicariiStuff/Impish_Nemo_12B_GGUF
     - Llama-3.2-3B-Instruct-GGUF:
     https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF
     - Llama-3.2-3B-Instruct-uncensored-GGUF:
     https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-uncensored-GGUF
     - Qwen_Qwen3-0.6B-GGUF:
     https://huggingface.co/bartowski/Qwen_Qwen3-0.6B-GGUF

AUTHORS
     ohazot(oh) | about(oh) | ohazot.com: https://ohazot.com
     <admin@ohazot.com>

linux , OpenBSD 7.8 |
		    Created:2026-04-02|Updated:2026-04-02|	     LLAMA(oh)