Infrastructure & Agents

What Is llama.cpp?

llama.cpp is a C++ implementation that runs language models efficiently on ordinary hardware, including CPUs. It popularized running quantized open models locally without specialized accelerators.

What Is llama.cpp?

Related topics

Further reading