llama.cpp (mobile)
Frameworks & SDKs
113.5k
stars
llama.cpp is a C/C++ implementation of LLaMA that runs large language models efficiently on mobile devices with ARM or x86 CPUs, supporting 4-bit and 8-bit quantization for reduced memory usage, and is released under the MIT license.