https://github.com/ggerganov/llama.cpp

https://github.com/abetlen/llama-cpp-python

tutorial

use by GPU