For fully local execution, llama.cpp enables running compatible open models in the GGUF format, with optional GPU acceleration.

AI21 publishes official Jamba model weights on the Hugging Face Hub, and community contributors may provide GGUF-format conversions (e.g., Jamba Mini 1.7) for use with llama.cpp.
Note:AI21 does not distribute or support GGUF builds and cannot verify the accuracy of third-party conversions. Be sure to review the model’s license terms and consult the llama.cpp documentation before use.