turbopilot/BUILD.md
2023-04-23 08:38:17 +01:00

2.7 KiB

Build TurboPilot

TurboPilot is a C++ program that uses the GGML project to parse and run language models.

Dependencies

To build turbopilot you will need CMake, Libboost, a C++ toolchain and GNU Make.

Ubuntu

On Ubuntu you can install these things with:

sudo apt-get update
sudo apt-get install libboost-dev cmake build-essential

MacOS

If you use brew you can simply add these dependencies by running:

brew install cmake boost

Checkout Submodules

Make sure the ggml subproject is checked out with git submodule init and git submodule update

Prepare and Build

Configure cmake to build the project with the following:

mkdir ggml/build
cd ggml/build
cmake ..

If you are running on linux you can optionally compile a static build with cmake -D CMAKE_EXE_LINKER_FLAGS="-static" .. which should make your binary portable across different flavours of the OS.

From here you can now build the components that make up turbopilot:

make codegen codegen-quantize codegen-serve

Where:

  • codegen is a command line tool for testing out prompts in a lightweight way (a lot like llama.cpp)
  • codegen-serve is the actual REST server that can be used to connect to VSCode
  • codegen-quantize is the tool for quantizing models exported by the conversion script. For more details see Converting and Quantizing The Models.

Building with OpenBLAS

BLAS libraries accelerate mathematical operations. You can use the OpenBLAS implementation with Turbopilot to make generation faster - particularly for longer prompts.

When you run cmake, you can additionally set -D GGML_OPENBLAS=On to enable BLAS support.

E.g. cmake .. -D GGML_OPENBLAS=On

Building with CuBLAS

CuBLAS is the BLAS library provided by nvidia that runs linear algebra code on your GPU. This can speed up the application significantly, especially when working with long prompts.

Install Cuda SDK for your Operating System

You will need nvcc and the libcublas-dev dependencies as a bare minimum. Follow the guide from nvidia here for more detailed installation instructions.

Configuring Cmake with CuBLAS

You will need to set -DGGML_CUBLAS=ON and also pass the path to your nvcc executable with -DCMAKE_CUDA_COMPILER=/path/to/nvcc.

Full example: cmake -DGGML_CUBLAS=ON -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc ..