Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU
Go to file
2023-04-10 10:01:01 +01:00
.github/workflows make build static 2023-04-10 10:01:01 +01:00
assets update screen recording 2023-04-10 08:32:00 +01:00
ggml@560ee1aaa0 use latest ggml submodule 2023-04-10 09:13:21 +01:00
models add models readme 2023-04-10 09:41:09 +01:00
.dockerignore add docker build stuff 2023-04-10 08:51:48 +01:00
.gitmodules add ggml 2023-04-09 17:49:03 +01:00
convert-codegen-to-ggml.py add conversion script 2023-04-09 17:49:42 +01:00
Dockerfile update model name in docker 2023-04-10 09:18:04 +01:00
LICENSE.md add readme and license 2023-04-10 08:16:12 +01:00
README.md Add instructions for getting the models 2023-04-10 09:39:58 +01:00
requirements.txt add requirements file for python 2023-04-10 09:31:54 +01:00
run.sh add docker build stuff 2023-04-10 08:51:48 +01:00

TurboPilot

TurboPilot is a self-hosted copilot clone which uses the library behind llama.cpp to run the 6 Billion Parameter Salesforce Codegen model in 4GiB of RAM. It is heavily based and inspired by on the fauxpilot project.

NB: This is a proof of concept right now rather than a stable tool. Autocompletion is quite slow in this version of the project. Feel free to play with it, but your mileage may vary.

a screen recording of turbopilot running through fauxpilot plugin

Getting Started

git clone https://github.com/ravenscroftj/turbopilot
git submodule init
cd ggml
mkdir build
cd build
cmake ..
make codegen codegen-quantize

Getting The Models

Direct Download

You can download the pre-converted, pre-quantized models from Google Drive. I've made the multi flavour models with 2B and 6B parameters available - these models are pre-trained on C, C++, Go, Java, JavaScript, and Python

Convert The Models Yourself

Start by downloading either the 2B or 6B GPT-J versions of CodeGen.

You could also experiment with the other sizes of model such as 16B if you want or try the mono models (2B, 6B, 16B) which are fine-tuned on python only but which outperform the multi models in some cases (see the original paper for details).

You will also need to place vocab.json and added_tokens.json in the directory along with the model to make the conversion script work. This is a temporary limitation that I'll remove at some point.

You can directly git clone from huggingface URLS above. To save time you can disable LFS on first checkout and selectively pull the files you need (you only need the .bin files for conversion. The large .zst files are not needed). Here is an example:

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/moyix/codegen-16B-multi-gptj
git config lfs.fetchexclude "*.zst"
git lfs fetch
git lfs checkout *.bin

Install Python Dependencies

The convert-codegen-to-ggml.py requires Python 3 - I used 3.10. Install the dependencies with pip install -r requirements.txt.

Convert The Model

python convert-codegen-to-ggml.py ./codegen-6B-multi-gptj 0

Quantize the Model

./bin/codegen-quantize ./codegen-6B-multi-gptj/ggml-model-f32.bin ./codegen-6B-multi-gptj/ggml-model-quant.bin 2

Acknowledgements