mirror of https://github.com/ravenscroftj/turbopilot.git synced 2024-06-28 23:32:20 +00:00

Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU

code-completion cpp language-model machine-learning

Go to file

James Ravenscroft 22b0f9664f update model name in docker		2023-04-10 09:18:04 +01:00
.github/workflows	Update docker-image.yml	2023-04-10 09:09:10 +01:00
assets	update screen recording	2023-04-10 08:32:00 +01:00
ggml@560ee1aaa0	use latest ggml submodule	2023-04-10 09:13:21 +01:00
.dockerignore	add docker build stuff	2023-04-10 08:51:48 +01:00
.gitmodules	add ggml	2023-04-09 17:49:03 +01:00
convert-codegen-to-ggml.py	add conversion script	2023-04-09 17:49:42 +01:00
Dockerfile	update model name in docker	2023-04-10 09:18:04 +01:00
LICENSE.md	add readme and license	2023-04-10 08:16:12 +01:00
README.md	update readme	2023-04-10 08:23:27 +01:00
run.sh	add docker build stuff	2023-04-10 08:51:48 +01:00

README.md

TurboPilot

TurboPilot is a self-hosted copilot clone which uses the library behind llama.cpp to run the 6 Billion Parameter Salesforce Codegen model in 4GiB of RAM. It is heavily based and inspired by on the fauxpilot project.

NB: This is a proof of concept right now rather than a stable tool. Autocompletion is quite slow in this version of the project. Feel free to play with it, but your mileage may vary.

Getting Started

git clone https://github.com/ravenscroftj/turbopilot
git submodule init
cd ggml
mkdir build
cd build
cmake ..
make codegen codegen-quantize

Getting The Models

Start by downloading either the 2B or 6B GPT-J versions of CodeGen.

Convert The Model

python convert-codegen-to-ggml.py ./codegen-6B-multi-gptj 0

Quantize the model

./bin/codegen-quantize ../../codegen-6B-multi-gptj/ggml-model-f32.bin ../../codegen-6B-multi-gptj/ggml-model-quant.bin 2

Run the model

./bin/codegen -t 6 -m ../../codegen-6B-multi-gptj/ggml-model-quant.bin -p "def main("

Acknowledgements

This project would not have been possible without Georgi Gerganov's work on GGML and llama.cpp
It was completely inspired by fauxpilot which I did experiment with for a little while but wanted to try to make the models work without a GPU
The frontend of the project is powered by Venthe's vscode-fauxpilot plugin
The project uses the Salesforce Codegen models.
Thanks to Moyix for his work on converting the Salesforce models to run in a GPT-J architecture. Not only does this confer some speed benefits but it also made it much easier for me to port the models to GGML using the existing gpt-j example code