2.8 KiB
GPT4All
Demo, data and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa
Try it yourself
Clone this repository down and download the CPU quantized gpt4all model.
Place the quantized model in the chat
directory and start chatting by running:
./chat/gpt4all-lora-quantized-OSX-m1
on M1 Mac/OSX./chat/gpt4all-lora-quantized-linux-x86
on Windows/Linux
To compile for custom hardware, see our fork of the Alpaca C++ repo.
Note: the full model on GPU (16GB of RAM required) performs much better in our qualitative evaluations.
Reproducibility
Trained LoRa Weights:
- gpt4all-lora: https://huggingface.co/nomic-ai/gpt4all-lora
- gpt4all-lora-epoch-2 https://huggingface.co/nomic-ai/gpt4all-lora-epoch-2
Raw Data:
We are not distributing a LLaMa 7B checkpoint.
You can reproduce our trained model by doing the following:
Setup
Clone the repo
git clone --recurse-submodules git@github.com:nomic-ai/gpt4all.git
git submodule configure && git submodule update
Setup the environment
python -m pip install -r requirements.txt
cd transformers
pip install -e .
cd ../peft
pip install -e .
Training
accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config.json train.py --config configs/train/finetune-7b.yaml
Generate
python generate.py --config configs/generate/generate.yaml --prompt "Write a script to reverse a string in Python
If you utilize this reposistory, models or data in a downstream project, please consider citing it with:
@misc{gpt4all,
author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}