A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Go to file
2022-12-21 16:49:30 -03:00
convert-to-torch.py Add file 2022-12-21 13:28:19 -03:00
environment.yml Initial commit 2022-12-21 13:17:06 -03:00
LICENSE Initial commit 2022-12-21 01:17:38 -03:00
README.md Update README.md 2022-12-21 16:49:30 -03:00
server.py Add files 2022-12-21 13:27:31 -03:00
webui.png Add files via upload 2022-12-21 14:04:15 -03:00


A gradio webui for running large language models locally. Supports gpt-j-6B, gpt-neox-20b, opt, galactica, and many others.

Its goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation.

webui screenshot


conda env create -f environment.yml

This installs the CUDA version of pytorch, which assumes that you have a NVIDIA GPU. If you want to run this on an AMD GPU, you should install the ROCM version of pytorch instead.

Downloading models

Models should be placed under models/model-name.

Hugging Face

Hugging Face is the main place to download models. For instance, here you can find the files for the model gpt-j-6B.

The files that you need to download and put under models/gpt-j-6B are the json, txt, and pytorch*.bin files. The remaining files are not necessary.


GPT-4chan has been shut down from Hugging Face, so you need to download it elsewhere. You have two options:

Converting to pytorch

This webui allows you to switch between different models on the fly, so it must be fast to load the models from disk.

One way to make this process about 10x faster is to convert the models to pytorch format using the script convert-to-torch.py. Create a folder called torch-dumps and then make the conversion with:

python convert-to-torch.py models/model-name/

The output model will be saved to torch-dumps/model-name.pt. This is the default way to load all models except for gpt-neox-20b, opt-13b, OPT-13B-Erebus, gpt-j-6B, and flan-t5. I don't remember why these models are exceptions.

If I get enough s on this repository, I will make the process of loading models more transparent and straightforward.

Starting the webui

conda activate textgen
python server.py

Then browse to http://localhost:7860/?__theme=dark