text-generation-webui/README.md

73 lines
2.9 KiB
Markdown
Raw Normal View History

2022-12-20 23:17:38 -05:00
# text-generation-webui
2022-12-21 14:49:30 -05:00
A gradio webui for running large language models locally. Supports gpt-j-6B, gpt-neox-20b, opt, galactica, and many others.
Its goal is to become the [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) of text generation.
2022-12-21 11:17:06 -05:00
2022-12-21 12:04:51 -05:00
![webui screenshot](https://github.com/oobabooga/text-generation-webui/raw/main/webui.png)
2022-12-21 11:17:06 -05:00
## Installation
Create a conda environment:
conda create -n textgen
conda activate textgen
Install the appropriate pytorch for your GPU. For NVIDIA GPUs, this should work:
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
Install the requirements:
pip install -r requirements.txt
2022-12-21 11:17:06 -05:00
2022-12-21 12:37:50 -05:00
## Downloading models
Models should be placed under `models/model-name`.
2022-12-21 13:00:16 -05:00
#### Hugging Face
2022-12-21 12:37:50 -05:00
2023-01-05 22:13:26 -05:00
Hugging Face is the main place to download models. These are the most important ones in my opinion:
2022-12-21 12:37:50 -05:00
2023-01-05 22:13:26 -05:00
* [gpt-j-6B](https://huggingface.co/EleutherAI/gpt-j-6B/tree/main)
* [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b/tree/main)
* [opt](https://huggingface.co/models?search=facebook/opt)
* [galactica](https://huggingface.co/models?search=facebook/galactica)
* [\*-Erebus](https://huggingface.co/models?search=erebus)
2023-01-05 22:18:34 -05:00
The files that you need to download and put under `models/model-name` (for instance, `models/gpt-j-6B`) are the json, txt, and pytorch\*.bin files. The remaining files are not necessary.
2022-12-21 12:37:50 -05:00
2022-12-21 13:00:16 -05:00
#### GPT-4chan
2022-12-21 12:37:50 -05:00
[GPT-4chan](https://huggingface.co/ykilcher/gpt-4chan) has been shut down from Hugging Face, so you need to download it elsewhere. You have two options:
* Torrent: [16-bit](https://archive.org/details/gpt4chan_model_float16) / [32-bit](https://archive.org/details/gpt4chan_model)
* Direct download: [16-bit](https://theswissbay.ch/pdf/_notpdf_/gpt4chan_model_float16/) / [32-bit](https://theswissbay.ch/pdf/_notpdf_/gpt4chan_model/)
2022-12-21 14:49:30 -05:00
## Converting to pytorch
This webui allows you to switch between different models on the fly, so it must be fast to load the models from disk.
One way to make this process about 10x faster is to convert the models to pytorch format using the script `convert-to-torch.py`. Create a folder called `torch-dumps` and then make the conversion with:
python convert-to-torch.py models/model-name/
The output model will be saved to `torch-dumps/model-name.pt`. This is the default way to load all models except for `gpt-neox-20b`, `opt-13b`, `OPT-13B-Erebus`, `gpt-j-6B`, and `flan-t5`. I don't remember why these models are exceptions.
2022-12-21 14:52:23 -05:00
If I get enough ⭐s on this repository, I will make the process of loading models saner and more customizable.
2022-12-21 14:49:30 -05:00
2022-12-21 11:17:06 -05:00
## Starting the webui
conda activate textgen
python server.py
Then browse to `http://localhost:7860/?__theme=dark`
2022-12-21 14:52:23 -05:00
2023-01-05 23:33:21 -05:00
## Presets
Inference settings presets can be created under `presets/` as text files. These files are detected automatically at startup.
2022-12-21 14:52:23 -05:00
## Contributing
Pull requests are welcome.