text-generation-webui/requirements.txt

accelerate==0.32.*
aqlm[gpu,cpu]==1.1.6; platform_system == "Linux"
auto-gptq==0.7.1
bitsandbytes==0.43.*
colorama
datasets
einops
gradio==4.26.*
hqq==0.1.7.post3
jinja2==3.1.4
lm_eval==0.3.0
markdown
numba==0.59.*
numpy==1.26.*
optimum==1.17.*
pandas
peft==0.8.*
Pillow>=9.5.0
psutil
pyyaml
requests
rich
safetensors==0.4.*
scipy
sentencepiece
tensorboard
transformers==4.43.*
tqdm
wandb

# API
SpeechRecognition==3.10.0
flask_cloudflared==0.0.14
sse-starlette==1.6.5
tiktoken

# llama-cpp-python (CPU only, AVX2)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"

# llama-cpp-python (CUDA, no tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# llama-cpp-python (CUDA, tensor cores)
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"

# CUDA wheels
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"
https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"
autoawq==0.2.5; platform_system == "Linux" or platform_system == "Windows"
Update accelerate requirement from ==0.31.* to ==0.32.* (#6217) 2024-07-11 18:56:42 -04:00			`accelerate==0.32.*`
Bump aqlm[cpu,gpu] from 1.1.5 to 1.1.6 (#6157) 2024-06-27 20:13:02 -04:00			`aqlm[gpu,cpu]==1.1.6; platform_system == "Linux"`
Backend cleanup (#6025) 2024-05-21 12:32:02 -04:00			`auto-gptq==0.7.1`
Bump bitsandbytes to 0.43, add official Windows wheel 2024-03-10 11:30:53 -04:00			`bitsandbytes==0.43.*`
Add 4-bit LoRA support (#1200) 2023-04-16 22:26:52 -04:00			`colorama`
New yaml character format (#337 from TheTerrasque/feature/yaml-characters) This doesn't break backward compatibility with JSON characters. 2023-04-02 19:34:25 -04:00			`datasets`
Falcon support (trust-remote-code and autogptq checkboxes) (#2367) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com> 2023-05-29 09:20:18 -04:00			`einops`
Update gradio requirement from ==4.25.* to ==4.26.* (#5832) 2024-04-11 01:24:53 -04:00			`gradio==4.26.*`
Revert "Bump hqq from 0.1.7.post3 to 0.1.8 (#6238)" This reverts commit 1c3671699c83424fb48adb7929d007f6e9056eaa. 2024-07-22 22:53:56 -04:00			`hqq==0.1.7.post3`
Bump jinja2 from 3.1.2 to 3.1.4 (#6172) 2024-06-27 20:12:39 -04:00			`jinja2==3.1.4`
Pin lm_eval package version 2023-12-24 12:22:31 -05:00			`lm_eval==0.3.0`
Sort the requirements 2023-03-15 11:40:03 -04:00			`markdown`
Add numba to requirements.txt 2024-03-10 19:13:29 -04:00			`numba==0.59.*`
Update numpy requirement from ==1.24.* to ==1.26.* (#5490) 2024-02-13 14:26:35 -05:00			`numpy==1.26.*`
Update optimum requirement from ==1.16.* to ==1.17.* (#5548) 2024-02-19 17:15:21 -05:00			`optimum==1.17.*`
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-20 23:20:33 -04:00			`pandas`
Revert "Update peft requirement from ==0.8.* to ==0.9.* (#5626)" This reverts commit 72a498ddd44a895205e53b5696742dc4ded9e12e. 2024-03-05 05:56:37 -05:00			`peft==0.8.*`
Add Pillow as a requirement 2023-04-08 17:48:46 -04:00			`Pillow>=9.5.0`
requirements: add psutil (#5819) 2024-04-06 22:02:20 -04:00			`psutil`
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-20 23:20:33 -04:00			`pyyaml`
Add requests to requirements.txt 2023-03-11 12:47:30 -05:00			`requests`
Add rich requirement 2023-12-20 00:58:36 -05:00			`rich`
Bump safetensors version 2024-02-04 21:40:25 -05:00			`safetensors==0.4.*`
Add 'scipy' to requirements.txt #2335 (#2343) Unlisted dependency of bitsandbytes 2023-05-25 22:26:25 -04:00			`scipy`
Add CUDA wheels for llama-cpp-python by jllllll 2023-07-19 22:31:19 -04:00			`sentencepiece`
Add Tensorboard/Weights and biases integration for training (#2624) 2023-07-12 10:53:31 -04:00			`tensorboard`
Bump transformers to 4.43 2024-07-23 17:06:34 -04:00			`transformers==4.43.*`
Add CUDA wheels for llama-cpp-python by jllllll 2023-07-19 22:31:19 -04:00			`tqdm`
			`wandb`
Pin aiofiles version to fix statvfs issue 2023-08-09 11:07:55 -04:00
Do not install extensions requirements by default (#5621) 2024-03-04 02:46:39 -05:00			`# API`
			`SpeechRecognition==3.10.0`
			`flask_cloudflared==0.0.14`
Revert sse-starlette version bump because it breaks API request cancellation (#5873) 2024-04-18 14:05:00 -04:00			`sse-starlette==1.6.5`
Do not install extensions requirements by default (#5621) 2024-03-04 02:46:39 -05:00			`tiktoken`

Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 08:11:31 -04:00			`# llama-cpp-python (CPU only, AVX2)`
Bump llama-cpp-python to 0.2.83, add back tensorcore wheels Also add back the progress bar patch 2024-07-22 21:05:11 -04:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.83+cpuavx2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 08:11:31 -04:00
			`# llama-cpp-python (CUDA, no tensor cores)`
Bump llama-cpp-python to 0.2.83, add back tensorcore wheels Also add back the progress bar patch 2024-07-22 21:05:11 -04:00			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.2.83+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`

			`# llama-cpp-python (CUDA, tensor cores)`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda_tensorcores-0.2.83+cu121-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 08:11:31 -04:00
Create alternative requirements.txt with AMD and Metal wheels (#4052) 2023-09-24 08:58:29 -04:00			`# CUDA wheels`
Bump ExLlamaV2 to 0.1.7 2024-07-11 15:33:46 -04:00			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7+cu121.torch2.2.2-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
			`https://github.com/oobabooga/exllamav2/releases/download/v0.1.7/exllamav2-0.1.7-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"`
Bump flash-attention to 2.6.1 2024-07-12 23:16:11 -04:00			`https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"`
			`https://github.com/oobabooga/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu122torch2.2.2cxx11abiFALSE-cp310-cp310-win_amd64.whl; platform_system == "Windows" and python_version == "3.10"`
			`https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"`
			`https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.1/flash_attn-2.6.1+cu123torch2.2cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"`
Backend cleanup (#6025) 2024-05-21 12:32:02 -04:00			`autoawq==0.2.5; platform_system == "Linux" or platform_system == "Windows"`