Commit Graph

549 Commits

Author SHA1 Message Date
oobabooga
53fbd2f245 Add TensorRT-LLM to the README 2024-06-25 14:45:37 -07:00
oobabooga
9420973b62
Downgrade PyTorch to 2.2.2 (#6124) 2024-06-14 16:42:03 -03:00
oobabooga
8930bfc5f4
Bump PyTorch, ExLlamaV2, flash-attention (#6122) 2024-06-13 20:38:31 -03:00
oobabooga
bd7cc4234d
Backend cleanup (#6025) 2024-05-21 13:32:02 -03:00
oobabooga
6a1682aa95 README: update command-line flags with raw --help output
This helps me keep this up-to-date more easily.
2024-05-19 20:28:46 -07:00
oobabooga
7a728a38eb Update README 2024-05-07 02:59:36 -07:00
oobabooga
e61055253c Bump llama-cpp-python to 0.2.69, add --flash-attn option 2024-05-03 04:31:22 -07:00
oobabooga
1eba888af6 Update FUNDING.yml 2024-05-01 05:54:21 -07:00
oobabooga
51fb766bea
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 09:11:31 -03:00
oobabooga
c9b0df16ee Lint 2024-04-24 09:55:00 -07:00
oobabooga
4094813f8d Lint 2024-04-24 09:53:41 -07:00
oobabooga
9b623b8a78
Bump llama-cpp-python to 0.2.64, use official wheels (#5921) 2024-04-23 23:17:05 -03:00
oobabooga
d423021a48
Remove CTransformers support (#5807) 2024-04-04 20:23:58 -03:00
oobabooga
056717923f Document StreamingLLM 2024-03-10 19:15:23 -07:00
Bartowski
104573f7d4
Update cache_4bit documentation (#5649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-07 13:08:21 -03:00
oobabooga
2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga
fa0e68cefd Installer: add back INSTALL_EXTENSIONS environment variable (for docker) 2024-03-06 11:31:06 -08:00
oobabooga
97dc3602fc
Create an update wizard (#5623) 2024-03-04 15:52:24 -03:00
oobabooga
527ba98105
Do not install extensions requirements by default (#5621) 2024-03-04 04:46:39 -03:00
oobabooga
8bd4960d05
Update PyTorch to 2.2 (also update flash-attn to 2.5.6) (#5618) 2024-03-03 19:40:32 -03:00
oobabooga
7342afaf19 Update the PyTorch installation instructions 2024-02-08 20:36:11 -08:00
smCloudInTheSky
b1463df0a1
docker: add options for CPU only, Intel GPU, AMD GPU (#5380) 2024-01-28 11:18:14 -03:00
oobabooga
c1470870bb Update README 2024-01-26 05:58:40 -08:00
oobabooga
6ada77cf5a Update README.md 2024-01-22 03:17:15 -08:00
oobabooga
cc6505df14 Update README.md 2024-01-22 03:14:56 -08:00
oobabooga
2734ce3e4c
Remove RWKV loader (#5130) 2023-12-31 02:01:40 -03:00
oobabooga
0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
oobabooga
8e397915c9
Remove --sdp-attention, --xformers flags (#5126) 2023-12-31 01:36:51 -03:00
oobabooga
2706149c65
Organize the CMD arguments by group (#5027) 2023-12-21 00:33:55 -03:00
oobabooga
de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga
0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
Water
674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga
41424907b1 Update README 2023-12-16 16:35:36 -08:00
oobabooga
0087dca286 Update README 2023-12-16 12:28:51 -08:00
oobabooga
3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga
623c92792a Update README 2023-12-14 07:56:48 -08:00
oobabooga
3580bed041 Update README 2023-12-14 07:54:20 -08:00
oobabooga
d5ec3c3444 Update README 2023-12-14 06:20:52 -08:00
oobabooga
5b283fff22 Update README 2023-12-14 06:15:14 -08:00
oobabooga
958799221f Update README 2023-12-14 06:09:03 -08:00
oobabooga
e7fa17740a Update README 2023-12-13 22:49:42 -08:00
oobabooga
03babe7d81 Update README 2023-12-13 22:47:08 -08:00
oobabooga
aad14174e4 Update README 2023-12-13 22:46:18 -08:00
oobabooga
783947a2aa Update README 2023-12-13 22:44:25 -08:00
oobabooga
7fef16950f Update README 2023-12-13 22:42:54 -08:00
oobabooga
d36e7f1762 Update README 2023-12-13 22:35:22 -08:00
oobabooga
9695db0ee4 Update README 2023-12-13 22:30:31 -08:00
oobabooga
d354f5009c Update README 2023-12-13 22:21:29 -08:00
oobabooga
0a4fad2d46 Update README 2023-12-13 22:20:37 -08:00