Commit Graph

31 Commits

Author SHA1 Message Date
oobabooga
e436d69e2b Add --no_xformers and --no_sdpa flags for ExllamaV2 2024-07-11 15:47:37 -07:00
oobabooga
2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga
d6bd71db7f ExLlamaV2: fix loading when autosplit is not set 2024-02-17 12:54:37 -08:00
oobabooga
a6730f88f7
Add --autosplit flag for ExLlamaV2 (#5524) 2024-02-16 15:26:10 -03:00
oobabooga
2a1063eff5 Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478.
2024-02-06 06:21:36 -08:00
oobabooga
cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
oobabooga
87dc421ee8
Bump exllamav2 to 0.0.12 (#5352) 2024-01-22 22:40:12 -03:00
oobabooga
bcba200790 Fix EOS being ignored in ExLlamav2 after previous commit 2023-12-20 07:54:06 -08:00
oobabooga
b15f510154 Optimize ExLlamav2 (non-HF) loader 2023-12-20 07:31:42 -08:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
Yiximail
1c74b3ab45
Fix partial unicode characters issue (#4837) 2023-12-08 09:50:53 -03:00
俞航
ac9f154bcc
Bump exllamav2 from 0.0.8 to 0.0.10 & Fix code change (#4782) 2023-12-04 21:15:05 -03:00
oobabooga
58c6001be9 Add missing exllamav2 samplers 2023-11-16 07:09:40 -08:00
oobabooga
c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
oobabooga
77abd9b69b Add no_flash_attn option 2023-11-02 11:08:53 -07:00
Brian Dashore
3345da2ea4
Add flash-attention 2 for windows (#4235) 2023-10-21 03:46:23 -03:00
Johan
1d5a015ce7
Enable special token support for exllamav2 (#4314) 2023-10-21 01:54:06 -03:00
Forkoz
8cce1f1126
Exllamav2 lora support (#4229)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-14 16:12:41 -03:00
turboderp
8a98646a21
Bump ExLlamaV2 to 0.0.5 (#4186) 2023-10-05 19:12:22 -03:00
oobabooga
56b5a4af74 exllamav2 typical_p 2023-09-28 20:10:12 -07:00
oobabooga
13ac55fa18 Reorder some functions 2023-09-19 13:51:57 -07:00
oobabooga
ff5d3d2d09 Add missing import 2023-09-18 16:26:54 -07:00
oobabooga
605ec3c9f2 Add a warning about ExLlamaV2 without flash-attn 2023-09-18 12:26:35 -07:00
oobabooga
ad8ac545a5 Tokenization improvements 2023-09-17 07:02:00 -07:00
saltacc
cd08eb0753
token probs for non HF loaders (#3957) 2023-09-17 10:42:32 -03:00
saltacc
ed6b6411fb
Fix exllama tokenizers (#3954)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-16 09:42:38 -03:00
saltacc
f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00
Panchovix
34dc7306b8
Fix NTK (alpha) and RoPE scaling for exllamav2 and exllamav2_HF (#3897) 2023-09-13 02:35:09 -03:00
oobabooga
b7adf290fc Fix ExLlama-v2 path issue 2023-09-12 17:42:22 -07:00
oobabooga
18e6b275f3 Add alpha_value/compress_pos_emb to ExLlama-v2 2023-09-12 15:02:47 -07:00
oobabooga
c2a309f56e
Add ExLlamaV2 and ExLlamav2_HF loaders (#3881) 2023-09-12 14:33:07 -03:00