Commit Graph

11 Commits

Author SHA1 Message Date
oobabooga
c52290de50
ExLlama with long context (#2875) 2023-06-25 22:49:26 -03:00
jllllll
bef67af23c
Use pre-compiled python module for ExLlama (#2770) 2023-06-24 20:24:17 -03:00
oobabooga
eb30f4441f
Add ExLlama+LoRA support (#2756) 2023-06-19 12:31:24 -03:00
oobabooga
5f418f6171 Fix a memory leak (credits for the fix: Ph0rk0z) 2023-06-19 01:19:28 -03:00
Forkoz
3cae1221d4
Update exllama.py - Respect model dir parameter (#2744) 2023-06-18 13:26:30 -03:00
oobabooga
c5641b65d3 Handle leading spaces properly in ExLllama 2023-06-17 19:35:12 -03:00
oobabooga
cbd63eeeff Fix repeated tokens with exllama 2023-06-17 19:02:08 -03:00
oobabooga
766c760cd7 Use gen_begin_reuse in exllama 2023-06-17 18:00:10 -03:00
oobabooga
b27f83c0e9 Make exllama stoppable 2023-06-16 22:03:23 -03:00
oobabooga
5f392122fd Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga
9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00