Commit Graph

31 Commits

Author SHA1 Message Date
Forkoz
8cce1f1126
Exllamav2 lora support (#4229)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-14 16:12:41 -03:00
oobabooga
13ac55fa18 Reorder some functions 2023-09-19 13:51:57 -07:00
James Braza
fee38e0601
Simplified ExLlama cloning instructions and failure message (#3972) 2023-09-17 19:26:05 -03:00
oobabooga
ad8ac545a5 Tokenization improvements 2023-09-17 07:02:00 -07:00
saltacc
cd08eb0753
token probs for non HF loaders (#3957) 2023-09-17 10:42:32 -03:00
saltacc
ed6b6411fb
Fix exllama tokenizers (#3954)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-16 09:42:38 -03:00
saltacc
f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00
oobabooga
df52dab67b Lint 2023-09-11 07:57:38 -07:00
Forkoz
15e9b8c915
Exllama new rope settings (#3852) 2023-09-11 01:14:36 -03:00
oobabooga
52ab2a6b9e Add rope_freq_base parameter for CodeLlama 2023-08-25 06:55:15 -07:00
oobabooga
ef17da70af Fix ExLlama truncation 2023-08-20 08:53:26 -07:00
oobabooga
d4b851bdc8 Credit turboderp 2023-08-06 13:43:15 -07:00
oobabooga
0af10ab49b
Add Classifier Free Guidance (CFG) for Transformers/ExLlama (#3325) 2023-08-06 17:22:48 -03:00
oobabooga
32a2bbee4a Implement auto_max_new_tokens for ExLlama 2023-08-02 11:03:56 -07:00
oobabooga
b6643e5039 Add decode functions to llama.cpp/exllama 2023-07-07 09:11:30 -07:00
oobabooga
1ba2e88551 Add truncation to exllama 2023-07-07 09:09:23 -07:00
Panchovix
10c8c197bf
Add Support for Static NTK RoPE scaling for exllama/exllama_hf (#2955) 2023-07-04 01:13:16 -03:00
ardfork
3c076c3c80
Disable half2 for ExLlama when using HIP (#2912) 2023-06-29 15:03:16 -03:00
oobabooga
79db629665 Minor bug fix 2023-06-29 13:53:06 -03:00
oobabooga
3443219cbc
Add repetition penalty range parameter to transformers (#2916) 2023-06-29 13:40:13 -03:00
oobabooga
c52290de50
ExLlama with long context (#2875) 2023-06-25 22:49:26 -03:00
jllllll
bef67af23c
Use pre-compiled python module for ExLlama (#2770) 2023-06-24 20:24:17 -03:00
oobabooga
eb30f4441f
Add ExLlama+LoRA support (#2756) 2023-06-19 12:31:24 -03:00
oobabooga
5f418f6171 Fix a memory leak (credits for the fix: Ph0rk0z) 2023-06-19 01:19:28 -03:00
Forkoz
3cae1221d4
Update exllama.py - Respect model dir parameter (#2744) 2023-06-18 13:26:30 -03:00
oobabooga
c5641b65d3 Handle leading spaces properly in ExLllama 2023-06-17 19:35:12 -03:00
oobabooga
cbd63eeeff Fix repeated tokens with exllama 2023-06-17 19:02:08 -03:00
oobabooga
766c760cd7 Use gen_begin_reuse in exllama 2023-06-17 18:00:10 -03:00
oobabooga
b27f83c0e9 Make exllama stoppable 2023-06-16 22:03:23 -03:00
oobabooga
5f392122fd Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga
9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00