Commit Graph

115 Commits

Author SHA1 Message Date
Panchovix
5646690769
Fix some models not loading on exllama_hf (#2835) 2023-06-23 11:31:02 -03:00
LarryVRH
580c1ee748
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777) 2023-06-21 15:31:42 -03:00
ThisIsPIRI
def3b69002
Fix loading condition for universal llama tokenizer (#2753) 2023-06-18 18:14:06 -03:00
oobabooga
9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga
7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00
oobabooga
00b94847da Remove softprompt support 2023-06-06 07:42:23 -03:00
oobabooga
f276d88546 Use AutoGPTQ by default for GPTQ models 2023-06-05 15:41:48 -03:00
oobabooga
3578dd3611
Change a warning message 2023-05-29 22:40:54 -03:00
Luis Lopez
9e7204bef4
Add tail-free and top-a sampling (#2357) 2023-05-29 21:40:01 -03:00
Forkoz
60ae80cf28
Fix hang in tokenizer for AutoGPTQ llama models. (#2399) 2023-05-28 23:10:10 -03:00
oobabooga
361451ba60
Add --load-in-4bit parameter (#2320) 2023-05-25 01:14:13 -03:00
oobabooga
cd3618d7fb Add support for RWKV in Hugging Face format 2023-05-23 02:07:28 -03:00
oobabooga
e116d31180 Prevent unwanted log messages from modules 2023-05-21 22:42:34 -03:00
oobabooga
05593a7834 Minor bug fix 2023-05-20 23:22:36 -03:00
oobabooga
9d5025f531 Improve error handling while loading GPTQ models 2023-05-19 11:20:08 -03:00
oobabooga
ef10ffc6b4 Add various checks to model loading functions 2023-05-17 16:14:54 -03:00
oobabooga
abd361b3a0 Minor change 2023-05-17 11:33:43 -03:00
oobabooga
21ecc3701e Avoid a name conflict 2023-05-17 11:23:13 -03:00
oobabooga
1a8151a2b6
Add AutoGPTQ support (basic) (#2132) 2023-05-17 11:12:12 -03:00
oobabooga
7584d46c29
Refactor models.py (#2113) 2023-05-16 19:52:22 -03:00
oobabooga
4e66f68115 Create get_max_memory_dict() function 2023-05-15 19:38:27 -03:00
oobabooga
2eeb27659d Fix bug in --cpu-memory 2023-05-12 06:17:07 -03:00
oobabooga
3316e33d14 Remove unused code 2023-05-10 11:59:59 -03:00
oobabooga
3913155c1f
Style improvements (#1957) 2023-05-09 22:49:39 -03:00
Wesley Pyburn
a2b25322f0
Fix trust_remote_code in wrong location (#1953) 2023-05-09 19:22:10 -03:00
EgrorBs
d3ea70f453
More trust_remote_code=trust_remote_code (#1899) 2023-05-07 23:48:20 -03:00
oobabooga
97a6a50d98 Use oasst tokenizer instead of universal tokenizer 2023-05-04 15:55:39 -03:00
Mylo
bd531c2dc2
Make --trust-remote-code work for all models (#1772) 2023-05-04 02:01:28 -03:00
oobabooga
9c77ab4fc2 Improve some warnings 2023-05-03 22:06:46 -03:00
oobabooga
95d04d6a8d Better warning messages 2023-05-03 21:43:17 -03:00
Ahmed Said
fbcd32988e
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative (#1649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-02 18:25:28 -03:00
oobabooga
9c2e7c0fab Fix path on models.py 2023-04-26 03:29:09 -03:00
oobabooga
a8409426d7
Fix bug in models.py 2023-04-26 01:55:40 -03:00
oobabooga
f642135517 Make universal tokenizer, xformers, sdp-attention apply to monkey patch 2023-04-25 23:18:11 -03:00
Vincent Brouwers
92cdb4f22b
Seq2Seq support (including FLAN-T5) (#1535)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-25 22:39:04 -03:00
Wojtab
12212cf6be
LLaVA support (#1487) 2023-04-23 20:32:22 -03:00
oobabooga
c0b5c09860 Minor change 2023-04-22 15:15:31 -03:00
oobabooga
fcb594b90e Don't require llama.cpp models to be placed in subfolders 2023-04-22 14:56:48 -03:00
oobabooga
c4f4f41389
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-21 00:20:33 -03:00
oobabooga
7bb9036ac9 Add universal LLaMA tokenizer support 2023-04-19 21:23:51 -03:00
catalpaaa
07de7d0426
Load llamacpp before quantized model (#1307) 2023-04-17 10:47:26 -03:00
oobabooga
39099663a0
Add 4-bit LoRA support (#1200) 2023-04-16 23:26:52 -03:00
Forkoz
c6fe1ced01
Add ChatGLM support (#1256)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 19:15:03 -03:00
oobabooga
ac189011cb Add "Save current settings for this model" button 2023-04-15 12:54:02 -03:00
oobabooga
cacbcda208
Two new options: truncation length and ban eos token 2023-04-11 18:46:06 -03:00
oobabooga
1911504f82 Minor bug fix 2023-04-09 23:45:41 -03:00
oobabooga
dba2000d2b Do things that I am not proud of 2023-04-09 23:40:49 -03:00
MarkovInequality
992663fa20
Added xformers support to Llama (#950) 2023-04-09 23:08:40 -03:00
oobabooga
a3085dba07 Fix LlamaTokenizer eos_token (attempt) 2023-04-09 21:19:39 -03:00
oobabooga
0b458bf82d Simplify a function 2023-04-07 21:37:41 -03:00