Commit Graph

663 Commits

Author SHA1 Message Date
oobabooga
b27f83c0e9 Make exllama stoppable 2023-06-16 22:03:23 -03:00
oobabooga
7f06d551a3 Fix streaming callback 2023-06-16 21:44:56 -03:00
oobabooga
5f392122fd Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga
9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga
dea43685b0 Add some clarifications 2023-06-16 19:10:53 -03:00
oobabooga
7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00
Tom Jobbins
646b0c889f
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP (#2648) 2023-06-15 23:59:54 -03:00
oobabooga
2b9a6b9259 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-06-14 18:45:24 -03:00
oobabooga
4d508cbe58 Add some checks to AutoGPTQ loader 2023-06-14 18:44:43 -03:00
FartyPants
56c19e623c
Add LORA name instead of "default" in PeftModel (#2689) 2023-06-14 18:29:42 -03:00
oobabooga
474dc7355a Allow API requests to use parameter presets 2023-06-14 11:32:20 -03:00
oobabooga
e471919e6d Make llava/minigpt-4 work with AutoGPTQ 2023-06-11 17:56:01 -03:00
oobabooga
f4defde752 Add a menu for installing extensions 2023-06-11 17:11:06 -03:00
oobabooga
ac122832f7 Make dropdown menus more similar to automatic1111 2023-06-11 14:20:16 -03:00
oobabooga
6133675e0f
Add menus for saving presets/characters/instruction templates/prompts (#2621) 2023-06-11 12:19:18 -03:00
brandonj60
b04e18d10c
Add Mirostat v2 sampling to transformer models (#2571) 2023-06-09 21:26:31 -03:00
oobabooga
6015616338 Style changes 2023-06-06 13:06:05 -03:00
oobabooga
f040073ef1 Handle the case of older autogptq install 2023-06-06 13:05:05 -03:00
oobabooga
bc58dc40bd Fix a minor bug 2023-06-06 12:57:13 -03:00
oobabooga
00b94847da Remove softprompt support 2023-06-06 07:42:23 -03:00
oobabooga
0aebc838a0 Don't save the history for 'None' character 2023-06-06 07:21:07 -03:00
oobabooga
9f215523e2 Remove some unused imports 2023-06-06 07:05:46 -03:00
oobabooga
0f0108ce34 Never load the history for default character 2023-06-06 07:00:11 -03:00
oobabooga
11f38b5c2b Add AutoGPTQ LoRA support 2023-06-05 23:32:57 -03:00
oobabooga
3a5cfe96f0 Increase chat_prompt_size_max 2023-06-05 17:37:37 -03:00
oobabooga
f276d88546 Use AutoGPTQ by default for GPTQ models 2023-06-05 15:41:48 -03:00
oobabooga
9b0e95abeb Fix "regenerate" when "Start reply with" is set 2023-06-05 11:56:03 -03:00
oobabooga
19f78684e6 Add "Start reply with" feature to chat mode 2023-06-02 13:58:08 -03:00
GralchemOz
f7b07c4705
Fix the missing Chinese character bug (#2497) 2023-06-02 13:45:41 -03:00
oobabooga
2f6631195a Add desc_act checkbox to the UI 2023-06-02 01:45:46 -03:00
LaaZa
9c066601f5
Extend AutoGPTQ support for any GPTQ model (#1668) 2023-06-02 01:33:55 -03:00
oobabooga
a83f9aa65b
Update shared.py 2023-06-01 12:08:39 -03:00
oobabooga
b6c407f51d Don't stream at more than 24 fps
This is a performance optimization
2023-05-31 23:41:42 -03:00
Forkoz
9ab90d8b60
Fix warning for qlora (#2438) 2023-05-30 11:09:18 -03:00
oobabooga
3578dd3611
Change a warning message 2023-05-29 22:40:54 -03:00
oobabooga
3a6e194bc7
Change a warning message 2023-05-29 22:39:23 -03:00
Luis Lopez
9e7204bef4
Add tail-free and top-a sampling (#2357) 2023-05-29 21:40:01 -03:00
oobabooga
1394f44e14 Add triton checkbox for AutoGPTQ 2023-05-29 15:32:45 -03:00
oobabooga
f34d20922c Minor fix 2023-05-29 13:31:17 -03:00
oobabooga
983eef1e29 Attempt at evaluating falcon perplexity (failed) 2023-05-29 13:28:25 -03:00
Honkware
204731952a
Falcon support (trust-remote-code and autogptq checkboxes) (#2367)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-29 10:20:18 -03:00
Forkoz
60ae80cf28
Fix hang in tokenizer for AutoGPTQ llama models. (#2399) 2023-05-28 23:10:10 -03:00
oobabooga
2f811b1bdf Change a warning message 2023-05-28 22:48:20 -03:00
oobabooga
9ee1e37121 Fix return message when no model is loaded 2023-05-28 22:46:32 -03:00
oobabooga
00ebea0b2a Use YAML for presets and settings 2023-05-28 22:34:12 -03:00
oobabooga
acfd876f29 Some qol changes to "Perplexity evaluation" 2023-05-25 15:06:22 -03:00
oobabooga
8efdc01ffb Better default for compute_dtype 2023-05-25 15:05:53 -03:00
oobabooga
37d4ad012b Add a button for rendering markdown for any model 2023-05-25 11:59:27 -03:00
DGdev91
cf088566f8
Make llama.cpp read prompt size and seed from settings (#2299) 2023-05-25 10:29:31 -03:00
oobabooga
361451ba60
Add --load-in-4bit parameter (#2320) 2023-05-25 01:14:13 -03:00