Commit Graph

1309 Commits

Author SHA1 Message Date
oobabooga
4e188eeb80 Lint 2024-02-03 20:40:10 -08:00
oobabooga
cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
kalomaze
b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
Badis Ghoubali
40c7977f9b
Add roleplay.gbnf grammar (#5368) 2024-01-28 21:41:28 -03:00
sam-ngu
c0bdcee646
added trust_remote_code to deepspeed init loaderClass (#5237) 2024-01-26 11:10:57 -03:00
oobabooga
87dc421ee8
Bump exllamav2 to 0.0.12 (#5352) 2024-01-22 22:40:12 -03:00
oobabooga
aad73667af Lint 2024-01-22 03:25:55 -08:00
lmg-anon
db1da9f98d
Fix logprobs tokens in OpenAI API (#5339) 2024-01-22 08:07:42 -03:00
Forkoz
5c5ef4cef7
UI: change n_gpu_layers maximum to 256 for larger models. (#5262) 2024-01-17 17:13:16 -03:00
ilya sheprut
4d14eb8b82
LoRA: Fix error "Attempting to unscale FP16 gradients" when training (#5268) 2024-01-17 17:11:49 -03:00
oobabooga
e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00
oobabooga
b3fc2cd887 UI: Do not save unchanged extension settings to settings.yaml 2024-01-10 03:48:30 -08:00
oobabooga
53dc1d8197 UI: Do not save unchanged settings to settings.yaml 2024-01-09 18:59:04 -08:00
oobabooga
89e7e107fc Lint 2024-01-09 16:27:50 -08:00
mamei16
bec4e0a1ce
Fix update event in refresh buttons (#5197) 2024-01-09 14:49:37 -03:00
oobabooga
4333d82b9d Minor bug fix 2024-01-09 06:55:18 -08:00
oobabooga
953343cced Improve the file saving/deletion menus 2024-01-09 06:33:47 -08:00
oobabooga
123f27a3c5 Load the nearest character after deleting a character
Instead of the first.
2024-01-09 06:24:27 -08:00
oobabooga
b908ed318d Revert "Rename past chats -> chat history"
This reverts commit aac93a1fd6.
2024-01-09 05:26:07 -08:00
oobabooga
4ca82a4df9 Save light/dark theme on "Save UI defaults to settings.yaml" 2024-01-09 04:20:10 -08:00
oobabooga
7af50ede94 Reorder some buttons 2024-01-09 04:11:50 -08:00
oobabooga
a9f49a7574 Confirm the chat history rename with enter 2024-01-09 04:00:53 -08:00
oobabooga
7bdd2118a2 Change some log messages when deleting files 2024-01-09 03:32:01 -08:00
oobabooga
aac93a1fd6 Rename past chats -> chat history 2024-01-09 03:14:30 -08:00
oobabooga
615fa11af8 Move new chat button, improve history deletion handling 2024-01-08 21:22:37 -08:00
oobabooga
4f7e1eeafd
Past chat histories in a side bar on desktop (#5098)
Lots of room for improvement, but that's a start.
2024-01-09 01:57:29 -03:00
oobabooga
372ef5e2d8 Fix dynatemp parameters always visible 2024-01-08 19:42:31 -08:00
oobabooga
29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters (#5209) 2024-01-08 23:28:35 -03:00
oobabooga
c4e005efec Fix dropdown menus sometimes failing to refresh 2024-01-08 17:49:54 -08:00
oobabooga
9cd2106303 Revert "Add dynamic temperature to the random preset button"
This reverts commit 4365fb890f.
2024-01-08 16:46:24 -08:00
oobabooga
4365fb890f Add dynamic temperature to the random preset button 2024-01-07 13:08:15 -08:00
oobabooga
0d07b3a6a1
Add dynamic_temperature_low parameter (#5198) 2024-01-07 17:03:47 -03:00
oobabooga
b8a0b3f925 Don't print torch tensors with --verbose 2024-01-07 10:35:55 -08:00
oobabooga
cf820c69c5 Print generation parameters with --verbose (HF only) 2024-01-07 10:06:23 -08:00
oobabooga
c4c7fc4ab3 Lint 2024-01-07 09:36:56 -08:00
kalomaze
48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga
248742df1c Save extension fields to settings.yaml on "Save UI defaults" 2024-01-04 20:33:42 -08:00
oobabooga
c9d814592e Increase maximum temperature value to 5 2024-01-04 17:28:15 -08:00
oobabooga
e4d724eb3f Fix cache_folder bug introduced in 37eff915d6 2024-01-04 07:49:40 -08:00
Alberto Cano
37eff915d6
Use --disk-cache-dir for all caches 2024-01-04 00:27:26 -03:00
Lounger
7965f6045e
Fix loading latest history for file names with dots (#5162) 2024-01-03 22:39:41 -03:00
AstrisCantCode
b80e6365d0
Fix various bugs for LoRA training (#5161) 2024-01-03 20:42:20 -03:00
oobabooga
7cce88c403 Rmove an unncecessary exception 2024-01-02 07:20:59 -08:00
oobabooga
94afa0f9cf Minor style changes 2024-01-01 16:00:22 -08:00
oobabooga
cbf6f9e695 Update some UI messages 2023-12-30 21:31:17 -08:00
oobabooga
2aad91f3c9
Remove deprecated command-line flags (#5131) 2023-12-31 02:07:48 -03:00
oobabooga
2734ce3e4c
Remove RWKV loader (#5130) 2023-12-31 02:01:40 -03:00
oobabooga
0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
oobabooga
8e397915c9
Remove --sdp-attention, --xformers flags (#5126) 2023-12-31 01:36:51 -03:00
B611
b7dd1f9542
Specify utf-8 encoding for model metadata file open (#5125) 2023-12-31 01:34:32 -03:00
oobabooga
c06f630bcc Increase max_updates_second maximum value 2023-12-24 13:29:47 -08:00
oobabooga
8c60495878 UI: add "Maximum UI updates/second" parameter 2023-12-24 09:17:40 -08:00
zhangningboo
1b8b61b928
Fix output_ids decoding for Qwen/Qwen-7B-Chat (#5045) 2023-12-22 23:11:02 -03:00
Yiximail
afc91edcb2
Reset the model_name after unloading the model (#5051) 2023-12-22 22:18:24 -03:00
oobabooga
2706149c65
Organize the CMD arguments by group (#5027) 2023-12-21 00:33:55 -03:00
oobabooga
c727a70572 Remove redundancy from modules/loaders.py 2023-12-20 19:18:07 -08:00
luna
6efbe3009f
let exllama v1 models load safetensor loras (#4854) 2023-12-20 13:29:19 -03:00
oobabooga
bcba200790 Fix EOS being ignored in ExLlamav2 after previous commit 2023-12-20 07:54:06 -08:00
oobabooga
f0f6d9bdf9 Add HQQ back & update version
This reverts commit 2289e9031e.
2023-12-20 07:46:09 -08:00
oobabooga
b15f510154 Optimize ExLlamav2 (non-HF) loader 2023-12-20 07:31:42 -08:00
oobabooga
fadb295d4d Lint 2023-12-19 21:36:57 -08:00
oobabooga
fb8ee9f7ff Add a specific error if HQQ is missing 2023-12-19 21:32:58 -08:00
oobabooga
9992f7d8c0 Improve several log messages 2023-12-19 20:54:32 -08:00
oobabooga
23818dc098 Better logger
Credits: vladmandic/automatic
2023-12-19 20:38:33 -08:00
oobabooga
95600073bc Add an informative error when extension requirements are missing 2023-12-19 20:20:45 -08:00
oobabooga
d8279dc710 Replace character name placeholders in chat context (closes #5007) 2023-12-19 17:31:46 -08:00
oobabooga
e83e6cedbe Organize the model menu 2023-12-19 13:18:26 -08:00
oobabooga
f4ae0075e8 Fix conversion from old template format to jinja2 2023-12-19 13:16:52 -08:00
oobabooga
de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga
0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
oobabooga
83cf1a6b67 Fix Yi space issue (closes #4996) 2023-12-19 07:54:19 -08:00
oobabooga
9847809a7a Add a warning about ppl evaluation without --no_use_fast 2023-12-18 18:09:24 -08:00
oobabooga
f6d701624c UI: mention that QuIP# does not work on Windows 2023-12-18 18:05:02 -08:00
oobabooga
a23a004434 Update the example template 2023-12-18 17:47:35 -08:00
Water
674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga
1f9e25e76a UI: update "Saved instruction templates" dropdown after loading template 2023-12-17 21:19:06 -08:00
oobabooga
da1c8d77ea Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2023-12-17 21:05:10 -08:00
oobabooga
cac89df97b Instruction templates: better handle unwanted bos tokens 2023-12-17 21:04:30 -08:00
oobabooga
f0d6ead877
llama.cpp: read instruction template from GGUF metadata (#4975) 2023-12-18 01:51:58 -03:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga
12690d3ffc
Better HF grammar implementation (#4953) 2023-12-17 02:01:23 -03:00
oobabooga
f8079d067d UI: save the sent chat message on "no model is loaded" error 2023-12-16 10:52:41 -08:00
oobabooga
3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga
2cb5b68ad9
Bug fix: when generation fails, save the sent message (#4915) 2023-12-15 01:01:45 -03:00
Kim Jaewon
e53f99faa0
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint (#4916) 2023-12-15 00:22:43 -03:00
Lounger
5754f0c357
Fix deleting chat logs (#4914) 2023-12-13 21:54:43 -03:00
Bartowski
f51156705d
Allow symlinked folder within root directory (#4863) 2023-12-13 18:08:21 -03:00
Ixion
3f3960dbfb
Fixed invalid Jinja2 syntax in instruction templates (#4911) 2023-12-13 15:46:23 -03:00
oobabooga
fcf5512364 Jinja templates: fix a potential small bug 2023-12-13 10:19:39 -08:00
oobabooga
7f1a6a70e3 Update the llamacpp_HF comment 2023-12-12 21:04:20 -08:00
oobabooga
1c531a3713 Minor cleanup 2023-12-12 13:25:21 -08:00
oobabooga
8513028968 Fix lag in the chat tab during streaming 2023-12-12 13:01:25 -08:00
oobabooga
39d2fe1ed9
Jinja templates for Instruct and Chat (#4874) 2023-12-12 17:23:14 -03:00
oobabooga
aab0dd962d Revert "Update callbacks.py to show tracebacks on ValueError (#4892)"
This reverts commit 993ca51a65.
2023-12-12 11:47:11 -08:00
Nehereus
993ca51a65
Update callbacks.py to show tracebacks on ValueError (#4892) 2023-12-12 02:29:27 -03:00
Morgan Schweers
602b8c6210
Make new browser reloads recognize current model. (#4865) 2023-12-11 02:51:01 -03:00
oobabooga
8c8825b777 Add QuIP# to README 2023-12-08 08:40:42 -08:00
oobabooga
2a335b8aa7 Cleanup: set shared.model_name only once 2023-12-08 06:35:23 -08:00
oobabooga
62d59a516f Add trust_remote_code to all HF loaders 2023-12-08 06:29:26 -08:00
oobabooga
181743fd97 Fix missing spaces tokenizer issue (closes #4834) 2023-12-08 05:16:46 -08:00
Yiximail
1c74b3ab45
Fix partial unicode characters issue (#4837) 2023-12-08 09:50:53 -03:00
oobabooga
2c5a1e67f9
Parameters: change max_new_tokens & repetition_penalty_range defaults (#4842) 2023-12-07 20:04:52 -03:00
oobabooga
98361af4d5
Add QuIP# support (#4803)
It has to be installed manually for now.
2023-12-06 00:01:01 -03:00
oobabooga
6430acadde Minor bug fix after https://github.com/oobabooga/text-generation-webui/pull/4814 2023-12-05 10:08:11 -08:00
oobabooga
0f828ea441 Do not limit API updates/second 2023-12-04 20:45:43 -08:00
oobabooga
9edb193def
Optimize HF text generation (#4814) 2023-12-05 00:00:40 -03:00
俞航
ac9f154bcc
Bump exllamav2 from 0.0.8 to 0.0.10 & Fix code change (#4782) 2023-12-04 21:15:05 -03:00
oobabooga
131a5212ce UI: update context upper limit to 200000 2023-12-04 15:48:34 -08:00
oobabooga
be88b072e9 Update --loader flag description 2023-12-04 15:41:25 -08:00
oobabooga
7fc9033b2e Recommend ExLlama_HF and ExLlamav2_HF 2023-12-04 15:28:46 -08:00
Lounger
7c0a17962d
Gallery improvements (#4789) 2023-12-03 22:45:50 -03:00
oobabooga
77d6ccf12b Add a LOADER debug message while loading models 2023-11-30 12:00:32 -08:00
oobabooga
092a2c3516 Fix a bug in llama.cpp get_logits() function 2023-11-30 11:21:40 -08:00
oobabooga
2698d7c9fd Fix llama.cpp model unloading 2023-11-29 15:19:48 -08:00
oobabooga
9940ed9c77 Sort the loaders 2023-11-29 15:13:03 -08:00
oobabooga
a7670c31ca Sort 2023-11-28 18:43:33 -08:00
oobabooga
6e51bae2e0 Sort the loaders menu 2023-11-28 18:41:11 -08:00
oobabooga
68059d7c23 llama.cpp: minor log change & lint 2023-11-27 10:44:55 -08:00
tsukanov-as
9f7ae6bb2e
fix detection of stopping strings when HTML escaping is used (#4728) 2023-11-27 15:42:08 -03:00
oobabooga
0589ff5b12
Bump llama-cpp-python to 0.2.19 & add min_p and typical_p parameters to llama.cpp loader (#4701) 2023-11-21 20:59:39 -03:00
oobabooga
2769a1fa25 Hide deprecated args from Session tab 2023-11-21 15:15:16 -08:00
oobabooga
a2e6d00128 Use convert_ids_to_tokens instead of decode in logits endpoint
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga
9da7bb203d Minor LoRA bug fix 2023-11-19 07:59:29 -08:00
oobabooga
a6f1e1bcc5 Fix PEFT LoRA unloading 2023-11-19 07:55:25 -08:00
oobabooga
ab94f0d9bf Minor style change 2023-11-18 21:11:04 -08:00
oobabooga
5fcee696ea
New feature: enlarge character pictures on click (#4654) 2023-11-19 02:05:17 -03:00
oobabooga
ef6feedeb2
Add --nowebui flag for pure API mode (#4651) 2023-11-18 23:38:39 -03:00
oobabooga
0fa1af296c
Add /v1/internal/logits endpoint (#4650) 2023-11-18 23:19:31 -03:00
oobabooga
8f4f4daf8b
Add --admin-key flag for API (#4649) 2023-11-18 22:33:27 -03:00
Jordan Tucker
baab894759
fix: use system message in chat-instruct mode (#4648) 2023-11-18 20:20:13 -03:00
oobabooga
47d9e2618b Refresh the Preset menu after saving a preset 2023-11-18 14:03:42 -08:00
oobabooga
83b64e7fc1
New feature: "random preset" button (#4647) 2023-11-18 18:31:41 -03:00
oobabooga
e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga
9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga
13dc3b61da Update README 2023-11-16 19:57:55 -08:00
oobabooga
8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga
6525707a7f Fix "send instruction template to..." buttons (closes #4625) 2023-11-16 18:16:42 -08:00
oobabooga
510a01ef46 Lint 2023-11-16 18:03:06 -08:00
oobabooga
923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga
58c6001be9 Add missing exllamav2 samplers 2023-11-16 07:09:40 -08:00
oobabooga
cd41f8912b Warn users about n_ctx / max_seq_len 2023-11-15 18:56:42 -08:00
oobabooga
9be48e83a9 Start API when "api" checkbox is checked 2023-11-15 16:35:47 -08:00
oobabooga
a85ce5f055 Add more info messages for truncation / instruction template 2023-11-15 16:20:31 -08:00
oobabooga
883701bc40 Alternative solution to 025da386a0
Fixes an error.
2023-11-15 16:04:02 -08:00
oobabooga
8ac942813c Revert "Fix CPU memory limit error (issue #3763) (#4597)"
This reverts commit 025da386a0.
2023-11-15 16:01:54 -08:00
oobabooga
e6f44d6d19 Print context length / instruction template to terminal when loading models 2023-11-15 16:00:51 -08:00
oobabooga
e05d8fd441 Style changes 2023-11-15 15:51:37 -08:00
Andy Bao
025da386a0
Fix CPU memory limit error (issue #3763) (#4597)
get_max_memory_dict() was not properly formatting shared.args.cpu_memory

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-11-15 20:27:20 -03:00
oobabooga
4aabff3728 Remove old API, launch OpenAI API with --api 2023-11-10 06:39:08 -08:00
oobabooga
2af7e382b1 Revert "Bump llama-cpp-python to 0.2.14"
This reverts commit 5c3eb22ce6.

The new version has issues:

https://github.com/oobabooga/text-generation-webui/issues/4540
https://github.com/abetlen/llama-cpp-python/issues/893
2023-11-09 10:02:13 -08:00