Commit Graph

1143 Commits

Author SHA1 Message Date
oobabooga
e83e6cedbe Organize the model menu 2023-12-19 13:18:26 -08:00
oobabooga
f4ae0075e8 Fix conversion from old template format to jinja2 2023-12-19 13:16:52 -08:00
oobabooga
de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga
0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
oobabooga
83cf1a6b67 Fix Yi space issue (closes #4996) 2023-12-19 07:54:19 -08:00
oobabooga
9847809a7a Add a warning about ppl evaluation without --no_use_fast 2023-12-18 18:09:24 -08:00
oobabooga
f6d701624c UI: mention that QuIP# does not work on Windows 2023-12-18 18:05:02 -08:00
oobabooga
a23a004434 Update the example template 2023-12-18 17:47:35 -08:00
Water
674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga
1f9e25e76a UI: update "Saved instruction templates" dropdown after loading template 2023-12-17 21:19:06 -08:00
oobabooga
da1c8d77ea Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2023-12-17 21:05:10 -08:00
oobabooga
cac89df97b Instruction templates: better handle unwanted bos tokens 2023-12-17 21:04:30 -08:00
oobabooga
f0d6ead877
llama.cpp: read instruction template from GGUF metadata (#4975) 2023-12-18 01:51:58 -03:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga
12690d3ffc
Better HF grammar implementation (#4953) 2023-12-17 02:01:23 -03:00
oobabooga
f8079d067d UI: save the sent chat message on "no model is loaded" error 2023-12-16 10:52:41 -08:00
oobabooga
3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga
2cb5b68ad9
Bug fix: when generation fails, save the sent message (#4915) 2023-12-15 01:01:45 -03:00
Kim Jaewon
e53f99faa0
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint (#4916) 2023-12-15 00:22:43 -03:00
Lounger
5754f0c357
Fix deleting chat logs (#4914) 2023-12-13 21:54:43 -03:00
Bartowski
f51156705d
Allow symlinked folder within root directory (#4863) 2023-12-13 18:08:21 -03:00
Ixion
3f3960dbfb
Fixed invalid Jinja2 syntax in instruction templates (#4911) 2023-12-13 15:46:23 -03:00
oobabooga
fcf5512364 Jinja templates: fix a potential small bug 2023-12-13 10:19:39 -08:00
oobabooga
7f1a6a70e3 Update the llamacpp_HF comment 2023-12-12 21:04:20 -08:00
oobabooga
1c531a3713 Minor cleanup 2023-12-12 13:25:21 -08:00
oobabooga
8513028968 Fix lag in the chat tab during streaming 2023-12-12 13:01:25 -08:00
oobabooga
39d2fe1ed9
Jinja templates for Instruct and Chat (#4874) 2023-12-12 17:23:14 -03:00
oobabooga
aab0dd962d Revert "Update callbacks.py to show tracebacks on ValueError (#4892)"
This reverts commit 993ca51a65.
2023-12-12 11:47:11 -08:00
Nehereus
993ca51a65
Update callbacks.py to show tracebacks on ValueError (#4892) 2023-12-12 02:29:27 -03:00
Morgan Schweers
602b8c6210
Make new browser reloads recognize current model. (#4865) 2023-12-11 02:51:01 -03:00
oobabooga
8c8825b777 Add QuIP# to README 2023-12-08 08:40:42 -08:00
oobabooga
2a335b8aa7 Cleanup: set shared.model_name only once 2023-12-08 06:35:23 -08:00
oobabooga
62d59a516f Add trust_remote_code to all HF loaders 2023-12-08 06:29:26 -08:00
oobabooga
181743fd97 Fix missing spaces tokenizer issue (closes #4834) 2023-12-08 05:16:46 -08:00
Yiximail
1c74b3ab45
Fix partial unicode characters issue (#4837) 2023-12-08 09:50:53 -03:00
oobabooga
2c5a1e67f9
Parameters: change max_new_tokens & repetition_penalty_range defaults (#4842) 2023-12-07 20:04:52 -03:00
oobabooga
98361af4d5
Add QuIP# support (#4803)
It has to be installed manually for now.
2023-12-06 00:01:01 -03:00
oobabooga
6430acadde Minor bug fix after https://github.com/oobabooga/text-generation-webui/pull/4814 2023-12-05 10:08:11 -08:00
oobabooga
0f828ea441 Do not limit API updates/second 2023-12-04 20:45:43 -08:00
oobabooga
9edb193def
Optimize HF text generation (#4814) 2023-12-05 00:00:40 -03:00
俞航
ac9f154bcc
Bump exllamav2 from 0.0.8 to 0.0.10 & Fix code change (#4782) 2023-12-04 21:15:05 -03:00
oobabooga
131a5212ce UI: update context upper limit to 200000 2023-12-04 15:48:34 -08:00
oobabooga
be88b072e9 Update --loader flag description 2023-12-04 15:41:25 -08:00
oobabooga
7fc9033b2e Recommend ExLlama_HF and ExLlamav2_HF 2023-12-04 15:28:46 -08:00
Lounger
7c0a17962d
Gallery improvements (#4789) 2023-12-03 22:45:50 -03:00
oobabooga
77d6ccf12b Add a LOADER debug message while loading models 2023-11-30 12:00:32 -08:00
oobabooga
092a2c3516 Fix a bug in llama.cpp get_logits() function 2023-11-30 11:21:40 -08:00
oobabooga
2698d7c9fd Fix llama.cpp model unloading 2023-11-29 15:19:48 -08:00
oobabooga
9940ed9c77 Sort the loaders 2023-11-29 15:13:03 -08:00
oobabooga
a7670c31ca Sort 2023-11-28 18:43:33 -08:00