Commit Graph

1105 Commits

Author SHA1 Message Date
oobabooga
0f828ea441 Do not limit API updates/second 2023-12-04 20:45:43 -08:00
oobabooga
9edb193def
Optimize HF text generation (#4814) 2023-12-05 00:00:40 -03:00
俞航
ac9f154bcc
Bump exllamav2 from 0.0.8 to 0.0.10 & Fix code change (#4782) 2023-12-04 21:15:05 -03:00
oobabooga
131a5212ce UI: update context upper limit to 200000 2023-12-04 15:48:34 -08:00
oobabooga
be88b072e9 Update --loader flag description 2023-12-04 15:41:25 -08:00
oobabooga
7fc9033b2e Recommend ExLlama_HF and ExLlamav2_HF 2023-12-04 15:28:46 -08:00
Lounger
7c0a17962d
Gallery improvements (#4789) 2023-12-03 22:45:50 -03:00
oobabooga
77d6ccf12b Add a LOADER debug message while loading models 2023-11-30 12:00:32 -08:00
oobabooga
092a2c3516 Fix a bug in llama.cpp get_logits() function 2023-11-30 11:21:40 -08:00
oobabooga
2698d7c9fd Fix llama.cpp model unloading 2023-11-29 15:19:48 -08:00
oobabooga
9940ed9c77 Sort the loaders 2023-11-29 15:13:03 -08:00
oobabooga
a7670c31ca Sort 2023-11-28 18:43:33 -08:00
oobabooga
6e51bae2e0 Sort the loaders menu 2023-11-28 18:41:11 -08:00
oobabooga
68059d7c23 llama.cpp: minor log change & lint 2023-11-27 10:44:55 -08:00
tsukanov-as
9f7ae6bb2e
fix detection of stopping strings when HTML escaping is used (#4728) 2023-11-27 15:42:08 -03:00
oobabooga
0589ff5b12
Bump llama-cpp-python to 0.2.19 & add min_p and typical_p parameters to llama.cpp loader (#4701) 2023-11-21 20:59:39 -03:00
oobabooga
2769a1fa25 Hide deprecated args from Session tab 2023-11-21 15:15:16 -08:00
oobabooga
a2e6d00128 Use convert_ids_to_tokens instead of decode in logits endpoint
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga
9da7bb203d Minor LoRA bug fix 2023-11-19 07:59:29 -08:00
oobabooga
a6f1e1bcc5 Fix PEFT LoRA unloading 2023-11-19 07:55:25 -08:00
oobabooga
ab94f0d9bf Minor style change 2023-11-18 21:11:04 -08:00
oobabooga
5fcee696ea
New feature: enlarge character pictures on click (#4654) 2023-11-19 02:05:17 -03:00
oobabooga
ef6feedeb2
Add --nowebui flag for pure API mode (#4651) 2023-11-18 23:38:39 -03:00
oobabooga
0fa1af296c
Add /v1/internal/logits endpoint (#4650) 2023-11-18 23:19:31 -03:00
oobabooga
8f4f4daf8b
Add --admin-key flag for API (#4649) 2023-11-18 22:33:27 -03:00
Jordan Tucker
baab894759
fix: use system message in chat-instruct mode (#4648) 2023-11-18 20:20:13 -03:00
oobabooga
47d9e2618b Refresh the Preset menu after saving a preset 2023-11-18 14:03:42 -08:00
oobabooga
83b64e7fc1
New feature: "random preset" button (#4647) 2023-11-18 18:31:41 -03:00
oobabooga
e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga
9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga
13dc3b61da Update README 2023-11-16 19:57:55 -08:00
oobabooga
8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga
6525707a7f Fix "send instruction template to..." buttons (closes #4625) 2023-11-16 18:16:42 -08:00
oobabooga
510a01ef46 Lint 2023-11-16 18:03:06 -08:00
oobabooga
923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga
58c6001be9 Add missing exllamav2 samplers 2023-11-16 07:09:40 -08:00
oobabooga
cd41f8912b Warn users about n_ctx / max_seq_len 2023-11-15 18:56:42 -08:00
oobabooga
9be48e83a9 Start API when "api" checkbox is checked 2023-11-15 16:35:47 -08:00
oobabooga
a85ce5f055 Add more info messages for truncation / instruction template 2023-11-15 16:20:31 -08:00
oobabooga
883701bc40 Alternative solution to 025da386a0
Fixes an error.
2023-11-15 16:04:02 -08:00
oobabooga
8ac942813c Revert "Fix CPU memory limit error (issue #3763) (#4597)"
This reverts commit 025da386a0.
2023-11-15 16:01:54 -08:00
oobabooga
e6f44d6d19 Print context length / instruction template to terminal when loading models 2023-11-15 16:00:51 -08:00
oobabooga
e05d8fd441 Style changes 2023-11-15 15:51:37 -08:00
Andy Bao
025da386a0
Fix CPU memory limit error (issue #3763) (#4597)
get_max_memory_dict() was not properly formatting shared.args.cpu_memory

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-11-15 20:27:20 -03:00
oobabooga
4aabff3728 Remove old API, launch OpenAI API with --api 2023-11-10 06:39:08 -08:00
oobabooga
2af7e382b1 Revert "Bump llama-cpp-python to 0.2.14"
This reverts commit 5c3eb22ce6.

The new version has issues:

https://github.com/oobabooga/text-generation-webui/issues/4540
https://github.com/abetlen/llama-cpp-python/issues/893
2023-11-09 10:02:13 -08:00
oobabooga
21ed9a260e Document the new "Custom system message" field 2023-11-08 17:54:10 -08:00
oobabooga
2358706453 Add /v1/internal/model/load endpoint (tentative) 2023-11-07 20:58:06 -08:00
oobabooga
43c53a7820 Refactor the /v1/models endpoint 2023-11-07 19:59:27 -08:00
oobabooga
1b69694fe9 Add types to the encode/decode/token-count endpoints 2023-11-07 19:32:14 -08:00