text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2024-10-01 01:26:03 -04:00

Author	SHA1	Message	Date
oobabooga	bcf0075278	Merge pull request #235 from xanthousm/Quality_of_life-main --auto-launch and "Is typing..."	2023-03-12 03:12:56 -03:00
oobabooga	341e135036	Various fixes in chat mode	2023-03-12 02:53:08 -03:00
oobabooga	b0e8cb8c88	Various fixes in chat mode	2023-03-12 02:31:45 -03:00
oobabooga	0bd5430988	Use 'with' statement to better handle streaming memory	2023-03-12 02:04:28 -03:00
oobabooga	37f0166b2d	Fix memory leak in new streaming (second attempt)	2023-03-11 23:14:49 -03:00
oobabooga	92fe947721	Merge branch 'main' into new-streaming	2023-03-11 19:59:45 -03:00
oobabooga	2743dd736a	Add Is typing... to impersonate as well	2023-03-11 10:50:18 -03:00
Xan	96c51973f9	--auto-launch and "Is typing..." - Added `--auto-launch` arg to open web UI in the default browser when ready. - Changed chat.py to display user input immediately and "Is typing..." as a temporary reply while generating text. Most noticeable when using `--no-stream`.	2023-03-11 22:50:59 +11:00
oobabooga	026d60bd34	Remove default preset that didn't do anything	2023-03-10 14:01:02 -03:00
oobabooga	e9dbdafb14	Merge branch 'main' into pt-path-changes	2023-03-10 11:03:42 -03:00
oobabooga	706a03b2cb	Minor changes	2023-03-10 11:02:25 -03:00
oobabooga	de7dd8b6aa	Add comments	2023-03-10 10:54:08 -03:00
oobabooga	e461c0b7a0	Move the import to the top	2023-03-10 10:51:12 -03:00
deepdiffuser	9fbd60bf22	add no_split_module_classes to prevent tensor split error	2023-03-10 05:30:47 -08:00
deepdiffuser	ab47044459	add multi-gpu support for 4bit gptq LLaMA	2023-03-10 04:52:45 -08:00
rohvani	2ac2913747	fix reference issue	2023-03-09 20:13:23 -08:00
rohvani	826e297b0e	add llama-65b-4bit support & multiple pt paths	2023-03-09 18:31:32 -08:00
oobabooga	9849aac0f1	Don't show .pt models in the list	2023-03-09 21:54:50 -03:00
oobabooga	74102d5ee4	Insert to the path instead of appending	2023-03-09 20:51:22 -03:00
oobabooga	2965aa1625	Check if the .pt file exists	2023-03-09 20:48:51 -03:00
oobabooga	828a524f9a	Add LLaMA 4-bit support	2023-03-09 15:50:26 -03:00
oobabooga	59b5f7a4b7	Improve usage of stopping_criteria	2023-03-08 12:13:40 -03:00
oobabooga	add9330e5e	Bug fixes	2023-03-08 11:26:29 -03:00
oobabooga	33fb6aed74	Minor bug fix	2023-03-08 03:08:16 -03:00
oobabooga	ad2970374a	Readability improvements	2023-03-08 03:00:06 -03:00
oobabooga	72d539dbff	Better separate the FlexGen case	2023-03-08 02:54:47 -03:00
oobabooga	0e16c0bacb	Remove redeclaration of a function	2023-03-08 02:50:49 -03:00
oobabooga	ab50f80542	New text streaming method (much faster)	2023-03-08 02:46:35 -03:00
oobabooga	8e89bc596b	Fix encode() for RWKV	2023-03-07 23:15:46 -03:00
oobabooga	19a34941ed	Add proper streaming to RWKV	2023-03-07 18:17:56 -03:00
oobabooga	8660227e1b	Add top_k to RWKV	2023-03-07 17:24:28 -03:00
oobabooga	153dfeb4dd	Add --rwkv-cuda-on parameter, bump rwkv version	2023-03-06 20:12:54 -03:00
oobabooga	6904a507c6	Change some parameters	2023-03-06 16:29:43 -03:00
oobabooga	20bd645f6a	Fix bug in multigpu setups (attempt 3)	2023-03-06 15:58:18 -03:00
oobabooga	09a7c36e1b	Minor improvement while running custom models	2023-03-06 15:36:35 -03:00
oobabooga	24c4c20391	Fix bug in multigpu setups (attempt #2 )	2023-03-06 15:23:29 -03:00
oobabooga	d88b7836c6	Fix bug in multigpu setups	2023-03-06 14:58:30 -03:00
oobabooga	5bed607b77	Increase repetition frequency/penalty for RWKV	2023-03-06 14:25:48 -03:00
oobabooga	bf56b6c1fb	Load settings.json without the need for --settings settings.json This is for setting UI defaults	2023-03-06 10:57:45 -03:00
oobabooga	e91f4bc25a	Add RWKV tokenizer	2023-03-06 08:45:49 -03:00
oobabooga	c855b828fe	Better handle <USER>	2023-03-05 17:01:47 -03:00
oobabooga	2af66a4d4c	Fix <USER> in pygmalion replies	2023-03-05 16:08:50 -03:00
oobabooga	a54b91af77	Improve readability	2023-03-05 10:21:15 -03:00
oobabooga	8e706df20e	Fix a memory leak when text streaming is on	2023-03-05 10:12:43 -03:00
oobabooga	c33715ad5b	Move towards HF LLaMA implementation	2023-03-05 01:20:31 -03:00
oobabooga	bd8aac8fa4	Add LLaMA 8-bit support	2023-03-04 13:28:42 -03:00
oobabooga	c93f1fa99b	Count the tokens more conservatively	2023-03-04 03:10:21 -03:00
oobabooga	ed8b35efd2	Add --pin-weight parameter for FlexGen	2023-03-04 01:04:02 -03:00
oobabooga	05e703b4a4	Print the performance information more reliably	2023-03-03 21:24:32 -03:00
oobabooga	5a79863df3	Increase the sequence length, decrease batch size I have no idea what I am doing	2023-03-03 15:54:13 -03:00

1 2 3 4

166 Commits