Commit Graph

224 Commits

Author SHA1 Message Date
oobabooga
6e2e0317af
Separate context and system message in instruction formats (#4499) 2023-11-07 20:02:58 -03:00
oobabooga
af3d25a503 Disable logits_all in llamacpp_HF (makes processing 3x faster) 2023-11-07 14:35:48 -08:00
oobabooga
ec17a5d2b7
Make OpenAI API the default API (#4430) 2023-11-06 02:38:29 -03:00
feng lui
4766a57352
transformers: add use_flash_attention_2 option (#4373) 2023-11-04 13:59:33 -03:00
Julien Chaumond
fdcaa955e3
transformers: Add a flag to force load from safetensors (#4450) 2023-11-02 16:20:54 -03:00
oobabooga
c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
oobabooga
77abd9b69b Add no_flash_attn option 2023-11-02 11:08:53 -07:00
oobabooga
1edf321362 Lint 2023-10-23 13:09:03 -07:00
oobabooga
df90d03e0b Replace --mul_mat_q with --no_mul_mat_q 2023-10-22 12:23:03 -07:00
oobabooga
2d1b3332e4 Ignore warnings on Colab 2023-10-21 21:45:25 -07:00
oobabooga
506d05aede Organize command-line arguments 2023-10-21 18:52:59 -07:00
cal066
cc632c3f33
AutoAWQ: initial support (#3999) 2023-10-05 13:19:18 -03:00
oobabooga
b6fe6acf88 Add threads_batch parameter 2023-10-01 21:28:00 -07:00
jllllll
41a2de96e5
Bump llama-cpp-python to 0.2.11 2023-10-01 18:08:10 -05:00
oobabooga
f931184b53 Increase truncation limits to 32768 2023-09-28 19:28:22 -07:00
StoyanStAtanasov
7e6ff8d1f0
Enable NUMA feature for llama_cpp_python (#4040) 2023-09-26 22:05:00 -03:00
oobabooga
1ca54faaf0 Improve --multi-user mode 2023-09-26 06:42:33 -07:00
oobabooga
7f1460af29 Change a warning 2023-09-25 20:22:27 -07:00
oobabooga
d0d221df49 Add --use_fast option (closes #3741) 2023-09-25 12:19:43 -07:00
oobabooga
00ab450c13
Multiple histories for each character (#4022) 2023-09-21 17:19:32 -03:00
oobabooga
5075087461 Fix command-line arguments being ignored 2023-09-19 13:11:46 -07:00
missionfloyd
2ad6ca8874
Add back chat buttons with --chat-buttons (#3947) 2023-09-16 00:39:37 -03:00
saltacc
f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00
oobabooga
3d1c0f173d User config precedence over GGUF metadata 2023-09-14 12:15:52 -07:00
oobabooga
2f935547c8 Minor changes 2023-09-12 15:05:21 -07:00
oobabooga
c2a309f56e
Add ExLlamaV2 and ExLlamav2_HF loaders (#3881) 2023-09-12 14:33:07 -03:00
oobabooga
dae428a967 Revamp cai-chat theme, make it default 2023-09-11 19:30:40 -07:00
oobabooga
ed86878f02 Remove GGML support 2023-09-11 07:44:00 -07:00
oobabooga
cec8db52e5
Add max_tokens_second param (#3533) 2023-08-29 17:44:31 -03:00
oobabooga
36864cb3e8 Use Alpaca as the default instruction template 2023-08-29 13:06:25 -07:00
Cebtenzzre
2f5d769a8d
accept floating-point alpha value on the command line (#3712) 2023-08-27 18:54:43 -03:00
oobabooga
f4f04c8c32 Fix a typo 2023-08-25 07:08:38 -07:00
oobabooga
52ab2a6b9e Add rope_freq_base parameter for CodeLlama 2023-08-25 06:55:15 -07:00
oobabooga
d6934bc7bc
Implement CFG for ExLlama_HF (#3666) 2023-08-24 16:27:36 -03:00
oobabooga
7cba000421
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q (#3610) 2023-08-18 12:03:34 -03:00
oobabooga
73d9befb65 Make "Show controls" customizable through settings.yaml 2023-08-16 07:04:18 -07:00
oobabooga
ccfc02a28d
Add the --disable_exllama option for AutoGPTQ (#3545 from clefever/disable-exllama) 2023-08-14 15:15:55 -03:00
oobabooga
d8a82d34ed Improve a warning 2023-08-14 08:46:05 -07:00
oobabooga
619cb4e78b
Add "save defaults to settings.yaml" button (#3574) 2023-08-14 11:46:07 -03:00
oobabooga
a1a9ec895d
Unify the 3 interface modes (#3554) 2023-08-13 01:12:15 -03:00
Chris Lefever
0230fa4e9c Add the --disable_exllama option for AutoGPTQ 2023-08-12 02:26:58 -04:00
cal066
7a4fcee069
Add ctransformers support (#3313)
---------

Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
2023-08-11 14:41:33 -03:00
jllllll
bee73cedbd
Streamline GPTQ-for-LLaMa support 2023-08-09 23:42:34 -05:00
oobabooga
d8fb506aff Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
2023-08-08 21:25:48 -07:00
Friedemann Lipphardt
901b028d55
Add option for named cloudflare tunnels (#3364) 2023-08-08 22:20:27 -03:00
oobabooga
a373c96d59 Fix a bug in modules/shared.py 2023-08-06 20:36:35 -07:00
oobabooga
3d48933f27 Remove ancient deprecation warnings 2023-08-06 18:58:59 -07:00
oobabooga
0af10ab49b
Add Classifier Free Guidance (CFG) for Transformers/ExLlama (#3325) 2023-08-06 17:22:48 -03:00
oobabooga
8df3cdfd51
Add SSL certificate support (#3453) 2023-08-04 13:57:31 -03:00
oobabooga
87dab03dc0
Add the --cpu option for llama.cpp to prevent CUDA from being used (#3432) 2023-08-03 11:00:36 -03:00