oobabooga
3361728da1
Change some comments
2023-08-26 22:24:44 -07:00
oobabooga
7f5370a272
Minor fixes/cosmetics
2023-08-26 22:11:07 -07:00
oobabooga
83640d6f43
Replace ggml occurences with gguf
2023-08-26 01:06:59 -07:00
oobabooga
f4f04c8c32
Fix a typo
2023-08-25 07:08:38 -07:00
oobabooga
52ab2a6b9e
Add rope_freq_base parameter for CodeLlama
2023-08-25 06:55:15 -07:00
oobabooga
3320accfdc
Add CFG to llamacpp_HF (second attempt) ( #3678 )
2023-08-24 20:32:21 -03:00
oobabooga
d6934bc7bc
Implement CFG for ExLlama_HF ( #3666 )
2023-08-24 16:27:36 -03:00
oobabooga
1b419f656f
Acknowledge a16z support
2023-08-21 11:57:51 -07:00
oobabooga
54df0bfad1
Update README.md
2023-08-18 09:43:15 -07:00
oobabooga
f50f534b0f
Add note about AMD/Metal to README
2023-08-18 09:37:20 -07:00
oobabooga
7cba000421
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q ( #3610 )
2023-08-18 12:03:34 -03:00
oobabooga
32ff3da941
Update ancient screenshots
2023-08-15 17:16:24 -03:00
oobabooga
87dd85b719
Update README
2023-08-15 12:21:50 -07:00
oobabooga
a03a70bed6
Update README
2023-08-15 12:20:59 -07:00
oobabooga
7089b2a48f
Update README
2023-08-15 12:16:21 -07:00
oobabooga
155862a4a0
Update README
2023-08-15 12:11:12 -07:00
cal066
991bb57e43
ctransformers: Fix up model_type name consistency ( #3567 )
2023-08-14 15:17:24 -03:00
oobabooga
ccfc02a28d
Add the --disable_exllama option for AutoGPTQ ( #3545 from clefever/disable-exllama)
2023-08-14 15:15:55 -03:00
oobabooga
619cb4e78b
Add "save defaults to settings.yaml" button ( #3574 )
2023-08-14 11:46:07 -03:00
Eve
66c04c304d
Various ctransformers fixes ( #3556 )
...
---------
Co-authored-by: cal066 <cal066@users.noreply.github.com>
2023-08-13 23:09:03 -03:00
oobabooga
a1a9ec895d
Unify the 3 interface modes ( #3554 )
2023-08-13 01:12:15 -03:00
Chris Lefever
0230fa4e9c
Add the --disable_exllama option for AutoGPTQ
2023-08-12 02:26:58 -04:00
oobabooga
4c450e6b70
Update README.md
2023-08-11 15:50:16 -03:00
cal066
7a4fcee069
Add ctransformers support ( #3313 )
...
---------
Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
2023-08-11 14:41:33 -03:00
oobabooga
949c92d7df
Create README.md
2023-08-10 14:32:40 -03:00
oobabooga
c7f52bbdc1
Revert "Remove GPTQ-for-LLaMa monkey patch support"
...
This reverts commit e3d3565b2a
.
2023-08-10 08:39:41 -07:00
jllllll
e3d3565b2a
Remove GPTQ-for-LLaMa monkey patch support
...
AutoGPTQ will be the preferred GPTQ LoRa loader in the future.
2023-08-09 23:59:04 -05:00
jllllll
bee73cedbd
Streamline GPTQ-for-LLaMa support
2023-08-09 23:42:34 -05:00
oobabooga
2255349f19
Update README
2023-08-09 05:46:25 -07:00
oobabooga
d8fb506aff
Add RoPE scaling support for transformers (including dynamic NTK)
...
https://github.com/huggingface/transformers/pull/24653
2023-08-08 21:25:48 -07:00
Friedemann Lipphardt
901b028d55
Add option for named cloudflare tunnels ( #3364 )
2023-08-08 22:20:27 -03:00
oobabooga
8df3cdfd51
Add SSL certificate support ( #3453 )
2023-08-04 13:57:31 -03:00
oobabooga
4e6dc6d99d
Add Contributing guidelines
2023-08-03 14:40:28 -07:00
oobabooga
87dab03dc0
Add the --cpu option for llama.cpp to prevent CUDA from being used ( #3432 )
2023-08-03 11:00:36 -03:00
oobabooga
b17893a58f
Revert "Add tensor split support for llama.cpp ( #3171 )"
...
This reverts commit 031fe7225e
.
2023-07-26 07:06:01 -07:00
oobabooga
69f8b35bc9
Revert changes to README
2023-07-25 20:51:19 -07:00
oobabooga
1b89c304ad
Update README
2023-07-25 15:46:12 -07:00
oobabooga
77d2e9f060
Remove flexgen 2
2023-07-25 15:18:25 -07:00
oobabooga
5134d5b1c6
Update README
2023-07-25 15:13:07 -07:00
Shouyi
031fe7225e
Add tensor split support for llama.cpp ( #3171 )
2023-07-25 18:59:26 -03:00
Eve
f653546484
README updates and improvements ( #3198 )
2023-07-25 18:58:13 -03:00
Ikko Eltociear Ashimine
b09e4f10fd
Fix typo in README.md ( #3286 )
...
tranformers -> transformers
2023-07-25 18:56:25 -03:00
oobabooga
a07d070b6c
Add llama-2-70b GGML support ( #3285 )
2023-07-24 16:37:03 -03:00
oobabooga
6415cc68a2
Remove obsolete information from README
2023-07-19 21:20:40 -07:00
Panchovix
10c8c197bf
Add Support for Static NTK RoPE scaling for exllama/exllama_hf ( #2955 )
2023-07-04 01:13:16 -03:00
oobabooga
4b1804a438
Implement sessions + add basic multi-user support ( #2991 )
2023-07-04 00:03:30 -03:00
oobabooga
7611978f7b
Add Community section to README
2023-06-27 13:56:14 -03:00
oobabooga
c52290de50
ExLlama with long context ( #2875 )
2023-06-25 22:49:26 -03:00
oobabooga
8bb3bb39b3
Implement stopping string search in string space ( #2847 )
2023-06-24 09:43:00 -03:00
oobabooga
0f9088f730
Update README
2023-06-23 12:24:43 -03:00
oobabooga
383c50f05b
Replace old presets with the results of Preset Arena ( #2830 )
2023-06-23 01:48:29 -03:00
LarryVRH
580c1ee748
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. ( #2777 )
2023-06-21 15:31:42 -03:00
oobabooga
a1cac88c19
Update README.md
2023-06-19 01:28:23 -03:00
oobabooga
5f392122fd
Add gpu_split param to ExLlama
...
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga
9f40032d32
Add ExLlama support ( #2444 )
2023-06-16 20:35:38 -03:00
oobabooga
7ef6a50e84
Reorganize model loading UI completely ( #2720 )
2023-06-16 19:00:37 -03:00
oobabooga
57be2eecdf
Update README.md
2023-06-16 15:04:16 -03:00
Tom Jobbins
646b0c889f
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP ( #2648 )
2023-06-15 23:59:54 -03:00
oobabooga
8936160e54
Add WSL installer to README (thanks jllllll)
2023-06-13 00:07:34 -03:00
oobabooga
eda224c92d
Update README
2023-06-05 17:04:09 -03:00
oobabooga
bef94b9ebb
Update README
2023-06-05 17:01:13 -03:00
oobabooga
f276d88546
Use AutoGPTQ by default for GPTQ models
2023-06-05 15:41:48 -03:00
oobabooga
632571a009
Update README
2023-06-05 15:16:06 -03:00
oobabooga
2f6631195a
Add desc_act checkbox to the UI
2023-06-02 01:45:46 -03:00
oobabooga
ee99a87330
Update README.md
2023-06-01 12:08:44 -03:00
oobabooga
146505a16b
Update README.md
2023-06-01 12:04:58 -03:00
oobabooga
3347395944
Update README.md
2023-06-01 12:01:20 -03:00
oobabooga
aba56de41b
Update README.md
2023-06-01 11:46:28 -03:00
oobabooga
df18ae7d6c
Update README.md
2023-06-01 11:27:33 -03:00
Morgan Schweers
1aed2b9e52
Make it possible to download protected HF models from the command line. ( #2408 )
2023-06-01 00:11:21 -03:00
jllllll
412e7a6a96
Update README.md to include missing flags ( #2449 )
2023-05-31 11:07:56 -03:00
Atinoda
bfbd13ae89
Update docker repo link ( #2340 )
2023-05-30 22:14:49 -03:00
oobabooga
962d05ca7e
Update README.md
2023-05-29 14:56:55 -03:00
Honkware
204731952a
Falcon support (trust-remote-code and autogptq checkboxes) ( #2367 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-29 10:20:18 -03:00
oobabooga
f27135bdd3
Add Eta Sampling preset
...
Also remove some presets that I do not consider relevant
2023-05-28 22:44:35 -03:00
oobabooga
00ebea0b2a
Use YAML for presets and settings
2023-05-28 22:34:12 -03:00
jllllll
07a4f0569f
Update README.md to account for BnB Windows wheel ( #2341 )
2023-05-25 18:44:26 -03:00
oobabooga
231305d0f5
Update README.md
2023-05-25 12:05:08 -03:00
oobabooga
37d4ad012b
Add a button for rendering markdown for any model
2023-05-25 11:59:27 -03:00
oobabooga
9a43656a50
Add bitsandbytes note
2023-05-25 11:21:52 -03:00
DGdev91
cf088566f8
Make llama.cpp read prompt size and seed from settings ( #2299 )
2023-05-25 10:29:31 -03:00
oobabooga
a04266161d
Update README.md
2023-05-25 01:23:46 -03:00
oobabooga
361451ba60
Add --load-in-4bit parameter ( #2320 )
2023-05-25 01:14:13 -03:00
Gabriel Terrien
7aed53559a
Support of the --gradio-auth flag ( #2283 )
2023-05-23 20:39:26 -03:00
Atinoda
4155aaa96a
Add mention to alternative docker repository ( #2145 )
2023-05-23 20:35:53 -03:00
Carl Kenner
c86231377b
Wizard Mega, Ziya, KoAlpaca, OpenBuddy, Chinese-Vicuna, Vigogne, Bactrian, H2O support, fix Baize ( #2159 )
2023-05-19 11:42:41 -03:00
Alex "mcmonkey" Goodwin
1f50dbe352
Experimental jank multiGPU inference that's 2x faster than native somehow ( #2100 )
2023-05-17 10:41:09 -03:00
Andrei
e657dd342d
Add in-memory cache support for llama.cpp ( #1936 )
2023-05-15 20:19:55 -03:00
AlphaAtlas
071f0776ad
Add llama.cpp GPU offload option ( #2060 )
2023-05-14 22:58:11 -03:00
oobabooga
23d3f6909a
Update README.md
2023-05-11 10:21:20 -03:00
oobabooga
2930e5a895
Update README.md
2023-05-11 10:04:38 -03:00
oobabooga
0ff38c994e
Update README.md
2023-05-11 09:58:58 -03:00
oobabooga
e6959a5d9a
Update README.md
2023-05-11 09:54:22 -03:00
oobabooga
dcfd09b61e
Update README.md
2023-05-11 09:49:57 -03:00
oobabooga
7a49ceab29
Update README.md
2023-05-11 09:42:39 -03:00
oobabooga
57dc44a995
Update README.md
2023-05-10 12:48:25 -03:00
oobabooga
181b102521
Update README.md
2023-05-10 12:09:47 -03:00
Carl Kenner
814f754451
Support for MPT, INCITE, WizardLM, StableLM, Galactica, Vicuna, Guanaco, and Baize instruction following ( #1596 )
2023-05-09 20:37:31 -03:00
Wojtab
e9e75a9ec7
Generalize multimodality (llava/minigpt4 7b and 13b now supported) ( #1741 )
2023-05-09 20:18:02 -03:00
oobabooga
00e333d790
Add MOSS support
2023-05-04 23:20:34 -03:00
oobabooga
b6ff138084
Add --checkpoint argument for GPTQ
2023-05-04 15:17:20 -03:00
Ahmed Said
fbcd32988e
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative ( #1649 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-02 18:25:28 -03:00
oobabooga
f39c99fa14
Load more than one LoRA with --lora, fix a bug
2023-04-25 22:58:48 -03:00
oobabooga
b6af2e56a2
Add --character flag, add character to settings.json
2023-04-24 13:19:42 -03:00
eiery
78d1977ebf
add n_batch support for llama.cpp ( #1115 )
2023-04-24 03:46:18 -03:00
Andy Salerno
654933c634
New universal API with streaming/blocking endpoints ( #990 )
...
Previous title: Add api_streaming extension and update api-example-stream to use it
* Merge with latest main
* Add parameter capturing encoder_repetition_penalty
* Change some defaults, minor fixes
* Add --api, --public-api flags
* remove unneeded/broken comment from blocking API startup. The comment is already correctly emitted in try_start_cloudflared by calling the lambda we pass in.
* Update on_start message for blocking_api, it should say 'non-streaming' and not 'streaming'
* Update the API examples
* Change a comment
* Update README
* Remove the gradio API
* Remove unused import
* Minor change
* Remove unused import
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-23 15:52:43 -03:00
oobabooga
7438f4f6ba
Change GPTQ triton default settings
2023-04-22 12:27:30 -03:00
oobabooga
fe02281477
Update README.md
2023-04-22 03:05:00 -03:00
oobabooga
038fa3eb39
Update README.md
2023-04-22 02:46:07 -03:00
oobabooga
505c2c73e8
Update README.md
2023-04-22 00:11:27 -03:00
oobabooga
f8da9a0424
Update README.md
2023-04-18 20:25:08 -03:00
oobabooga
c3f6e65554
Update README.md
2023-04-18 20:23:31 -03:00
oobabooga
eb15193327
Update README.md
2023-04-18 13:07:08 -03:00
oobabooga
7fbfc489e2
Update README.md
2023-04-18 12:56:37 -03:00
oobabooga
f559f9595b
Update README.md
2023-04-18 12:54:09 -03:00
loeken
89e22d4d6a
added windows/docker docs ( #1027 )
2023-04-18 12:47:43 -03:00
oobabooga
8275989f03
Add new 1-click installers for Linux and MacOS
2023-04-18 02:40:36 -03:00
oobabooga
301c687c64
Update README.md
2023-04-17 11:25:26 -03:00
oobabooga
89bc540557
Update README
2023-04-17 10:55:35 -03:00
practicaldreamer
3961f49524
Add note about --no-fused_mlp ignoring --gpu-memory ( #1301 )
2023-04-17 10:46:37 -03:00
sgsdxzy
b57ffc2ec9
Update to support GPTQ triton commit c90adef ( #1229 )
2023-04-17 01:11:18 -03:00
oobabooga
3e5cdd005f
Update README.md
2023-04-16 23:28:59 -03:00
oobabooga
39099663a0
Add 4-bit LoRA support ( #1200 )
2023-04-16 23:26:52 -03:00
oobabooga
705121161b
Update README.md
2023-04-16 20:03:03 -03:00
oobabooga
50c55a51fc
Update README.md
2023-04-16 19:22:31 -03:00
Forkoz
c6fe1ced01
Add ChatGLM support ( #1256 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 19:15:03 -03:00
oobabooga
c96529a1b3
Update README.md
2023-04-16 17:00:03 -03:00
oobabooga
004f275efe
Update README.md
2023-04-14 23:36:56 -03:00
oobabooga
83964ed354
Update README.md
2023-04-14 23:33:54 -03:00
oobabooga
c41037db68
Update README.md
2023-04-14 23:32:39 -03:00
v0xie
9d66957207
Add --listen-host launch option ( #1122 )
2023-04-13 21:35:08 -03:00
oobabooga
403be8a27f
Update README.md
2023-04-13 21:23:35 -03:00
Light
97e67d136b
Update README.md
2023-04-13 21:00:58 +08:00
Light
15d5a043f2
Merge remote-tracking branch 'origin/main' into triton
2023-04-13 19:38:51 +08:00
oobabooga
7dfbe54f42
Add --model-menu option
2023-04-12 21:24:26 -03:00
MarlinMr
47daf891fe
Link to developer.nvidia.com ( #1104 )
2023-04-12 15:56:42 -03:00
Light
f3591ccfa1
Keep minimal change.
2023-04-12 23:26:06 +08:00
oobabooga
461ca7faf5
Mention that pull request reviews are welcome
2023-04-11 23:12:48 -03:00
oobabooga
749c08a4ff
Update README.md
2023-04-11 14:42:10 -03:00
IggoOnCode
09d8119e3c
Add CPU LoRA training ( #938 )
...
(It's very slow)
2023-04-10 17:29:00 -03:00
oobabooga
f035b01823
Update README.md
2023-04-10 16:20:23 -03:00
Jeff Lefebvre
b7ca89ba3f
Mention that build-essential is required ( #1013 )
2023-04-10 16:19:10 -03:00
MarkovInequality
992663fa20
Added xformers support to Llama ( #950 )
2023-04-09 23:08:40 -03:00
oobabooga
bce1b7fbb2
Update README.md
2023-04-09 02:19:40 -03:00
oobabooga
f7860ce192
Update README.md
2023-04-09 02:19:17 -03:00
oobabooga
ece8ed2c84
Update README.md
2023-04-09 02:18:42 -03:00
MarlinMr
ec979cd9c4
Use updated docker compose ( #877 )
2023-04-07 10:48:47 -03:00
MarlinMr
2c0018d946
Cosmetic change of README.md ( #878 )
2023-04-07 10:47:10 -03:00
oobabooga
848c4edfd5
Update README.md
2023-04-06 22:52:35 -03:00
oobabooga
e047cd1def
Update README
2023-04-06 22:50:58 -03:00