Commit Graph

523 Commits

Author SHA1 Message Date
oobabooga
0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
oobabooga
8e397915c9
Remove --sdp-attention, --xformers flags (#5126) 2023-12-31 01:36:51 -03:00
oobabooga
2706149c65
Organize the CMD arguments by group (#5027) 2023-12-21 00:33:55 -03:00
oobabooga
de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga
0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
Water
674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga
41424907b1 Update README 2023-12-16 16:35:36 -08:00
oobabooga
0087dca286 Update README 2023-12-16 12:28:51 -08:00
oobabooga
3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga
623c92792a Update README 2023-12-14 07:56:48 -08:00
oobabooga
3580bed041 Update README 2023-12-14 07:54:20 -08:00
oobabooga
d5ec3c3444 Update README 2023-12-14 06:20:52 -08:00
oobabooga
5b283fff22 Update README 2023-12-14 06:15:14 -08:00
oobabooga
958799221f Update README 2023-12-14 06:09:03 -08:00
oobabooga
e7fa17740a Update README 2023-12-13 22:49:42 -08:00
oobabooga
03babe7d81 Update README 2023-12-13 22:47:08 -08:00
oobabooga
aad14174e4 Update README 2023-12-13 22:46:18 -08:00
oobabooga
783947a2aa Update README 2023-12-13 22:44:25 -08:00
oobabooga
7fef16950f Update README 2023-12-13 22:42:54 -08:00
oobabooga
d36e7f1762 Update README 2023-12-13 22:35:22 -08:00
oobabooga
9695db0ee4 Update README 2023-12-13 22:30:31 -08:00
oobabooga
d354f5009c Update README 2023-12-13 22:21:29 -08:00
oobabooga
0a4fad2d46 Update README 2023-12-13 22:20:37 -08:00
oobabooga
fade6abfe9 Update README 2023-12-13 22:18:40 -08:00
oobabooga
aafd15109d Update README 2023-12-13 22:15:58 -08:00
oobabooga
634518a412 Update README 2023-12-13 22:08:41 -08:00
oobabooga
0d5ca05ab9 Update README 2023-12-13 22:06:04 -08:00
oobabooga
d241de86c4 Update README 2023-12-13 22:02:26 -08:00
oobabooga
36e850fe89
Update README.md 2023-12-13 17:55:41 -03:00
oobabooga
8c8825b777 Add QuIP# to README 2023-12-08 08:40:42 -08:00
oobabooga
f7145544f9 Update README 2023-12-04 15:44:44 -08:00
oobabooga
be88b072e9 Update --loader flag description 2023-12-04 15:41:25 -08:00
Ikko Eltociear Ashimine
06cc9a85f7
README: minor typo fix (#4793) 2023-12-03 22:46:34 -03:00
oobabooga
000b77a17d Minor docker changes 2023-11-29 21:27:23 -08:00
Callum
88620c6b39
feature/docker_improvements (#4768) 2023-11-30 02:20:23 -03:00
oobabooga
ff24648510 Credit llama-cpp-python in the README 2023-11-20 12:13:15 -08:00
oobabooga
ef6feedeb2
Add --nowebui flag for pure API mode (#4651) 2023-11-18 23:38:39 -03:00
oobabooga
8f4f4daf8b
Add --admin-key flag for API (#4649) 2023-11-18 22:33:27 -03:00
oobabooga
d1a58da52f Update ancient Docker instructions 2023-11-17 19:52:53 -08:00
oobabooga
e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga
9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga
13dc3b61da Update README 2023-11-16 19:57:55 -08:00
oobabooga
923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga
322c170566 Document logits_all 2023-11-07 14:45:11 -08:00
oobabooga
d59f1ad89a
Update README.md 2023-11-07 13:05:06 -03:00
oobabooga
ec17a5d2b7
Make OpenAI API the default API (#4430) 2023-11-06 02:38:29 -03:00
feng lui
4766a57352
transformers: add use_flash_attention_2 option (#4373) 2023-11-04 13:59:33 -03:00
oobabooga
c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
oobabooga
77abd9b69b Add no_flash_attn option 2023-11-02 11:08:53 -07:00