Commit Graph

3636 Commits

Author SHA1 Message Date
oobabooga
7073665a10
Truncate long chat completions inputs (#5439) 2024-02-05 02:31:24 -03:00
oobabooga
9033fa5eee Organize the Model tab 2024-02-04 19:30:22 -08:00
oobabooga
cd4ffd3dd4 Update docs 2024-02-04 18:48:04 -08:00
oobabooga
92d0617bce Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2024-02-04 18:40:46 -08:00
oobabooga
a210999255 Bump safetensors version 2024-02-04 18:40:25 -08:00
Badis Ghoubali
9fdee65cf5
Improve ChatML template (#5411) 2024-02-04 23:39:15 -03:00
Forkoz
2a45620c85
Split by rows instead of layers for llama.cpp multi-gpu (#5435) 2024-02-04 23:36:40 -03:00
Badis Ghoubali
3df7e151f7
fix the n_batch slider (#5436) 2024-02-04 18:15:30 -03:00
oobabooga
4e188eeb80 Lint 2024-02-03 20:40:10 -08:00
oobabooga
cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
kalomaze
b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
oobabooga
e98d1086f5
Bump llama-cpp-python to 0.2.38 (#5420) 2024-02-01 20:09:30 -03:00
oobabooga
167ee72d4e Lint 2024-01-30 09:16:23 -08:00
oobabooga
ee65f4f014 Downloader: don't assume that huggingface_hub is installed 2024-01-30 09:14:11 -08:00
oobabooga
89f6036e98
Bump llama-cpp-python, remove python 3.8/3.9, cuda 11.7 (#5397) 2024-01-30 13:19:20 -03:00
Forkoz
528318b700
API: Remove tiktoken from logit bias (#5391) 2024-01-28 21:42:03 -03:00
Badis Ghoubali
40c7977f9b
Add roleplay.gbnf grammar (#5368) 2024-01-28 21:41:28 -03:00
smCloudInTheSky
b1463df0a1
docker: add options for CPU only, Intel GPU, AMD GPU (#5380) 2024-01-28 11:18:14 -03:00
oobabooga
d921f80322 one-click: minor fix after 5e87678fea 2024-01-28 06:14:15 -08:00
Evgenii
26c3ab367e
one-click: use f-strings to improve readability and unify with the rest code (#5068) 2024-01-27 17:31:22 -03:00
Andrew C. Dvorak
5e87678fea
Support running as a git submodule. (#5227) 2024-01-27 17:18:50 -03:00
Hubert Kasperek
69622930c7
Ability to run the Coqui TTS extension on the CPU (#5365) 2024-01-27 17:15:34 -03:00
Anthony Guijarro
828be63f2c
Downloader: use HF get_token function (#5381) 2024-01-27 17:13:09 -03:00
oobabooga
de387069da Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2024-01-26 06:12:19 -08:00
sam-ngu
c0bdcee646
added trust_remote_code to deepspeed init loaderClass (#5237) 2024-01-26 11:10:57 -03:00
dependabot[bot]
bfe2326a24
Bump hqq from 0.1.2 to 0.1.2.post1 (#5349) 2024-01-26 11:10:18 -03:00
oobabooga
70648e75e6 Docs: minor change 2024-01-26 06:00:26 -08:00
oobabooga
c1470870bb Update README 2024-01-26 05:58:40 -08:00
oobabooga
87dc421ee8
Bump exllamav2 to 0.0.12 (#5352) 2024-01-22 22:40:12 -03:00
oobabooga
aa575119e6 API: minor fix 2024-01-22 04:38:43 -08:00
oobabooga
821dd65fb3 API: add a comment 2024-01-22 04:15:51 -08:00
oobabooga
6247eafcc5 API: better handle temperature = 0 2024-01-22 04:12:23 -08:00
oobabooga
817866c9cf Lint 2024-01-22 04:07:25 -08:00
oobabooga
b9d1873301 Bump transformers to 4.37 2024-01-22 04:07:12 -08:00
oobabooga
aad73667af Lint 2024-01-22 03:25:55 -08:00
oobabooga
6ada77cf5a Update README.md 2024-01-22 03:17:15 -08:00
oobabooga
8b5495ebf8 Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2024-01-22 03:15:29 -08:00
oobabooga
cc6505df14 Update README.md 2024-01-22 03:14:56 -08:00
Cohee
fbf8ae39f8
API: Allow content arrays for multimodal OpenAI requests (#5277) 2024-01-22 08:10:26 -03:00
Ercan
166fdf09f3
API: Properly handle Images with RGBA color format (#5332) 2024-01-22 08:08:51 -03:00
lmg-anon
db1da9f98d
Fix logprobs tokens in OpenAI API (#5339) 2024-01-22 08:07:42 -03:00
oobabooga
b5cabb6e9d
Bump llama-cpp-python to 0.2.31 (#5345) 2024-01-22 08:05:59 -03:00
oobabooga
8962bb173e
Bump llama-cpp-python to 0.2.29 (#5307) 2024-01-18 14:24:17 -03:00
Stefan Daniel Schwarz
232c07bf1f
API: set do_sample=false when temperature=0 (#5275) 2024-01-17 23:58:11 -03:00
Yiximail
3fef37cda8
UI: Update position of show-controls label to avoid line breaks due to font size (#5256) 2024-01-17 23:56:48 -03:00
oobabooga
7916cf863b Bump transformers (necesary for e055967974) 2024-01-17 12:37:31 -08:00
Forkoz
5c5ef4cef7
UI: change n_gpu_layers maximum to 256 for larger models. (#5262) 2024-01-17 17:13:16 -03:00
ilya sheprut
4d14eb8b82
LoRA: Fix error "Attempting to unscale FP16 gradients" when training (#5268) 2024-01-17 17:11:49 -03:00
Katehuuh
535ea9928a
Fixed whisper README Typo Hyperlinks (#5281) 2024-01-17 17:10:45 -03:00
oobabooga
e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00