Commit Graph

1427 Commits

Author SHA1 Message Date
Cebtenzzre
cc7675d432 convert scripts: make gptj script executable 2023-10-05 18:16:19 -04:00
Cebtenzzre
0493e6eb07 convert scripts: use bytes_to_unicode from transformers 2023-10-05 18:16:19 -04:00
Cebtenzzre
a49a1dcdf4 chatllm: grammar fix 2023-10-05 18:16:19 -04:00
Cebtenzzre
d5d72f0361 gpt-j: update inference to match latest llama.cpp insights
- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78
2023-10-05 18:16:19 -04:00
Cebtenzzre
050e7f076e backend: port GPT-J to GGUF 2023-10-05 18:16:19 -04:00
Cebtenzzre
31b20f093a modellist: fix the system prompt 2023-10-05 18:16:19 -04:00
Cebtenzzre
8f3abb37ca fix references to removed model types 2023-10-05 18:16:19 -04:00
Cebtenzzre
4219c0e2e7 convert scripts: make them directly executable 2023-10-05 18:16:19 -04:00
Cebtenzzre
ce7be1db48 backend: use llamamodel.cpp for Falcon 2023-10-05 18:16:19 -04:00
Cebtenzzre
cca9e6ce81 convert_mpt_hf_to_gguf.py: better tokenizer decoding 2023-10-05 18:16:19 -04:00
Cebtenzzre
25297786db convert scripts: load model as late as possible 2023-10-05 18:16:19 -04:00
Cebtenzzre
fd47088f2b conversion scripts: cleanup 2023-10-05 18:16:19 -04:00
Cebtenzzre
6277eac9cc backend: use llamamodel.cpp for StarCoder 2023-10-05 18:16:19 -04:00
Cebtenzzre
aa706ab1ff backend: use gguf branch of llama.cpp-mainline 2023-10-05 18:16:19 -04:00
Cebtenzzre
17fc9e3e58 backend: port Replit to GGUF 2023-10-05 18:16:19 -04:00
Cebtenzzre
7c67262a13 backend: port MPT to GGUF 2023-10-05 18:16:19 -04:00
Cebtenzzre
42bcb814b3 backend: port BERT to GGUF 2023-10-05 18:16:19 -04:00
Cebtenzzre
4392bf26e0 pyllmodel: print specific error message 2023-10-05 18:16:19 -04:00
Cebtenzzre
34f2ec2b33 gpt4all.py: GGUF 2023-10-05 18:16:19 -04:00
Cebtenzzre
1d29e4696c llamamodel: metal supports all quantization types now 2023-10-05 18:16:19 -04:00
Aaron Miller
507753a37c macos build fixes 2023-10-05 18:16:19 -04:00
Adam Treat
d90d003a1d Latest rebase on llama.cpp with gguf support. 2023-10-05 18:16:19 -04:00
Akarshan Biswas
5f3d739205 appdata: update software description 2023-10-05 10:12:43 -04:00
Akarshan Biswas
b4cf12e1bd Update to 2.4.19 2023-10-05 10:12:43 -04:00
Akarshan Biswas
21a5709b07 Remove unnecessary stuffs from manifest 2023-10-05 10:12:43 -04:00
Akarshan Biswas
4426640f44 Add flatpak manifest 2023-10-05 10:12:43 -04:00
Aaron Miller
6711bddc4c launch browser instead of maintenancetool from offline builds 2023-09-27 11:24:21 -07:00
Aaron Miller
7f979c8258 Build offline installers in CircleCI 2023-09-27 11:24:21 -07:00
Adam Treat
99c106e6b5 Fix a bug seen on AMD RADEON cards with vulkan backend. 2023-09-26 11:59:47 -04:00
Andriy Mulyar
9611c4081a
Update README.md
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-09-20 15:50:28 -04:00
kevinbazira
17cb4a86d1 Replace git clone SSH URI with HTTPS URL
Running `git clone --recurse-submodules git@github.com:nomic-ai/gpt4all.git`
returns `Permission denied (publickey)` as shown below:
```
git clone --recurse-submodules git@github.com:nomic-ai/gpt4all.git
Cloning into gpt4all...
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.
```

This change replaces `git@github.com:nomic-ai/gpt4all.git` with
`https://github.com/nomic-ai/gpt4all.git` which runs without permission issues.

resolves nomic-ai/gpt4all#8, resolves nomic-ai/gpt4all#49
2023-09-20 09:48:47 -04:00
Andriy Mulyar
0d1edaf029
Update README.md with GPU support
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-09-19 10:51:17 -04:00
Adam Treat
dc80d1e578 Fix up the offline installer. 2023-09-18 16:21:50 -04:00
Jacob Nguyen
e86c63750d Update llama.cpp.cmake
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
2023-09-16 11:42:56 -07:00
Adam Treat
f47e698193 Release notes for v2.4.19 and bump the version. 2023-09-16 12:35:08 -04:00
Adam Treat
84905aa281 Fix for crashes on systems where vulkan is not installed properly. 2023-09-16 12:19:46 -04:00
Adam Treat
ecf014f03b Release notes for v2.4.18 and bump the version. 2023-09-16 10:21:50 -04:00
Adam Treat
e6e724d2dc Actually bump the version. 2023-09-16 10:07:20 -04:00
Adam Treat
06a833e652 Send actual and requested device info for those who have opt-in. 2023-09-16 09:42:22 -04:00
Adam Treat
045f6e6cdc Link against ggml in bin so we can get the available devices without loading a model. 2023-09-15 14:45:25 -04:00
Adam Treat
0f046cf905 Bump the Python version to python-v1.0.12 to restrict the quants that vulkan recognizes. 2023-09-15 09:12:20 -04:00
Adam Treat
655372dbfa Release notes for v2.4.17 and bump the version. 2023-09-14 17:11:04 -04:00
Adam Treat
aa33419c6e Fallback to CPU more robustly. 2023-09-14 16:53:11 -04:00
Adam Treat
79843c269e Release notes for v2.4.16 and bump the version. 2023-09-14 11:24:25 -04:00
Adam Treat
9013a089bd Bump to new llama with new bugfix. 2023-09-14 10:02:11 -04:00
Adam Treat
3076e0bf26 Only show GPU when we're actually using it. 2023-09-14 09:59:19 -04:00
Adam Treat
1fa67a585c Report the actual device we're using. 2023-09-14 08:25:37 -04:00
Adam Treat
cf4eb530ce Sync to a newer version of llama.cpp with bugfix for vulkan. 2023-09-13 21:01:44 -04:00
Adam Treat
21a3244645 Fix a bug where we're not properly falling back to CPU. 2023-09-13 19:30:27 -04:00
Adam Treat
0458c9b4e6 Add version 2.4.15 and bump the version number. 2023-09-13 17:55:50 -04:00