Commit Graph

61 Commits

Author SHA1 Message Date
Jared Van Bortel
de7cb36fcc
python: reduce size of wheels built by CI, other build tweaks (#2802)
* Read CMAKE_CUDA_ARCHITECTURES directly
* Disable CUBINs for python build in CI
* Search for CUDA 11 as well as CUDA 12

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-07 11:27:50 -04:00
Jared Van Bortel
1f2294ed73
python: prepare to release v2.8.0 (#2794)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-05 13:36:18 -04:00
Jared Van Bortel
10c3e21147
python: detect Rosetta 2 (#2793)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-05 13:24:06 -04:00
mcembalest
5306595176
V3 docs max (#2488)
* new skeleton

Signed-off-by: Max Cembalest <max@nomic.ai>

* v3 docs

Signed-off-by: Max Cembalest <max@nomic.ai>

---------

Signed-off-by: Max Cembalest <max@nomic.ai>
2024-07-01 13:00:14 -04:00
Jared Van Bortel
09dd3dc318
python: depend on offical NVIDIA CUDA packages (#2355)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-20 18:06:27 -04:00
Jared Van Bortel
d2a99d9bc6
support the llama.cpp CUDA backend (#2310)
* rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f
* support for CUDA backend (enabled by default)
* partial support for Occam's Vulkan backend (disabled by default)
* partial support for HIP/ROCm backend (disabled by default)
* sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt
* changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA)
* ship CUDA runtime with installed version
* make device selection in the UI on macOS actually do something
* model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-15 15:27:50 -04:00
Jared Van Bortel
ba53ab5da0
python: do not print GPU name with verbose=False, expose this info via properties (#2222)
* llamamodel: only print device used in verbose mode

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: expose backend and device via GPT4All properties

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* backend: const correctness fixes

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: bump version

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: typing fixups

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: fix segfault with closed GPT4All

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

---------

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-18 14:52:02 -04:00
Jared Van Bortel
ac498f79ac
fix regressions in system prompt handling (#2219)
* python: fix system prompt being ignored
* fix unintended whitespace after system prompt

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-15 11:39:48 -04:00
Jared Van Bortel
3f8257c563
llamamodel: fix semantic typo in nomic client dynamic mode (#2216)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-12 17:25:15 -04:00
Jared Van Bortel
46818e466e
python: embedding cancel callback for nomic client dynamic mode (#2214)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-12 16:00:39 -04:00
Jared Van Bortel
459289b94c
embed4all: small fixes related to nomic client local embeddings (#2213)
* actually submit larger batches with increased n_ctx
* fix crash when llama_tokenize returns no tokens

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-12 10:54:15 -04:00
Jared Van Bortel
1b84a48c47
python: add list_gpus to the GPT4All API (#2194)
Other changes:
* fix memory leak in llmodel_available_gpu_devices
* drop model argument from llmodel_available_gpu_devices
* breaking: make GPT4All/Embed4All arguments past model_name keyword-only

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-04 14:52:13 -04:00
Jared Van Bortel
3313c7de0d
python: implement close() and context manager interface (#2177)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-28 16:48:07 -04:00
Jared Van Bortel
b743c588e8
python: bump version to 2.3.2 to include *all* of the bugfixes (#2171)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-26 15:26:08 -04:00
Jared Van Bortel
8d09b2c264 python: bump version
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-25 22:16:50 -07:00
Jared Van Bortel
446668674e python: use TypedDict from typing_extensions on python 3.9 and 3.10
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-25 22:16:50 -07:00
Jared Van Bortel
71d7f34d1a
python: improve handling of incomplete downloads (#2152)
* make sure encoding is identity for Range requests
* use a .part file for partial downloads
* verify using file size and MD5 from models3.json

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-21 11:33:41 -04:00
Jared Van Bortel
4a16a920a3
python: actually fix python 3.8 compatibility (#1973)
importlib.resources.files also didn't exist until python 3.9.

Fixes #1972

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-26 13:15:02 -05:00
Jared Van Bortel
4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-21 15:45:32 -05:00
Simon Willison
f2024a1f9e
python: README and project links for PyPI listing (#1964)
Signed-off-by: Simon Willison <swillison@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-02-13 17:44:33 -05:00
Jared Van Bortel
fc7e5f4a09
ci: fix missing Kompute support in python bindings (#1953)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-09 21:40:32 -05:00
Jared Van Bortel
bf493bb048
Mixtral crash fix and python bindings v2.2.0 (#1931)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-06 11:01:15 -05:00
Jared Van Bortel
f8564398fc minor change to trigger CircleCI 2024-01-12 16:13:46 -05:00
Jared Van Bortel
eef604fd64 python: release bindings version 2.1.0
The backend has a breaking change for Falcon and MPT models, so we need
to make a new release.
2024-01-12 09:38:16 -05:00
cebtenzzre
3c561bcdf2 python: bump bindings version for AMD fixes 2023-10-30 17:00:05 -04:00
cebtenzzre
7e5e84fbb7
python: change default extension to .gguf (#1559) 2023-10-23 22:18:50 -04:00
Andriy Mulyar
d50803ff8e
GGUF Python Release (#1539) 2023-10-19 19:11:03 -04:00
cebtenzzre
0fe2e19691
llamamodel: re-enable error messages by default (#1537) 2023-10-19 13:46:33 -04:00
cebtenzzre
017c3a9649
python: prepare version 2.0.0rc1 (#1529) 2023-10-18 20:24:54 -04:00
Adam Treat
0f046cf905 Bump the Python version to python-v1.0.12 to restrict the quants that vulkan recognizes. 2023-09-15 09:12:20 -04:00
Aaron Miller
f0735efa7d vulkan python bindings on windows fixes 2023-09-12 14:16:02 -07:00
Aaron Miller
0ad1472b62 bump python version (library linking fix) 2023-09-11 09:42:06 -07:00
Andriy Mulyar
b6e38d69ed
Python version bump
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-09-01 13:21:41 -04:00
Andriy Mulyar
39acbc8378
Python version bump
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-07-27 12:19:23 -04:00
Andriy Mulyar
41f640577c
Update setup.py (#1263)
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-07-24 14:25:04 -04:00
Adam Treat
f543affa9a Add better docs and threading support to bert. 2023-07-14 14:14:22 -04:00
Adam Treat
bb2b82e1b9 Add docs and bump version since we changed python api again. 2023-07-14 09:48:57 -04:00
Adam Treat
4963db8f43 Bump the version numbers for both python and c backend. 2023-07-13 14:21:46 -04:00
Aaron Miller
ed470e18b3
python: Only eval latest message in chat sessions (#1149)
* python: Only eval latest message in chat sessions

* python: version bump
2023-07-06 21:02:14 -04:00
Andriy Mulyar
71a7032421
python bindings v1.0.2
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-07-04 11:24:05 -04:00
Andriy Mulyar
19412cfa5d
Clear chat history between chat sessions (#1116) 2023-06-30 20:50:38 -04:00
Andriy Mulyar
46a0762bd5
Python Bindings: Improved unit tests, documentation and unification of API (#1090)
* Makefiles, black, isort

* Black and isort

* unit tests and generation method

* chat context provider

* context does not reset

* Current state

* Fixup

* Python bindings with unit tests

* GPT4All Python Bindings: chat contexts, tests

* New python bindings and backend fixes

* Black and Isort

* Documentation error

* preserved n_predict for backwords compat with langchain

---------

Co-authored-by: Adam Treat <treat.adam@gmail.com>
2023-06-30 16:02:02 -04:00
AMOGUS
b8464073b8
Update gpt4all_chat.md (#1050)
* Update gpt4all_chat.md

Cleaned up and made the sideloading part more readable, also moved Replit architecture to supported ones. (+ renamed all "ggML" to "GGML" because who calls it "ggML"??)

Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>

* Removed the prefixing part

Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>

* Bump version

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

---------

Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-06-27 10:49:45 -04:00
Richard Guo
a39a897e34 0.3.5 bump 2023-06-20 10:21:51 -04:00
Richard Guo
25ce8c6a1e revert version 2023-06-20 10:21:51 -04:00
Richard Guo
282a3b5498 setup.py update 2023-06-20 10:21:51 -04:00
Richard Guo
a9b33c3d10 update setup.py 2023-06-13 09:07:08 -04:00
Richard Guo
e9449190cd version bump 2023-06-12 17:32:56 -04:00
Aaron Miller
d3ba1295a7
Metal+LLama take two (#929)
Support latest llama with Metal
---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-09 16:48:46 -04:00
Richard Guo
e0a8480c0e
Generator in Python Bindings - streaming yields tokens at a time (#895)
* generator method

* cleanup

* bump version number for clarity

* added replace in decode to avoid unicodedecode exception

* revert back to _build_prompt
2023-06-09 10:17:44 -04:00