Commit Graph

231 Commits

Author SHA1 Message Date
Jared Van Bortel
971c83d1d3
llama.cpp: pull in fix for Kompute-related nvidia-egl crash (#2843)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-13 11:10:10 -04:00
Jared Van Bortel
be91576937
ci: use consistent build options on macOS (#2849)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-12 19:03:18 -04:00
Jared Van Bortel
c950fdd84e
changelogs: add PR 2781 (#2809)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-07 18:59:57 -04:00
Jared Van Bortel
de7cb36fcc
python: reduce size of wheels built by CI, other build tweaks (#2802)
* Read CMAKE_CUDA_ARCHITECTURES directly
* Disable CUBINs for python build in CI
* Search for CUDA 11 as well as CUDA 12

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-07 11:27:50 -04:00
Jared Van Bortel
be66ec8ab5
chat: faster KV shift, continue generating, fix stop sequences (#2781)
* Don't stop generating at end of context
* Use llama_kv_cache ops to shift context
* Fix and improve reverse prompt detection
* Replace prompt recalc callback with a flag to disallow context shift
2024-08-07 11:25:24 -04:00
Jared Van Bortel
1f2294ed73
python: prepare to release v2.8.0 (#2794)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-05 13:36:18 -04:00
Jared Van Bortel
10c3e21147
python: detect Rosetta 2 (#2793)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-05 13:24:06 -04:00
Jared Van Bortel
51bd01ae05
backend: fix extra spaces in tokenization and a CUDA crash (#2778)
Also potentially improves accuracy of BOS insertion, token cache, and logit indexing.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-01 10:46:36 -04:00
patcher9
71c957f8ee
Update monitoring.md (#2724)
Signed-off-by: patcher9 <patcher99@dokulabs.com>
2024-07-25 19:13:00 -04:00
abhisomala
df510ef869
added tutorial and images for it (#2717)
* added tutorial and images for it

Signed-off-by: Max Cembalest <mbcembalest@gmail.com>

* updated images

Signed-off-by: abhisomala <68791501+abhisomala@users.noreply.github.com>
Signed-off-by: Max Cembalest <mbcembalest@gmail.com>

* Minor updates

Signed-off-by: abhisomala <68791501+abhisomala@users.noreply.github.com>
Signed-off-by: Max Cembalest <mbcembalest@gmail.com>

* fix link & indent note callouts

Signed-off-by: mcembalest <70534565+mcembalest@users.noreply.github.com>
Signed-off-by: Max Cembalest <mbcembalest@gmail.com>

* added obsidian tutorial to sidebar and fixed formatting of note boxes

Signed-off-by: Max Cembalest <mbcembalest@gmail.com>

---------

Signed-off-by: Max Cembalest <mbcembalest@gmail.com>
Signed-off-by: abhisomala <68791501+abhisomala@users.noreply.github.com>
Signed-off-by: mcembalest <70534565+mcembalest@users.noreply.github.com>
Co-authored-by: mcembalest <70534565+mcembalest@users.noreply.github.com>
Co-authored-by: Max Cembalest <mbcembalest@gmail.com>
2024-07-22 15:31:43 -04:00
mcembalest
62abecaec8
fixed link to embeddings docs on localdocs page (#2687)
Signed-off-by: Max Cembalest <mbcembalest@gmail.com>
2024-07-17 16:36:31 -04:00
akgom
214499ce84
Update use-local-ai-models-to-privately-chat-with-google-drive.md (#2647)
Updated screenshots for google drive guide with new app images
Signed off by Max Cembalest
2024-07-11 13:22:43 -04:00
akgom
df5d374187
Update use-local-ai-models-to-privately-chat-with-One-Drive.md (#2646)
Signed-off-by: akgom <132290469+akgom@users.noreply.github.com>
2024-07-11 11:26:28 -04:00
akgom
7ec67eab15
Create using-local-ai-models-to-privately-chat-with-One-Drive.md (#2637)
* Create using-local-ai-models-to-privately-chat-with-One-Drive.md
Signed-off-by: Max Cembalest
2024-07-11 11:03:05 -04:00
Andriy Mulyar
d87484d3c9
analytics entry (#2641)
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2024-07-10 18:50:16 -04:00
mcembalest
0de6eba69e
formatted note callouts (#2633)
Signed-off-by: Max Cembalest <mbcembalest@gmail.com>
2024-07-10 09:55:53 -04:00
Andriy Mulyar
62d423c554
typo (#2629)
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2024-07-09 22:54:43 -04:00
akgom
7f2ceff5c8
Create googledrive.md (#2621)
* Create googledrive.md

Signed-off-by: akgom <132290469+akgom@users.noreply.github.com>

* updates

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

---------

Signed-off-by: akgom <132290469+akgom@users.noreply.github.com>
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2024-07-09 22:47:23 -04:00
Hampus
b9103892b6
fix: incomplete sentence in faq (#2611)
Signed-off-by: Hampus <16954508+xdHampus@users.noreply.github.com>
2024-07-09 11:13:03 -04:00
HydeZero
c73f0e5c8c
python: fix docstring grammar (#2529)
Signed-off-by: HydeZero <128327411+HydeZero@users.noreply.github.com>
2024-07-05 12:44:28 -04:00
mcembalest
69102a2859
small edits and placeholder gif (#2513)
* small edits and placeholder gif

Signed-off-by: Max Cembalest <max@nomic.ai>

* jul2 docs updates

Signed-off-by: Max Cembalest <max@nomic.ai>

* added video

Signed-off-by: mcembalest <70534565+mcembalest@users.noreply.github.com>
Signed-off-by: Max Cembalest <max@nomic.ai>

* quantization nits

Signed-off-by: Max Cembalest <max@nomic.ai>

---------

Signed-off-by: Max Cembalest <max@nomic.ai>
Signed-off-by: mcembalest <70534565+mcembalest@users.noreply.github.com>
2024-07-02 11:41:39 -04:00
mcembalest
b85b74d5bf
docs: bump copyright year and change site_description (#2502)
Signed-off-by: Max Cembalest <max@nomic.ai>
2024-07-01 14:34:07 -04:00
mcembalest
5306595176
V3 docs max (#2488)
* new skeleton

Signed-off-by: Max Cembalest <max@nomic.ai>

* v3 docs

Signed-off-by: Max Cembalest <max@nomic.ai>

---------

Signed-off-by: Max Cembalest <max@nomic.ai>
2024-07-01 13:00:14 -04:00
Jared Van Bortel
01870b4a46
chat: fix blank device in UI and improve Mixpanel reporting (#2409)
Also remove LLModel::hasGPUDevice.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-06-26 15:26:27 -04:00
patcher9
986d9d9bb8
docs: add description of OpenLIT GPU monitoring (#2436)
Signed-off-by: patcher9 <patcher99@dokulabs.com>
2024-06-13 11:23:32 -04:00
patcher9
d43bfa0a53
docs: document OpenLIT integration (#2386)
Signed-off-by: patcher9 <patcher99@dokulabs.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-06-05 11:05:21 -04:00
woheller69
f001897a1a
Fix path in Readme (#2339)
Signed-off-by: woheller69 <68678880+woheller69@users.noreply.github.com>
2024-05-31 17:20:41 -04:00
Jared Van Bortel
09dd3dc318
python: depend on offical NVIDIA CUDA packages (#2355)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-20 18:06:27 -04:00
Jared Van Bortel
c779d8a32d
python: init_gpu fixes (#2368)
* python: tweak GPU init failure message
* llama.cpp: update submodule for use-after-free fix

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-20 18:04:11 -04:00
Jared Van Bortel
d2a99d9bc6
support the llama.cpp CUDA backend (#2310)
* rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f
* support for CUDA backend (enabled by default)
* partial support for Occam's Vulkan backend (disabled by default)
* partial support for HIP/ROCm backend (disabled by default)
* sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt
* changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA)
* ship CUDA runtime with installed version
* make device selection in the UI on macOS actually do something
* model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-15 15:27:50 -04:00
Jared Van Bortel
86560f3952
maint: remove Docker API server and related references (#2314)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-09 12:50:26 -04:00
Jared Van Bortel
ba53ab5da0
python: do not print GPU name with verbose=False, expose this info via properties (#2222)
* llamamodel: only print device used in verbose mode

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: expose backend and device via GPT4All properties

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* backend: const correctness fixes

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: bump version

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: typing fixups

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: fix segfault with closed GPT4All

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

---------

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-18 14:52:02 -04:00
Jared Van Bortel
ac498f79ac
fix regressions in system prompt handling (#2219)
* python: fix system prompt being ignored
* fix unintended whitespace after system prompt

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-15 11:39:48 -04:00
Jared Van Bortel
3f8257c563
llamamodel: fix semantic typo in nomic client dynamic mode (#2216)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-12 17:25:15 -04:00
Jared Van Bortel
46818e466e
python: embedding cancel callback for nomic client dynamic mode (#2214)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-12 16:00:39 -04:00
Jared Van Bortel
459289b94c
embed4all: small fixes related to nomic client local embeddings (#2213)
* actually submit larger batches with increased n_ctx
* fix crash when llama_tokenize returns no tokens

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-12 10:54:15 -04:00
Jared Van Bortel
1b84a48c47
python: add list_gpus to the GPT4All API (#2194)
Other changes:
* fix memory leak in llmodel_available_gpu_devices
* drop model argument from llmodel_available_gpu_devices
* breaking: make GPT4All/Embed4All arguments past model_name keyword-only

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-04 14:52:13 -04:00
Jared Van Bortel
3313c7de0d
python: implement close() and context manager interface (#2177)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-28 16:48:07 -04:00
Jacob Nguyen
55f3b056b7
typescript!: chatSessions, fixes, tokenStreams (#2045)
Signed-off-by: jacob <jacoobes@sern.dev>
Signed-off-by: limez <limez@protonmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: limez <limez@protonmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-03-28 12:08:23 -04:00
Jared Van Bortel
b743c588e8
python: bump version to 2.3.2 to include *all* of the bugfixes (#2171)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-26 15:26:08 -04:00
Jared Van Bortel
8d09b2c264 python: bump version
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-25 22:16:50 -07:00
Jared Van Bortel
446668674e python: use TypedDict from typing_extensions on python 3.9 and 3.10
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-25 22:16:50 -07:00
Jared Van Bortel
71db8bdc80
python: also delete partial file on KeyboardInterrupt/SystemExit (#2154)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-21 12:59:35 -04:00
Jared Van Bortel
71d7f34d1a
python: improve handling of incomplete downloads (#2152)
* make sure encoding is identity for Range requests
* use a .part file for partial downloads
* verify using file size and MD5 from models3.json

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-21 11:33:41 -04:00
Jared Van Bortel
0455b80b7f
Embed4All: optionally count tokens, misc fixes (#2145)
Key changes:
* python: optionally return token count in Embed4All.embed
* python and docs: models2.json -> models3.json
* Embed4All: require explicit prefix for unknown models
* llamamodel: fix shouldAddBOS for Bert and Nomic Bert

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-20 11:24:02 -04:00
Jared Van Bortel
a1bb6084ed
python: documentation update and typing improvements (#2129)
Key changes:
* revert "python: tweak constructor docstrings"
* docs: update python GPT4All and Embed4All documentation
* breaking: require keyword args to GPT4All.generate

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-19 17:25:22 -04:00
Jared Van Bortel
699410014a
fix non-AVX CPU detection (#2141)
* chat: fix non-AVX CPU detection on Windows
* bindings: throw exception instead of logging to console

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-19 10:56:14 -04:00
Jared Van Bortel
255568fb9a
python: various fixes for GPT4All and Embed4All (#2130)
Key changes:
* honor empty system prompt argument
* current_chat_session is now read-only and defaults to None
* deprecate fallback prompt template for unknown models
* fix mistakes from #2086

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-15 11:49:58 -04:00
Jared Van Bortel
406e88b59a
implement local Nomic Embed via llama.cpp (#2086)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-13 18:09:24 -04:00
Jared Van Bortel
d8c842263f
python: more fixes for new prompt templates (#2044)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-06 14:22:08 -05:00