Jared Van Bortel
be66ec8ab5
chat: faster KV shift, continue generating, fix stop sequences ( #2781 )
...
* Don't stop generating at end of context
* Use llama_kv_cache ops to shift context
* Fix and improve reverse prompt detection
* Replace prompt recalc callback with a flag to disallow context shift
2024-08-07 11:25:24 -04:00
Jared Van Bortel
51bd01ae05
backend: fix extra spaces in tokenization and a CUDA crash ( #2778 )
...
Also potentially improves accuracy of BOS insertion, token cache, and logit indexing.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-08-01 10:46:36 -04:00
不知火 Shiranui
f9cd2e321c
feat: add openai-compatible api models ( #2683 )
...
Signed-off-by: Shiranui <supersonic@livemail.tw>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-07-25 10:02:52 -04:00
AT
765e055597
Change the timeout for circle ci and add a fixme. ( #2722 )
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-07-23 17:01:46 -04:00
Jared Van Bortel
56d5a23001
chatllm: fix loading of chats after #2676 ( #2693 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-07-18 21:03:18 -04:00
AT
ca72428783
Remove support for GPT-J models. ( #2676 )
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-07-17 16:07:37 -04:00
AT
66bc04aa8e
chat: generate follow-up questions after response ( #2634 )
...
* user can configure the prompt and when they appear
* also make the name generation prompt configurable
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-07-10 15:45:20 -04:00
Jared Van Bortel
2c8d634b5b
UI and embedding device changes for GPT4All v3.0.0-rc3 ( #2477 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-06-28 12:57:57 -04:00
Jared Van Bortel
01870b4a46
chat: fix blank device in UI and improve Mixpanel reporting ( #2409 )
...
Also remove LLModel::hasGPUDevice.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-06-26 15:26:27 -04:00
AT
9273b49b62
chat: major UI redesign for v3.0.0 ( #2396 )
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-06-24 18:49:23 -04:00
Jared Van Bortel
41c9013fa4
chat: don't use incomplete types with signals/slots/Q_INVOKABLE ( #2408 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-06-06 11:59:28 -04:00
Jared Van Bortel
d3d777bc51
chat: fix #includes with include-what-you-use ( #2401 )
...
Also use qGuiApp instead of qApp.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-06-04 14:47:11 -04:00
Jared Van Bortel
d2a99d9bc6
support the llama.cpp CUDA backend ( #2310 )
...
* rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f
* support for CUDA backend (enabled by default)
* partial support for Occam's Vulkan backend (disabled by default)
* partial support for HIP/ROCm backend (disabled by default)
* sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt
* changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA)
* ship CUDA runtime with installed version
* make device selection in the UI on macOS actually do something
* model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-15 15:27:50 -04:00
Jared Van Bortel
7e1e00f331
chat: fix issues with quickly switching between multiple chats ( #2343 )
...
* prevent load progress from getting out of sync with the current chat
* fix memory leak on exit if the LLModelStore contains a model
* do not report cancellation as a failure in console/Mixpanel
* show "waiting for model" separately from "switching context" in UI
* do not show lower "reload" button on error
* skip context switch if unload is pending
* skip unnecessary calls to LLModel::saveState
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-15 14:07:03 -04:00
Jared Van Bortel
7f1c3d4275
chatllm: fix model loading progress showing "Reload" sometimes ( #2337 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-15 13:57:53 -04:00
Jared Van Bortel
5fb9d17c00
chatllm: use a better prompt for the generated chat name ( #2322 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-09 09:38:19 -04:00
Jared Van Bortel
adaecb7a72
mixpanel: improved GPU device statistics (plus GPU sort order fix) ( #2297 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-01 16:15:48 -04:00
Jared Van Bortel
c622921894
improve mixpanel usage statistics ( #2238 )
...
Other changes:
- Always display first start dialog if privacy options are unset (e.g. if the user closed GPT4All without selecting them)
- LocalDocs scanQueue is now always deferred
- Fix a potential crash in magic_match
- LocalDocs indexing is now started after the first start dialog is dismissed so usage stats are included
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-25 13:16:52 -04:00
Jared Van Bortel
271d752701
localdocs: small but important fixes to local docs ( #2236 )
...
* chat: use .rmodel extension for Nomic Embed
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
* database: fix order of SQL arguments in updateDocument
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
---------
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-18 14:51:13 -04:00
Jared Van Bortel
ac498f79ac
fix regressions in system prompt handling ( #2219 )
...
* python: fix system prompt being ignored
* fix unintended whitespace after system prompt
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-15 11:39:48 -04:00
Olyxz16
2c0a660e6e
feat: Add support for Mistral API models ( #2053 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Cédric Sazos <cedric.sazos@tutanota.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-03-13 18:23:57 -04:00
Jared Van Bortel
406e88b59a
implement local Nomic Embed via llama.cpp ( #2086 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-13 18:09:24 -04:00
Xu Zhen
0072860d24
Fix compatibility with Qt 6.4
...
Signed-off-by: Xu Zhen <xuzhen@users.noreply.github.com>
2024-03-12 07:42:22 -05:00
Adam Treat
17dee02287
Fix for issue #2080 where the GUI appears to hang when a chat with a large
...
model is deleted. There is no reason to save the context for a chat that
is being deleted.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-03-06 16:52:17 -06:00
Jared Van Bortel
44717682a7
chat: implement display of model loading warnings ( #2034 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-06 17:14:54 -05:00
Jared Van Bortel
a0bd96f75d
chat: join ChatLLM threads without calling destructors ( #2043 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-06 16:42:59 -05:00
Jared Van Bortel
2a91ffd73f
chatllm: fix undefined behavior in resetContext
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-06 12:54:19 -06:00
chrisbarrera
f8b1069a1c
add min_p sampling parameter ( #2014 )
...
Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
2024-02-24 17:51:34 -05:00
Adam Treat
67bbce43ab
Fix state issues with reloading model.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 16:05:49 -05:00
Jared Van Bortel
4fc4d94be4
fix chat-style prompt templates ( #1970 )
...
Also use a new version of Mistral OpenOrca.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-21 15:45:32 -05:00
Adam Treat
fa0a2129dc
Don't try and detect model load error on startup.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 10:15:20 -06:00
Adam Treat
67099f80ba
Add comment to make this clear.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 10:15:20 -06:00
Adam Treat
d948a4f2ee
Complete revamp of model loading to allow for more discreet control by
...
the user of the models loading behavior.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 10:15:20 -06:00
Adam Treat
4461af35c7
Fix includes.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-05 16:46:16 -05:00
Jared Van Bortel
10e3f7bbf5
Fix VRAM leak when model loading fails ( #1901 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-01 15:45:45 -05:00
Adam Treat
d14b95f4bd
Add Nomic Embed model for atlas with localdocs.
2024-01-31 22:22:08 -05:00
Jared Van Bortel
061d1969f8
expose n_gpu_layers parameter of llama.cpp ( #1890 )
...
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-01-31 14:17:44 -05:00
Jared Van Bortel
c7ea283f1f
chatllm: fix deserialization version mismatch ( #1859 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-01-22 10:01:31 -05:00
Jared Van Bortel
d1c56b8b28
Implement configurable context length ( #1749 )
2023-12-16 17:58:15 -05:00
Jared Van Bortel
0600f551b3
chatllm: do not attempt to serialize incompatible state ( #1742 )
2023-12-12 11:45:03 -05:00
Adam Treat
fb3b1ceba2
Do not attempt to do a blocking retrieval if we don't have any collections.
2023-12-04 12:58:40 -05:00
Moritz Tim W
012f399639
fix typo ( #1697 )
2023-11-30 12:37:52 -05:00
Adam Treat
9e27a118ed
Fix system prompt.
2023-11-21 10:42:12 -05:00
Adam Treat
5c0d077f74
Remove leading whitespace in responses.
2023-10-28 16:53:42 -04:00
Adam Treat
dc2e7d6e9b
Don't start recalculating context immediately upon switching to a new chat
...
but rather wait until the first prompt. This allows users to switch between
chats fast and to delete chats more easily.
Fixes issue #1545
2023-10-28 16:41:23 -04:00
cebtenzzre
4338e72a51
MPT: use upstream llama.cpp implementation ( #1515 )
2023-10-19 15:25:17 -04:00
cebtenzzre
04499d1c7d
chatllm: do not write uninitialized data to stream ( #1486 )
2023-10-11 11:31:34 -04:00
Adam Treat
f0742c22f4
Restore state from text if necessary.
2023-10-11 09:16:02 -04:00
Adam Treat
b2cd3bdb3f
Fix crasher with an empty string for prompt template.
2023-10-06 12:44:53 -04:00
Cebtenzzre
5fe685427a
chat: clearer CPU fallback messages
2023-10-06 11:35:14 -04:00