Adam Treat
d3ec333314
Allow these to load for gptj too.
2023-05-08 18:31:20 -04:00
Aaron Miller
5002614b20
mpt: allow q4_2 quantized models to load
2023-05-08 18:23:36 -04:00
Aaron Miller
832720dd27
mpt tokenizer: better special token handling
...
closer to the behavior of huggingface `tokenizers`,
do not attempt to handle additional tokens as if they were part
of the original vocabulary as this cannot prevent them from being
split into smaller chunks - handle added tokens *before*
the regular tokenizing pass
note this is still necessary even with a "proper" tokenizer implementation
2023-05-08 18:23:36 -04:00
Adam Treat
8c4b8f215f
Fix gptj to have lower memory requirements for kv cache and add versioning to the internal state to smoothly handle such a fix in the future.
2023-05-08 17:23:02 -04:00
Adam Treat
ccbd16cf18
Fix the version.
2023-05-08 16:50:21 -04:00
Adam Treat
a549871220
Remove as upstream has removed.
2023-05-08 15:09:23 -04:00
Adam Treat
dfe85386b5
This shouldn't have snuck in.
2023-05-08 15:09:23 -04:00
Adam Treat
992e553cfa
Update to the alibi version that Zach made.
2023-05-08 12:27:01 -04:00
Adam Treat
98aedd2173
Match Helly's impl of kv cache.
2023-05-08 12:21:30 -04:00
Adam Treat
eb77d5157b
Use F16 for kv cache on mpt.
2023-05-08 12:21:30 -04:00
Adam Treat
dc559c1575
Fix for special tokens.
2023-05-08 12:21:30 -04:00
Adam Treat
b6886c0e31
Fix up mpt.
2023-05-08 12:21:30 -04:00
Zach Nussbaum
61e2aabadb
fix: helly changes
2023-05-08 12:21:30 -04:00
Zach Nussbaum
d30be81506
fix: model loading
2023-05-08 12:21:30 -04:00
Zach Nussbaum
f732ba2d56
fix: convert script working
2023-05-08 12:21:30 -04:00
Zach Nussbaum
6a56bcaf06
feat: load model
2023-05-08 12:21:30 -04:00
Zach Nussbaum
58069dc8b9
chore: import for mpt
2023-05-08 12:21:30 -04:00
Zach Nussbaum
03bde18e49
feat: mpt convert from hf to ggml
2023-05-08 12:21:30 -04:00
Zach Nussbaum
2f6ecbe798
feat: build works + tokenizer
2023-05-08 12:21:30 -04:00
Zach Nussbaum
525b703984
feat: add ln 2, rename vars
2023-05-08 12:21:30 -04:00
Zach Nussbaum
aef524b460
feat: mpt wip
2023-05-08 12:21:30 -04:00
Adam Treat
159053be5a
Scaffolding for the mpt <-> ggml project.
2023-05-08 12:21:30 -04:00
Adam Treat
40b976436a
Only generate three words max.
2023-05-08 12:21:30 -04:00
Adam Treat
49a6a6ed65
Restore defaults for repeat penalty too.
2023-05-08 12:21:30 -04:00
Adam Treat
c054efa6ac
Send info on how many are running into this error.
2023-05-08 08:31:35 -04:00
Adam Treat
6d943917f1
Fail early/gracefully if incompatible hardware detected. And default to universal builds on mac.
2023-05-08 08:23:00 -04:00
Adam Treat
3c30310539
Convert the old format properly.
2023-05-08 05:53:16 -04:00
Adam Treat
7b66cb7119
Add debug for chatllm model loading and fix order of getting rid of the
...
dummy chat when no models are restored.
2023-05-07 14:40:02 -04:00
Adam Treat
9bd5609ba0
Deserialize one at a time and don't block gui until all of them are done.
2023-05-07 09:20:09 -04:00
Adam Treat
86da175e1c
Use last lts for this.
2023-05-07 06:39:32 -04:00
Adam Treat
ab13148430
The GUI should come up immediately and not wait on deserializing from disk.
2023-05-06 20:01:14 -04:00
Adam Treat
eb7b61a76d
Move the location of the chat files to the model download directory and add a magic+version.
2023-05-06 18:51:49 -04:00
Aaron Miller
7a8f437f8f
add name to LICENSE
2023-05-06 13:11:39 -04:00
Adam Treat
e397fda250
Bump the version and save up to an order of magnitude of disk space for chat files.
2023-05-05 20:12:00 -04:00
Adam Treat
8d2c8c8cb0
Turn off saving chats to disk by default as it eats so much disk space.
2023-05-05 12:30:11 -04:00
Adam Treat
6d4d86d07c
Bump the version.
2023-05-05 11:43:25 -04:00
Adam Treat
d0d5d84e06
Add reverse prompt support for gptj too.
2023-05-05 11:16:24 -04:00
Adam Treat
06bb6960d4
Add about dialog.
2023-05-05 10:47:05 -04:00
Adam Treat
659442394f
Persistent state for gpt-j models too.
2023-05-05 10:00:17 -04:00
Adam Treat
5b71d39024
Don't crash if state has not been set.
2023-05-05 10:00:17 -04:00
Richard Guo
7ab7d948b5
Update monorepo_plan.md
2023-05-05 09:32:45 -04:00
Aaron Miller
019f6d0103
include <cstdint> in llmodel.h
2023-05-04 20:36:19 -04:00
Adam Treat
f291853e51
First attempt at providing a persistent chat list experience.
...
Limitations:
1) Context is not restored for gpt-j models
2) When you switch between different model types in an existing chat
the context and all the conversation is lost
3) The settings are not chat or conversation specific
4) The sizes of the chat persisted files are very large due to how much
data the llama.cpp backend tries to persist. Need to investigate how
we can shrink this.
2023-05-04 15:31:41 -04:00
Adam Treat
081d32bd97
Restore the model when switching chats.
2023-05-03 12:45:14 -04:00
Adam Treat
0bb52fc5fe
Experiment with a much shorter default prompt template.
2023-05-03 12:19:14 -04:00
Adam Treat
82c1d08b33
Add reverse prompts for llama models.
2023-05-03 11:58:26 -04:00
Adam Treat
01accf9e33
Don't exceed the window size for dialogs.
2023-05-03 08:37:45 -04:00
Adam Treat
0f70289ba4
Changes the datalake feature so all conversations are captured when opted-in.
2023-05-03 07:54:45 -04:00
Aaron Miller
edad3baa99
download: make model downloads resumable
...
* save files as `incomplete-{filename}` in the dest folder
* rename into place after hash is confirmed or delete if hash is bad
* resume downloads using http `range`
* if DL is resumed from a different app session rewind a bit -
this is to deal with the case where the file size changes before
the content is fully flushed out
* flush dest file at end of readyRead, this mitigates the above
and provides backpressure on the download if the destination disk
is slower than the network connection
2023-05-02 20:36:25 -04:00
Adam Treat
4a09f0f0ec
More extensive usage stats to help diagnose errors and problems in the ui.
2023-05-02 20:31:17 -04:00