Adam Treat
a13dcfb13b
Move this script and rename.
2023-05-09 11:48:32 -04:00
Adam Treat
8b80345c98
Copy pasta.
2023-05-08 19:10:22 -04:00
Adam Treat
af4a67c109
Fix for special im_end token in mpt-7b-chat model.
2023-05-08 18:57:40 -04:00
Adam Treat
d3ec333314
Allow these to load for gptj too.
2023-05-08 18:31:20 -04:00
Aaron Miller
5002614b20
mpt: allow q4_2 quantized models to load
2023-05-08 18:23:36 -04:00
Aaron Miller
832720dd27
mpt tokenizer: better special token handling
...
closer to the behavior of huggingface `tokenizers`,
do not attempt to handle additional tokens as if they were part
of the original vocabulary as this cannot prevent them from being
split into smaller chunks - handle added tokens *before*
the regular tokenizing pass
note this is still necessary even with a "proper" tokenizer implementation
2023-05-08 18:23:36 -04:00
Adam Treat
8c4b8f215f
Fix gptj to have lower memory requirements for kv cache and add versioning to the internal state to smoothly handle such a fix in the future.
2023-05-08 17:23:02 -04:00
Adam Treat
a549871220
Remove as upstream has removed.
2023-05-08 15:09:23 -04:00
Adam Treat
992e553cfa
Update to the alibi version that Zach made.
2023-05-08 12:27:01 -04:00
Adam Treat
98aedd2173
Match Helly's impl of kv cache.
2023-05-08 12:21:30 -04:00
Adam Treat
eb77d5157b
Use F16 for kv cache on mpt.
2023-05-08 12:21:30 -04:00
Adam Treat
dc559c1575
Fix for special tokens.
2023-05-08 12:21:30 -04:00
Adam Treat
b6886c0e31
Fix up mpt.
2023-05-08 12:21:30 -04:00
Zach Nussbaum
61e2aabadb
fix: helly changes
2023-05-08 12:21:30 -04:00
Zach Nussbaum
d30be81506
fix: model loading
2023-05-08 12:21:30 -04:00
Zach Nussbaum
6a56bcaf06
feat: load model
2023-05-08 12:21:30 -04:00
Zach Nussbaum
2f6ecbe798
feat: build works + tokenizer
2023-05-08 12:21:30 -04:00
Zach Nussbaum
525b703984
feat: add ln 2, rename vars
2023-05-08 12:21:30 -04:00
Zach Nussbaum
aef524b460
feat: mpt wip
2023-05-08 12:21:30 -04:00
Adam Treat
159053be5a
Scaffolding for the mpt <-> ggml project.
2023-05-08 12:21:30 -04:00
Adam Treat
6d943917f1
Fail early/gracefully if incompatible hardware detected. And default to universal builds on mac.
2023-05-08 08:23:00 -04:00
Adam Treat
7b66cb7119
Add debug for chatllm model loading and fix order of getting rid of the
...
dummy chat when no models are restored.
2023-05-07 14:40:02 -04:00
Adam Treat
d0d5d84e06
Add reverse prompt support for gptj too.
2023-05-05 11:16:24 -04:00
Adam Treat
659442394f
Persistent state for gpt-j models too.
2023-05-05 10:00:17 -04:00
Aaron Miller
019f6d0103
include <cstdint> in llmodel.h
2023-05-04 20:36:19 -04:00
Adam Treat
f291853e51
First attempt at providing a persistent chat list experience.
...
Limitations:
1) Context is not restored for gpt-j models
2) When you switch between different model types in an existing chat
the context and all the conversation is lost
3) The settings are not chat or conversation specific
4) The sizes of the chat persisted files are very large due to how much
data the llama.cpp backend tries to persist. Need to investigate how
we can shrink this.
2023-05-04 15:31:41 -04:00
Adam Treat
82c1d08b33
Add reverse prompts for llama models.
2023-05-03 11:58:26 -04:00
Adam Treat
8fe60c29fb
Don't set the app version in the llmodel.
2023-04-29 10:31:12 -04:00
Adam Treat
69f92d8ea8
Load models from filepath only.
2023-04-28 20:15:10 -04:00
Adam Treat
d982dc0529
Update to latest llama.cpp
2023-04-28 11:03:16 -04:00
Adam Treat
5a7d40f604
Move the saving of the tokens to the impl and not the callbacks responsibility.
2023-04-27 11:16:51 -04:00
Adam Treat
ba4b28fcd5
Move the promptCallback to own function.
2023-04-27 11:08:15 -04:00
Adam Treat
0e9f85bcda
Provide an initial impl. of the C interface. NOTE: has not been tested.
2023-04-27 09:43:24 -04:00
Adam Treat
b19d2f2c21
Add this and unbreak the build.
2023-04-26 22:45:10 -04:00
Adam Treat
ee5c58c26c
Initial support for opt-in telemetry.
2023-04-26 22:05:56 -04:00
Adam Treat
a3d97fa009
Don't crash when prompt is too large.
2023-04-26 19:08:37 -04:00
Adam Treat
7da3bc07cc
Update llama.cpp submodule to latest.
2023-04-26 11:50:05 -04:00
Adam Treat
fd0f92a94e
Clean up the docs a bit more still.
2023-04-26 08:22:38 -04:00
Adam Treat
c89096ccb4
Clean up the docs a bit more.
2023-04-26 08:22:38 -04:00
Adam Treat
ac7ecd2cef
Clean up the docs a bit.
2023-04-26 08:22:38 -04:00
Adam Treat
832b5d1a96
Only need one opaque pointer.
2023-04-26 08:22:38 -04:00
Adam Treat
102f68b18c
Fixup the api a bit.
2023-04-26 08:22:38 -04:00
Adam Treat
3c9139b5d2
Move the backend code into own subdirectory and make it a shared library. Begin fleshing out the C api wrapper that bindings can use.
2023-04-26 08:22:38 -04:00