Commit Graph

1140 Commits

Author SHA1 Message Date
Andriy Mulyar
633e2a2137
GPT4All API Scaffolding. Matches OpenAI OpenAPI spec for chats and completions (#839)
* GPT4All API Scaffolding. Matches OpenAI OpenAI spec for engines, chats and completions

* Edits for docker building

* FastAPI app builds and pydantic models are accurate

* Added groovy download into dockerfile

* improved dockerfile

* Chat completions endpoint edits

* API uni test sketch

* Working example of groovy inference with open ai api

* Added lines to test

* Set default to mpt
2023-06-28 14:28:52 -04:00
Andriy Mulyar
6b8456bf99
Update README.md (#1086)
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-06-28 12:15:05 -04:00
Adam Treat
e70899a26c Make the retrieval/parsing of models.json sync on startup. We were jumping to many hoops to mitigate the async behavior. 2023-06-28 12:32:22 -03:00
Adam Treat
9560336490 Match on the filename too for server mode. 2023-06-28 09:20:05 -04:00
Aaron Miller
28d41d4f6d
falcon: use *model-local* eval & scratch bufs (#1079)
fixes memory leaks copied from ggml/examples based implementation
2023-06-27 16:09:11 -07:00
Adam Treat
58cd346686 Bump release again and new release notes. 2023-06-27 18:01:23 -04:00
Adam Treat
0f8f364d76 Fix mac again for falcon. 2023-06-27 17:20:40 -04:00
Adam Treat
8aae4e52b3 Fix for falcon on mac. 2023-06-27 17:13:13 -04:00
Adam Treat
9375c71aa7 New release notes for 2.4.9 and bump version. 2023-06-27 17:01:49 -04:00
Adam Treat
71449bbc4b Fix this correctly? 2023-06-27 16:01:11 -04:00
Adam Treat
07a5405618 Make it clear this is our finetune. 2023-06-27 15:33:38 -04:00
Adam Treat
189ac82277 Fix server mode. 2023-06-27 15:01:16 -04:00
Adam Treat
b56cc61ca2 Don't allow setting an invalid prompt template. 2023-06-27 14:52:44 -04:00
Adam Treat
0780393d00 Don't use local. 2023-06-27 14:13:42 -04:00
Adam Treat
924efd9e25 Add falcon to our models.json 2023-06-27 13:56:16 -04:00
Adam Treat
d3b8234106 Fix spelling. 2023-06-27 14:23:56 -03:00
Adam Treat
42c0a6673a Don't persist the force metal setting. 2023-06-27 14:23:56 -03:00
Adam Treat
267601d670 Enable the force metal setting. 2023-06-27 14:23:56 -03:00
Zach Nussbaum
2565f6a94a feat: add conversion script 2023-06-27 14:06:39 -03:00
Aaron Miller
e22dd164d8 add falcon to chatllm::serialize 2023-06-27 14:06:39 -03:00
Aaron Miller
198b5e4832 add Falcon 7B model
Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin
2023-06-27 14:06:39 -03:00
AMOGUS
b8464073b8
Update gpt4all_chat.md (#1050)
* Update gpt4all_chat.md

Cleaned up and made the sideloading part more readable, also moved Replit architecture to supported ones. (+ renamed all "ggML" to "GGML" because who calls it "ggML"??)

Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>

* Removed the prefixing part

Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>

* Bump version

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

---------

Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-06-27 10:49:45 -04:00
Adam Treat
985d3bbfa4 Add Orca models to list. 2023-06-27 09:38:43 -04:00
Adam Treat
8558fb4297 Fix models.json for spanning multiple lines with string. 2023-06-26 21:35:56 -04:00
Adam Treat
c24ad02a6a Wait just a bit to set the model name so that we can display the proper name instead of filename. 2023-06-26 21:00:09 -04:00
Aaron Miller
db34a2f670 llmodel: skip attempting Metal if model+kvcache > 53% of system ram 2023-06-26 19:46:49 -03:00
Adam Treat
57fa8644d6 Make spelling check happy. 2023-06-26 17:56:56 -04:00
Adam Treat
d0a3e82ffc Restore feature I accidentally erased in modellist update. 2023-06-26 17:50:45 -04:00
Aaron Miller
b19a3e5b2c add requiredMem method to llmodel impls
most of these can just shortcut out of the model loading logic llama is a bit worse to deal with because we submodule it so I have to at least parse the hparams, and then I just use the size on disk as an estimate for the mem size (which seems reasonable since we mmap() the llama files anyway)
2023-06-26 18:27:58 -03:00
Adam Treat
dead954134 Fix save chats setting. 2023-06-26 16:43:37 -04:00
Adam Treat
26c9193227 Sigh. Windows. 2023-06-26 16:34:35 -04:00
Adam Treat
5deec2afe1 Change this back now that it is ready. 2023-06-26 16:21:09 -04:00
Adam Treat
676248fe8f Update the language. 2023-06-26 14:14:49 -04:00
Adam Treat
ef92492d8c Add better warnings and links. 2023-06-26 14:14:49 -04:00
Adam Treat
71c972f8fa Provide a more stark warning for localdocs and add more size to dialogs. 2023-06-26 14:14:49 -04:00
Adam Treat
1b5aa4617f Enable the add button always, but show an error in placeholder text. 2023-06-26 14:14:49 -04:00
Adam Treat
a0f80453e5 Use sysinfo in backend. 2023-06-26 14:14:49 -04:00
Adam Treat
5e520bb775 Fix so that models are searched in subdirectories. 2023-06-26 14:14:49 -04:00
Adam Treat
64e98b8ea9 Fix bug with model loading on initial load. 2023-06-26 14:14:49 -04:00
Adam Treat
3ca9e8692c Don't try and load incomplete files. 2023-06-26 14:14:49 -04:00
Adam Treat
27f25d5878 Get rid of recursive mutex. 2023-06-26 14:14:49 -04:00
Adam Treat
7f01b153b3 Modellist temp 2023-06-26 14:14:46 -04:00
Adam Treat
c1794597a7 Revert "Enable Wayland in build"
This reverts commit d686a583f9.
2023-06-26 14:10:27 -04:00
Akarshan Biswas
d686a583f9 Enable Wayland in build
# Describe your changes
The patch include support for running natively on a Linux Wayland display server/compositor which is successor to old Xorg.
Cmakelist was missing WaylandClient so added it back.

Will fix #1047 .

Signed-off-by: Akarshan Biswas <akarshan.biswas@gmail.com>
2023-06-26 14:58:23 -03:00
niansa/tuxifan
47323f8591 Update replit.cpp
replit_tokenizer_detokenize returnins std::string now

Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-26 14:49:58 -03:00
niansa
0855c0df1d Fixed Replit implementation compile warnings 2023-06-26 14:49:58 -03:00
Aaron Miller
1290b32451 update to latest mainline llama.cpp
add max_size param to ggml_metal_add_buffer - introduced in https://github.com/ggerganov/llama.cpp/pull/1826
2023-06-26 14:40:52 -03:00
AMOGUS
3417a37c54
Change "web server" to "API server" for less confusion (#1039)
* Change "Web server" to "API server"

* Changed "API server" to "OpenAPI server"

* Reversed back to "API server" and updated tooltip
2023-06-23 16:28:52 -04:00
cosmic-snow
ee26e8f271
CLI Improvements (#1021)
* Add gpt4all-bindings/cli/README.md

* Unify version information
- Was previously split; base one on the other
- Add VERSION_INFO as the "source of truth":
  - Modelled after sys.version_info.
  - Implemented as a tuple, because it's much easier for (partial)
    programmatic comparison.
- Previous API is kept intact.

* Add gpt4all-bindings/cli/developer_notes.md
- A few notes on what's what, especially regarding docs

* Add gpt4all-bindings/python/docs/gpt4all_cli.md
- The CLI user documentation

* Bump CLI version to 0.3.5

* Finalise docs & add to index.md
- Amend where necessary
- Fix typo in gpt4all_cli.md
- Mention and add link to CLI doc in index.md

* Add docstings to gpt4all-bindings/cli/app.py

* Better 'groovy' link & fix typo
- Documentation: point to the Hugging Face model card for 'groovy'
- Correct typo in app.py
2023-06-23 12:09:31 -07:00
EKal-aa
aed7b43143
set n_threads in GPT4All python bindings (#1042)
* set n_threads in GPT4All

* changed default n_threads to None
2023-06-23 01:16:35 -07:00