Commit Graph

18 Commits

Author SHA1 Message Date
johannesploetner
c951a5b1d3 Update gpt4all-api/gpt4all_api/app/api_v1/routes/chat.py
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Signed-off-by: johannesploetner <52075191+johannesploetner@users.noreply.github.com>
2024-03-11 09:58:47 -05:00
Johannes Plötner
026ee4e46b Implement /v1/chat/completions endpoint for CPU mode
Signed-off-by: Johannes Plötner <johannes.w.m.ploetner@gmail.com>
2024-03-11 09:58:47 -05:00
dsalvatierra
76413e1d03 Refactor engines module to fetch engine details
from API

Update chat.py

Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>
2023-11-21 10:46:51 -05:00
dsalvatierra
db70f1752a Update .gitignore and Dockerfile, add .env file
and modify test batch
2023-11-21 10:46:51 -05:00
dsalvat1
f3eaa33ce7 Fixing API problem - bin files are deprecated 2023-11-21 10:46:51 -05:00
Thomas
34daf240f9
Update Dockerfile.buildkit (#1542)
corrected model download directory

Signed-off-by: Thomas <tvhdev@vonhaugwitz-softwaresolutions.de>
2023-10-21 14:56:06 -04:00
cebtenzzre
245c5ce5ea
update default model URLs (#1538) 2023-10-19 15:25:37 -04:00
Adam Treat
ea66669cef Switch to new models2.json for new gguf release and bump our version to
2.5.0.
2023-10-05 18:16:19 -04:00
Andriy Mulyar
a9668eb2e4 Added optional top_p and top_k 2023-08-15 12:06:49 -04:00
David Okpare
889c8d1758
Add embeddings endpoint for gpt4all-api (#1314)
* Add embeddings endpoint

* Add test for embedding endpoint
2023-08-10 10:43:07 -04:00
Andriy Mulyar
14f4b522d5
Allow you to monitor GPT4All-API with Sentry (#1271) 2023-07-25 12:47:41 -04:00
Zach Nussbaum
b3f84c56e7
fix: don't pass around the same dict object (#1264) 2023-07-24 15:28:12 -04:00
Andriy Mulyar
2befff83d6 top_p error in gpt4all-api 2023-07-24 12:01:37 -04:00
Andriy Mulyar
3d10110314 Moved model check into cpu only paths 2023-07-24 11:34:50 -04:00
Zach Nussbaum
8aba2c9009
GPU Inference Server (#1112)
* feat: local inference server

* fix: source to use bash + vars

* chore: isort and black

* fix: make file + inference mode

* chore: logging

* refactor: remove old links

* fix: add new env vars

* feat: hf inference server

* refactor: remove old links

* test: batch and single response

* chore: black + isort

* separate gpu and cpu dockerfiles

* moved gpu to separate dockerfile

* Fixed test endpoints

* Edits to API. server won't start due to failed instantiation error

* Method signature

* fix: gpu_infer

* tests: fix tests

---------

Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-07-21 15:13:29 -04:00
Andriy Mulyar
58f0fcab57
Added health endpoint
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-07-20 21:23:29 -04:00
Brandon Beiler
fb576fbd7e
Update to gpt4all version 1.0.1. Implement the Streaming version of the completions endpoint. Implemented an openai python client test for the new streaming functionality. (#1129)
Co-authored-by: Brandon <bbeiler@ridgelineintl.com>
2023-07-05 23:17:30 -04:00
Andriy Mulyar
633e2a2137
GPT4All API Scaffolding. Matches OpenAI OpenAPI spec for chats and completions (#839)
* GPT4All API Scaffolding. Matches OpenAI OpenAI spec for engines, chats and completions

* Edits for docker building

* FastAPI app builds and pydantic models are accurate

* Added groovy download into dockerfile

* improved dockerfile

* Chat completions endpoint edits

* API uni test sketch

* Working example of groovy inference with open ai api

* Added lines to test

* Set default to mpt
2023-06-28 14:28:52 -04:00