mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2024-10-01 01:06:10 -04:00
typescript: publish alpha on npm and lots of cleanup, documentation, and more (#913)
* fix typo so padding can be accessed * Small cleanups for settings dialog. * Fix the build. * localdocs * Fixup the rescan. Fix debug output. * Add remove folder implementation. * Remove this signal as unnecessary for now. * Cleanup of the database, better chunking, better matching. * Add new reverse prompt for new localdocs context feature. * Add a new muted text color. * Turn off the debugging messages by default. * Add prompt processing and localdocs to the busy indicator in UI. * Specify a large number of suffixes we will search for now. * Add a collection list to support a UI. * Add a localdocs tab. * Start fleshing out the localdocs ui. * Begin implementing the localdocs ui in earnest. * Clean up the settings dialog for localdocs a bit. * Add more of the UI for selecting collections for chats. * Complete the settings for localdocs. * Adds the collections to serialize and implement references for localdocs. * Store the references separately so they are not sent to datalake. * Add context link to references. * Don't use the full path in reference text. * Various fixes to remove unnecessary warnings. * Add a newline * ignore rider and vscode dirs * create test project and basic model loading tests * make sample print usage and cleaner * Get the backend as well as the client building/working with msvc. * Libraries named differently on msvc. * Bump the version number. * This time remember to bump the version right after a release. * rm redundant json * More precise condition * Nicer handling of missing model directory. Correct exception message. * Log where the model was found * Concise model matching * reduce nesting, better error reporting * convert to f-strings * less magic number * 1. Cleanup the interrupted download 2. with-syntax * Redundant else * Do not ignore explicitly passed 4 threads * Correct return type * Add optional verbosity * Correct indentation of the multiline error message * one funcion to append .bin suffix * hotfix default verbose optioin * export hidden types and fix prompt() type * tiny typo (#739) * Update README.md (#738) * Update README.md fix golang gpt4all import path Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> * Update README.md Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> --------- Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> * fix(training instructions): model repo name (#728) Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com> * C# Bindings - Prompt formatting (#712) * Added support for custom prompt formatting * more docs added * bump version * clean up cc files and revert things * LocalDocs documentation initial (#761) * LocalDocs documentation initial * Improved localdocs documentation (#762) * Improved localdocs documentation * Improved localdocs documentation * Improved localdocs documentation * Improved localdocs documentation * New tokenizer implementation for MPT and GPT-J Improves output quality by making these tokenizers more closely match the behavior of the huggingface `tokenizers` based BPE tokenizers these models were trained with. Featuring: * Fixed unicode handling (via ICU) * Fixed BPE token merge handling * Complete added vocabulary handling * buf_ref.into() can be const now * add tokenizer readme w/ instructions for convert script * Revert "add tokenizer readme w/ instructions for convert script" This reverts commit9c15d1f83e
. * Revert "buf_ref.into() can be const now" This reverts commit840e011b75
. * Revert "New tokenizer implementation for MPT and GPT-J" This reverts commitee3469ba6c
. * Fix remove model from model download for regular models. * Fixed formatting of localdocs docs (#770) * construct and return the correct reponse when the request is a chat completion * chore: update typings to keep consistent with python api * progress, updating createCompletion to mirror py api * update spec, unfinished backend * prebuild binaries for package distribution using prebuildify/node-gyp-build * Get rid of blocking behavior for regenerate response. * Add a label to the model loading visual indicator. * Use the new MyButton for the regenerate response button. * Add a hover and pressed to the visual indication of MyButton. * Fix wording of this accessible description. * Some color and theme enhancements to make the UI contrast a bit better. * Make the comboboxes align in UI. * chore: update namespace and fix prompt bug * fix linux build * add roadmap * Fix offset of prompt/response icons for smaller text. * Dlopen backend 5 (#779) Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved. * Add a custom busy indicator to further align look and feel across platforms. * Draw the indicator for combobox to ensure it looks the same on all platforms. * Fix warning. * Use the proper text color for sending messages. * Fixup the plus new chat button. * Make all the toolbuttons highlight on hover. * Advanced avxonly autodetection (#744) * Advanced avxonly requirement detection * chore: support llamaversion >= 3 and ggml default * Dlopen better implementation management (Version 2) * Add fixme's and clean up a bit. * Documentation improvements on LocalDocs (#790) * Update gpt4all_chat.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * typo Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Adapt code * Makefile changes (WIP to test) * Debug * Adapt makefile * Style * Implemented logging mechanism (#785) * Cleaned up implementation management (#787) * Cleaned up implementation management * Initialize LLModel::m_implementation to nullptr * llmodel.h: Moved dlhandle fwd declare above LLModel class * Fix compile * Fixed double-free in LLModel::Implementation destructor * Allow user to specify custom search path via $GPT4ALL_IMPLEMENTATIONS_PATH (#789) * Drop leftover include * Add ldl in gpt4all.go for dynamic linking (#797) * Logger should also output to stderr * Fix MSVC Build, Update C# Binding Scripts * Update gpt4all_chat.md (#800) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * C# Bindings - improved logging (#714) * added optional support for .NET logging * bump version and add missing alpha suffix * avoid creating additional namespace for extensions * prefer NullLogger/NullLoggerFactory over null-conditional ILogger to avoid errors --------- Signed-off-by: mvenditto <venditto.matteo@gmail.com> * Make localdocs work with server mode. * Better name for database results. * Fix for stale references after we regenerate. * Don't hardcode these. * Fix bug with resetting context with chatgpt model. * Trying to shrink the copy+paste code and do more code sharing between backend model impl. * Remove this as it is no longer useful. * Try and fix build on mac. * Fix mac build again. * Add models/release.json to github repo to allow PRs * Fixed spelling error in models.json to make CI happy Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * updated bindings code for updated C api * load all model libs * model creation is failing... debugging * load libs correctly * fixed finding model libs * cleanup * cleanup * more cleanup * small typo fix * updated binding.gyp * Fixed model type for GPT-J (#815) Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * Fixed tons of warnings and clazy findings (#811) * Some tweaks to UI to make window resizing smooth and flow nicely. * Min constraints on about dialog. * Prevent flashing of white on resize. * Actually use the theme dark color for window background. * Add the ability to change the directory via text field not just 'browse' button. * add scripts to build dlls * markdown doc gen * add scripts, nearly done moving breaking changes * merge with main * oops, fixed comment * more meaningful name * leave for testing * Only default mlock on macOS where swap seems to be a problem Repeating the change that once was done in https://github.com/nomic-ai/gpt4all/pull/663 but then was overriden by9c6c09cbd2
Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> * Add a collection immediately and show a placeholder + busy indicator in localdocs settings. * some tweaks to optional types and defaults * mingw script for windows compilation * Update README.md huggingface -> Hugging Face Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Backend prompt dedup (#822) * Deduplicated prompt() function code * Better error handling when the model fails to load. * We no longer have an avx_only repository and better error handling for minimum hardware requirements. (#833) * Update build_and_run.md (#834) Signed-off-by: AT <manyoso@users.noreply.github.com> * Trying out a new feature to download directly from huggingface. * Try again with the url. * Allow for download of models hosted on third party hosts. * Fix up for newer models on reset context. This fixes the model from totally failing after a reset context. * Update to latest llama.cpp * Remove older models that are not as popular. (#837) * Remove older models that are not as popular. * Update models.json Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update models.json (#838) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update models.json Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * feat: finalyl compiled on windows (MSVC) goadman * update README and spec and promisfy createCompletion * update d.ts * Make installers work with mac/windows for big backend change. * Need this so the linux installer packages it as a dependency. * Try and fix mac. * Fix compile on mac. * These need to be installed for them to be packaged and work for both mac and windows. * Fix installers for windows and linux. * Fix symbol resolution on windows. * updated pypi version * Release notes for version 2.4.5 (#853) * Update README.md (#854) Signed-off-by: AT <manyoso@users.noreply.github.com> * Documentation for model sideloading (#851) * Documentation for model sideloading Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update gpt4all_chat.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Speculative fix for windows llama models with installer. * Revert "Speculative fix for windows llama models with installer." This reverts commitadd725d1eb
. * Revert "Fix bug with resetting context with chatgpt model." (#859) This reverts commite0dcf6a14f
. * Fix llama models on linux and windows. * Bump the version. * New release notes * Set thread counts after loading model (#836) * Update gpt4all_faq.md (#861) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Supports downloading officially supported models not hosted on gpt4all R2 * Replit Model (#713) * porting over replit code model to gpt4all * replaced memory with kv_self struct * continuing debug * welp it built but lot of sus things * working model loading and somewhat working generate.. need to format response? * revert back to semi working version * finally got rid of weird formatting * figured out problem is with python bindings - this is good to go for testing * addressing PR feedback * output refactor * fixed prompt reponse collection * cleanup * addressing PR comments * building replit backend with new ggmlver code * chatllm replit and clean python files * cleanup * updated replit to match new llmodel api * match llmodel api and change size_t to Token * resolve PR comments * replit model commit comment * Synced llama.cpp.cmake with upstream (#887) * Fix for windows. * fix: build script * Revert "Synced llama.cpp.cmake with upstream (#887)" This reverts commit5c5e10c1f5
. * Update README.md (#906) Add PyPI link and add clickable, more specific link to documentation Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> * Update CollectionsDialog.qml (#856) Phrasing for localdocs Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * sampling: remove incorrect offset for n_vocab (#900) no effect, but avoids a *potential* bug later if we use actualVocabSize - which is for when a model has a larger embedding tensor/# of output logits than actually trained token to allow room for adding extras in finetuning - presently all of our models have had "placeholder" tokens in the vocab so this hasn't broken anything, but if the sizes did differ we want the equivalent of `logits[actualVocabSize:]` (the start point is unchanged), not `logits[-actualVocabSize:]` (this.) * non-llama: explicitly greedy sampling for temp<=0 (#901) copied directly from llama.cpp - without this temp=0.0 will just scale all the logits to infinity and give bad output * work on thread safety and cleaning up, adding object option * chore: cleanup tests and spec * refactor for object based startup * more docs * Circleci builds for Linux, Windows, and macOS for gpt4all-chat. * more docs * Synced llama.cpp.cmake with upstream * add lock file to ignore codespell * Move usage in Python bindings readme to own section (#907) Have own section for short usage example, as it is not specific to local build Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> * Always sync for circleci. * update models json with replit model * Forgot to bump. * Change the default values for generation in GUI * Removed double-static from variables in replit.cpp The anonymous namespace already makes it static. Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * Generator in Python Bindings - streaming yields tokens at a time (#895) * generator method * cleanup * bump version number for clarity * added replace in decode to avoid unicodedecode exception * revert back to _build_prompt * Do auto detection by default in C++ API Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * remove comment * add comments for index.h * chore: add new models and edit ignore files and documentation * llama on Metal (#885) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> * Revert "llama on Metal (#885)" This reverts commitb59ce1c6e7
. * add more readme stuff and debug info * spell * Metal+LLama take two (#929) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> * add prebuilts for windows * Add new solution for context links that does not force regular markdown (#938) in responses which is disruptive to code completions in responses. * add prettier * split out non llm related methods into util.js, add listModels method * add prebuild script for creating all platforms bindings at once * check in prebuild linux/so libs and allow distribution of napi prebuilds * apply autoformatter * move constants in config.js, add loadModel and retrieveModel methods * Clean up the context links a bit. * Don't interfere with selection. * Add code blocks and python syntax highlighting. * Spelling error. * Add c++/c highighting support. * Fix some bugs with bash syntax and add some C23 keywords. * Bugfixes for prompt syntax highlighting. * Try and fix a false positive from codespell. * When recalculating context we can't erase the BOS. * Fix Windows MSVC AVX builds - bug introduced in557c82b5ed
- currently getting: `warning C5102: ignoring invalid command-line macro definition '/arch:AVX2'` - solution is to use `_options(...)` not `_definitions(...)` * remove .so unneeded path --------- Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com> Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> Signed-off-by: mvenditto <venditto.matteo@gmail.com> Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Signed-off-by: AT <manyoso@users.noreply.github.com> Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> Co-authored-by: Justin Wang <justinwang46@gmail.com> Co-authored-by: Adam Treat <treat.adam@gmail.com> Co-authored-by: redthing1 <redthing1@alt.icu> Co-authored-by: Konstantin Gukov <gukkos@gmail.com> Co-authored-by: Richard Guo <richardg7890@gmail.com> Co-authored-by: Joseph Mearman <joseph@mearman.co.uk> Co-authored-by: Nandakumar <nandagunasekaran@gmail.com> Co-authored-by: Chase McDougall <chasemcdougall@hotmail.com> Co-authored-by: mvenditto <venditto.matteo@gmail.com> Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com> Co-authored-by: Aaron Miller <apage43@ninjawhale.com> Co-authored-by: FoivosC <christoulakis.foivos@adlittle.com> Co-authored-by: limez <limez@protonmail.com> Co-authored-by: AT <manyoso@users.noreply.github.com> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> Co-authored-by: niansa <anton-sa@web.de> Co-authored-by: mudler <mudler@mocaccino.org> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Tim Miller <innerlogic4321@gmail.com> Co-authored-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: Claudius Ellsel <claudius.ellsel@live.de> Co-authored-by: pingpongching <golololologol02@gmail.com> Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: Cosmic Snow <cosmic-snow@mailfence.com>
This commit is contained in:
parent
44bf91855d
commit
8d53614444
@ -1,3 +1,3 @@
|
|||||||
[codespell]
|
[codespell]
|
||||||
ignore-words-list = blong, belong
|
ignore-words-list = blong, belong
|
||||||
skip = .git,*.pdf,*.svg
|
skip = .git,*.pdf,*.svg,*.lock
|
||||||
|
1
gpt4all-bindings/typescript/.gitignore
vendored
1
gpt4all-bindings/typescript/.gitignore
vendored
@ -1,2 +1,3 @@
|
|||||||
node_modules/
|
node_modules/
|
||||||
build/
|
build/
|
||||||
|
prebuilds/
|
||||||
|
@ -1,3 +1,4 @@
|
|||||||
test/
|
test/
|
||||||
spec/
|
spec/
|
||||||
|
scripts/
|
||||||
|
build
|
@ -2,12 +2,32 @@
|
|||||||
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
|
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
|
||||||
|
|
||||||
- created by [jacoobes](https://github.com/jacoobes) and [nomic ai](https://home.nomic.ai) :D, for all to use.
|
- created by [jacoobes](https://github.com/jacoobes) and [nomic ai](https://home.nomic.ai) :D, for all to use.
|
||||||
- will maintain this repository when possible, new feature requests will be handled through nomic
|
|
||||||
|
|
||||||
|
### Code (alpha)
|
||||||
|
```js
|
||||||
|
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } from '../src/gpt4all.js'
|
||||||
|
|
||||||
|
const ll = new LLModel({
|
||||||
|
model_name: 'ggml-vicuna-7b-1.1-q4_2.bin',
|
||||||
|
model_path: './',
|
||||||
|
library_path: DEFAULT_LIBRARIES_DIRECTORY
|
||||||
|
});
|
||||||
|
|
||||||
|
const response = await createCompletion(ll, [
|
||||||
|
{ role : 'system', content: 'You are meant to be annoying and unhelpful.' },
|
||||||
|
{ role : 'user', content: 'What is 1 + 1?' }
|
||||||
|
]);
|
||||||
|
|
||||||
|
```
|
||||||
|
### API
|
||||||
|
- The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
|
||||||
|
- [docs](./docs/api.md)
|
||||||
### Build Instructions
|
### Build Instructions
|
||||||
|
|
||||||
- As of 05/21/2023, Tested on windows (MSVC) only. (somehow got it to work on MSVC 🤯)
|
- As of 05/21/2023, Tested on windows (MSVC). (somehow got it to work on MSVC 🤯)
|
||||||
- binding.gyp is compile config
|
- binding.gyp is compile config
|
||||||
|
- Tested on Ubuntu. Everything seems to work fine
|
||||||
|
- MingW works as well to build the gpt4all-backend. HOWEVER, this package works only with MSVC built dlls.
|
||||||
|
|
||||||
### Requirements
|
### Requirements
|
||||||
- git
|
- git
|
||||||
@ -31,6 +51,15 @@ cd gpt4all-bindings/typescript
|
|||||||
```sh
|
```sh
|
||||||
git submodule update --init --depth 1 --recursive
|
git submodule update --init --depth 1 --recursive
|
||||||
```
|
```
|
||||||
|
**AS OF NEW BACKEND** to build the backend,
|
||||||
|
```sh
|
||||||
|
yarn build:backend
|
||||||
|
```
|
||||||
|
This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native The only current way to use them is to put them in the current working directory of your application. That is, **WHEREVER YOU RUN YOUR NODE APPLICATION**
|
||||||
|
- llama-xxxx.dll is required.
|
||||||
|
- According to whatever model you are using, you'll need to select the proper model loader.
|
||||||
|
- For example, if you running an Mosaic MPT model, you will need to select the mpt-(buildvariant).(dynamiclibrary)
|
||||||
|
|
||||||
### Test
|
### Test
|
||||||
```sh
|
```sh
|
||||||
yarn test
|
yarn test
|
||||||
@ -48,9 +77,22 @@ yarn test
|
|||||||
|
|
||||||
#### spec/
|
#### spec/
|
||||||
- Average look and feel of the api
|
- Average look and feel of the api
|
||||||
- Should work assuming a model is installed locally in working directory
|
- Should work assuming a model and libraries are installed locally in working directory
|
||||||
|
|
||||||
#### index.cc
|
#### index.cc
|
||||||
- The bridge between nodejs and c. Where the bindings are.
|
- The bridge between nodejs and c. Where the bindings are.
|
||||||
|
#### prompt.cc
|
||||||
|
- Handling prompting and inference of models in a threadsafe, asynchronous way.
|
||||||
|
#### docs/
|
||||||
|
- Autogenerated documentation using the script `yarn docs:build`
|
||||||
|
|
||||||
|
### Roadmap
|
||||||
|
This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
|
||||||
|
|
||||||
|
- [x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
|
||||||
|
- [ ] createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)
|
||||||
|
- [ ] proper unit testing (integrate with circle ci)
|
||||||
|
- [ ] publish to npm under alpha tag `gpt4all@alpha`
|
||||||
|
- [ ] have more people test on other platforms (mac tester needed)
|
||||||
|
- [x] switch to new pluggable backend
|
||||||
|
|
||||||
|
@ -1,45 +1,55 @@
|
|||||||
{
|
{
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
{
|
||||||
"target_name": "gpt4allts", # gpt4all-ts will cause compile error
|
"target_name": "gpt4all", # gpt4all-ts will cause compile error
|
||||||
"cflags!": [ "-fno-exceptions" ],
|
"cflags_cc!": [ "-fno-exceptions"],
|
||||||
"cflags_cc!": [ "-fno-exceptions" ],
|
|
||||||
"include_dirs": [
|
"include_dirs": [
|
||||||
"<!@(node -p \"require('node-addon-api').include\")",
|
"<!@(node -p \"require('node-addon-api').include\")",
|
||||||
"../../gpt4all-backend/llama.cpp/", # need to include llama.cpp because the include paths for examples/common.h include llama.h relatively
|
|
||||||
"../../gpt4all-backend",
|
"../../gpt4all-backend",
|
||||||
],
|
],
|
||||||
"sources": [ # is there a better way to do this
|
"sources": [
|
||||||
"../../gpt4all-backend/llama.cpp/examples/common.cpp",
|
# PREVIOUS VERSION: had to required the sources, but with newest changes do not need to
|
||||||
"../../gpt4all-backend/llama.cpp/ggml.c",
|
#"../../gpt4all-backend/llama.cpp/examples/common.cpp",
|
||||||
"../../gpt4all-backend/llama.cpp/llama.cpp",
|
#"../../gpt4all-backend/llama.cpp/ggml.c",
|
||||||
"../../gpt4all-backend/utils.cpp",
|
#"../../gpt4all-backend/llama.cpp/llama.cpp",
|
||||||
|
# "../../gpt4all-backend/utils.cpp",
|
||||||
"../../gpt4all-backend/llmodel_c.cpp",
|
"../../gpt4all-backend/llmodel_c.cpp",
|
||||||
"../../gpt4all-backend/gptj.cpp",
|
"../../gpt4all-backend/llmodel.cpp",
|
||||||
"../../gpt4all-backend/llamamodel.cpp",
|
"prompt.cc",
|
||||||
"../../gpt4all-backend/mpt.cpp",
|
|
||||||
"stdcapture.cc",
|
|
||||||
"index.cc",
|
"index.cc",
|
||||||
],
|
],
|
||||||
"conditions": [
|
"conditions": [
|
||||||
['OS=="mac"', {
|
['OS=="mac"', {
|
||||||
'defines': [
|
'defines': [
|
||||||
'NAPI_CPP_EXCEPTIONS'
|
'LIB_FILE_EXT=".dylib"',
|
||||||
],
|
'NAPI_CPP_EXCEPTIONS',
|
||||||
|
]
|
||||||
}],
|
}],
|
||||||
['OS=="win"', {
|
['OS=="win"', {
|
||||||
'defines': [
|
'defines': [
|
||||||
|
'LIB_FILE_EXT=".dll"',
|
||||||
'NAPI_CPP_EXCEPTIONS',
|
'NAPI_CPP_EXCEPTIONS',
|
||||||
"__AVX2__" # allows SIMD: https://discord.com/channels/1076964370942267462/1092290790388150272/1107564673957630023
|
|
||||||
],
|
],
|
||||||
"msvs_settings": {
|
"msvs_settings": {
|
||||||
"VCCLCompilerTool": {
|
"VCCLCompilerTool": {
|
||||||
"AdditionalOptions": [
|
"AdditionalOptions": [
|
||||||
"/std:c++20",
|
"/std:c++20",
|
||||||
"/EHsc"
|
"/EHsc",
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
}],
|
||||||
|
['OS=="linux"', {
|
||||||
|
'defines': [
|
||||||
|
'LIB_FILE_EXT=".so"',
|
||||||
|
'NAPI_CPP_EXCEPTIONS',
|
||||||
|
],
|
||||||
|
'cflags_cc!': [
|
||||||
|
'-fno-rtti',
|
||||||
|
],
|
||||||
|
'cflags_cc': [
|
||||||
|
'-std=c++20'
|
||||||
|
]
|
||||||
}]
|
}]
|
||||||
]
|
]
|
||||||
}]
|
}]
|
||||||
|
623
gpt4all-bindings/typescript/docs/api.md
Normal file
623
gpt4all-bindings/typescript/docs/api.md
Normal file
@ -0,0 +1,623 @@
|
|||||||
|
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->
|
||||||
|
|
||||||
|
### Table of Contents
|
||||||
|
|
||||||
|
* [download][1]
|
||||||
|
* [Parameters][2]
|
||||||
|
* [Examples][3]
|
||||||
|
* [DownloadOptions][4]
|
||||||
|
* [location][5]
|
||||||
|
* [debug][6]
|
||||||
|
* [url][7]
|
||||||
|
* [DownloadController][8]
|
||||||
|
* [cancel][9]
|
||||||
|
* [promise][10]
|
||||||
|
* [ModelType][11]
|
||||||
|
* [ModelFile][12]
|
||||||
|
* [gptj][13]
|
||||||
|
* [llama][14]
|
||||||
|
* [mpt][15]
|
||||||
|
* [type][16]
|
||||||
|
* [LLModel][17]
|
||||||
|
* [constructor][18]
|
||||||
|
* [Parameters][19]
|
||||||
|
* [type][20]
|
||||||
|
* [name][21]
|
||||||
|
* [stateSize][22]
|
||||||
|
* [threadCount][23]
|
||||||
|
* [setThreadCount][24]
|
||||||
|
* [Parameters][25]
|
||||||
|
* [raw\_prompt][26]
|
||||||
|
* [Parameters][27]
|
||||||
|
* [isModelLoaded][28]
|
||||||
|
* [setLibraryPath][29]
|
||||||
|
* [Parameters][30]
|
||||||
|
* [getLibraryPath][31]
|
||||||
|
* [createCompletion][32]
|
||||||
|
* [Parameters][33]
|
||||||
|
* [Examples][34]
|
||||||
|
* [CompletionOptions][35]
|
||||||
|
* [verbose][36]
|
||||||
|
* [hasDefaultHeader][37]
|
||||||
|
* [hasDefaultFooter][38]
|
||||||
|
* [PromptMessage][39]
|
||||||
|
* [role][40]
|
||||||
|
* [content][41]
|
||||||
|
* [prompt\_tokens][42]
|
||||||
|
* [completion\_tokens][43]
|
||||||
|
* [total\_tokens][44]
|
||||||
|
* [CompletionReturn][45]
|
||||||
|
* [model][46]
|
||||||
|
* [usage][47]
|
||||||
|
* [choices][48]
|
||||||
|
* [CompletionChoice][49]
|
||||||
|
* [message][50]
|
||||||
|
* [LLModelPromptContext][51]
|
||||||
|
* [logits\_size][52]
|
||||||
|
* [tokens\_size][53]
|
||||||
|
* [n\_past][54]
|
||||||
|
* [n\_ctx][55]
|
||||||
|
* [n\_predict][56]
|
||||||
|
* [top\_k][57]
|
||||||
|
* [top\_p][58]
|
||||||
|
* [temp][59]
|
||||||
|
* [n\_batch][60]
|
||||||
|
* [repeat\_penalty][61]
|
||||||
|
* [repeat\_last\_n][62]
|
||||||
|
* [context\_erase][63]
|
||||||
|
* [createTokenStream][64]
|
||||||
|
* [Parameters][65]
|
||||||
|
* [DEFAULT\_DIRECTORY][66]
|
||||||
|
* [DEFAULT\_LIBRARIES\_DIRECTORY][67]
|
||||||
|
|
||||||
|
## download
|
||||||
|
|
||||||
|
Initiates the download of a model file of a specific model type.
|
||||||
|
By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||||
|
|
||||||
|
### Parameters
|
||||||
|
|
||||||
|
* `model` **[ModelFile][12]** The model file to be downloaded.
|
||||||
|
* `options` **[DownloadOptions][4]** to pass into the downloader. Default is { location: (cwd), debug: false }.
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
|
||||||
|
controller.promise().then(() => console.log('Downloaded!'))
|
||||||
|
```
|
||||||
|
|
||||||
|
* Throws **[Error][68]** If the model already exists in the specified location.
|
||||||
|
* Throws **[Error][68]** If the model cannot be found at the specified url.
|
||||||
|
|
||||||
|
Returns **[DownloadController][8]** object that allows controlling the download process.
|
||||||
|
|
||||||
|
## DownloadOptions
|
||||||
|
|
||||||
|
Options for the model download process.
|
||||||
|
|
||||||
|
### location
|
||||||
|
|
||||||
|
location to download the model.
|
||||||
|
Default is process.cwd(), or the current working directory
|
||||||
|
|
||||||
|
Type: [string][69]
|
||||||
|
|
||||||
|
### debug
|
||||||
|
|
||||||
|
Debug mode -- check how long it took to download in seconds
|
||||||
|
|
||||||
|
Type: [boolean][70]
|
||||||
|
|
||||||
|
### url
|
||||||
|
|
||||||
|
Remote download url. Defaults to `https://gpt4all.io/models`
|
||||||
|
|
||||||
|
Type: [string][69]
|
||||||
|
|
||||||
|
## DownloadController
|
||||||
|
|
||||||
|
Model download controller.
|
||||||
|
|
||||||
|
### cancel
|
||||||
|
|
||||||
|
Cancel the request to download from gpt4all website if this is called.
|
||||||
|
|
||||||
|
Type: function (): void
|
||||||
|
|
||||||
|
### promise
|
||||||
|
|
||||||
|
Convert the downloader into a promise, allowing people to await and manage its lifetime
|
||||||
|
|
||||||
|
Type: function (): [Promise][71]\<void>
|
||||||
|
|
||||||
|
## ModelType
|
||||||
|
|
||||||
|
Type of the model
|
||||||
|
|
||||||
|
Type: (`"gptj"` | `"llama"` | `"mpt"`)
|
||||||
|
|
||||||
|
## ModelFile
|
||||||
|
|
||||||
|
Full list of models available
|
||||||
|
|
||||||
|
### gptj
|
||||||
|
|
||||||
|
List of GPT-J Models
|
||||||
|
|
||||||
|
Type: (`"ggml-gpt4all-j-v1.3-groovy.bin"` | `"ggml-gpt4all-j-v1.2-jazzy.bin"` | `"ggml-gpt4all-j-v1.1-breezy.bin"` | `"ggml-gpt4all-j.bin"`)
|
||||||
|
|
||||||
|
### llama
|
||||||
|
|
||||||
|
List Llama Models
|
||||||
|
|
||||||
|
Type: (`"ggml-gpt4all-l13b-snoozy.bin"` | `"ggml-vicuna-7b-1.1-q4_2.bin"` | `"ggml-vicuna-13b-1.1-q4_2.bin"` | `"ggml-wizardLM-7B.q4_2.bin"` | `"ggml-stable-vicuna-13B.q4_2.bin"` | `"ggml-nous-gpt4-vicuna-13b.bin"`)
|
||||||
|
|
||||||
|
### mpt
|
||||||
|
|
||||||
|
List of MPT Models
|
||||||
|
|
||||||
|
Type: (`"ggml-mpt-7b-base.bin"` | `"ggml-mpt-7b-chat.bin"` | `"ggml-mpt-7b-instruct.bin"`)
|
||||||
|
|
||||||
|
## type
|
||||||
|
|
||||||
|
Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
||||||
|
|
||||||
|
Type: [ModelType][11]
|
||||||
|
|
||||||
|
## LLModel
|
||||||
|
|
||||||
|
LLModel class representing a language model.
|
||||||
|
This is a base class that provides common functionality for different types of language models.
|
||||||
|
|
||||||
|
### constructor
|
||||||
|
|
||||||
|
Initialize a new LLModel.
|
||||||
|
|
||||||
|
#### Parameters
|
||||||
|
|
||||||
|
* `path` **[string][69]** Absolute path to the model file.
|
||||||
|
|
||||||
|
<!---->
|
||||||
|
|
||||||
|
* Throws **[Error][68]** If the model file does not exist.
|
||||||
|
|
||||||
|
### type
|
||||||
|
|
||||||
|
either 'gpt', mpt', or 'llama' or undefined
|
||||||
|
|
||||||
|
Returns **([ModelType][11] | [undefined][72])** 
|
||||||
|
|
||||||
|
### name
|
||||||
|
|
||||||
|
The name of the model.
|
||||||
|
|
||||||
|
Returns **[ModelFile][12]** 
|
||||||
|
|
||||||
|
### stateSize
|
||||||
|
|
||||||
|
Get the size of the internal state of the model.
|
||||||
|
NOTE: This state data is specific to the type of model you have created.
|
||||||
|
|
||||||
|
Returns **[number][73]** the size in bytes of the internal state of the model
|
||||||
|
|
||||||
|
### threadCount
|
||||||
|
|
||||||
|
Get the number of threads used for model inference.
|
||||||
|
The default is the number of physical cores your computer has.
|
||||||
|
|
||||||
|
Returns **[number][73]** The number of threads used for model inference.
|
||||||
|
|
||||||
|
### setThreadCount
|
||||||
|
|
||||||
|
Set the number of threads used for model inference.
|
||||||
|
|
||||||
|
#### Parameters
|
||||||
|
|
||||||
|
* `newNumber` **[number][73]** The new number of threads.
|
||||||
|
|
||||||
|
Returns **void** 
|
||||||
|
|
||||||
|
### raw\_prompt
|
||||||
|
|
||||||
|
Prompt the model with a given input and optional parameters.
|
||||||
|
This is the raw output from std out.
|
||||||
|
Use the prompt function exported for a value
|
||||||
|
|
||||||
|
#### Parameters
|
||||||
|
|
||||||
|
* `q` **[string][69]** The prompt input.
|
||||||
|
* `params` **Partial<[LLModelPromptContext][51]>?** Optional parameters for the prompt context.
|
||||||
|
|
||||||
|
Returns **any** The result of the model prompt.
|
||||||
|
|
||||||
|
### isModelLoaded
|
||||||
|
|
||||||
|
Whether the model is loaded or not.
|
||||||
|
|
||||||
|
Returns **[boolean][70]** 
|
||||||
|
|
||||||
|
### setLibraryPath
|
||||||
|
|
||||||
|
Where to search for the pluggable backend libraries
|
||||||
|
|
||||||
|
#### Parameters
|
||||||
|
|
||||||
|
* `s` **[string][69]** 
|
||||||
|
|
||||||
|
Returns **void** 
|
||||||
|
|
||||||
|
### getLibraryPath
|
||||||
|
|
||||||
|
Where to get the pluggable backend libraries
|
||||||
|
|
||||||
|
Returns **[string][69]** 
|
||||||
|
|
||||||
|
## createCompletion
|
||||||
|
|
||||||
|
The nodejs equivalent to python binding's chat\_completion
|
||||||
|
|
||||||
|
### Parameters
|
||||||
|
|
||||||
|
* `llmodel` **[LLModel][17]** The language model object.
|
||||||
|
* `messages` **[Array][74]<[PromptMessage][39]>** The array of messages for the conversation.
|
||||||
|
* `options` **[CompletionOptions][35]** The options for creating the completion.
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
const llmodel = new LLModel(model)
|
||||||
|
const messages = [
|
||||||
|
{ role: 'system', message: 'You are a weather forecaster.' },
|
||||||
|
{ role: 'user', message: 'should i go out today?' } ]
|
||||||
|
const completion = await createCompletion(llmodel, messages, {
|
||||||
|
verbose: true,
|
||||||
|
temp: 0.9,
|
||||||
|
})
|
||||||
|
console.log(completion.choices[0].message.content)
|
||||||
|
// No, it's going to be cold and rainy.
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns **[CompletionReturn][45]** The completion result.
|
||||||
|
|
||||||
|
## CompletionOptions
|
||||||
|
|
||||||
|
**Extends Partial\<LLModelPromptContext>**
|
||||||
|
|
||||||
|
The options for creating the completion.
|
||||||
|
|
||||||
|
### verbose
|
||||||
|
|
||||||
|
Indicates if verbose logging is enabled.
|
||||||
|
|
||||||
|
Type: [boolean][70]
|
||||||
|
|
||||||
|
### hasDefaultHeader
|
||||||
|
|
||||||
|
Indicates if the default header is included in the prompt.
|
||||||
|
|
||||||
|
Type: [boolean][70]
|
||||||
|
|
||||||
|
### hasDefaultFooter
|
||||||
|
|
||||||
|
Indicates if the default footer is included in the prompt.
|
||||||
|
|
||||||
|
Type: [boolean][70]
|
||||||
|
|
||||||
|
## PromptMessage
|
||||||
|
|
||||||
|
A message in the conversation, identical to OpenAI's chat message.
|
||||||
|
|
||||||
|
### role
|
||||||
|
|
||||||
|
The role of the message.
|
||||||
|
|
||||||
|
Type: (`"system"` | `"assistant"` | `"user"`)
|
||||||
|
|
||||||
|
### content
|
||||||
|
|
||||||
|
The message content.
|
||||||
|
|
||||||
|
Type: [string][69]
|
||||||
|
|
||||||
|
## prompt\_tokens
|
||||||
|
|
||||||
|
The number of tokens used in the prompt.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
## completion\_tokens
|
||||||
|
|
||||||
|
The number of tokens used in the completion.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
## total\_tokens
|
||||||
|
|
||||||
|
The total number of tokens used.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
## CompletionReturn
|
||||||
|
|
||||||
|
The result of the completion, similar to OpenAI's format.
|
||||||
|
|
||||||
|
### model
|
||||||
|
|
||||||
|
The model name.
|
||||||
|
|
||||||
|
Type: [ModelFile][12]
|
||||||
|
|
||||||
|
### usage
|
||||||
|
|
||||||
|
Token usage report.
|
||||||
|
|
||||||
|
Type: {prompt\_tokens: [number][73], completion\_tokens: [number][73], total\_tokens: [number][73]}
|
||||||
|
|
||||||
|
### choices
|
||||||
|
|
||||||
|
The generated completions.
|
||||||
|
|
||||||
|
Type: [Array][74]<[CompletionChoice][49]>
|
||||||
|
|
||||||
|
## CompletionChoice
|
||||||
|
|
||||||
|
A completion choice, similar to OpenAI's format.
|
||||||
|
|
||||||
|
### message
|
||||||
|
|
||||||
|
Response message
|
||||||
|
|
||||||
|
Type: [PromptMessage][39]
|
||||||
|
|
||||||
|
## LLModelPromptContext
|
||||||
|
|
||||||
|
Model inference arguments for generating completions.
|
||||||
|
|
||||||
|
### logits\_size
|
||||||
|
|
||||||
|
The size of the raw logits vector.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### tokens\_size
|
||||||
|
|
||||||
|
The size of the raw tokens vector.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### n\_past
|
||||||
|
|
||||||
|
The number of tokens in the past conversation.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### n\_ctx
|
||||||
|
|
||||||
|
The number of tokens possible in the context window.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### n\_predict
|
||||||
|
|
||||||
|
The number of tokens to predict.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### top\_k
|
||||||
|
|
||||||
|
The top-k logits to sample from.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### top\_p
|
||||||
|
|
||||||
|
The nucleus sampling probability threshold.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### temp
|
||||||
|
|
||||||
|
The temperature to adjust the model's output distribution.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### n\_batch
|
||||||
|
|
||||||
|
The number of predictions to generate in parallel.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### repeat\_penalty
|
||||||
|
|
||||||
|
The penalty factor for repeated tokens.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### repeat\_last\_n
|
||||||
|
|
||||||
|
The number of last tokens to penalize.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
### context\_erase
|
||||||
|
|
||||||
|
The percentage of context to erase if the context window is exceeded.
|
||||||
|
|
||||||
|
Type: [number][73]
|
||||||
|
|
||||||
|
## createTokenStream
|
||||||
|
|
||||||
|
TODO: Help wanted to implement this
|
||||||
|
|
||||||
|
### Parameters
|
||||||
|
|
||||||
|
* `llmodel` **[LLModel][17]** 
|
||||||
|
* `messages` **[Array][74]<[PromptMessage][39]>** 
|
||||||
|
* `options` **[CompletionOptions][35]** 
|
||||||
|
|
||||||
|
Returns **function (ll: [LLModel][17]): AsyncGenerator<[string][69]>** 
|
||||||
|
|
||||||
|
## DEFAULT\_DIRECTORY
|
||||||
|
|
||||||
|
From python api:
|
||||||
|
models will be stored in (homedir)/.cache/gpt4all/\`
|
||||||
|
|
||||||
|
Type: [string][69]
|
||||||
|
|
||||||
|
## DEFAULT\_LIBRARIES\_DIRECTORY
|
||||||
|
|
||||||
|
From python api:
|
||||||
|
The default path for dynamic libraries to be stored.
|
||||||
|
You may separate paths by a semicolon to search in multiple areas.
|
||||||
|
This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
||||||
|
|
||||||
|
Type: [string][69]
|
||||||
|
|
||||||
|
[1]: #download
|
||||||
|
|
||||||
|
[2]: #parameters
|
||||||
|
|
||||||
|
[3]: #examples
|
||||||
|
|
||||||
|
[4]: #downloadoptions
|
||||||
|
|
||||||
|
[5]: #location
|
||||||
|
|
||||||
|
[6]: #debug
|
||||||
|
|
||||||
|
[7]: #url
|
||||||
|
|
||||||
|
[8]: #downloadcontroller
|
||||||
|
|
||||||
|
[9]: #cancel
|
||||||
|
|
||||||
|
[10]: #promise
|
||||||
|
|
||||||
|
[11]: #modeltype
|
||||||
|
|
||||||
|
[12]: #modelfile
|
||||||
|
|
||||||
|
[13]: #gptj
|
||||||
|
|
||||||
|
[14]: #llama
|
||||||
|
|
||||||
|
[15]: #mpt
|
||||||
|
|
||||||
|
[16]: #type
|
||||||
|
|
||||||
|
[17]: #llmodel
|
||||||
|
|
||||||
|
[18]: #constructor
|
||||||
|
|
||||||
|
[19]: #parameters-1
|
||||||
|
|
||||||
|
[20]: #type-1
|
||||||
|
|
||||||
|
[21]: #name
|
||||||
|
|
||||||
|
[22]: #statesize
|
||||||
|
|
||||||
|
[23]: #threadcount
|
||||||
|
|
||||||
|
[24]: #setthreadcount
|
||||||
|
|
||||||
|
[25]: #parameters-2
|
||||||
|
|
||||||
|
[26]: #raw_prompt
|
||||||
|
|
||||||
|
[27]: #parameters-3
|
||||||
|
|
||||||
|
[28]: #ismodelloaded
|
||||||
|
|
||||||
|
[29]: #setlibrarypath
|
||||||
|
|
||||||
|
[30]: #parameters-4
|
||||||
|
|
||||||
|
[31]: #getlibrarypath
|
||||||
|
|
||||||
|
[32]: #createcompletion
|
||||||
|
|
||||||
|
[33]: #parameters-5
|
||||||
|
|
||||||
|
[34]: #examples-1
|
||||||
|
|
||||||
|
[35]: #completionoptions
|
||||||
|
|
||||||
|
[36]: #verbose
|
||||||
|
|
||||||
|
[37]: #hasdefaultheader
|
||||||
|
|
||||||
|
[38]: #hasdefaultfooter
|
||||||
|
|
||||||
|
[39]: #promptmessage
|
||||||
|
|
||||||
|
[40]: #role
|
||||||
|
|
||||||
|
[41]: #content
|
||||||
|
|
||||||
|
[42]: #prompt_tokens
|
||||||
|
|
||||||
|
[43]: #completion_tokens
|
||||||
|
|
||||||
|
[44]: #total_tokens
|
||||||
|
|
||||||
|
[45]: #completionreturn
|
||||||
|
|
||||||
|
[46]: #model
|
||||||
|
|
||||||
|
[47]: #usage
|
||||||
|
|
||||||
|
[48]: #choices
|
||||||
|
|
||||||
|
[49]: #completionchoice
|
||||||
|
|
||||||
|
[50]: #message
|
||||||
|
|
||||||
|
[51]: #llmodelpromptcontext
|
||||||
|
|
||||||
|
[52]: #logits_size
|
||||||
|
|
||||||
|
[53]: #tokens_size
|
||||||
|
|
||||||
|
[54]: #n_past
|
||||||
|
|
||||||
|
[55]: #n_ctx
|
||||||
|
|
||||||
|
[56]: #n_predict
|
||||||
|
|
||||||
|
[57]: #top_k
|
||||||
|
|
||||||
|
[58]: #top_p
|
||||||
|
|
||||||
|
[59]: #temp
|
||||||
|
|
||||||
|
[60]: #n_batch
|
||||||
|
|
||||||
|
[61]: #repeat_penalty
|
||||||
|
|
||||||
|
[62]: #repeat_last_n
|
||||||
|
|
||||||
|
[63]: #context_erase
|
||||||
|
|
||||||
|
[64]: #createtokenstream
|
||||||
|
|
||||||
|
[65]: #parameters-6
|
||||||
|
|
||||||
|
[66]: #default_directory
|
||||||
|
|
||||||
|
[67]: #default_libraries_directory
|
||||||
|
|
||||||
|
[68]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error
|
||||||
|
|
||||||
|
[69]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String
|
||||||
|
|
||||||
|
[70]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean
|
||||||
|
|
||||||
|
[71]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise
|
||||||
|
|
||||||
|
[72]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined
|
||||||
|
|
||||||
|
[73]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number
|
||||||
|
|
||||||
|
[74]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array
|
@ -1,68 +1,95 @@
|
|||||||
#include <napi.h>
|
#include "index.h"
|
||||||
#include <iostream>
|
|
||||||
#include "llmodel_c.h"
|
|
||||||
#include "llmodel.h"
|
|
||||||
#include "gptj.h"
|
|
||||||
#include "llamamodel.h"
|
|
||||||
#include "mpt.h"
|
|
||||||
#include "stdcapture.h"
|
|
||||||
|
|
||||||
class NodeModelWrapper : public Napi::ObjectWrap<NodeModelWrapper> {
|
Napi::FunctionReference NodeModelWrapper::constructor;
|
||||||
public:
|
|
||||||
static Napi::Object Init(Napi::Env env, Napi::Object exports) {
|
Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
||||||
Napi::Function func = DefineClass(env, "LLModel", {
|
Napi::Function self = DefineClass(env, "LLModel", {
|
||||||
InstanceMethod("type", &NodeModelWrapper::getType),
|
InstanceMethod("type", &NodeModelWrapper::getType),
|
||||||
InstanceMethod("name", &NodeModelWrapper::getName),
|
InstanceMethod("isModelLoaded", &NodeModelWrapper::IsModelLoaded),
|
||||||
InstanceMethod("stateSize", &NodeModelWrapper::StateSize),
|
InstanceMethod("name", &NodeModelWrapper::getName),
|
||||||
InstanceMethod("raw_prompt", &NodeModelWrapper::Prompt),
|
InstanceMethod("stateSize", &NodeModelWrapper::StateSize),
|
||||||
InstanceMethod("setThreadCount", &NodeModelWrapper::SetThreadCount),
|
InstanceMethod("raw_prompt", &NodeModelWrapper::Prompt),
|
||||||
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
|
InstanceMethod("setThreadCount", &NodeModelWrapper::SetThreadCount),
|
||||||
|
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
|
||||||
|
InstanceMethod("getLibraryPath", &NodeModelWrapper::GetLibraryPath),
|
||||||
});
|
});
|
||||||
|
// Keep a static reference to the constructor
|
||||||
Napi::FunctionReference* constructor = new Napi::FunctionReference();
|
//
|
||||||
*constructor = Napi::Persistent(func);
|
constructor = Napi::Persistent(self);
|
||||||
env.SetInstanceData(constructor);
|
constructor.SuppressDestruct();
|
||||||
|
return self;
|
||||||
exports.Set("LLModel", func);
|
|
||||||
return exports;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
Napi::Value getType(const Napi::CallbackInfo& info)
|
Napi::Value NodeModelWrapper::getType(const Napi::CallbackInfo& info)
|
||||||
{
|
{
|
||||||
|
if(type.empty()) {
|
||||||
|
return info.Env().Undefined();
|
||||||
|
}
|
||||||
return Napi::String::New(info.Env(), type);
|
return Napi::String::New(info.Env(), type);
|
||||||
}
|
}
|
||||||
|
|
||||||
NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
|
NodeModelWrapper::NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
|
||||||
{
|
{
|
||||||
auto env = info.Env();
|
auto env = info.Env();
|
||||||
std::string weights_path = info[0].As<Napi::String>().Utf8Value();
|
fs::path model_path;
|
||||||
|
|
||||||
const char *c_weights_path = weights_path.c_str();
|
std::string full_weight_path;
|
||||||
|
//todo
|
||||||
inference_ = create_model_set_type(c_weights_path);
|
std::string library_path = ".";
|
||||||
|
std::string model_name;
|
||||||
|
if(info[0].IsString()) {
|
||||||
|
model_path = info[0].As<Napi::String>().Utf8Value();
|
||||||
|
full_weight_path = model_path.string();
|
||||||
|
std::cout << "DEPRECATION: constructor accepts object now. Check docs for more.\n";
|
||||||
|
} else {
|
||||||
|
auto config_object = info[0].As<Napi::Object>();
|
||||||
|
model_name = config_object.Get("model_name").As<Napi::String>();
|
||||||
|
model_path = config_object.Get("model_path").As<Napi::String>().Utf8Value();
|
||||||
|
if(config_object.Has("model_type")) {
|
||||||
|
type = config_object.Get("model_type").As<Napi::String>();
|
||||||
|
}
|
||||||
|
full_weight_path = (model_path / fs::path(model_name)).string();
|
||||||
|
|
||||||
|
if(config_object.Has("library_path")) {
|
||||||
|
library_path = config_object.Get("library_path").As<Napi::String>();
|
||||||
|
} else {
|
||||||
|
library_path = ".";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
llmodel_set_implementation_search_path(library_path.c_str());
|
||||||
|
llmodel_error* e = nullptr;
|
||||||
|
inference_ = std::make_shared<llmodel_model>(llmodel_model_create2(full_weight_path.c_str(), "auto", e));
|
||||||
|
if(e != nullptr) {
|
||||||
|
Napi::Error::New(env, e->message).ThrowAsJavaScriptException();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if(GetInference() == nullptr) {
|
||||||
|
std::cerr << "Tried searching libraries in \"" << library_path << "\"" << std::endl;
|
||||||
|
std::cerr << "Tried searching for model weight in \"" << full_weight_path << "\"" << std::endl;
|
||||||
|
Napi::Error::New(env, "Had an issue creating llmodel object, inference is null").ThrowAsJavaScriptException();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
auto success = llmodel_loadModel(inference_, c_weights_path);
|
auto success = llmodel_loadModel(GetInference(), full_weight_path.c_str());
|
||||||
if(!success) {
|
if(!success) {
|
||||||
Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException();
|
Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
name = weights_path.substr(weights_path.find_last_of("/\\") + 1);
|
name = model_name.empty() ? model_path.filename().string() : model_name;
|
||||||
|
|
||||||
};
|
};
|
||||||
~NodeModelWrapper() {
|
//NodeModelWrapper::~NodeModelWrapper() {
|
||||||
// destroying the model manually causes exit code 3221226505, why?
|
//GetInference().reset();
|
||||||
// However, bindings seem to operate fine without destructing pointer
|
//}
|
||||||
//llmodel_model_destroy(inference_);
|
|
||||||
|
Napi::Value NodeModelWrapper::IsModelLoaded(const Napi::CallbackInfo& info) {
|
||||||
|
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(GetInference()));
|
||||||
}
|
}
|
||||||
|
|
||||||
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info) {
|
Napi::Value NodeModelWrapper::StateSize(const Napi::CallbackInfo& info) {
|
||||||
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(inference_));
|
|
||||||
}
|
|
||||||
|
|
||||||
Napi::Value StateSize(const Napi::CallbackInfo& info) {
|
|
||||||
// Implement the binding for the stateSize method
|
// Implement the binding for the stateSize method
|
||||||
return Napi::Number::New(info.Env(), static_cast<int64_t>(llmodel_get_state_size(inference_)));
|
return Napi::Number::New(info.Env(), static_cast<int64_t>(llmodel_get_state_size(GetInference())));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Generate a response using the model.
|
* Generate a response using the model.
|
||||||
@ -73,16 +100,14 @@ public:
|
|||||||
* @param recalculate_callback A callback function for handling recalculation requests.
|
* @param recalculate_callback A callback function for handling recalculation requests.
|
||||||
* @param ctx A pointer to the llmodel_prompt_context structure.
|
* @param ctx A pointer to the llmodel_prompt_context structure.
|
||||||
*/
|
*/
|
||||||
Napi::Value Prompt(const Napi::CallbackInfo& info) {
|
Napi::Value NodeModelWrapper::Prompt(const Napi::CallbackInfo& info) {
|
||||||
|
|
||||||
auto env = info.Env();
|
auto env = info.Env();
|
||||||
|
|
||||||
std::string question;
|
std::string question;
|
||||||
if(info[0].IsString()) {
|
if(info[0].IsString()) {
|
||||||
question = info[0].As<Napi::String>().Utf8Value();
|
question = info[0].As<Napi::String>().Utf8Value();
|
||||||
} else {
|
} else {
|
||||||
Napi::Error::New(env, "invalid string argument").ThrowAsJavaScriptException();
|
Napi::Error::New(info.Env(), "invalid string argument").ThrowAsJavaScriptException();
|
||||||
return env.Undefined();
|
return info.Env().Undefined();
|
||||||
}
|
}
|
||||||
//defaults copied from python bindings
|
//defaults copied from python bindings
|
||||||
llmodel_prompt_context promptContext = {
|
llmodel_prompt_context promptContext = {
|
||||||
@ -101,127 +126,90 @@ public:
|
|||||||
};
|
};
|
||||||
if(info[1].IsObject())
|
if(info[1].IsObject())
|
||||||
{
|
{
|
||||||
auto inputObject = info[1].As<Napi::Object>();
|
auto inputObject = info[1].As<Napi::Object>();
|
||||||
|
|
||||||
// Extract and assign the properties
|
// Extract and assign the properties
|
||||||
if (inputObject.Has("logits") || inputObject.Has("tokens")) {
|
if (inputObject.Has("logits") || inputObject.Has("tokens")) {
|
||||||
Napi::Error::New(env, "Invalid input: 'logits' or 'tokens' properties are not allowed").ThrowAsJavaScriptException();
|
Napi::Error::New(info.Env(), "Invalid input: 'logits' or 'tokens' properties are not allowed").ThrowAsJavaScriptException();
|
||||||
return env.Undefined();
|
return info.Env().Undefined();
|
||||||
}
|
}
|
||||||
// Assign the remaining properties
|
// Assign the remaining properties
|
||||||
if(inputObject.Has("n_past")) {
|
if(inputObject.Has("n_past"))
|
||||||
promptContext.n_past = inputObject.Get("n_past").As<Napi::Number>().Int32Value();
|
promptContext.n_past = inputObject.Get("n_past").As<Napi::Number>().Int32Value();
|
||||||
}
|
if(inputObject.Has("n_ctx"))
|
||||||
if(inputObject.Has("n_ctx")) {
|
promptContext.n_ctx = inputObject.Get("n_ctx").As<Napi::Number>().Int32Value();
|
||||||
promptContext.n_ctx = inputObject.Get("n_ctx").As<Napi::Number>().Int32Value();
|
if(inputObject.Has("n_predict"))
|
||||||
}
|
promptContext.n_predict = inputObject.Get("n_predict").As<Napi::Number>().Int32Value();
|
||||||
if(inputObject.Has("n_predict")) {
|
if(inputObject.Has("top_k"))
|
||||||
promptContext.n_predict = inputObject.Get("n_predict").As<Napi::Number>().Int32Value();
|
promptContext.top_k = inputObject.Get("top_k").As<Napi::Number>().Int32Value();
|
||||||
}
|
if(inputObject.Has("top_p"))
|
||||||
if(inputObject.Has("top_k")) {
|
promptContext.top_p = inputObject.Get("top_p").As<Napi::Number>().FloatValue();
|
||||||
promptContext.top_k = inputObject.Get("top_k").As<Napi::Number>().Int32Value();
|
if(inputObject.Has("temp"))
|
||||||
}
|
promptContext.temp = inputObject.Get("temp").As<Napi::Number>().FloatValue();
|
||||||
if(inputObject.Has("top_p")) {
|
if(inputObject.Has("n_batch"))
|
||||||
promptContext.top_p = inputObject.Get("top_p").As<Napi::Number>().FloatValue();
|
promptContext.n_batch = inputObject.Get("n_batch").As<Napi::Number>().Int32Value();
|
||||||
}
|
if(inputObject.Has("repeat_penalty"))
|
||||||
if(inputObject.Has("temp")) {
|
promptContext.repeat_penalty = inputObject.Get("repeat_penalty").As<Napi::Number>().FloatValue();
|
||||||
promptContext.temp = inputObject.Get("temp").As<Napi::Number>().FloatValue();
|
if(inputObject.Has("repeat_last_n"))
|
||||||
}
|
promptContext.repeat_last_n = inputObject.Get("repeat_last_n").As<Napi::Number>().Int32Value();
|
||||||
if(inputObject.Has("n_batch")) {
|
if(inputObject.Has("context_erase"))
|
||||||
promptContext.n_batch = inputObject.Get("n_batch").As<Napi::Number>().Int32Value();
|
promptContext.context_erase = inputObject.Get("context_erase").As<Napi::Number>().FloatValue();
|
||||||
}
|
|
||||||
if(inputObject.Has("repeat_penalty")) {
|
|
||||||
promptContext.repeat_penalty = inputObject.Get("repeat_penalty").As<Napi::Number>().FloatValue();
|
|
||||||
}
|
|
||||||
if(inputObject.Has("repeat_last_n")) {
|
|
||||||
promptContext.repeat_last_n = inputObject.Get("repeat_last_n").As<Napi::Number>().Int32Value();
|
|
||||||
}
|
|
||||||
if(inputObject.Has("context_erase")) {
|
|
||||||
promptContext.context_erase = inputObject.Get("context_erase").As<Napi::Number>().FloatValue();
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
// custom callbacks are weird with the gpt4all c bindings: I need to turn Napi::Functions into raw c function pointers,
|
//copy to protect llmodel resources when splitting to new thread
|
||||||
// but it doesn't seem like its possible? (TODO, is it possible?)
|
|
||||||
|
|
||||||
// if(info[1].IsFunction()) {
|
llmodel_prompt_context copiedPrompt = promptContext;
|
||||||
// Napi::Callback cb = *info[1].As<Napi::Function>();
|
std::string copiedQuestion = question;
|
||||||
// }
|
PromptWorkContext pc = {
|
||||||
|
copiedQuestion,
|
||||||
|
inference_.load(),
|
||||||
// For now, simple capture of stdout
|
copiedPrompt,
|
||||||
// possible TODO: put this on a libuv async thread. (AsyncWorker)
|
};
|
||||||
CoutRedirect cr;
|
auto threadSafeContext = new TsfnContext(env, pc);
|
||||||
llmodel_prompt(inference_, question.c_str(), &prompt_callback, &response_callback, &recalculate_callback, &promptContext);
|
threadSafeContext->tsfn = Napi::ThreadSafeFunction::New(
|
||||||
return Napi::String::New(env, cr.getString());
|
env, // Environment
|
||||||
|
info[2].As<Napi::Function>(), // JS function from caller
|
||||||
|
"PromptCallback", // Resource name
|
||||||
|
0, // Max queue size (0 = unlimited).
|
||||||
|
1, // Initial thread count
|
||||||
|
threadSafeContext, // Context,
|
||||||
|
FinalizerCallback, // Finalizer
|
||||||
|
(void*)nullptr // Finalizer data
|
||||||
|
);
|
||||||
|
threadSafeContext->nativeThread = std::thread(threadEntry, threadSafeContext);
|
||||||
|
return threadSafeContext->deferred_.Promise();
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetThreadCount(const Napi::CallbackInfo& info) {
|
void NodeModelWrapper::SetThreadCount(const Napi::CallbackInfo& info) {
|
||||||
if(info[0].IsNumber()) {
|
if(info[0].IsNumber()) {
|
||||||
llmodel_setThreadCount(inference_, info[0].As<Napi::Number>().Int64Value());
|
llmodel_setThreadCount(GetInference(), info[0].As<Napi::Number>().Int64Value());
|
||||||
} else {
|
} else {
|
||||||
Napi::Error::New(info.Env(), "Could not set thread count: argument 1 is NaN").ThrowAsJavaScriptException();
|
Napi::Error::New(info.Env(), "Could not set thread count: argument 1 is NaN").ThrowAsJavaScriptException();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
Napi::Value getName(const Napi::CallbackInfo& info) {
|
|
||||||
|
Napi::Value NodeModelWrapper::getName(const Napi::CallbackInfo& info) {
|
||||||
return Napi::String::New(info.Env(), name);
|
return Napi::String::New(info.Env(), name);
|
||||||
}
|
}
|
||||||
Napi::Value ThreadCount(const Napi::CallbackInfo& info) {
|
Napi::Value NodeModelWrapper::ThreadCount(const Napi::CallbackInfo& info) {
|
||||||
return Napi::Number::New(info.Env(), llmodel_threadCount(inference_));
|
return Napi::Number::New(info.Env(), llmodel_threadCount(GetInference()));
|
||||||
}
|
}
|
||||||
|
|
||||||
private:
|
Napi::Value NodeModelWrapper::GetLibraryPath(const Napi::CallbackInfo& info) {
|
||||||
llmodel_model inference_;
|
return Napi::String::New(info.Env(),
|
||||||
std::string type;
|
llmodel_get_implementation_search_path());
|
||||||
std::string name;
|
|
||||||
|
|
||||||
|
|
||||||
//wrapper cb to capture output into stdout.then, CoutRedirect captures this
|
|
||||||
// and writes it to a file
|
|
||||||
static bool response_callback(int32_t tid, const char* resp)
|
|
||||||
{
|
|
||||||
if(tid != -1) {
|
|
||||||
std::cout<<std::string(resp);
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
return false;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool prompt_callback(int32_t tid) { return true; }
|
llmodel_model NodeModelWrapper::GetInference() {
|
||||||
static bool recalculate_callback(bool isrecalculating) { return isrecalculating; }
|
return *inference_.load();
|
||||||
// Had to use this instead of the c library in order
|
|
||||||
// set the type of the model loaded.
|
|
||||||
// causes side effect: type is mutated;
|
|
||||||
llmodel_model create_model_set_type(const char* c_weights_path)
|
|
||||||
{
|
|
||||||
|
|
||||||
uint32_t magic;
|
|
||||||
llmodel_model model;
|
|
||||||
FILE *f = fopen(c_weights_path, "rb");
|
|
||||||
fread(&magic, sizeof(magic), 1, f);
|
|
||||||
|
|
||||||
if (magic == 0x67676d6c) {
|
|
||||||
model = llmodel_gptj_create();
|
|
||||||
type = "gptj";
|
|
||||||
}
|
|
||||||
else if (magic == 0x67676a74) {
|
|
||||||
model = llmodel_llama_create();
|
|
||||||
type = "llama";
|
|
||||||
}
|
|
||||||
else if (magic == 0x67676d6d) {
|
|
||||||
model = llmodel_mpt_create();
|
|
||||||
type = "mpt";
|
|
||||||
}
|
|
||||||
else {fprintf(stderr, "Invalid model file\n");}
|
|
||||||
fclose(f);
|
|
||||||
|
|
||||||
return model;
|
|
||||||
}
|
}
|
||||||
};
|
|
||||||
|
|
||||||
//Exports Bindings
|
//Exports Bindings
|
||||||
Napi::Object Init(Napi::Env env, Napi::Object exports) {
|
Napi::Object Init(Napi::Env env, Napi::Object exports) {
|
||||||
return NodeModelWrapper::Init(env, exports);
|
exports["LLModel"] = NodeModelWrapper::GetClass(env);
|
||||||
|
return exports;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)
|
NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)
|
||||||
|
45
gpt4all-bindings/typescript/index.h
Normal file
45
gpt4all-bindings/typescript/index.h
Normal file
@ -0,0 +1,45 @@
|
|||||||
|
#include <napi.h>
|
||||||
|
#include "llmodel.h"
|
||||||
|
#include <iostream>
|
||||||
|
#include "llmodel_c.h"
|
||||||
|
#include "prompt.h"
|
||||||
|
#include <atomic>
|
||||||
|
#include <memory>
|
||||||
|
#include <filesystem>
|
||||||
|
namespace fs = std::filesystem;
|
||||||
|
|
||||||
|
class NodeModelWrapper: public Napi::ObjectWrap<NodeModelWrapper> {
|
||||||
|
public:
|
||||||
|
NodeModelWrapper(const Napi::CallbackInfo &);
|
||||||
|
//~NodeModelWrapper();
|
||||||
|
Napi::Value getType(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value StateSize(const Napi::CallbackInfo& info);
|
||||||
|
/**
|
||||||
|
* Prompting the model. This entails spawning a new thread and adding the response tokens
|
||||||
|
* into a thread local string variable.
|
||||||
|
*/
|
||||||
|
Napi::Value Prompt(const Napi::CallbackInfo& info);
|
||||||
|
void SetThreadCount(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value getName(const Napi::CallbackInfo& info);
|
||||||
|
Napi::Value ThreadCount(const Napi::CallbackInfo& info);
|
||||||
|
/*
|
||||||
|
* The path that is used to search for the dynamic libraries
|
||||||
|
*/
|
||||||
|
Napi::Value GetLibraryPath(const Napi::CallbackInfo& info);
|
||||||
|
/**
|
||||||
|
* Creates the LLModel class
|
||||||
|
*/
|
||||||
|
static Napi::Function GetClass(Napi::Env);
|
||||||
|
llmodel_model GetInference();
|
||||||
|
private:
|
||||||
|
/**
|
||||||
|
* The underlying inference that interfaces with the C interface
|
||||||
|
*/
|
||||||
|
std::atomic<std::shared_ptr<llmodel_model>> inference_;
|
||||||
|
|
||||||
|
std::string type;
|
||||||
|
// corresponds to LLModel::name() in typescript
|
||||||
|
std::string name;
|
||||||
|
static Napi::FunctionReference constructor;
|
||||||
|
};
|
@ -1,19 +1,32 @@
|
|||||||
{
|
{
|
||||||
"name": "gpt4all-ts",
|
"name": "gpt4all",
|
||||||
|
"version": "2.0.0",
|
||||||
"packageManager": "yarn@3.5.1",
|
"packageManager": "yarn@3.5.1",
|
||||||
"gypfile": true,
|
"main": "src/gpt4all.js",
|
||||||
|
"repository": "nomic-ai/gpt4all",
|
||||||
"scripts": {
|
"scripts": {
|
||||||
"test": "node ./test/index.mjs"
|
"test": "node ./test/index.mjs",
|
||||||
|
"build:backend": "node scripts/build.js",
|
||||||
|
"install": "node-gyp-build",
|
||||||
|
"prebuild": "node scripts/prebuild.js",
|
||||||
|
"docs:build": "documentation build ./src/gpt4all.d.ts --parse-extension d.ts --format md --output docs/api.md"
|
||||||
},
|
},
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"bindings": "^1.5.0",
|
"mkdirp": "^3.0.1",
|
||||||
"node-addon-api": "^6.1.0"
|
"node-addon-api": "^6.1.0",
|
||||||
|
"node-gyp-build": "^4.6.0"
|
||||||
},
|
},
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"@types/node": "^20.1.5"
|
"@types/node": "^20.1.5",
|
||||||
|
"documentation": "^14.0.2",
|
||||||
|
"prebuildify": "^5.0.1",
|
||||||
|
"prettier": "^2.8.8"
|
||||||
},
|
},
|
||||||
"engines": {
|
"engines": {
|
||||||
"node": ">= 18.x.x"
|
"node": ">= 18.x.x"
|
||||||
|
},
|
||||||
|
"prettier": {
|
||||||
|
"endOfLine": "lf",
|
||||||
|
"tabWidth": 4
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
62
gpt4all-bindings/typescript/prompt.cc
Normal file
62
gpt4all-bindings/typescript/prompt.cc
Normal file
@ -0,0 +1,62 @@
|
|||||||
|
#include "prompt.h"
|
||||||
|
|
||||||
|
|
||||||
|
TsfnContext::TsfnContext(Napi::Env env, const PromptWorkContext& pc)
|
||||||
|
: deferred_(Napi::Promise::Deferred::New(env)), pc(pc) {
|
||||||
|
}
|
||||||
|
|
||||||
|
std::mutex mtx;
|
||||||
|
static thread_local std::string res;
|
||||||
|
bool response_callback(int32_t token_id, const char *response) {
|
||||||
|
res+=response;
|
||||||
|
return token_id != -1;
|
||||||
|
}
|
||||||
|
bool recalculate_callback (bool isrecalculating) {
|
||||||
|
return isrecalculating;
|
||||||
|
};
|
||||||
|
bool prompt_callback (int32_t tid) {
|
||||||
|
return true;
|
||||||
|
};
|
||||||
|
|
||||||
|
// The thread entry point. This takes as its arguments the specific
|
||||||
|
// threadsafe-function context created inside the main thread.
|
||||||
|
void threadEntry(TsfnContext* context) {
|
||||||
|
std::lock_guard<std::mutex> lock(mtx);
|
||||||
|
// Perform a call into JavaScript.
|
||||||
|
napi_status status =
|
||||||
|
context->tsfn.NonBlockingCall(&context->pc,
|
||||||
|
[](Napi::Env env, Napi::Function jsCallback, PromptWorkContext* pc) {
|
||||||
|
llmodel_prompt(
|
||||||
|
*pc->inference_,
|
||||||
|
pc->question.c_str(),
|
||||||
|
&prompt_callback,
|
||||||
|
&response_callback,
|
||||||
|
&recalculate_callback,
|
||||||
|
&pc->prompt_params
|
||||||
|
);
|
||||||
|
jsCallback.Call({ Napi::String::New(env, res)} );
|
||||||
|
res.clear();
|
||||||
|
});
|
||||||
|
|
||||||
|
if (status != napi_ok) {
|
||||||
|
Napi::Error::Fatal(
|
||||||
|
"ThreadEntry",
|
||||||
|
"Napi::ThreadSafeNapi::Function.NonBlockingCall() failed");
|
||||||
|
}
|
||||||
|
|
||||||
|
// Release the thread-safe function. This decrements the internal thread
|
||||||
|
// count, and will perform finalization since the count will reach 0.
|
||||||
|
context->tsfn.Release();
|
||||||
|
}
|
||||||
|
|
||||||
|
void FinalizerCallback(Napi::Env env,
|
||||||
|
void* finalizeData,
|
||||||
|
TsfnContext* context) {
|
||||||
|
// Join the thread
|
||||||
|
context->nativeThread.join();
|
||||||
|
// Resolve the Promise previously returned to JS via the CreateTSFN method.
|
||||||
|
context->deferred_.Resolve(Napi::Boolean::New(env, true));
|
||||||
|
delete context;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
42
gpt4all-bindings/typescript/prompt.h
Normal file
42
gpt4all-bindings/typescript/prompt.h
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
#ifndef TSFN_CONTEXT_H
|
||||||
|
#define TSFN_CONTEXT_H
|
||||||
|
|
||||||
|
#include "napi.h"
|
||||||
|
#include "llmodel_c.h"
|
||||||
|
#include <thread>
|
||||||
|
#include <mutex>
|
||||||
|
#include <iostream>
|
||||||
|
#include <atomic>
|
||||||
|
#include <memory>
|
||||||
|
struct PromptWorkContext {
|
||||||
|
std::string question;
|
||||||
|
std::shared_ptr<llmodel_model> inference_;
|
||||||
|
llmodel_prompt_context prompt_params;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct TsfnContext {
|
||||||
|
public:
|
||||||
|
TsfnContext(Napi::Env env, const PromptWorkContext &pc);
|
||||||
|
std::thread nativeThread;
|
||||||
|
Napi::Promise::Deferred deferred_;
|
||||||
|
PromptWorkContext pc;
|
||||||
|
Napi::ThreadSafeFunction tsfn;
|
||||||
|
|
||||||
|
// Some data to pass around
|
||||||
|
// int ints[ARRAY_LENGTH];
|
||||||
|
|
||||||
|
};
|
||||||
|
|
||||||
|
// The thread entry point. This takes as its arguments the specific
|
||||||
|
// threadsafe-function context created inside the main thread.
|
||||||
|
void threadEntry(TsfnContext* context);
|
||||||
|
|
||||||
|
// The thread-safe function finalizer callback. This callback executes
|
||||||
|
// at destruction of thread-safe function, taking as arguments the finalizer
|
||||||
|
// data and threadsafe-function context.
|
||||||
|
void FinalizerCallback(Napi::Env env, void* finalizeData, TsfnContext* context);
|
||||||
|
|
||||||
|
bool response_callback(int32_t token_id, const char *response);
|
||||||
|
bool recalculate_callback (bool isrecalculating);
|
||||||
|
bool prompt_callback (int32_t tid);
|
||||||
|
#endif // TSFN_CONTEXT_H
|
17
gpt4all-bindings/typescript/scripts/build.js
Normal file
17
gpt4all-bindings/typescript/scripts/build.js
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
const { spawn } = require("node:child_process");
|
||||||
|
const { resolve } = require("path");
|
||||||
|
const args = process.argv.slice(2);
|
||||||
|
const platform = process.platform;
|
||||||
|
|
||||||
|
//windows 64bit or 32
|
||||||
|
if (platform === "win32") {
|
||||||
|
const path = "scripts/build_msvc.bat";
|
||||||
|
spawn(resolve(path), ["/Y", ...args], { shell: true, stdio: "inherit" });
|
||||||
|
process.on("data", (s) => console.log(s.toString()));
|
||||||
|
} else if (platform === "linux" || platform === "darwin") {
|
||||||
|
const path = "scripts/build_unix.sh";
|
||||||
|
const bash = spawn(`sh`, [path, ...args]);
|
||||||
|
bash.stdout.on("data", (s) => console.log(s.toString()), {
|
||||||
|
stdio: "inherit",
|
||||||
|
});
|
||||||
|
}
|
16
gpt4all-bindings/typescript/scripts/build_mingw.ps1
Normal file
16
gpt4all-bindings/typescript/scripts/build_mingw.ps1
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
$ROOT_DIR = '.\runtimes\win-x64'
|
||||||
|
$BUILD_DIR = '.\runtimes\win-x64\build\mingw'
|
||||||
|
$LIBS_DIR = '.\runtimes\win-x64\native'
|
||||||
|
|
||||||
|
# cleanup env
|
||||||
|
Remove-Item -Force -Recurse $ROOT_DIR -ErrorAction SilentlyContinue | Out-Null
|
||||||
|
mkdir $BUILD_DIR | Out-Null
|
||||||
|
mkdir $LIBS_DIR | Out-Null
|
||||||
|
|
||||||
|
# build
|
||||||
|
cmake -G "MinGW Makefiles" -S ..\..\gpt4all-backend -B $BUILD_DIR -DLLAMA_AVX2=ON
|
||||||
|
cmake --build $BUILD_DIR --parallel --config Release
|
||||||
|
|
||||||
|
# copy native dlls
|
||||||
|
# cp "C:\ProgramData\chocolatey\lib\mingw\tools\install\mingw64\bin\*dll" $LIBS_DIR
|
||||||
|
cp "$BUILD_DIR\bin\*.dll" $LIBS_DIR
|
31
gpt4all-bindings/typescript/scripts/build_unix.sh
Normal file
31
gpt4all-bindings/typescript/scripts/build_unix.sh
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
SYSNAME=$(uname -s)
|
||||||
|
|
||||||
|
if [ "$SYSNAME" = "Linux" ]; then
|
||||||
|
BASE_DIR="runtimes/linux-x64"
|
||||||
|
LIB_EXT="so"
|
||||||
|
elif [ "$SYSNAME" = "Darwin" ]; then
|
||||||
|
BASE_DIR="runtimes/osx"
|
||||||
|
LIB_EXT="dylib"
|
||||||
|
elif [ -n "$SYSNAME" ]; then
|
||||||
|
echo "Unsupported system: $SYSNAME" >&2
|
||||||
|
exit 1
|
||||||
|
else
|
||||||
|
echo "\"uname -s\" failed" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
NATIVE_DIR="$BASE_DIR/native"
|
||||||
|
BUILD_DIR="$BASE_DIR/build"
|
||||||
|
|
||||||
|
rm -rf "$BASE_DIR"
|
||||||
|
mkdir -p "$NATIVE_DIR" "$BUILD_DIR"
|
||||||
|
|
||||||
|
cmake -S ../../gpt4all-backend -B "$BUILD_DIR" &&
|
||||||
|
cmake --build "$BUILD_DIR" -j --config Release && {
|
||||||
|
cp "$BUILD_DIR"/libllmodel.$LIB_EXT "$NATIVE_DIR"/
|
||||||
|
cp "$BUILD_DIR"/libgptj*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
|
cp "$BUILD_DIR"/libllama*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
|
cp "$BUILD_DIR"/libmpt*.$LIB_EXT "$NATIVE_DIR"/
|
||||||
|
}
|
50
gpt4all-bindings/typescript/scripts/prebuild.js
Normal file
50
gpt4all-bindings/typescript/scripts/prebuild.js
Normal file
@ -0,0 +1,50 @@
|
|||||||
|
const prebuildify = require("prebuildify");
|
||||||
|
|
||||||
|
async function createPrebuilds(combinations) {
|
||||||
|
for (const { platform, arch } of combinations) {
|
||||||
|
const opts = {
|
||||||
|
platform,
|
||||||
|
arch,
|
||||||
|
napi: true,
|
||||||
|
};
|
||||||
|
try {
|
||||||
|
await createPrebuild(opts);
|
||||||
|
console.log(
|
||||||
|
`Build succeeded for platform ${opts.platform} and architecture ${opts.arch}`
|
||||||
|
);
|
||||||
|
} catch (err) {
|
||||||
|
console.error(
|
||||||
|
`Error building for platform ${opts.platform} and architecture ${opts.arch}:`,
|
||||||
|
err
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function createPrebuild(opts) {
|
||||||
|
return new Promise((resolve, reject) => {
|
||||||
|
prebuildify(opts, (err) => {
|
||||||
|
if (err) {
|
||||||
|
reject(err);
|
||||||
|
} else {
|
||||||
|
resolve();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
const prebuildConfigs = [
|
||||||
|
{ platform: "win32", arch: "x64" },
|
||||||
|
{ platform: "win32", arch: "arm64" },
|
||||||
|
// { platform: 'win32', arch: 'armv7' },
|
||||||
|
{ platform: "darwin", arch: "x64" },
|
||||||
|
{ platform: "darwin", arch: "arm64" },
|
||||||
|
// { platform: 'darwin', arch: 'armv7' },
|
||||||
|
{ platform: "linux", arch: "x64" },
|
||||||
|
{ platform: "linux", arch: "arm64" },
|
||||||
|
{ platform: "linux", arch: "armv7" },
|
||||||
|
];
|
||||||
|
|
||||||
|
createPrebuilds(prebuildConfigs)
|
||||||
|
.then(() => console.log("All builds succeeded"))
|
||||||
|
.catch((err) => console.error("Error building:", err));
|
@ -1,14 +1,15 @@
|
|||||||
import { LLModel, prompt, createCompletion } from '../src/gpt4all.js'
|
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } from '../src/gpt4all.js'
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
const ll = new LLModel("./ggml-vicuna-7b-1.1-q4_2.bin");
|
|
||||||
|
|
||||||
|
const ll = new LLModel({
|
||||||
|
model_name: 'ggml-vicuna-7b-1.1-q4_2.bin',
|
||||||
|
model_path: './',
|
||||||
|
library_path: DEFAULT_LIBRARIES_DIRECTORY
|
||||||
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
class Extended extends LLModel {
|
class Extended extends LLModel {
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
} catch(e) {
|
} catch(e) {
|
||||||
console.log("Extending from native class gone wrong " + e)
|
console.log("Extending from native class gone wrong " + e)
|
||||||
}
|
}
|
||||||
@ -20,13 +21,26 @@ ll.setThreadCount(5);
|
|||||||
console.log("thread count " + ll.threadCount());
|
console.log("thread count " + ll.threadCount());
|
||||||
ll.setThreadCount(4);
|
ll.setThreadCount(4);
|
||||||
console.log("thread count " + ll.threadCount());
|
console.log("thread count " + ll.threadCount());
|
||||||
|
console.log("name " + ll.name());
|
||||||
|
console.log("type: " + ll.type());
|
||||||
|
console.log("Default directory for models", DEFAULT_DIRECTORY);
|
||||||
|
console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY);
|
||||||
|
|
||||||
|
console.log(await createCompletion(
|
||||||
console.log(createCompletion(
|
|
||||||
ll,
|
ll,
|
||||||
prompt`${"header"} ${"prompt"}`, {
|
[
|
||||||
verbose: true,
|
{ role : 'system', content: 'You are a girl who likes playing league of legends.' },
|
||||||
prompt: 'hello! Say something thought provoking.'
|
{ role : 'user', content: 'What is the best top laner to play right now?' },
|
||||||
}
|
],
|
||||||
|
{ verbose: false}
|
||||||
));
|
));
|
||||||
|
|
||||||
|
|
||||||
|
console.log(await createCompletion(
|
||||||
|
ll,
|
||||||
|
[
|
||||||
|
{ role : 'user', content: 'What is the best bottom laner to play right now?' },
|
||||||
|
],
|
||||||
|
))
|
||||||
|
|
||||||
|
|
||||||
|
22
gpt4all-bindings/typescript/src/config.js
Normal file
22
gpt4all-bindings/typescript/src/config.js
Normal file
@ -0,0 +1,22 @@
|
|||||||
|
const os = require("node:os");
|
||||||
|
const path = require("node:path");
|
||||||
|
|
||||||
|
const DEFAULT_DIRECTORY = path.resolve(os.homedir(), ".cache/gpt4all");
|
||||||
|
|
||||||
|
const librarySearchPaths = [
|
||||||
|
path.join(DEFAULT_DIRECTORY, "libraries"),
|
||||||
|
path.resolve("./libraries"),
|
||||||
|
path.resolve(
|
||||||
|
__dirname,
|
||||||
|
"..",
|
||||||
|
`runtimes/${process.platform}-${process.arch}/native`
|
||||||
|
),
|
||||||
|
process.cwd(),
|
||||||
|
];
|
||||||
|
|
||||||
|
const DEFAULT_LIBRARIES_DIRECTORY = librarySearchPaths.join(";");
|
||||||
|
|
||||||
|
module.exports = {
|
||||||
|
DEFAULT_DIRECTORY,
|
||||||
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
|
};
|
382
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
382
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
@ -1,162 +1,310 @@
|
|||||||
/// <reference types="node" />
|
/// <reference types="node" />
|
||||||
declare module 'gpt4all-ts';
|
declare module "gpt4all";
|
||||||
|
|
||||||
|
export * from "./util.d.ts";
|
||||||
|
|
||||||
|
/** Type of the model */
|
||||||
|
type ModelType = "gptj" | "llama" | "mpt" | "replit";
|
||||||
|
|
||||||
|
/**
|
||||||
interface LLModelPromptContext {
|
* Full list of models available
|
||||||
|
*/
|
||||||
// Size of the raw logits vector
|
interface ModelFile {
|
||||||
logits_size: number;
|
/** List of GPT-J Models */
|
||||||
|
gptj:
|
||||||
// Size of the raw tokens vector
|
| "ggml-gpt4all-j-v1.3-groovy.bin"
|
||||||
tokens_size: number;
|
| "ggml-gpt4all-j-v1.2-jazzy.bin"
|
||||||
|
| "ggml-gpt4all-j-v1.1-breezy.bin"
|
||||||
// Number of tokens in past conversation
|
| "ggml-gpt4all-j.bin";
|
||||||
n_past: number;
|
/** List Llama Models */
|
||||||
|
llama:
|
||||||
// Number of tokens possible in context window
|
| "ggml-gpt4all-l13b-snoozy.bin"
|
||||||
n_ctx: number;
|
| "ggml-vicuna-7b-1.1-q4_2.bin"
|
||||||
|
| "ggml-vicuna-13b-1.1-q4_2.bin"
|
||||||
// Number of tokens to predict
|
| "ggml-wizardLM-7B.q4_2.bin"
|
||||||
n_predict: number;
|
| "ggml-stable-vicuna-13B.q4_2.bin"
|
||||||
|
| "ggml-nous-gpt4-vicuna-13b.bin"
|
||||||
// Top k logits to sample from
|
| "ggml-v3-13b-hermes-q5_1.bin";
|
||||||
top_k: number;
|
/** List of MPT Models */
|
||||||
|
mpt:
|
||||||
// Nucleus sampling probability threshold
|
| "ggml-mpt-7b-base.bin"
|
||||||
top_p: number;
|
| "ggml-mpt-7b-chat.bin"
|
||||||
|
| "ggml-mpt-7b-instruct.bin";
|
||||||
// Temperature to adjust model's output distribution
|
/** List of Replit Models */
|
||||||
temp: number;
|
replit: "ggml-replit-code-v1-3b.bin";
|
||||||
|
|
||||||
// Number of predictions to generate in parallel
|
|
||||||
n_batch: number;
|
|
||||||
|
|
||||||
// Penalty factor for repeated tokens
|
|
||||||
repeat_penalty: number;
|
|
||||||
|
|
||||||
// Last n tokens to penalize
|
|
||||||
repeat_last_n: number;
|
|
||||||
|
|
||||||
// Percent of context to erase if we exceed the context window
|
|
||||||
context_erase: number;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
//mirrors py options
|
||||||
|
interface LLModelOptions {
|
||||||
|
/**
|
||||||
|
* Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
||||||
|
*/
|
||||||
|
type?: ModelType;
|
||||||
|
model_name: ModelFile[ModelType];
|
||||||
|
model_path: string;
|
||||||
|
library_path?: string;
|
||||||
|
}
|
||||||
/**
|
/**
|
||||||
* LLModel class representing a language model.
|
* LLModel class representing a language model.
|
||||||
* This is a base class that provides common functionality for different types of language models.
|
* This is a base class that provides common functionality for different types of language models.
|
||||||
*/
|
*/
|
||||||
declare class LLModel {
|
declare class LLModel {
|
||||||
//either 'gpt', mpt', or 'llama'
|
/**
|
||||||
type() : ModelType;
|
* Initialize a new LLModel.
|
||||||
//The name of the model
|
* @param path Absolute path to the model file.
|
||||||
name(): ModelFile;
|
* @throws {Error} If the model file does not exist.
|
||||||
|
*/
|
||||||
constructor(path: string);
|
constructor(path: string);
|
||||||
|
constructor(options: LLModelOptions);
|
||||||
|
|
||||||
|
/** either 'gpt', mpt', or 'llama' or undefined */
|
||||||
|
type(): ModelType | undefined;
|
||||||
|
|
||||||
|
/** The name of the model. */
|
||||||
|
name(): ModelFile;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get the size of the internal state of the model.
|
* Get the size of the internal state of the model.
|
||||||
* NOTE: This state data is specific to the type of model you have created.
|
* NOTE: This state data is specific to the type of model you have created.
|
||||||
* @return the size in bytes of the internal state of the model
|
* @return the size in bytes of the internal state of the model
|
||||||
*/
|
*/
|
||||||
stateSize(): number;
|
stateSize(): number;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Get the number of threads used for model inference.
|
* Get the number of threads used for model inference.
|
||||||
* The default is the number of physical cores your computer has.
|
* The default is the number of physical cores your computer has.
|
||||||
* @returns The number of threads used for model inference.
|
* @returns The number of threads used for model inference.
|
||||||
*/
|
*/
|
||||||
threadCount() : number;
|
threadCount(): number;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Set the number of threads used for model inference.
|
* Set the number of threads used for model inference.
|
||||||
* @param newNumber The new number of threads.
|
* @param newNumber The new number of threads.
|
||||||
*/
|
*/
|
||||||
setThreadCount(newNumber: number): void;
|
setThreadCount(newNumber: number): void;
|
||||||
/**
|
|
||||||
* Prompt the model with a given input and optional parameters.
|
|
||||||
* This is the raw output from std out.
|
|
||||||
* Use the prompt function exported for a value
|
|
||||||
* @param q The prompt input.
|
|
||||||
* @param params Optional parameters for the prompt context.
|
|
||||||
* @returns The result of the model prompt.
|
|
||||||
*/
|
|
||||||
raw_prompt(q: string, params?: Partial<LLModelPromptContext>) : unknown; //todo work on return type
|
|
||||||
|
|
||||||
}
|
|
||||||
|
|
||||||
interface DownloadController {
|
|
||||||
//Cancel the request to download from gpt4all website if this is called.
|
|
||||||
cancel: () => void;
|
|
||||||
//Convert the downloader into a promise, allowing people to await and manage its lifetime
|
|
||||||
promise: () => Promise<void>
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
export interface DownloadConfig {
|
|
||||||
/**
|
/**
|
||||||
* location to download the model.
|
* Prompt the model with a given input and optional parameters.
|
||||||
* Default is process.cwd(), or the current working directory
|
* This is the raw output from std out.
|
||||||
|
* Use the prompt function exported for a value
|
||||||
|
* @param q The prompt input.
|
||||||
|
* @param params Optional parameters for the prompt context.
|
||||||
|
* @returns The result of the model prompt.
|
||||||
*/
|
*/
|
||||||
location: string;
|
raw_prompt(q: string, params: Partial<LLModelPromptContext>, callback: (res: string) => void): void; // TODO work on return type
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Debug mode -- check how long it took to download in seconds
|
* Whether the model is loaded or not.
|
||||||
*/
|
*/
|
||||||
debug: boolean;
|
isModelLoaded(): boolean;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Default link = https://gpt4all.io/models`
|
* Where to search for the pluggable backend libraries
|
||||||
* This property overrides the default.
|
|
||||||
*/
|
*/
|
||||||
link?: string
|
setLibraryPath(s: string): void;
|
||||||
}
|
/**
|
||||||
/**
|
* Where to get the pluggable backend libraries
|
||||||
* Initiates the download of a model file of a specific model type.
|
*/
|
||||||
* By default this downloads without waiting. use the controller returned to alter this behavior.
|
getLibraryPath(): string;
|
||||||
* @param {ModelFile[ModelType]} m - The model file to be downloaded.
|
|
||||||
* @param {Record<string, unknown>} op - options to pass into the downloader. Default is { location: (cwd), debug: false }.
|
|
||||||
* @returns {DownloadController} A DownloadController object that allows controlling the download process.
|
|
||||||
*/
|
|
||||||
declare function download(m: ModelFile[ModelType], op: { location: string, debug: boolean, link?:string }): DownloadController
|
|
||||||
|
|
||||||
|
|
||||||
type ModelType = 'gptj' | 'llama' | 'mpt';
|
|
||||||
|
|
||||||
/*
|
|
||||||
* A nice interface for intellisense of all possibly models.
|
|
||||||
*/
|
|
||||||
interface ModelFile {
|
|
||||||
'gptj': | "ggml-gpt4all-j-v1.3-groovy.bin"
|
|
||||||
| "ggml-gpt4all-j-v1.2-jazzy.bin"
|
|
||||||
| "ggml-gpt4all-j-v1.1-breezy.bin"
|
|
||||||
| "ggml-gpt4all-j.bin";
|
|
||||||
'llama':| "ggml-gpt4all-l13b-snoozy.bin"
|
|
||||||
| "ggml-vicuna-7b-1.1-q4_2.bin"
|
|
||||||
| "ggml-vicuna-13b-1.1-q4_2.bin"
|
|
||||||
| "ggml-wizardLM-7B.q4_2.bin"
|
|
||||||
| "ggml-stable-vicuna-13B.q4_2.bin"
|
|
||||||
| "ggml-nous-gpt4-vicuna-13b.bin"
|
|
||||||
'mpt': | "ggml-mpt-7b-base.bin"
|
|
||||||
| "ggml-mpt-7b-chat.bin"
|
|
||||||
| "ggml-mpt-7b-instruct.bin"
|
|
||||||
}
|
}
|
||||||
|
|
||||||
interface ExtendedOptions {
|
interface LoadModelOptions {
|
||||||
|
modelPath?: string;
|
||||||
|
librariesPath?: string;
|
||||||
|
allowDownload?: boolean;
|
||||||
verbose?: boolean;
|
verbose?: boolean;
|
||||||
system?: string;
|
|
||||||
header?: string;
|
|
||||||
prompt: string;
|
|
||||||
promptEntries?: Record<string, unknown>
|
|
||||||
}
|
}
|
||||||
|
|
||||||
type PromptTemplate = (...args: string[]) => string;
|
declare function loadModel(
|
||||||
|
modelName: string,
|
||||||
|
options?: LoadModelOptions
|
||||||
|
): Promise<LLModel>;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The nodejs equivalent to python binding's chat_completion
|
||||||
|
* @param {LLModel} llmodel - The language model object.
|
||||||
|
* @param {PromptMessage[]} messages - The array of messages for the conversation.
|
||||||
|
* @param {CompletionOptions} options - The options for creating the completion.
|
||||||
|
* @returns {CompletionReturn} The completion result.
|
||||||
|
* @example
|
||||||
|
* const llmodel = new LLModel(model)
|
||||||
|
* const messages = [
|
||||||
|
* { role: 'system', message: 'You are a weather forecaster.' },
|
||||||
|
* { role: 'user', message: 'should i go out today?' } ]
|
||||||
|
* const completion = await createCompletion(llmodel, messages, {
|
||||||
|
* verbose: true,
|
||||||
|
* temp: 0.9,
|
||||||
|
* })
|
||||||
|
* console.log(completion.choices[0].message.content)
|
||||||
|
* // No, it's going to be cold and rainy.
|
||||||
|
*/
|
||||||
declare function createCompletion(
|
declare function createCompletion(
|
||||||
model: LLModel,
|
llmodel: LLModel,
|
||||||
pt: PromptTemplate,
|
messages: PromptMessage[],
|
||||||
options: LLModelPromptContext&ExtendedOptions
|
options?: CompletionOptions
|
||||||
) : string
|
): Promise<CompletionReturn>;
|
||||||
|
|
||||||
function prompt(
|
/**
|
||||||
strings: TemplateStringsArray
|
* The options for creating the completion.
|
||||||
): PromptTemplate
|
*/
|
||||||
|
interface CompletionOptions extends Partial<LLModelPromptContext> {
|
||||||
|
/**
|
||||||
|
* Indicates if verbose logging is enabled.
|
||||||
|
* @default true
|
||||||
|
*/
|
||||||
|
verbose?: boolean;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Indicates if the default header is included in the prompt.
|
||||||
|
* @default true
|
||||||
|
*/
|
||||||
|
hasDefaultHeader?: boolean;
|
||||||
|
|
||||||
export { LLModel, LLModelPromptContext, ModelType, download, DownloadController, prompt, ExtendedOptions, createCompletion }
|
/**
|
||||||
|
* Indicates if the default footer is included in the prompt.
|
||||||
|
* @default true
|
||||||
|
*/
|
||||||
|
hasDefaultFooter?: boolean;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A message in the conversation, identical to OpenAI's chat message.
|
||||||
|
*/
|
||||||
|
interface PromptMessage {
|
||||||
|
/** The role of the message. */
|
||||||
|
role: "system" | "assistant" | "user";
|
||||||
|
|
||||||
|
/** The message content. */
|
||||||
|
content: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The result of the completion, similar to OpenAI's format.
|
||||||
|
*/
|
||||||
|
interface CompletionReturn {
|
||||||
|
/** The model name.
|
||||||
|
* @type {ModelFile}
|
||||||
|
*/
|
||||||
|
model: ModelFile[ModelType];
|
||||||
|
|
||||||
|
/** Token usage report. */
|
||||||
|
usage: {
|
||||||
|
/** The number of tokens used in the prompt. */
|
||||||
|
prompt_tokens: number;
|
||||||
|
|
||||||
|
/** The number of tokens used in the completion. */
|
||||||
|
completion_tokens: number;
|
||||||
|
|
||||||
|
/** The total number of tokens used. */
|
||||||
|
total_tokens: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
/** The generated completions. */
|
||||||
|
choices: CompletionChoice[];
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A completion choice, similar to OpenAI's format.
|
||||||
|
*/
|
||||||
|
interface CompletionChoice {
|
||||||
|
/** Response message */
|
||||||
|
message: PromptMessage;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Model inference arguments for generating completions.
|
||||||
|
*/
|
||||||
|
interface LLModelPromptContext {
|
||||||
|
/** The size of the raw logits vector. */
|
||||||
|
logits_size: number;
|
||||||
|
|
||||||
|
/** The size of the raw tokens vector. */
|
||||||
|
tokens_size: number;
|
||||||
|
|
||||||
|
/** The number of tokens in the past conversation. */
|
||||||
|
n_past: number;
|
||||||
|
|
||||||
|
/** The number of tokens possible in the context window.
|
||||||
|
* @default 1024
|
||||||
|
*/
|
||||||
|
n_ctx: number;
|
||||||
|
|
||||||
|
/** The number of tokens to predict.
|
||||||
|
* @default 128
|
||||||
|
* */
|
||||||
|
n_predict: number;
|
||||||
|
|
||||||
|
/** The top-k logits to sample from.
|
||||||
|
* @default 40
|
||||||
|
* */
|
||||||
|
top_k: number;
|
||||||
|
|
||||||
|
/** The nucleus sampling probability threshold.
|
||||||
|
* @default 0.9
|
||||||
|
* */
|
||||||
|
top_p: number;
|
||||||
|
|
||||||
|
/** The temperature to adjust the model's output distribution.
|
||||||
|
* @default 0.72
|
||||||
|
* */
|
||||||
|
temp: number;
|
||||||
|
|
||||||
|
/** The number of predictions to generate in parallel.
|
||||||
|
* @default 8
|
||||||
|
* */
|
||||||
|
n_batch: number;
|
||||||
|
|
||||||
|
/** The penalty factor for repeated tokens.
|
||||||
|
* @default 1
|
||||||
|
* */
|
||||||
|
repeat_penalty: number;
|
||||||
|
|
||||||
|
/** The number of last tokens to penalize.
|
||||||
|
* @default 10
|
||||||
|
* */
|
||||||
|
repeat_last_n: number;
|
||||||
|
|
||||||
|
/** The percentage of context to erase if the context window is exceeded.
|
||||||
|
* @default 0.5
|
||||||
|
* */
|
||||||
|
context_erase: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* TODO: Help wanted to implement this
|
||||||
|
*/
|
||||||
|
declare function createTokenStream(
|
||||||
|
llmodel: LLModel,
|
||||||
|
messages: PromptMessage[],
|
||||||
|
options: CompletionOptions
|
||||||
|
): (ll: LLModel) => AsyncGenerator<string>;
|
||||||
|
/**
|
||||||
|
* From python api:
|
||||||
|
* models will be stored in (homedir)/.cache/gpt4all/`
|
||||||
|
*/
|
||||||
|
declare const DEFAULT_DIRECTORY: string;
|
||||||
|
/**
|
||||||
|
* From python api:
|
||||||
|
* The default path for dynamic libraries to be stored.
|
||||||
|
* You may separate paths by a semicolon to search in multiple areas.
|
||||||
|
* This searches DEFAULT_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
||||||
|
*/
|
||||||
|
declare const DEFAULT_LIBRARIES_DIRECTORY: string;
|
||||||
|
interface PromptMessage {
|
||||||
|
role: "system" | "assistant" | "user";
|
||||||
|
content: string;
|
||||||
|
}
|
||||||
|
export {
|
||||||
|
ModelType,
|
||||||
|
ModelFile,
|
||||||
|
LLModel,
|
||||||
|
LLModelPromptContext,
|
||||||
|
PromptMessage,
|
||||||
|
CompletionOptions,
|
||||||
|
LoadModelOptions,
|
||||||
|
loadModel,
|
||||||
|
createCompletion,
|
||||||
|
createTokenStream,
|
||||||
|
DEFAULT_DIRECTORY,
|
||||||
|
DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
|
};
|
||||||
|
@ -1,112 +1,138 @@
|
|||||||
|
"use strict";
|
||||||
|
|
||||||
/// This file implements the gpt4all.d.ts file endings.
|
/// This file implements the gpt4all.d.ts file endings.
|
||||||
/// Written in commonjs to support both ESM and CJS projects.
|
/// Written in commonjs to support both ESM and CJS projects.
|
||||||
|
const { existsSync } = require("fs");
|
||||||
|
const path = require("node:path");
|
||||||
|
const { LLModel } = require("node-gyp-build")(path.resolve(__dirname, ".."));
|
||||||
|
const {
|
||||||
|
retrieveModel,
|
||||||
|
downloadModel,
|
||||||
|
appendBinSuffixIfMissing,
|
||||||
|
} = require("./util.js");
|
||||||
|
const config = require("./config.js");
|
||||||
|
|
||||||
const { LLModel } = require('bindings')('../build/Release/gpt4allts');
|
async function loadModel(modelName, options = {}) {
|
||||||
const { createWriteStream, existsSync } = require('fs');
|
const loadOptions = {
|
||||||
const { join } = require('path');
|
modelPath: config.DEFAULT_DIRECTORY,
|
||||||
const { performance } = require('node:perf_hooks');
|
librariesPath: config.DEFAULT_LIBRARIES_DIRECTORY,
|
||||||
|
allowDownload: true,
|
||||||
|
verbose: true,
|
||||||
|
...options,
|
||||||
// readChunks() reads from the provided reader and yields the results into an async iterable
|
|
||||||
// https://css-tricks.com/web-streams-everywhere-and-fetch-for-node-js/
|
|
||||||
function readChunks(reader) {
|
|
||||||
return {
|
|
||||||
async* [Symbol.asyncIterator]() {
|
|
||||||
let readResult = await reader.read();
|
|
||||||
while (!readResult.done) {
|
|
||||||
yield readResult.value;
|
|
||||||
readResult = await reader.read();
|
|
||||||
}
|
|
||||||
},
|
|
||||||
};
|
};
|
||||||
|
|
||||||
|
await retrieveModel(modelName, {
|
||||||
|
modelPath: loadOptions.modelPath,
|
||||||
|
allowDownload: loadOptions.allowDownload,
|
||||||
|
verbose: loadOptions.verbose,
|
||||||
|
});
|
||||||
|
|
||||||
|
const libSearchPaths = loadOptions.librariesPath.split(";");
|
||||||
|
|
||||||
|
let libPath = null;
|
||||||
|
|
||||||
|
for (const searchPath of libSearchPaths) {
|
||||||
|
if (existsSync(searchPath)) {
|
||||||
|
libPath = searchPath;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
const llmOptions = {
|
||||||
|
model_name: appendBinSuffixIfMissing(modelName),
|
||||||
|
model_path: loadOptions.modelPath,
|
||||||
|
library_path: libPath,
|
||||||
|
};
|
||||||
|
|
||||||
|
if (loadOptions.verbose) {
|
||||||
|
console.log("Creating LLModel with options:", llmOptions);
|
||||||
|
}
|
||||||
|
const llmodel = new LLModel(llmOptions);
|
||||||
|
|
||||||
|
return llmodel;
|
||||||
}
|
}
|
||||||
|
|
||||||
exports.LLModel = LLModel;
|
function createPrompt(messages, hasDefaultHeader, hasDefaultFooter) {
|
||||||
|
let fullPrompt = "";
|
||||||
|
|
||||||
|
for (const message of messages) {
|
||||||
|
if (message.role === "system") {
|
||||||
|
const systemMessage = message.content + "\n";
|
||||||
|
fullPrompt += systemMessage;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (hasDefaultHeader) {
|
||||||
|
fullPrompt += `### Instruction:
|
||||||
|
The prompt below is a question to answer, a task to complete, or a conversation
|
||||||
|
to respond to; decide which and write an appropriate response.
|
||||||
|
\n### Prompt:
|
||||||
|
`;
|
||||||
|
}
|
||||||
|
for (const message of messages) {
|
||||||
|
if (message.role === "user") {
|
||||||
|
const user_message = "\n" + message["content"];
|
||||||
|
fullPrompt += user_message;
|
||||||
|
}
|
||||||
|
if (message["role"] == "assistant") {
|
||||||
|
const assistant_message = "\nResponse: " + message["content"];
|
||||||
|
fullPrompt += assistant_message;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (hasDefaultFooter) {
|
||||||
|
fullPrompt += "\n### Response:";
|
||||||
|
}
|
||||||
|
|
||||||
exports.download = function (
|
return fullPrompt;
|
||||||
name,
|
}
|
||||||
options = { debug: false, location: process.cwd(), link: undefined }
|
|
||||||
|
async function createCompletion(
|
||||||
|
llmodel,
|
||||||
|
messages,
|
||||||
|
options = {
|
||||||
|
hasDefaultHeader: true,
|
||||||
|
hasDefaultFooter: false,
|
||||||
|
verbose: true,
|
||||||
|
}
|
||||||
) {
|
) {
|
||||||
const abortController = new AbortController();
|
|
||||||
const signal = abortController.signal;
|
|
||||||
|
|
||||||
const pathToModel = join(options.location, name);
|
|
||||||
if(existsSync(pathToModel)) {
|
|
||||||
throw Error("Path to model already exists");
|
|
||||||
}
|
|
||||||
|
|
||||||
//wrapper function to get the readable stream from request
|
|
||||||
const fetcher = (name) => fetch(options.link ?? `https://gpt4all.io/models/${name}`, {
|
|
||||||
signal,
|
|
||||||
})
|
|
||||||
.then(res => {
|
|
||||||
if(!res.ok) {
|
|
||||||
throw Error("Could not find "+ name + " from " + `https://gpt4all.io/models/` )
|
|
||||||
}
|
|
||||||
return res.body.getReader()
|
|
||||||
})
|
|
||||||
|
|
||||||
//a promise that executes and writes to a stream. Resolves when done writing.
|
|
||||||
const res = new Promise((resolve, reject) => {
|
|
||||||
fetcher(name)
|
|
||||||
//Resolves an array of a reader and writestream.
|
|
||||||
.then(reader => [reader, createWriteStream(pathToModel)])
|
|
||||||
.then(
|
|
||||||
async ([readable, wstream]) => {
|
|
||||||
console.log('(CLI might hang) Downloading @ ', pathToModel);
|
|
||||||
let perf;
|
|
||||||
if(options.debug) {
|
|
||||||
perf = performance.now();
|
|
||||||
}
|
|
||||||
for await (const chunk of readChunks(readable)) {
|
|
||||||
wstream.write(chunk);
|
|
||||||
}
|
|
||||||
if(options.debug) {
|
|
||||||
console.log("Time taken: ", (performance.now()-perf).toFixed(2), " ms");
|
|
||||||
}
|
|
||||||
resolve();
|
|
||||||
}
|
|
||||||
).catch(reject);
|
|
||||||
});
|
|
||||||
|
|
||||||
return {
|
|
||||||
cancel : () => abortController.abort(),
|
|
||||||
promise: () => res
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
//https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates
|
|
||||||
exports.prompt = function prompt(strings, ...keys) {
|
|
||||||
return (...values) => {
|
|
||||||
const dict = values[values.length - 1] || {};
|
|
||||||
const result = [strings[0]];
|
|
||||||
keys.forEach((key, i) => {
|
|
||||||
const value = Number.isInteger(key) ? values[key] : dict[key];
|
|
||||||
result.push(value, strings[i + 1]);
|
|
||||||
});
|
|
||||||
return result.join("");
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
exports.createCompletion = function (llmodel, promptMaker, options) {
|
|
||||||
//creating the keys to insert into promptMaker.
|
//creating the keys to insert into promptMaker.
|
||||||
const entries = {
|
const fullPrompt = createPrompt(
|
||||||
system: options.system ?? '',
|
messages,
|
||||||
header: options.header ?? "### Instruction: The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response.\n### Prompt: ",
|
options.hasDefaultHeader ?? true,
|
||||||
prompt: options.prompt,
|
options.hasDefaultFooter
|
||||||
...(options.promptEntries ?? {})
|
);
|
||||||
};
|
if (options.verbose) {
|
||||||
|
console.log("Sent: " + fullPrompt);
|
||||||
const fullPrompt = promptMaker(entries)+'\n### Response:';
|
|
||||||
|
|
||||||
if(options.verbose) {
|
|
||||||
console.log("sending prompt: " + `"${fullPrompt}"`)
|
|
||||||
}
|
}
|
||||||
|
const promisifiedRawPrompt = new Promise((resolve, rej) => {
|
||||||
return llmodel.raw_prompt(fullPrompt, options);
|
llmodel.raw_prompt(fullPrompt, options, (s) => {
|
||||||
|
resolve(s);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
return promisifiedRawPrompt.then((response) => {
|
||||||
|
return {
|
||||||
|
llmodel: llmodel.name(),
|
||||||
|
usage: {
|
||||||
|
prompt_tokens: fullPrompt.length,
|
||||||
|
completion_tokens: response.length, //TODO
|
||||||
|
total_tokens: fullPrompt.length + response.length, //TODO
|
||||||
|
},
|
||||||
|
choices: [
|
||||||
|
{
|
||||||
|
message: {
|
||||||
|
role: "assistant",
|
||||||
|
content: response,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
};
|
||||||
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
module.exports = {
|
||||||
|
...config,
|
||||||
|
LLModel,
|
||||||
|
createCompletion,
|
||||||
|
downloadModel,
|
||||||
|
retrieveModel,
|
||||||
|
loadModel,
|
||||||
|
};
|
||||||
|
69
gpt4all-bindings/typescript/src/util.d.ts
vendored
Normal file
69
gpt4all-bindings/typescript/src/util.d.ts
vendored
Normal file
@ -0,0 +1,69 @@
|
|||||||
|
/// <reference types="node" />
|
||||||
|
declare module "gpt4all";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Initiates the download of a model file of a specific model type.
|
||||||
|
* By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||||
|
* @param {ModelFile} model - The model file to be downloaded.
|
||||||
|
* @param {DownloadOptions} options - to pass into the downloader. Default is { location: (cwd), debug: false }.
|
||||||
|
* @returns {DownloadController} object that allows controlling the download process.
|
||||||
|
*
|
||||||
|
* @throws {Error} If the model already exists in the specified location.
|
||||||
|
* @throws {Error} If the model cannot be found at the specified url.
|
||||||
|
*
|
||||||
|
* @example
|
||||||
|
* const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
|
||||||
|
* controller.promise().then(() => console.log('Downloaded!'))
|
||||||
|
*/
|
||||||
|
declare function downloadModel(
|
||||||
|
modelName: string,
|
||||||
|
options?: DownloadModelOptions
|
||||||
|
): DownloadController;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Options for the model download process.
|
||||||
|
*/
|
||||||
|
export interface DownloadModelOptions {
|
||||||
|
/**
|
||||||
|
* location to download the model.
|
||||||
|
* Default is process.cwd(), or the current working directory
|
||||||
|
*/
|
||||||
|
modelPath?: string;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Debug mode -- check how long it took to download in seconds
|
||||||
|
* @default false
|
||||||
|
*/
|
||||||
|
debug?: boolean;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Remote download url. Defaults to `https://gpt4all.io/models`
|
||||||
|
* @default https://gpt4all.io/models
|
||||||
|
*/
|
||||||
|
url?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
declare function listModels(): Promise<Record<string, string>[]>;
|
||||||
|
|
||||||
|
interface RetrieveModelOptions {
|
||||||
|
allowDownload?: boolean;
|
||||||
|
verbose?: boolean;
|
||||||
|
modelPath?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
declare async function retrieveModel(
|
||||||
|
model: string,
|
||||||
|
options?: RetrieveModelOptions
|
||||||
|
): Promise<string>;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Model download controller.
|
||||||
|
*/
|
||||||
|
interface DownloadController {
|
||||||
|
/** Cancel the request to download from gpt4all website if this is called. */
|
||||||
|
cancel: () => void;
|
||||||
|
/** Convert the downloader into a promise, allowing people to await and manage its lifetime */
|
||||||
|
promise: () => Promise<void>;
|
||||||
|
}
|
||||||
|
|
||||||
|
export { downloadModel, DownloadModelOptions, DownloadController, listModels, retrieveModel, RetrieveModelOptions };
|
156
gpt4all-bindings/typescript/src/util.js
Normal file
156
gpt4all-bindings/typescript/src/util.js
Normal file
@ -0,0 +1,156 @@
|
|||||||
|
const { createWriteStream, existsSync } = require("fs");
|
||||||
|
const { performance } = require("node:perf_hooks");
|
||||||
|
const path = require("node:path");
|
||||||
|
const {mkdirp} = require("mkdirp");
|
||||||
|
const { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } = require("./config.js");
|
||||||
|
|
||||||
|
async function listModels() {
|
||||||
|
const res = await fetch("https://gpt4all.io/models/models.json");
|
||||||
|
const modelList = await res.json();
|
||||||
|
return modelList;
|
||||||
|
}
|
||||||
|
|
||||||
|
function appendBinSuffixIfMissing(name) {
|
||||||
|
if (!name.endsWith(".bin")) {
|
||||||
|
return name + ".bin";
|
||||||
|
}
|
||||||
|
return name;
|
||||||
|
}
|
||||||
|
|
||||||
|
// readChunks() reads from the provided reader and yields the results into an async iterable
|
||||||
|
// https://css-tricks.com/web-streams-everywhere-and-fetch-for-node-js/
|
||||||
|
function readChunks(reader) {
|
||||||
|
return {
|
||||||
|
async *[Symbol.asyncIterator]() {
|
||||||
|
let readResult = await reader.read();
|
||||||
|
while (!readResult.done) {
|
||||||
|
yield readResult.value;
|
||||||
|
readResult = await reader.read();
|
||||||
|
}
|
||||||
|
},
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
function downloadModel(
|
||||||
|
modelName,
|
||||||
|
options = {}
|
||||||
|
) {
|
||||||
|
const downloadOptions = {
|
||||||
|
modelPath: DEFAULT_DIRECTORY,
|
||||||
|
debug: false,
|
||||||
|
url: "https://gpt4all.io/models",
|
||||||
|
...options,
|
||||||
|
};
|
||||||
|
|
||||||
|
const modelFileName = appendBinSuffixIfMissing(modelName);
|
||||||
|
const fullModelPath = path.join(downloadOptions.modelPath, modelFileName);
|
||||||
|
const modelUrl = `${downloadOptions.url}/${modelFileName}`
|
||||||
|
|
||||||
|
if (existsSync(fullModelPath)) {
|
||||||
|
throw Error(`Model already exists at ${fullModelPath}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const abortController = new AbortController();
|
||||||
|
const signal = abortController.signal;
|
||||||
|
|
||||||
|
//wrapper function to get the readable stream from request
|
||||||
|
// const baseUrl = options.url ?? "https://gpt4all.io/models";
|
||||||
|
const fetchModel = () =>
|
||||||
|
fetch(modelUrl, {
|
||||||
|
signal,
|
||||||
|
}).then((res) => {
|
||||||
|
if (!res.ok) {
|
||||||
|
throw Error(`Failed to download model from ${modelUrl} - ${res.statusText}`);
|
||||||
|
}
|
||||||
|
return res.body.getReader();
|
||||||
|
});
|
||||||
|
|
||||||
|
//a promise that executes and writes to a stream. Resolves when done writing.
|
||||||
|
const res = new Promise((resolve, reject) => {
|
||||||
|
fetchModel()
|
||||||
|
//Resolves an array of a reader and writestream.
|
||||||
|
.then((reader) => [reader, createWriteStream(fullModelPath)])
|
||||||
|
.then(async ([readable, wstream]) => {
|
||||||
|
console.log("Downloading @ ", fullModelPath);
|
||||||
|
let perf;
|
||||||
|
if (options.debug) {
|
||||||
|
perf = performance.now();
|
||||||
|
}
|
||||||
|
for await (const chunk of readChunks(readable)) {
|
||||||
|
wstream.write(chunk);
|
||||||
|
}
|
||||||
|
if (options.debug) {
|
||||||
|
console.log(
|
||||||
|
"Time taken: ",
|
||||||
|
(performance.now() - perf).toFixed(2),
|
||||||
|
" ms"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
resolve(fullModelPath);
|
||||||
|
})
|
||||||
|
.catch(reject);
|
||||||
|
});
|
||||||
|
|
||||||
|
return {
|
||||||
|
cancel: () => abortController.abort(),
|
||||||
|
promise: () => res,
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
async function retrieveModel (
|
||||||
|
modelName,
|
||||||
|
options = {}
|
||||||
|
) {
|
||||||
|
const retrieveOptions = {
|
||||||
|
modelPath: DEFAULT_DIRECTORY,
|
||||||
|
allowDownload: true,
|
||||||
|
verbose: true,
|
||||||
|
...options,
|
||||||
|
};
|
||||||
|
|
||||||
|
await mkdirp(retrieveOptions.modelPath);
|
||||||
|
|
||||||
|
const modelFileName = appendBinSuffixIfMissing(modelName);
|
||||||
|
const fullModelPath = path.join(retrieveOptions.modelPath, modelFileName);
|
||||||
|
const modelExists = existsSync(fullModelPath);
|
||||||
|
|
||||||
|
if (modelExists) {
|
||||||
|
return fullModelPath;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!retrieveOptions.allowDownload) {
|
||||||
|
throw Error(`Model does not exist at ${fullModelPath}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const availableModels = await listModels();
|
||||||
|
const foundModel = availableModels.find((model) => model.filename === modelFileName);
|
||||||
|
|
||||||
|
if (!foundModel) {
|
||||||
|
throw Error(`Model "${modelName}" is not available.`);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (retrieveOptions.verbose) {
|
||||||
|
console.log(`Downloading ${modelName}...`);
|
||||||
|
}
|
||||||
|
|
||||||
|
const downloadController = downloadModel(modelName, {
|
||||||
|
modelPath: retrieveOptions.modelPath,
|
||||||
|
debug: retrieveOptions.verbose,
|
||||||
|
});
|
||||||
|
|
||||||
|
const downloadPath = await downloadController.promise();
|
||||||
|
|
||||||
|
if (retrieveOptions.verbose) {
|
||||||
|
console.log(`Model downloaded to ${downloadPath}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
return downloadPath
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
module.exports = {
|
||||||
|
appendBinSuffixIfMissing,
|
||||||
|
downloadModel,
|
||||||
|
retrieveModel,
|
||||||
|
};
|
@ -1,14 +0,0 @@
|
|||||||
|
|
||||||
#include "stdcapture.h"
|
|
||||||
|
|
||||||
CoutRedirect::CoutRedirect() {
|
|
||||||
old = std::cout.rdbuf(buffer.rdbuf()); // redirect cout to buffer stream
|
|
||||||
}
|
|
||||||
|
|
||||||
std::string CoutRedirect::getString() {
|
|
||||||
return buffer.str(); // get string
|
|
||||||
}
|
|
||||||
|
|
||||||
CoutRedirect::~CoutRedirect() {
|
|
||||||
std::cout.rdbuf(old); // reverse redirect
|
|
||||||
}
|
|
@ -1,21 +0,0 @@
|
|||||||
//https://stackoverflow.com/questions/5419356/redirect-stdout-stderr-to-a-string
|
|
||||||
#ifndef COUTREDIRECT_H
|
|
||||||
#define COUTREDIRECT_H
|
|
||||||
|
|
||||||
#include <iostream>
|
|
||||||
#include <streambuf>
|
|
||||||
#include <string>
|
|
||||||
#include <sstream>
|
|
||||||
|
|
||||||
class CoutRedirect {
|
|
||||||
public:
|
|
||||||
CoutRedirect();
|
|
||||||
std::string getString();
|
|
||||||
~CoutRedirect();
|
|
||||||
|
|
||||||
private:
|
|
||||||
std::stringstream buffer;
|
|
||||||
std::streambuf* old;
|
|
||||||
};
|
|
||||||
|
|
||||||
#endif // COUTREDIRECT_H
|
|
@ -1,38 +1,5 @@
|
|||||||
import * as assert from 'node:assert'
|
import * as assert from 'node:assert'
|
||||||
import { prompt, download } from '../src/gpt4all.js'
|
import { download } from '../src/gpt4all.js'
|
||||||
|
|
||||||
{
|
|
||||||
|
|
||||||
const somePrompt = prompt`${"header"} Hello joe, my name is Ron. ${"prompt"}`;
|
|
||||||
assert.equal(
|
|
||||||
somePrompt({ header: 'oompa', prompt: 'holy moly' }),
|
|
||||||
'oompa Hello joe, my name is Ron. holy moly'
|
|
||||||
);
|
|
||||||
|
|
||||||
}
|
|
||||||
|
|
||||||
{
|
|
||||||
|
|
||||||
const indexedPrompt = prompt`${0}, ${1} ${0}`;
|
|
||||||
assert.equal(
|
|
||||||
indexedPrompt('hello', 'world'),
|
|
||||||
'hello, world hello'
|
|
||||||
);
|
|
||||||
|
|
||||||
assert.notEqual(
|
|
||||||
indexedPrompt(['hello', 'world']),
|
|
||||||
'hello, world hello'
|
|
||||||
);
|
|
||||||
|
|
||||||
}
|
|
||||||
|
|
||||||
{
|
|
||||||
assert.equal(
|
|
||||||
(prompt`${"header"} ${"prompt"}`)({ header: 'hello', prompt: 'poo' }), 'hello poo',
|
|
||||||
"Template prompt not equal"
|
|
||||||
);
|
|
||||||
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
assert.rejects(async () => download('poo.bin').promise());
|
assert.rejects(async () => download('poo.bin').promise());
|
||||||
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user