typescript: publish alpha on npm and lots of cleanup, documentation, and more (#913)

* fix typo so padding can be accessed

* Small cleanups for settings dialog.

* Fix the build.

* localdocs

* Fixup the rescan. Fix debug output.

* Add remove folder implementation.

* Remove this signal as unnecessary for now.

* Cleanup of the database, better chunking, better matching.

* Add new reverse prompt for new localdocs context feature.

* Add a new muted text color.

* Turn off the debugging messages by default.

* Add prompt processing and localdocs to the busy indicator in UI.

* Specify a large number of suffixes we will search for now.

* Add a collection list to support a UI.

* Add a localdocs tab.

* Start fleshing out the localdocs ui.

* Begin implementing the localdocs ui in earnest.

* Clean up the settings dialog for localdocs a bit.

* Add more of the UI for selecting collections for chats.

* Complete the settings for localdocs.

* Adds the collections to serialize and implement references for localdocs.

* Store the references separately so they are not sent to datalake.

* Add context link to references.

* Don't use the full path in reference text.

* Various fixes to remove unnecessary warnings.

* Add a newline

* ignore rider and vscode dirs

* create test project and basic model loading tests

* make sample print usage and cleaner

* Get the backend as well as the client building/working with msvc.

* Libraries named differently on msvc.

* Bump the version number.

* This time remember to bump the version right after a release.

* rm redundant json

* More precise condition

* Nicer handling of missing model directory.
Correct exception message.

* Log where the model was found

* Concise model matching

* reduce nesting, better error reporting

* convert to f-strings

* less magic number

* 1. Cleanup the interrupted download
2. with-syntax

* Redundant else

* Do not ignore explicitly passed 4 threads

* Correct return type

* Add optional verbosity

* Correct indentation of the multiline error message

* one funcion to append .bin suffix

* hotfix default verbose optioin

* export hidden types and fix prompt() type

* tiny typo (#739)

* Update README.md (#738)

* Update README.md

fix golang gpt4all import path

Signed-off-by: Nandakumar <nandagunasekaran@gmail.com>

* Update README.md

Signed-off-by: Nandakumar <nandagunasekaran@gmail.com>

---------

Signed-off-by: Nandakumar <nandagunasekaran@gmail.com>

* fix(training instructions): model repo name (#728)

Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com>

* C# Bindings - Prompt formatting (#712)

* Added support for custom prompt formatting

* more docs added

* bump version

* clean up cc files and revert things

* LocalDocs documentation initial (#761)

* LocalDocs documentation initial

* Improved localdocs documentation (#762)

* Improved localdocs documentation

* Improved localdocs documentation

* Improved localdocs documentation

* Improved localdocs documentation

* New tokenizer implementation for MPT and GPT-J

Improves output quality by making these tokenizers more closely
match the behavior of the huggingface `tokenizers` based BPE
tokenizers these models were trained with.

Featuring:
 * Fixed unicode handling (via ICU)
 * Fixed BPE token merge handling
 * Complete added vocabulary handling

* buf_ref.into() can be const now

* add tokenizer readme w/ instructions for convert script

* Revert "add tokenizer readme w/ instructions for convert script"

This reverts commit 9c15d1f83e.

* Revert "buf_ref.into() can be const now"

This reverts commit 840e011b75.

* Revert "New tokenizer implementation for MPT and GPT-J"

This reverts commit ee3469ba6c.

* Fix remove model from model download for regular models.

* Fixed formatting of localdocs docs (#770)

* construct and return the correct reponse when the request is a chat completion

* chore: update typings to keep consistent with python api

* progress, updating createCompletion to mirror py api

* update spec, unfinished backend

* prebuild binaries for package distribution using prebuildify/node-gyp-build

* Get rid of blocking behavior for regenerate response.

* Add a label to the model loading visual indicator.

* Use the new MyButton for the regenerate response button.

* Add a hover and pressed to the visual indication of MyButton.

* Fix wording of this accessible description.

* Some color and theme enhancements to make the UI contrast a bit better.

* Make the comboboxes align in UI.

* chore: update namespace and fix prompt bug

* fix linux build

* add roadmap

* Fix offset of prompt/response icons for smaller text.

* Dlopen backend 5 (#779)

Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.

* Add a custom busy indicator to further align look and feel across platforms.

* Draw the indicator for combobox to ensure it looks the same on all platforms.

* Fix warning.

* Use the proper text color for sending messages.

* Fixup the plus new chat button.

* Make all the toolbuttons highlight on hover.

* Advanced avxonly autodetection (#744)

* Advanced avxonly requirement detection

* chore: support llamaversion >= 3 and ggml default

* Dlopen better implementation management (Version 2)

* Add fixme's and clean up a bit.

* Documentation improvements on LocalDocs (#790)

* Update gpt4all_chat.md

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* typo

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

---------

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* Adapt code

* Makefile changes (WIP to test)

* Debug

* Adapt makefile

* Style

* Implemented logging mechanism (#785)

* Cleaned up implementation management (#787)

* Cleaned up implementation management

* Initialize LLModel::m_implementation to nullptr

* llmodel.h: Moved dlhandle fwd declare above LLModel class

* Fix compile

* Fixed double-free in LLModel::Implementation destructor

* Allow user to specify custom search path via $GPT4ALL_IMPLEMENTATIONS_PATH (#789)

* Drop leftover include

* Add ldl in gpt4all.go for dynamic linking (#797)

* Logger should also output to stderr

* Fix MSVC Build, Update C# Binding Scripts

* Update gpt4all_chat.md (#800)

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* C# Bindings - improved logging (#714)

* added optional support for .NET logging

* bump version and add missing alpha suffix

* avoid creating additional namespace for extensions

* prefer NullLogger/NullLoggerFactory over null-conditional ILogger to avoid errors

---------

Signed-off-by: mvenditto <venditto.matteo@gmail.com>

* Make localdocs work with server mode.

* Better name for database results.

* Fix for stale references after we regenerate.

* Don't hardcode these.

* Fix bug with resetting context with chatgpt model.

* Trying to shrink the copy+paste code and do more code sharing between backend model impl.

* Remove this as it is no longer useful.

* Try and fix build on mac.

* Fix mac build again.

* Add models/release.json to github repo to allow PRs

* Fixed spelling error in models.json

to make CI happy

Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>

* updated bindings code for updated C api

* load all model libs

* model creation is failing... debugging

* load libs correctly

* fixed finding model libs

* cleanup

* cleanup

* more cleanup

* small typo fix

* updated binding.gyp

* Fixed model type for GPT-J (#815)

Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>

* Fixed tons of warnings and clazy findings (#811)

* Some tweaks to UI to make window resizing smooth and flow nicely.

* Min constraints on about dialog.

* Prevent flashing of white on resize.

* Actually use the theme dark color for window background.

* Add the ability to change the directory via text field not just 'browse' button.

* add scripts to build dlls

* markdown doc gen

* add scripts, nearly done moving breaking changes

* merge with main

* oops, fixed comment

* more meaningful name

* leave for testing

* Only default mlock on macOS where swap seems to be a problem

Repeating the change that once was done in https://github.com/nomic-ai/gpt4all/pull/663 but then was overriden by 9c6c09cbd2

Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com>

* Add a collection immediately and show a placeholder + busy indicator in localdocs settings.

* some tweaks to optional types and defaults

* mingw script for windows compilation

* Update README.md

huggingface -> Hugging Face

Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>

* Backend prompt dedup (#822)

* Deduplicated prompt() function code

* Better error handling when the model fails to load.

* We no longer have an avx_only repository and better error handling for minimum hardware requirements. (#833)

* Update build_and_run.md (#834)

Signed-off-by: AT <manyoso@users.noreply.github.com>

* Trying out a new feature to download directly from huggingface.

* Try again with the url.

* Allow for download of models hosted on third party hosts.

* Fix up for newer models on reset context. This fixes the model from totally failing after a reset context.

* Update to latest llama.cpp

* Remove older models that are not as popular. (#837)

* Remove older models that are not as popular.

* Update models.json

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

---------

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* Update models.json (#838)

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* Update models.json

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* feat: finalyl compiled on windows (MSVC) goadman

* update README and spec and promisfy createCompletion

* update d.ts

* Make installers work with mac/windows for big backend change.

* Need this so the linux installer packages it as a dependency.

* Try and fix mac.

* Fix compile on mac.

* These need to be installed for them to be packaged and work for both mac and windows.

* Fix installers for windows and linux.

* Fix symbol resolution on windows.

* updated pypi version

* Release notes for version 2.4.5 (#853)

* Update README.md (#854)

Signed-off-by: AT <manyoso@users.noreply.github.com>

* Documentation for model sideloading (#851)

* Documentation for model sideloading

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* Update gpt4all_chat.md

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

---------

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* Speculative fix for windows llama models with installer.

* Revert "Speculative fix for windows llama models with installer."

This reverts commit add725d1eb.

* Revert "Fix bug with resetting context with chatgpt model." (#859)

This reverts commit e0dcf6a14f.

* Fix llama models on linux and windows.

* Bump the version.

* New release notes

* Set thread counts after loading model (#836)

* Update gpt4all_faq.md (#861)

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* Supports downloading officially supported models not hosted on gpt4all R2

* Replit Model (#713)

* porting over replit code model to gpt4all

* replaced memory with kv_self struct

* continuing debug

* welp it built but lot of sus things

* working model loading and somewhat working generate.. need to format response?

* revert back to semi working version

* finally got rid of weird formatting

* figured out problem is with python bindings - this is good to go for testing

* addressing PR feedback

* output refactor

* fixed prompt reponse collection

* cleanup

* addressing PR comments

* building replit backend with new ggmlver code

* chatllm replit and clean python files

* cleanup

* updated replit to match new llmodel api

* match llmodel api and change size_t to Token

* resolve PR comments

* replit model commit comment

* Synced llama.cpp.cmake with upstream (#887)

* Fix for windows.

* fix: build script

* Revert "Synced llama.cpp.cmake with upstream (#887)"

This reverts commit 5c5e10c1f5.

* Update README.md (#906)

Add PyPI link and add clickable, more specific link to documentation

Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de>

* Update CollectionsDialog.qml (#856)

Phrasing for localdocs

Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>

* sampling: remove incorrect offset for n_vocab (#900)

no effect, but avoids a *potential* bug later if we use
actualVocabSize - which is for when a model has a larger
embedding tensor/# of output logits than actually trained token
to allow room for adding extras in finetuning - presently all of our
models have had "placeholder" tokens in the vocab so this hasn't broken
anything, but if the sizes did differ we want the equivalent of
`logits[actualVocabSize:]` (the start point is unchanged), not
`logits[-actualVocabSize:]` (this.)

* non-llama: explicitly greedy sampling for temp<=0 (#901)

copied directly from llama.cpp - without this temp=0.0 will just
scale all the logits to infinity and give bad output

* work on thread safety and cleaning up, adding object option

* chore: cleanup tests and spec

* refactor for object based startup

* more docs

* Circleci builds for Linux, Windows, and macOS for gpt4all-chat.

* more docs

* Synced llama.cpp.cmake with upstream

* add lock file to ignore codespell

* Move usage in Python bindings readme to own section (#907)

Have own section for short usage example, as it is not specific to local build

Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de>

* Always sync for circleci.

* update models json with replit model

* Forgot to bump.

* Change the default values for generation in GUI

* Removed double-static from variables in replit.cpp

The anonymous namespace already makes it static.

Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>

* Generator in Python Bindings - streaming yields tokens at a time (#895)

* generator method

* cleanup

* bump version number for clarity

* added replace in decode to avoid unicodedecode exception

* revert back to _build_prompt

* Do auto detection by default in C++ API

Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>

* remove comment

* add comments for index.h

* chore: add new models and edit ignore files and documentation

* llama on Metal (#885)

Support latest llama with Metal

---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>

* Revert "llama on Metal (#885)"

This reverts commit b59ce1c6e7.

* add more readme stuff and debug info

* spell

* Metal+LLama take two (#929)

Support latest llama with Metal
---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>

* add prebuilts for windows

* Add new solution for context links that does not force regular markdown (#938)

in responses which is disruptive to code completions in responses.

* add prettier

* split out non llm related methods into util.js, add listModels method

* add prebuild script for creating all platforms bindings at once

* check in prebuild linux/so libs and allow distribution of napi prebuilds

* apply autoformatter

* move constants in config.js, add loadModel and retrieveModel methods

* Clean up the context links a bit.

* Don't interfere with selection.

* Add code blocks and python syntax highlighting.

* Spelling error.

* Add c++/c highighting support.

* Fix some bugs with bash syntax and add some C23 keywords.

* Bugfixes for prompt syntax highlighting.

* Try and fix a false positive from codespell.

* When recalculating context we can't erase the BOS.

* Fix Windows MSVC AVX builds
- bug introduced in 557c82b5ed
- currently getting: `warning C5102: ignoring invalid command-line macro definition '/arch:AVX2'`
- solution is to use `_options(...)` not `_definitions(...)`

* remove .so unneeded path

---------

Signed-off-by: Nandakumar <nandagunasekaran@gmail.com>
Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com>
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
Signed-off-by: mvenditto <venditto.matteo@gmail.com>
Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com>
Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Signed-off-by: AT <manyoso@users.noreply.github.com>
Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de>
Co-authored-by: Justin Wang <justinwang46@gmail.com>
Co-authored-by: Adam Treat <treat.adam@gmail.com>
Co-authored-by: redthing1 <redthing1@alt.icu>
Co-authored-by: Konstantin Gukov <gukkos@gmail.com>
Co-authored-by: Richard Guo <richardg7890@gmail.com>
Co-authored-by: Joseph Mearman <joseph@mearman.co.uk>
Co-authored-by: Nandakumar <nandagunasekaran@gmail.com>
Co-authored-by: Chase McDougall <chasemcdougall@hotmail.com>
Co-authored-by: mvenditto <venditto.matteo@gmail.com>
Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>
Co-authored-by: Aaron Miller <apage43@ninjawhale.com>
Co-authored-by: FoivosC <christoulakis.foivos@adlittle.com>
Co-authored-by: limez <limez@protonmail.com>
Co-authored-by: AT <manyoso@users.noreply.github.com>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
Co-authored-by: niansa <anton-sa@web.de>
Co-authored-by: mudler <mudler@mocaccino.org>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Tim Miller <innerlogic4321@gmail.com>
Co-authored-by: Peter Gagarinov <pgagarinov@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: Claudius Ellsel <claudius.ellsel@live.de>
Co-authored-by: pingpongching <golololologol02@gmail.com>
Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: Cosmic Snow <cosmic-snow@mailfence.com>
This commit is contained in:
Jacob Nguyen 2023-06-12 14:00:20 -05:00 committed by GitHub
parent 44bf91855d
commit 8d53614444
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
25 changed files with 4073 additions and 486 deletions

View File

@ -1,3 +1,3 @@
[codespell] [codespell]
ignore-words-list = blong, belong ignore-words-list = blong, belong
skip = .git,*.pdf,*.svg skip = .git,*.pdf,*.svg,*.lock

View File

@ -1,2 +1,3 @@
node_modules/ node_modules/
build/ build/
prebuilds/

View File

@ -1,3 +1,4 @@
test/ test/
spec/ spec/
scripts/
build

View File

@ -2,12 +2,32 @@
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date. The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
- created by [jacoobes](https://github.com/jacoobes) and [nomic ai](https://home.nomic.ai) :D, for all to use. - created by [jacoobes](https://github.com/jacoobes) and [nomic ai](https://home.nomic.ai) :D, for all to use.
- will maintain this repository when possible, new feature requests will be handled through nomic
### Code (alpha)
```js
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } from '../src/gpt4all.js'
const ll = new LLModel({
model_name: 'ggml-vicuna-7b-1.1-q4_2.bin',
model_path: './',
library_path: DEFAULT_LIBRARIES_DIRECTORY
});
const response = await createCompletion(ll, [
{ role : 'system', content: 'You are meant to be annoying and unhelpful.' },
{ role : 'user', content: 'What is 1 + 1?' }
]);
```
### API
- The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
- [docs](./docs/api.md)
### Build Instructions ### Build Instructions
- As of 05/21/2023, Tested on windows (MSVC) only. (somehow got it to work on MSVC 🤯) - As of 05/21/2023, Tested on windows (MSVC). (somehow got it to work on MSVC 🤯)
- binding.gyp is compile config - binding.gyp is compile config
- Tested on Ubuntu. Everything seems to work fine
- MingW works as well to build the gpt4all-backend. HOWEVER, this package works only with MSVC built dlls.
### Requirements ### Requirements
- git - git
@ -31,6 +51,15 @@ cd gpt4all-bindings/typescript
```sh ```sh
git submodule update --init --depth 1 --recursive git submodule update --init --depth 1 --recursive
``` ```
**AS OF NEW BACKEND** to build the backend,
```sh
yarn build:backend
```
This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native The only current way to use them is to put them in the current working directory of your application. That is, **WHEREVER YOU RUN YOUR NODE APPLICATION**
- llama-xxxx.dll is required.
- According to whatever model you are using, you'll need to select the proper model loader.
- For example, if you running an Mosaic MPT model, you will need to select the mpt-(buildvariant).(dynamiclibrary)
### Test ### Test
```sh ```sh
yarn test yarn test
@ -48,9 +77,22 @@ yarn test
#### spec/ #### spec/
- Average look and feel of the api - Average look and feel of the api
- Should work assuming a model is installed locally in working directory - Should work assuming a model and libraries are installed locally in working directory
#### index.cc #### index.cc
- The bridge between nodejs and c. Where the bindings are. - The bridge between nodejs and c. Where the bindings are.
#### prompt.cc
- Handling prompting and inference of models in a threadsafe, asynchronous way.
#### docs/
- Autogenerated documentation using the script `yarn docs:build`
### Roadmap
This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
- [x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
- [ ] createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)
- [ ] proper unit testing (integrate with circle ci)
- [ ] publish to npm under alpha tag `gpt4all@alpha`
- [ ] have more people test on other platforms (mac tester needed)
- [x] switch to new pluggable backend

View File

@ -1,45 +1,55 @@
{ {
"targets": [ "targets": [
{ {
"target_name": "gpt4allts", # gpt4all-ts will cause compile error "target_name": "gpt4all", # gpt4all-ts will cause compile error
"cflags!": [ "-fno-exceptions" ], "cflags_cc!": [ "-fno-exceptions"],
"cflags_cc!": [ "-fno-exceptions" ],
"include_dirs": [ "include_dirs": [
"<!@(node -p \"require('node-addon-api').include\")", "<!@(node -p \"require('node-addon-api').include\")",
"../../gpt4all-backend/llama.cpp/", # need to include llama.cpp because the include paths for examples/common.h include llama.h relatively
"../../gpt4all-backend", "../../gpt4all-backend",
], ],
"sources": [ # is there a better way to do this "sources": [
"../../gpt4all-backend/llama.cpp/examples/common.cpp", # PREVIOUS VERSION: had to required the sources, but with newest changes do not need to
"../../gpt4all-backend/llama.cpp/ggml.c", #"../../gpt4all-backend/llama.cpp/examples/common.cpp",
"../../gpt4all-backend/llama.cpp/llama.cpp", #"../../gpt4all-backend/llama.cpp/ggml.c",
"../../gpt4all-backend/utils.cpp", #"../../gpt4all-backend/llama.cpp/llama.cpp",
# "../../gpt4all-backend/utils.cpp",
"../../gpt4all-backend/llmodel_c.cpp", "../../gpt4all-backend/llmodel_c.cpp",
"../../gpt4all-backend/gptj.cpp", "../../gpt4all-backend/llmodel.cpp",
"../../gpt4all-backend/llamamodel.cpp", "prompt.cc",
"../../gpt4all-backend/mpt.cpp",
"stdcapture.cc",
"index.cc", "index.cc",
], ],
"conditions": [ "conditions": [
['OS=="mac"', { ['OS=="mac"', {
'defines': [ 'defines': [
'NAPI_CPP_EXCEPTIONS' 'LIB_FILE_EXT=".dylib"',
], 'NAPI_CPP_EXCEPTIONS',
]
}], }],
['OS=="win"', { ['OS=="win"', {
'defines': [ 'defines': [
'LIB_FILE_EXT=".dll"',
'NAPI_CPP_EXCEPTIONS', 'NAPI_CPP_EXCEPTIONS',
"__AVX2__" # allows SIMD: https://discord.com/channels/1076964370942267462/1092290790388150272/1107564673957630023
], ],
"msvs_settings": { "msvs_settings": {
"VCCLCompilerTool": { "VCCLCompilerTool": {
"AdditionalOptions": [ "AdditionalOptions": [
"/std:c++20", "/std:c++20",
"/EHsc" "/EHsc",
], ],
}, },
}, },
}],
['OS=="linux"', {
'defines': [
'LIB_FILE_EXT=".so"',
'NAPI_CPP_EXCEPTIONS',
],
'cflags_cc!': [
'-fno-rtti',
],
'cflags_cc': [
'-std=c++20'
]
}] }]
] ]
}] }]

View File

@ -0,0 +1,623 @@
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->
### Table of Contents
* [download][1]
* [Parameters][2]
* [Examples][3]
* [DownloadOptions][4]
* [location][5]
* [debug][6]
* [url][7]
* [DownloadController][8]
* [cancel][9]
* [promise][10]
* [ModelType][11]
* [ModelFile][12]
* [gptj][13]
* [llama][14]
* [mpt][15]
* [type][16]
* [LLModel][17]
* [constructor][18]
* [Parameters][19]
* [type][20]
* [name][21]
* [stateSize][22]
* [threadCount][23]
* [setThreadCount][24]
* [Parameters][25]
* [raw\_prompt][26]
* [Parameters][27]
* [isModelLoaded][28]
* [setLibraryPath][29]
* [Parameters][30]
* [getLibraryPath][31]
* [createCompletion][32]
* [Parameters][33]
* [Examples][34]
* [CompletionOptions][35]
* [verbose][36]
* [hasDefaultHeader][37]
* [hasDefaultFooter][38]
* [PromptMessage][39]
* [role][40]
* [content][41]
* [prompt\_tokens][42]
* [completion\_tokens][43]
* [total\_tokens][44]
* [CompletionReturn][45]
* [model][46]
* [usage][47]
* [choices][48]
* [CompletionChoice][49]
* [message][50]
* [LLModelPromptContext][51]
* [logits\_size][52]
* [tokens\_size][53]
* [n\_past][54]
* [n\_ctx][55]
* [n\_predict][56]
* [top\_k][57]
* [top\_p][58]
* [temp][59]
* [n\_batch][60]
* [repeat\_penalty][61]
* [repeat\_last\_n][62]
* [context\_erase][63]
* [createTokenStream][64]
* [Parameters][65]
* [DEFAULT\_DIRECTORY][66]
* [DEFAULT\_LIBRARIES\_DIRECTORY][67]
## download
Initiates the download of a model file of a specific model type.
By default this downloads without waiting. use the controller returned to alter this behavior.
### Parameters
* `model` **[ModelFile][12]** The model file to be downloaded.
* `options` **[DownloadOptions][4]** to pass into the downloader. Default is { location: (cwd), debug: false }.
### Examples
```javascript
const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
controller.promise().then(() => console.log('Downloaded!'))
```
* Throws **[Error][68]** If the model already exists in the specified location.
* Throws **[Error][68]** If the model cannot be found at the specified url.
Returns **[DownloadController][8]** object that allows controlling the download process.
## DownloadOptions
Options for the model download process.
### location
location to download the model.
Default is process.cwd(), or the current working directory
Type: [string][69]
### debug
Debug mode -- check how long it took to download in seconds
Type: [boolean][70]
### url
Remote download url. Defaults to `https://gpt4all.io/models`
Type: [string][69]
## DownloadController
Model download controller.
### cancel
Cancel the request to download from gpt4all website if this is called.
Type: function (): void
### promise
Convert the downloader into a promise, allowing people to await and manage its lifetime
Type: function (): [Promise][71]\<void>
## ModelType
Type of the model
Type: (`"gptj"` | `"llama"` | `"mpt"`)
## ModelFile
Full list of models available
### gptj
List of GPT-J Models
Type: (`"ggml-gpt4all-j-v1.3-groovy.bin"` | `"ggml-gpt4all-j-v1.2-jazzy.bin"` | `"ggml-gpt4all-j-v1.1-breezy.bin"` | `"ggml-gpt4all-j.bin"`)
### llama
List Llama Models
Type: (`"ggml-gpt4all-l13b-snoozy.bin"` | `"ggml-vicuna-7b-1.1-q4_2.bin"` | `"ggml-vicuna-13b-1.1-q4_2.bin"` | `"ggml-wizardLM-7B.q4_2.bin"` | `"ggml-stable-vicuna-13B.q4_2.bin"` | `"ggml-nous-gpt4-vicuna-13b.bin"`)
### mpt
List of MPT Models
Type: (`"ggml-mpt-7b-base.bin"` | `"ggml-mpt-7b-chat.bin"` | `"ggml-mpt-7b-instruct.bin"`)
## type
Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
Type: [ModelType][11]
## LLModel
LLModel class representing a language model.
This is a base class that provides common functionality for different types of language models.
### constructor
Initialize a new LLModel.
#### Parameters
* `path` **[string][69]** Absolute path to the model file.
<!---->
* Throws **[Error][68]** If the model file does not exist.
### type
either 'gpt', mpt', or 'llama' or undefined
Returns **([ModelType][11] | [undefined][72])**&#x20;
### name
The name of the model.
Returns **[ModelFile][12]**&#x20;
### stateSize
Get the size of the internal state of the model.
NOTE: This state data is specific to the type of model you have created.
Returns **[number][73]** the size in bytes of the internal state of the model
### threadCount
Get the number of threads used for model inference.
The default is the number of physical cores your computer has.
Returns **[number][73]** The number of threads used for model inference.
### setThreadCount
Set the number of threads used for model inference.
#### Parameters
* `newNumber` **[number][73]** The new number of threads.
Returns **void**&#x20;
### raw\_prompt
Prompt the model with a given input and optional parameters.
This is the raw output from std out.
Use the prompt function exported for a value
#### Parameters
* `q` **[string][69]** The prompt input.
* `params` **Partial<[LLModelPromptContext][51]>?** Optional parameters for the prompt context.
Returns **any** The result of the model prompt.
### isModelLoaded
Whether the model is loaded or not.
Returns **[boolean][70]**&#x20;
### setLibraryPath
Where to search for the pluggable backend libraries
#### Parameters
* `s` **[string][69]**&#x20;
Returns **void**&#x20;
### getLibraryPath
Where to get the pluggable backend libraries
Returns **[string][69]**&#x20;
## createCompletion
The nodejs equivalent to python binding's chat\_completion
### Parameters
* `llmodel` **[LLModel][17]** The language model object.
* `messages` **[Array][74]<[PromptMessage][39]>** The array of messages for the conversation.
* `options` **[CompletionOptions][35]** The options for creating the completion.
### Examples
```javascript
const llmodel = new LLModel(model)
const messages = [
{ role: 'system', message: 'You are a weather forecaster.' },
{ role: 'user', message: 'should i go out today?' } ]
const completion = await createCompletion(llmodel, messages, {
verbose: true,
temp: 0.9,
})
console.log(completion.choices[0].message.content)
// No, it's going to be cold and rainy.
```
Returns **[CompletionReturn][45]** The completion result.
## CompletionOptions
**Extends Partial\<LLModelPromptContext>**
The options for creating the completion.
### verbose
Indicates if verbose logging is enabled.
Type: [boolean][70]
### hasDefaultHeader
Indicates if the default header is included in the prompt.
Type: [boolean][70]
### hasDefaultFooter
Indicates if the default footer is included in the prompt.
Type: [boolean][70]
## PromptMessage
A message in the conversation, identical to OpenAI's chat message.
### role
The role of the message.
Type: (`"system"` | `"assistant"` | `"user"`)
### content
The message content.
Type: [string][69]
## prompt\_tokens
The number of tokens used in the prompt.
Type: [number][73]
## completion\_tokens
The number of tokens used in the completion.
Type: [number][73]
## total\_tokens
The total number of tokens used.
Type: [number][73]
## CompletionReturn
The result of the completion, similar to OpenAI's format.
### model
The model name.
Type: [ModelFile][12]
### usage
Token usage report.
Type: {prompt\_tokens: [number][73], completion\_tokens: [number][73], total\_tokens: [number][73]}
### choices
The generated completions.
Type: [Array][74]<[CompletionChoice][49]>
## CompletionChoice
A completion choice, similar to OpenAI's format.
### message
Response message
Type: [PromptMessage][39]
## LLModelPromptContext
Model inference arguments for generating completions.
### logits\_size
The size of the raw logits vector.
Type: [number][73]
### tokens\_size
The size of the raw tokens vector.
Type: [number][73]
### n\_past
The number of tokens in the past conversation.
Type: [number][73]
### n\_ctx
The number of tokens possible in the context window.
Type: [number][73]
### n\_predict
The number of tokens to predict.
Type: [number][73]
### top\_k
The top-k logits to sample from.
Type: [number][73]
### top\_p
The nucleus sampling probability threshold.
Type: [number][73]
### temp
The temperature to adjust the model's output distribution.
Type: [number][73]
### n\_batch
The number of predictions to generate in parallel.
Type: [number][73]
### repeat\_penalty
The penalty factor for repeated tokens.
Type: [number][73]
### repeat\_last\_n
The number of last tokens to penalize.
Type: [number][73]
### context\_erase
The percentage of context to erase if the context window is exceeded.
Type: [number][73]
## createTokenStream
TODO: Help wanted to implement this
### Parameters
* `llmodel` **[LLModel][17]**&#x20;
* `messages` **[Array][74]<[PromptMessage][39]>**&#x20;
* `options` **[CompletionOptions][35]**&#x20;
Returns **function (ll: [LLModel][17]): AsyncGenerator<[string][69]>**&#x20;
## DEFAULT\_DIRECTORY
From python api:
models will be stored in (homedir)/.cache/gpt4all/\`
Type: [string][69]
## DEFAULT\_LIBRARIES\_DIRECTORY
From python api:
The default path for dynamic libraries to be stored.
You may separate paths by a semicolon to search in multiple areas.
This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd.
Type: [string][69]
[1]: #download
[2]: #parameters
[3]: #examples
[4]: #downloadoptions
[5]: #location
[6]: #debug
[7]: #url
[8]: #downloadcontroller
[9]: #cancel
[10]: #promise
[11]: #modeltype
[12]: #modelfile
[13]: #gptj
[14]: #llama
[15]: #mpt
[16]: #type
[17]: #llmodel
[18]: #constructor
[19]: #parameters-1
[20]: #type-1
[21]: #name
[22]: #statesize
[23]: #threadcount
[24]: #setthreadcount
[25]: #parameters-2
[26]: #raw_prompt
[27]: #parameters-3
[28]: #ismodelloaded
[29]: #setlibrarypath
[30]: #parameters-4
[31]: #getlibrarypath
[32]: #createcompletion
[33]: #parameters-5
[34]: #examples-1
[35]: #completionoptions
[36]: #verbose
[37]: #hasdefaultheader
[38]: #hasdefaultfooter
[39]: #promptmessage
[40]: #role
[41]: #content
[42]: #prompt_tokens
[43]: #completion_tokens
[44]: #total_tokens
[45]: #completionreturn
[46]: #model
[47]: #usage
[48]: #choices
[49]: #completionchoice
[50]: #message
[51]: #llmodelpromptcontext
[52]: #logits_size
[53]: #tokens_size
[54]: #n_past
[55]: #n_ctx
[56]: #n_predict
[57]: #top_k
[58]: #top_p
[59]: #temp
[60]: #n_batch
[61]: #repeat_penalty
[62]: #repeat_last_n
[63]: #context_erase
[64]: #createtokenstream
[65]: #parameters-6
[66]: #default_directory
[67]: #default_libraries_directory
[68]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error
[69]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String
[70]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean
[71]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise
[72]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined
[73]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number
[74]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array

View File

@ -1,68 +1,95 @@
#include <napi.h> #include "index.h"
#include <iostream>
#include "llmodel_c.h"
#include "llmodel.h"
#include "gptj.h"
#include "llamamodel.h"
#include "mpt.h"
#include "stdcapture.h"
class NodeModelWrapper : public Napi::ObjectWrap<NodeModelWrapper> { Napi::FunctionReference NodeModelWrapper::constructor;
public:
static Napi::Object Init(Napi::Env env, Napi::Object exports) { Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
Napi::Function func = DefineClass(env, "LLModel", { Napi::Function self = DefineClass(env, "LLModel", {
InstanceMethod("type", &NodeModelWrapper::getType), InstanceMethod("type", &NodeModelWrapper::getType),
InstanceMethod("name", &NodeModelWrapper::getName), InstanceMethod("isModelLoaded", &NodeModelWrapper::IsModelLoaded),
InstanceMethod("stateSize", &NodeModelWrapper::StateSize), InstanceMethod("name", &NodeModelWrapper::getName),
InstanceMethod("raw_prompt", &NodeModelWrapper::Prompt), InstanceMethod("stateSize", &NodeModelWrapper::StateSize),
InstanceMethod("setThreadCount", &NodeModelWrapper::SetThreadCount), InstanceMethod("raw_prompt", &NodeModelWrapper::Prompt),
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount), InstanceMethod("setThreadCount", &NodeModelWrapper::SetThreadCount),
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
InstanceMethod("getLibraryPath", &NodeModelWrapper::GetLibraryPath),
}); });
// Keep a static reference to the constructor
Napi::FunctionReference* constructor = new Napi::FunctionReference(); //
*constructor = Napi::Persistent(func); constructor = Napi::Persistent(self);
env.SetInstanceData(constructor); constructor.SuppressDestruct();
return self;
exports.Set("LLModel", func);
return exports;
} }
Napi::Value getType(const Napi::CallbackInfo& info) Napi::Value NodeModelWrapper::getType(const Napi::CallbackInfo& info)
{ {
if(type.empty()) {
return info.Env().Undefined();
}
return Napi::String::New(info.Env(), type); return Napi::String::New(info.Env(), type);
} }
NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info) NodeModelWrapper::NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
{ {
auto env = info.Env(); auto env = info.Env();
std::string weights_path = info[0].As<Napi::String>().Utf8Value(); fs::path model_path;
const char *c_weights_path = weights_path.c_str(); std::string full_weight_path;
//todo
inference_ = create_model_set_type(c_weights_path); std::string library_path = ".";
std::string model_name;
if(info[0].IsString()) {
model_path = info[0].As<Napi::String>().Utf8Value();
full_weight_path = model_path.string();
std::cout << "DEPRECATION: constructor accepts object now. Check docs for more.\n";
} else {
auto config_object = info[0].As<Napi::Object>();
model_name = config_object.Get("model_name").As<Napi::String>();
model_path = config_object.Get("model_path").As<Napi::String>().Utf8Value();
if(config_object.Has("model_type")) {
type = config_object.Get("model_type").As<Napi::String>();
}
full_weight_path = (model_path / fs::path(model_name)).string();
if(config_object.Has("library_path")) {
library_path = config_object.Get("library_path").As<Napi::String>();
} else {
library_path = ".";
}
}
llmodel_set_implementation_search_path(library_path.c_str());
llmodel_error* e = nullptr;
inference_ = std::make_shared<llmodel_model>(llmodel_model_create2(full_weight_path.c_str(), "auto", e));
if(e != nullptr) {
Napi::Error::New(env, e->message).ThrowAsJavaScriptException();
return;
}
if(GetInference() == nullptr) {
std::cerr << "Tried searching libraries in \"" << library_path << "\"" << std::endl;
std::cerr << "Tried searching for model weight in \"" << full_weight_path << "\"" << std::endl;
Napi::Error::New(env, "Had an issue creating llmodel object, inference is null").ThrowAsJavaScriptException();
return;
}
auto success = llmodel_loadModel(inference_, c_weights_path); auto success = llmodel_loadModel(GetInference(), full_weight_path.c_str());
if(!success) { if(!success) {
Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException(); Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException();
return; return;
} }
name = weights_path.substr(weights_path.find_last_of("/\\") + 1); name = model_name.empty() ? model_path.filename().string() : model_name;
}; };
~NodeModelWrapper() { //NodeModelWrapper::~NodeModelWrapper() {
// destroying the model manually causes exit code 3221226505, why? //GetInference().reset();
// However, bindings seem to operate fine without destructing pointer //}
//llmodel_model_destroy(inference_);
Napi::Value NodeModelWrapper::IsModelLoaded(const Napi::CallbackInfo& info) {
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(GetInference()));
} }
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info) { Napi::Value NodeModelWrapper::StateSize(const Napi::CallbackInfo& info) {
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(inference_));
}
Napi::Value StateSize(const Napi::CallbackInfo& info) {
// Implement the binding for the stateSize method // Implement the binding for the stateSize method
return Napi::Number::New(info.Env(), static_cast<int64_t>(llmodel_get_state_size(inference_))); return Napi::Number::New(info.Env(), static_cast<int64_t>(llmodel_get_state_size(GetInference())));
} }
/** /**
* Generate a response using the model. * Generate a response using the model.
@ -73,16 +100,14 @@ public:
* @param recalculate_callback A callback function for handling recalculation requests. * @param recalculate_callback A callback function for handling recalculation requests.
* @param ctx A pointer to the llmodel_prompt_context structure. * @param ctx A pointer to the llmodel_prompt_context structure.
*/ */
Napi::Value Prompt(const Napi::CallbackInfo& info) { Napi::Value NodeModelWrapper::Prompt(const Napi::CallbackInfo& info) {
auto env = info.Env(); auto env = info.Env();
std::string question; std::string question;
if(info[0].IsString()) { if(info[0].IsString()) {
question = info[0].As<Napi::String>().Utf8Value(); question = info[0].As<Napi::String>().Utf8Value();
} else { } else {
Napi::Error::New(env, "invalid string argument").ThrowAsJavaScriptException(); Napi::Error::New(info.Env(), "invalid string argument").ThrowAsJavaScriptException();
return env.Undefined(); return info.Env().Undefined();
} }
//defaults copied from python bindings //defaults copied from python bindings
llmodel_prompt_context promptContext = { llmodel_prompt_context promptContext = {
@ -101,127 +126,90 @@ public:
}; };
if(info[1].IsObject()) if(info[1].IsObject())
{ {
auto inputObject = info[1].As<Napi::Object>(); auto inputObject = info[1].As<Napi::Object>();
// Extract and assign the properties // Extract and assign the properties
if (inputObject.Has("logits") || inputObject.Has("tokens")) { if (inputObject.Has("logits") || inputObject.Has("tokens")) {
Napi::Error::New(env, "Invalid input: 'logits' or 'tokens' properties are not allowed").ThrowAsJavaScriptException(); Napi::Error::New(info.Env(), "Invalid input: 'logits' or 'tokens' properties are not allowed").ThrowAsJavaScriptException();
return env.Undefined(); return info.Env().Undefined();
} }
// Assign the remaining properties // Assign the remaining properties
if(inputObject.Has("n_past")) { if(inputObject.Has("n_past"))
promptContext.n_past = inputObject.Get("n_past").As<Napi::Number>().Int32Value(); promptContext.n_past = inputObject.Get("n_past").As<Napi::Number>().Int32Value();
} if(inputObject.Has("n_ctx"))
if(inputObject.Has("n_ctx")) { promptContext.n_ctx = inputObject.Get("n_ctx").As<Napi::Number>().Int32Value();
promptContext.n_ctx = inputObject.Get("n_ctx").As<Napi::Number>().Int32Value(); if(inputObject.Has("n_predict"))
} promptContext.n_predict = inputObject.Get("n_predict").As<Napi::Number>().Int32Value();
if(inputObject.Has("n_predict")) { if(inputObject.Has("top_k"))
promptContext.n_predict = inputObject.Get("n_predict").As<Napi::Number>().Int32Value(); promptContext.top_k = inputObject.Get("top_k").As<Napi::Number>().Int32Value();
} if(inputObject.Has("top_p"))
if(inputObject.Has("top_k")) { promptContext.top_p = inputObject.Get("top_p").As<Napi::Number>().FloatValue();
promptContext.top_k = inputObject.Get("top_k").As<Napi::Number>().Int32Value(); if(inputObject.Has("temp"))
} promptContext.temp = inputObject.Get("temp").As<Napi::Number>().FloatValue();
if(inputObject.Has("top_p")) { if(inputObject.Has("n_batch"))
promptContext.top_p = inputObject.Get("top_p").As<Napi::Number>().FloatValue(); promptContext.n_batch = inputObject.Get("n_batch").As<Napi::Number>().Int32Value();
} if(inputObject.Has("repeat_penalty"))
if(inputObject.Has("temp")) { promptContext.repeat_penalty = inputObject.Get("repeat_penalty").As<Napi::Number>().FloatValue();
promptContext.temp = inputObject.Get("temp").As<Napi::Number>().FloatValue(); if(inputObject.Has("repeat_last_n"))
} promptContext.repeat_last_n = inputObject.Get("repeat_last_n").As<Napi::Number>().Int32Value();
if(inputObject.Has("n_batch")) { if(inputObject.Has("context_erase"))
promptContext.n_batch = inputObject.Get("n_batch").As<Napi::Number>().Int32Value(); promptContext.context_erase = inputObject.Get("context_erase").As<Napi::Number>().FloatValue();
}
if(inputObject.Has("repeat_penalty")) {
promptContext.repeat_penalty = inputObject.Get("repeat_penalty").As<Napi::Number>().FloatValue();
}
if(inputObject.Has("repeat_last_n")) {
promptContext.repeat_last_n = inputObject.Get("repeat_last_n").As<Napi::Number>().Int32Value();
}
if(inputObject.Has("context_erase")) {
promptContext.context_erase = inputObject.Get("context_erase").As<Napi::Number>().FloatValue();
}
} }
// custom callbacks are weird with the gpt4all c bindings: I need to turn Napi::Functions into raw c function pointers, //copy to protect llmodel resources when splitting to new thread
// but it doesn't seem like its possible? (TODO, is it possible?)
// if(info[1].IsFunction()) { llmodel_prompt_context copiedPrompt = promptContext;
// Napi::Callback cb = *info[1].As<Napi::Function>(); std::string copiedQuestion = question;
// } PromptWorkContext pc = {
copiedQuestion,
inference_.load(),
// For now, simple capture of stdout copiedPrompt,
// possible TODO: put this on a libuv async thread. (AsyncWorker) };
CoutRedirect cr; auto threadSafeContext = new TsfnContext(env, pc);
llmodel_prompt(inference_, question.c_str(), &prompt_callback, &response_callback, &recalculate_callback, &promptContext); threadSafeContext->tsfn = Napi::ThreadSafeFunction::New(
return Napi::String::New(env, cr.getString()); env, // Environment
info[2].As<Napi::Function>(), // JS function from caller
"PromptCallback", // Resource name
0, // Max queue size (0 = unlimited).
1, // Initial thread count
threadSafeContext, // Context,
FinalizerCallback, // Finalizer
(void*)nullptr // Finalizer data
);
threadSafeContext->nativeThread = std::thread(threadEntry, threadSafeContext);
return threadSafeContext->deferred_.Promise();
} }
void SetThreadCount(const Napi::CallbackInfo& info) { void NodeModelWrapper::SetThreadCount(const Napi::CallbackInfo& info) {
if(info[0].IsNumber()) { if(info[0].IsNumber()) {
llmodel_setThreadCount(inference_, info[0].As<Napi::Number>().Int64Value()); llmodel_setThreadCount(GetInference(), info[0].As<Napi::Number>().Int64Value());
} else { } else {
Napi::Error::New(info.Env(), "Could not set thread count: argument 1 is NaN").ThrowAsJavaScriptException(); Napi::Error::New(info.Env(), "Could not set thread count: argument 1 is NaN").ThrowAsJavaScriptException();
return; return;
} }
} }
Napi::Value getName(const Napi::CallbackInfo& info) {
Napi::Value NodeModelWrapper::getName(const Napi::CallbackInfo& info) {
return Napi::String::New(info.Env(), name); return Napi::String::New(info.Env(), name);
} }
Napi::Value ThreadCount(const Napi::CallbackInfo& info) { Napi::Value NodeModelWrapper::ThreadCount(const Napi::CallbackInfo& info) {
return Napi::Number::New(info.Env(), llmodel_threadCount(inference_)); return Napi::Number::New(info.Env(), llmodel_threadCount(GetInference()));
} }
private: Napi::Value NodeModelWrapper::GetLibraryPath(const Napi::CallbackInfo& info) {
llmodel_model inference_; return Napi::String::New(info.Env(),
std::string type; llmodel_get_implementation_search_path());
std::string name;
//wrapper cb to capture output into stdout.then, CoutRedirect captures this
// and writes it to a file
static bool response_callback(int32_t tid, const char* resp)
{
if(tid != -1) {
std::cout<<std::string(resp);
return true;
}
return false;
} }
static bool prompt_callback(int32_t tid) { return true; } llmodel_model NodeModelWrapper::GetInference() {
static bool recalculate_callback(bool isrecalculating) { return isrecalculating; } return *inference_.load();
// Had to use this instead of the c library in order
// set the type of the model loaded.
// causes side effect: type is mutated;
llmodel_model create_model_set_type(const char* c_weights_path)
{
uint32_t magic;
llmodel_model model;
FILE *f = fopen(c_weights_path, "rb");
fread(&magic, sizeof(magic), 1, f);
if (magic == 0x67676d6c) {
model = llmodel_gptj_create();
type = "gptj";
}
else if (magic == 0x67676a74) {
model = llmodel_llama_create();
type = "llama";
}
else if (magic == 0x67676d6d) {
model = llmodel_mpt_create();
type = "mpt";
}
else {fprintf(stderr, "Invalid model file\n");}
fclose(f);
return model;
} }
};
//Exports Bindings //Exports Bindings
Napi::Object Init(Napi::Env env, Napi::Object exports) { Napi::Object Init(Napi::Env env, Napi::Object exports) {
return NodeModelWrapper::Init(env, exports); exports["LLModel"] = NodeModelWrapper::GetClass(env);
return exports;
} }
NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init) NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)

View File

@ -0,0 +1,45 @@
#include <napi.h>
#include "llmodel.h"
#include <iostream>
#include "llmodel_c.h"
#include "prompt.h"
#include <atomic>
#include <memory>
#include <filesystem>
namespace fs = std::filesystem;
class NodeModelWrapper: public Napi::ObjectWrap<NodeModelWrapper> {
public:
NodeModelWrapper(const Napi::CallbackInfo &);
//~NodeModelWrapper();
Napi::Value getType(const Napi::CallbackInfo& info);
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info);
Napi::Value StateSize(const Napi::CallbackInfo& info);
/**
* Prompting the model. This entails spawning a new thread and adding the response tokens
* into a thread local string variable.
*/
Napi::Value Prompt(const Napi::CallbackInfo& info);
void SetThreadCount(const Napi::CallbackInfo& info);
Napi::Value getName(const Napi::CallbackInfo& info);
Napi::Value ThreadCount(const Napi::CallbackInfo& info);
/*
* The path that is used to search for the dynamic libraries
*/
Napi::Value GetLibraryPath(const Napi::CallbackInfo& info);
/**
* Creates the LLModel class
*/
static Napi::Function GetClass(Napi::Env);
llmodel_model GetInference();
private:
/**
* The underlying inference that interfaces with the C interface
*/
std::atomic<std::shared_ptr<llmodel_model>> inference_;
std::string type;
// corresponds to LLModel::name() in typescript
std::string name;
static Napi::FunctionReference constructor;
};

View File

@ -1,19 +1,32 @@
{ {
"name": "gpt4all-ts", "name": "gpt4all",
"version": "2.0.0",
"packageManager": "yarn@3.5.1", "packageManager": "yarn@3.5.1",
"gypfile": true, "main": "src/gpt4all.js",
"repository": "nomic-ai/gpt4all",
"scripts": { "scripts": {
"test": "node ./test/index.mjs" "test": "node ./test/index.mjs",
"build:backend": "node scripts/build.js",
"install": "node-gyp-build",
"prebuild": "node scripts/prebuild.js",
"docs:build": "documentation build ./src/gpt4all.d.ts --parse-extension d.ts --format md --output docs/api.md"
}, },
"dependencies": { "dependencies": {
"bindings": "^1.5.0", "mkdirp": "^3.0.1",
"node-addon-api": "^6.1.0" "node-addon-api": "^6.1.0",
"node-gyp-build": "^4.6.0"
}, },
"devDependencies": { "devDependencies": {
"@types/node": "^20.1.5" "@types/node": "^20.1.5",
"documentation": "^14.0.2",
"prebuildify": "^5.0.1",
"prettier": "^2.8.8"
}, },
"engines": { "engines": {
"node": ">= 18.x.x" "node": ">= 18.x.x"
},
"prettier": {
"endOfLine": "lf",
"tabWidth": 4
} }
} }

View File

@ -0,0 +1,62 @@
#include "prompt.h"
TsfnContext::TsfnContext(Napi::Env env, const PromptWorkContext& pc)
: deferred_(Napi::Promise::Deferred::New(env)), pc(pc) {
}
std::mutex mtx;
static thread_local std::string res;
bool response_callback(int32_t token_id, const char *response) {
res+=response;
return token_id != -1;
}
bool recalculate_callback (bool isrecalculating) {
return isrecalculating;
};
bool prompt_callback (int32_t tid) {
return true;
};
// The thread entry point. This takes as its arguments the specific
// threadsafe-function context created inside the main thread.
void threadEntry(TsfnContext* context) {
std::lock_guard<std::mutex> lock(mtx);
// Perform a call into JavaScript.
napi_status status =
context->tsfn.NonBlockingCall(&context->pc,
[](Napi::Env env, Napi::Function jsCallback, PromptWorkContext* pc) {
llmodel_prompt(
*pc->inference_,
pc->question.c_str(),
&prompt_callback,
&response_callback,
&recalculate_callback,
&pc->prompt_params
);
jsCallback.Call({ Napi::String::New(env, res)} );
res.clear();
});
if (status != napi_ok) {
Napi::Error::Fatal(
"ThreadEntry",
"Napi::ThreadSafeNapi::Function.NonBlockingCall() failed");
}
// Release the thread-safe function. This decrements the internal thread
// count, and will perform finalization since the count will reach 0.
context->tsfn.Release();
}
void FinalizerCallback(Napi::Env env,
void* finalizeData,
TsfnContext* context) {
// Join the thread
context->nativeThread.join();
// Resolve the Promise previously returned to JS via the CreateTSFN method.
context->deferred_.Resolve(Napi::Boolean::New(env, true));
delete context;
}

View File

@ -0,0 +1,42 @@
#ifndef TSFN_CONTEXT_H
#define TSFN_CONTEXT_H
#include "napi.h"
#include "llmodel_c.h"
#include <thread>
#include <mutex>
#include <iostream>
#include <atomic>
#include <memory>
struct PromptWorkContext {
std::string question;
std::shared_ptr<llmodel_model> inference_;
llmodel_prompt_context prompt_params;
};
struct TsfnContext {
public:
TsfnContext(Napi::Env env, const PromptWorkContext &pc);
std::thread nativeThread;
Napi::Promise::Deferred deferred_;
PromptWorkContext pc;
Napi::ThreadSafeFunction tsfn;
// Some data to pass around
// int ints[ARRAY_LENGTH];
};
// The thread entry point. This takes as its arguments the specific
// threadsafe-function context created inside the main thread.
void threadEntry(TsfnContext* context);
// The thread-safe function finalizer callback. This callback executes
// at destruction of thread-safe function, taking as arguments the finalizer
// data and threadsafe-function context.
void FinalizerCallback(Napi::Env env, void* finalizeData, TsfnContext* context);
bool response_callback(int32_t token_id, const char *response);
bool recalculate_callback (bool isrecalculating);
bool prompt_callback (int32_t tid);
#endif // TSFN_CONTEXT_H

View File

@ -0,0 +1,17 @@
const { spawn } = require("node:child_process");
const { resolve } = require("path");
const args = process.argv.slice(2);
const platform = process.platform;
//windows 64bit or 32
if (platform === "win32") {
const path = "scripts/build_msvc.bat";
spawn(resolve(path), ["/Y", ...args], { shell: true, stdio: "inherit" });
process.on("data", (s) => console.log(s.toString()));
} else if (platform === "linux" || platform === "darwin") {
const path = "scripts/build_unix.sh";
const bash = spawn(`sh`, [path, ...args]);
bash.stdout.on("data", (s) => console.log(s.toString()), {
stdio: "inherit",
});
}

View File

@ -0,0 +1,16 @@
$ROOT_DIR = '.\runtimes\win-x64'
$BUILD_DIR = '.\runtimes\win-x64\build\mingw'
$LIBS_DIR = '.\runtimes\win-x64\native'
# cleanup env
Remove-Item -Force -Recurse $ROOT_DIR -ErrorAction SilentlyContinue | Out-Null
mkdir $BUILD_DIR | Out-Null
mkdir $LIBS_DIR | Out-Null
# build
cmake -G "MinGW Makefiles" -S ..\..\gpt4all-backend -B $BUILD_DIR -DLLAMA_AVX2=ON
cmake --build $BUILD_DIR --parallel --config Release
# copy native dlls
# cp "C:\ProgramData\chocolatey\lib\mingw\tools\install\mingw64\bin\*dll" $LIBS_DIR
cp "$BUILD_DIR\bin\*.dll" $LIBS_DIR

View File

@ -0,0 +1,31 @@
#!/bin/sh
SYSNAME=$(uname -s)
if [ "$SYSNAME" = "Linux" ]; then
BASE_DIR="runtimes/linux-x64"
LIB_EXT="so"
elif [ "$SYSNAME" = "Darwin" ]; then
BASE_DIR="runtimes/osx"
LIB_EXT="dylib"
elif [ -n "$SYSNAME" ]; then
echo "Unsupported system: $SYSNAME" >&2
exit 1
else
echo "\"uname -s\" failed" >&2
exit 1
fi
NATIVE_DIR="$BASE_DIR/native"
BUILD_DIR="$BASE_DIR/build"
rm -rf "$BASE_DIR"
mkdir -p "$NATIVE_DIR" "$BUILD_DIR"
cmake -S ../../gpt4all-backend -B "$BUILD_DIR" &&
cmake --build "$BUILD_DIR" -j --config Release && {
cp "$BUILD_DIR"/libllmodel.$LIB_EXT "$NATIVE_DIR"/
cp "$BUILD_DIR"/libgptj*.$LIB_EXT "$NATIVE_DIR"/
cp "$BUILD_DIR"/libllama*.$LIB_EXT "$NATIVE_DIR"/
cp "$BUILD_DIR"/libmpt*.$LIB_EXT "$NATIVE_DIR"/
}

View File

@ -0,0 +1,50 @@
const prebuildify = require("prebuildify");
async function createPrebuilds(combinations) {
for (const { platform, arch } of combinations) {
const opts = {
platform,
arch,
napi: true,
};
try {
await createPrebuild(opts);
console.log(
`Build succeeded for platform ${opts.platform} and architecture ${opts.arch}`
);
} catch (err) {
console.error(
`Error building for platform ${opts.platform} and architecture ${opts.arch}:`,
err
);
}
}
}
function createPrebuild(opts) {
return new Promise((resolve, reject) => {
prebuildify(opts, (err) => {
if (err) {
reject(err);
} else {
resolve();
}
});
});
}
const prebuildConfigs = [
{ platform: "win32", arch: "x64" },
{ platform: "win32", arch: "arm64" },
// { platform: 'win32', arch: 'armv7' },
{ platform: "darwin", arch: "x64" },
{ platform: "darwin", arch: "arm64" },
// { platform: 'darwin', arch: 'armv7' },
{ platform: "linux", arch: "x64" },
{ platform: "linux", arch: "arm64" },
{ platform: "linux", arch: "armv7" },
];
createPrebuilds(prebuildConfigs)
.then(() => console.log("All builds succeeded"))
.catch((err) => console.error("Error building:", err));

View File

@ -1,14 +1,15 @@
import { LLModel, prompt, createCompletion } from '../src/gpt4all.js' import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } from '../src/gpt4all.js'
const ll = new LLModel("./ggml-vicuna-7b-1.1-q4_2.bin");
const ll = new LLModel({
model_name: 'ggml-vicuna-7b-1.1-q4_2.bin',
model_path: './',
library_path: DEFAULT_LIBRARIES_DIRECTORY
});
try { try {
class Extended extends LLModel { class Extended extends LLModel {
} }
} catch(e) { } catch(e) {
console.log("Extending from native class gone wrong " + e) console.log("Extending from native class gone wrong " + e)
} }
@ -20,13 +21,26 @@ ll.setThreadCount(5);
console.log("thread count " + ll.threadCount()); console.log("thread count " + ll.threadCount());
ll.setThreadCount(4); ll.setThreadCount(4);
console.log("thread count " + ll.threadCount()); console.log("thread count " + ll.threadCount());
console.log("name " + ll.name());
console.log("type: " + ll.type());
console.log("Default directory for models", DEFAULT_DIRECTORY);
console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY);
console.log(await createCompletion(
console.log(createCompletion(
ll, ll,
prompt`${"header"} ${"prompt"}`, { [
verbose: true, { role : 'system', content: 'You are a girl who likes playing league of legends.' },
prompt: 'hello! Say something thought provoking.' { role : 'user', content: 'What is the best top laner to play right now?' },
} ],
{ verbose: false}
)); ));
console.log(await createCompletion(
ll,
[
{ role : 'user', content: 'What is the best bottom laner to play right now?' },
],
))

View File

@ -0,0 +1,22 @@
const os = require("node:os");
const path = require("node:path");
const DEFAULT_DIRECTORY = path.resolve(os.homedir(), ".cache/gpt4all");
const librarySearchPaths = [
path.join(DEFAULT_DIRECTORY, "libraries"),
path.resolve("./libraries"),
path.resolve(
__dirname,
"..",
`runtimes/${process.platform}-${process.arch}/native`
),
process.cwd(),
];
const DEFAULT_LIBRARIES_DIRECTORY = librarySearchPaths.join(";");
module.exports = {
DEFAULT_DIRECTORY,
DEFAULT_LIBRARIES_DIRECTORY,
};

View File

@ -1,162 +1,310 @@
/// <reference types="node" /> /// <reference types="node" />
declare module 'gpt4all-ts'; declare module "gpt4all";
export * from "./util.d.ts";
/** Type of the model */
type ModelType = "gptj" | "llama" | "mpt" | "replit";
/**
interface LLModelPromptContext { * Full list of models available
*/
// Size of the raw logits vector interface ModelFile {
logits_size: number; /** List of GPT-J Models */
gptj:
// Size of the raw tokens vector | "ggml-gpt4all-j-v1.3-groovy.bin"
tokens_size: number; | "ggml-gpt4all-j-v1.2-jazzy.bin"
| "ggml-gpt4all-j-v1.1-breezy.bin"
// Number of tokens in past conversation | "ggml-gpt4all-j.bin";
n_past: number; /** List Llama Models */
llama:
// Number of tokens possible in context window | "ggml-gpt4all-l13b-snoozy.bin"
n_ctx: number; | "ggml-vicuna-7b-1.1-q4_2.bin"
| "ggml-vicuna-13b-1.1-q4_2.bin"
// Number of tokens to predict | "ggml-wizardLM-7B.q4_2.bin"
n_predict: number; | "ggml-stable-vicuna-13B.q4_2.bin"
| "ggml-nous-gpt4-vicuna-13b.bin"
// Top k logits to sample from | "ggml-v3-13b-hermes-q5_1.bin";
top_k: number; /** List of MPT Models */
mpt:
// Nucleus sampling probability threshold | "ggml-mpt-7b-base.bin"
top_p: number; | "ggml-mpt-7b-chat.bin"
| "ggml-mpt-7b-instruct.bin";
// Temperature to adjust model's output distribution /** List of Replit Models */
temp: number; replit: "ggml-replit-code-v1-3b.bin";
// Number of predictions to generate in parallel
n_batch: number;
// Penalty factor for repeated tokens
repeat_penalty: number;
// Last n tokens to penalize
repeat_last_n: number;
// Percent of context to erase if we exceed the context window
context_erase: number;
} }
//mirrors py options
interface LLModelOptions {
/**
* Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
*/
type?: ModelType;
model_name: ModelFile[ModelType];
model_path: string;
library_path?: string;
}
/** /**
* LLModel class representing a language model. * LLModel class representing a language model.
* This is a base class that provides common functionality for different types of language models. * This is a base class that provides common functionality for different types of language models.
*/ */
declare class LLModel { declare class LLModel {
//either 'gpt', mpt', or 'llama' /**
type() : ModelType; * Initialize a new LLModel.
//The name of the model * @param path Absolute path to the model file.
name(): ModelFile; * @throws {Error} If the model file does not exist.
*/
constructor(path: string); constructor(path: string);
constructor(options: LLModelOptions);
/** either 'gpt', mpt', or 'llama' or undefined */
type(): ModelType | undefined;
/** The name of the model. */
name(): ModelFile;
/** /**
* Get the size of the internal state of the model. * Get the size of the internal state of the model.
* NOTE: This state data is specific to the type of model you have created. * NOTE: This state data is specific to the type of model you have created.
* @return the size in bytes of the internal state of the model * @return the size in bytes of the internal state of the model
*/ */
stateSize(): number; stateSize(): number;
/** /**
* Get the number of threads used for model inference. * Get the number of threads used for model inference.
* The default is the number of physical cores your computer has. * The default is the number of physical cores your computer has.
* @returns The number of threads used for model inference. * @returns The number of threads used for model inference.
*/ */
threadCount() : number; threadCount(): number;
/** /**
* Set the number of threads used for model inference. * Set the number of threads used for model inference.
* @param newNumber The new number of threads. * @param newNumber The new number of threads.
*/ */
setThreadCount(newNumber: number): void; setThreadCount(newNumber: number): void;
/**
* Prompt the model with a given input and optional parameters.
* This is the raw output from std out.
* Use the prompt function exported for a value
* @param q The prompt input.
* @param params Optional parameters for the prompt context.
* @returns The result of the model prompt.
*/
raw_prompt(q: string, params?: Partial<LLModelPromptContext>) : unknown; //todo work on return type
}
interface DownloadController {
//Cancel the request to download from gpt4all website if this is called.
cancel: () => void;
//Convert the downloader into a promise, allowing people to await and manage its lifetime
promise: () => Promise<void>
}
export interface DownloadConfig {
/** /**
* location to download the model. * Prompt the model with a given input and optional parameters.
* Default is process.cwd(), or the current working directory * This is the raw output from std out.
* Use the prompt function exported for a value
* @param q The prompt input.
* @param params Optional parameters for the prompt context.
* @returns The result of the model prompt.
*/ */
location: string; raw_prompt(q: string, params: Partial<LLModelPromptContext>, callback: (res: string) => void): void; // TODO work on return type
/** /**
* Debug mode -- check how long it took to download in seconds * Whether the model is loaded or not.
*/ */
debug: boolean; isModelLoaded(): boolean;
/** /**
* Default link = https://gpt4all.io/models` * Where to search for the pluggable backend libraries
* This property overrides the default.
*/ */
link?: string setLibraryPath(s: string): void;
} /**
/** * Where to get the pluggable backend libraries
* Initiates the download of a model file of a specific model type. */
* By default this downloads without waiting. use the controller returned to alter this behavior. getLibraryPath(): string;
* @param {ModelFile[ModelType]} m - The model file to be downloaded.
* @param {Record<string, unknown>} op - options to pass into the downloader. Default is { location: (cwd), debug: false }.
* @returns {DownloadController} A DownloadController object that allows controlling the download process.
*/
declare function download(m: ModelFile[ModelType], op: { location: string, debug: boolean, link?:string }): DownloadController
type ModelType = 'gptj' | 'llama' | 'mpt';
/*
* A nice interface for intellisense of all possibly models.
*/
interface ModelFile {
'gptj': | "ggml-gpt4all-j-v1.3-groovy.bin"
| "ggml-gpt4all-j-v1.2-jazzy.bin"
| "ggml-gpt4all-j-v1.1-breezy.bin"
| "ggml-gpt4all-j.bin";
'llama':| "ggml-gpt4all-l13b-snoozy.bin"
| "ggml-vicuna-7b-1.1-q4_2.bin"
| "ggml-vicuna-13b-1.1-q4_2.bin"
| "ggml-wizardLM-7B.q4_2.bin"
| "ggml-stable-vicuna-13B.q4_2.bin"
| "ggml-nous-gpt4-vicuna-13b.bin"
'mpt': | "ggml-mpt-7b-base.bin"
| "ggml-mpt-7b-chat.bin"
| "ggml-mpt-7b-instruct.bin"
} }
interface ExtendedOptions { interface LoadModelOptions {
modelPath?: string;
librariesPath?: string;
allowDownload?: boolean;
verbose?: boolean; verbose?: boolean;
system?: string;
header?: string;
prompt: string;
promptEntries?: Record<string, unknown>
} }
type PromptTemplate = (...args: string[]) => string; declare function loadModel(
modelName: string,
options?: LoadModelOptions
): Promise<LLModel>;
/**
* The nodejs equivalent to python binding's chat_completion
* @param {LLModel} llmodel - The language model object.
* @param {PromptMessage[]} messages - The array of messages for the conversation.
* @param {CompletionOptions} options - The options for creating the completion.
* @returns {CompletionReturn} The completion result.
* @example
* const llmodel = new LLModel(model)
* const messages = [
* { role: 'system', message: 'You are a weather forecaster.' },
* { role: 'user', message: 'should i go out today?' } ]
* const completion = await createCompletion(llmodel, messages, {
* verbose: true,
* temp: 0.9,
* })
* console.log(completion.choices[0].message.content)
* // No, it's going to be cold and rainy.
*/
declare function createCompletion( declare function createCompletion(
model: LLModel, llmodel: LLModel,
pt: PromptTemplate, messages: PromptMessage[],
options: LLModelPromptContext&ExtendedOptions options?: CompletionOptions
) : string ): Promise<CompletionReturn>;
function prompt( /**
strings: TemplateStringsArray * The options for creating the completion.
): PromptTemplate */
interface CompletionOptions extends Partial<LLModelPromptContext> {
/**
* Indicates if verbose logging is enabled.
* @default true
*/
verbose?: boolean;
/**
* Indicates if the default header is included in the prompt.
* @default true
*/
hasDefaultHeader?: boolean;
export { LLModel, LLModelPromptContext, ModelType, download, DownloadController, prompt, ExtendedOptions, createCompletion } /**
* Indicates if the default footer is included in the prompt.
* @default true
*/
hasDefaultFooter?: boolean;
}
/**
* A message in the conversation, identical to OpenAI's chat message.
*/
interface PromptMessage {
/** The role of the message. */
role: "system" | "assistant" | "user";
/** The message content. */
content: string;
}
/**
* The result of the completion, similar to OpenAI's format.
*/
interface CompletionReturn {
/** The model name.
* @type {ModelFile}
*/
model: ModelFile[ModelType];
/** Token usage report. */
usage: {
/** The number of tokens used in the prompt. */
prompt_tokens: number;
/** The number of tokens used in the completion. */
completion_tokens: number;
/** The total number of tokens used. */
total_tokens: number;
};
/** The generated completions. */
choices: CompletionChoice[];
}
/**
* A completion choice, similar to OpenAI's format.
*/
interface CompletionChoice {
/** Response message */
message: PromptMessage;
}
/**
* Model inference arguments for generating completions.
*/
interface LLModelPromptContext {
/** The size of the raw logits vector. */
logits_size: number;
/** The size of the raw tokens vector. */
tokens_size: number;
/** The number of tokens in the past conversation. */
n_past: number;
/** The number of tokens possible in the context window.
* @default 1024
*/
n_ctx: number;
/** The number of tokens to predict.
* @default 128
* */
n_predict: number;
/** The top-k logits to sample from.
* @default 40
* */
top_k: number;
/** The nucleus sampling probability threshold.
* @default 0.9
* */
top_p: number;
/** The temperature to adjust the model's output distribution.
* @default 0.72
* */
temp: number;
/** The number of predictions to generate in parallel.
* @default 8
* */
n_batch: number;
/** The penalty factor for repeated tokens.
* @default 1
* */
repeat_penalty: number;
/** The number of last tokens to penalize.
* @default 10
* */
repeat_last_n: number;
/** The percentage of context to erase if the context window is exceeded.
* @default 0.5
* */
context_erase: number;
}
/**
* TODO: Help wanted to implement this
*/
declare function createTokenStream(
llmodel: LLModel,
messages: PromptMessage[],
options: CompletionOptions
): (ll: LLModel) => AsyncGenerator<string>;
/**
* From python api:
* models will be stored in (homedir)/.cache/gpt4all/`
*/
declare const DEFAULT_DIRECTORY: string;
/**
* From python api:
* The default path for dynamic libraries to be stored.
* You may separate paths by a semicolon to search in multiple areas.
* This searches DEFAULT_DIRECTORY/libraries, cwd/libraries, and finally cwd.
*/
declare const DEFAULT_LIBRARIES_DIRECTORY: string;
interface PromptMessage {
role: "system" | "assistant" | "user";
content: string;
}
export {
ModelType,
ModelFile,
LLModel,
LLModelPromptContext,
PromptMessage,
CompletionOptions,
LoadModelOptions,
loadModel,
createCompletion,
createTokenStream,
DEFAULT_DIRECTORY,
DEFAULT_LIBRARIES_DIRECTORY,
};

View File

@ -1,112 +1,138 @@
"use strict";
/// This file implements the gpt4all.d.ts file endings. /// This file implements the gpt4all.d.ts file endings.
/// Written in commonjs to support both ESM and CJS projects. /// Written in commonjs to support both ESM and CJS projects.
const { existsSync } = require("fs");
const path = require("node:path");
const { LLModel } = require("node-gyp-build")(path.resolve(__dirname, ".."));
const {
retrieveModel,
downloadModel,
appendBinSuffixIfMissing,
} = require("./util.js");
const config = require("./config.js");
const { LLModel } = require('bindings')('../build/Release/gpt4allts'); async function loadModel(modelName, options = {}) {
const { createWriteStream, existsSync } = require('fs'); const loadOptions = {
const { join } = require('path'); modelPath: config.DEFAULT_DIRECTORY,
const { performance } = require('node:perf_hooks'); librariesPath: config.DEFAULT_LIBRARIES_DIRECTORY,
allowDownload: true,
verbose: true,
...options,
// readChunks() reads from the provided reader and yields the results into an async iterable
// https://css-tricks.com/web-streams-everywhere-and-fetch-for-node-js/
function readChunks(reader) {
return {
async* [Symbol.asyncIterator]() {
let readResult = await reader.read();
while (!readResult.done) {
yield readResult.value;
readResult = await reader.read();
}
},
}; };
await retrieveModel(modelName, {
modelPath: loadOptions.modelPath,
allowDownload: loadOptions.allowDownload,
verbose: loadOptions.verbose,
});
const libSearchPaths = loadOptions.librariesPath.split(";");
let libPath = null;
for (const searchPath of libSearchPaths) {
if (existsSync(searchPath)) {
libPath = searchPath;
break;
}
}
const llmOptions = {
model_name: appendBinSuffixIfMissing(modelName),
model_path: loadOptions.modelPath,
library_path: libPath,
};
if (loadOptions.verbose) {
console.log("Creating LLModel with options:", llmOptions);
}
const llmodel = new LLModel(llmOptions);
return llmodel;
} }
exports.LLModel = LLModel; function createPrompt(messages, hasDefaultHeader, hasDefaultFooter) {
let fullPrompt = "";
for (const message of messages) {
if (message.role === "system") {
const systemMessage = message.content + "\n";
fullPrompt += systemMessage;
}
}
if (hasDefaultHeader) {
fullPrompt += `### Instruction:
The prompt below is a question to answer, a task to complete, or a conversation
to respond to; decide which and write an appropriate response.
\n### Prompt:
`;
}
for (const message of messages) {
if (message.role === "user") {
const user_message = "\n" + message["content"];
fullPrompt += user_message;
}
if (message["role"] == "assistant") {
const assistant_message = "\nResponse: " + message["content"];
fullPrompt += assistant_message;
}
}
if (hasDefaultFooter) {
fullPrompt += "\n### Response:";
}
exports.download = function ( return fullPrompt;
name, }
options = { debug: false, location: process.cwd(), link: undefined }
async function createCompletion(
llmodel,
messages,
options = {
hasDefaultHeader: true,
hasDefaultFooter: false,
verbose: true,
}
) { ) {
const abortController = new AbortController();
const signal = abortController.signal;
const pathToModel = join(options.location, name);
if(existsSync(pathToModel)) {
throw Error("Path to model already exists");
}
//wrapper function to get the readable stream from request
const fetcher = (name) => fetch(options.link ?? `https://gpt4all.io/models/${name}`, {
signal,
})
.then(res => {
if(!res.ok) {
throw Error("Could not find "+ name + " from " + `https://gpt4all.io/models/` )
}
return res.body.getReader()
})
//a promise that executes and writes to a stream. Resolves when done writing.
const res = new Promise((resolve, reject) => {
fetcher(name)
//Resolves an array of a reader and writestream.
.then(reader => [reader, createWriteStream(pathToModel)])
.then(
async ([readable, wstream]) => {
console.log('(CLI might hang) Downloading @ ', pathToModel);
let perf;
if(options.debug) {
perf = performance.now();
}
for await (const chunk of readChunks(readable)) {
wstream.write(chunk);
}
if(options.debug) {
console.log("Time taken: ", (performance.now()-perf).toFixed(2), " ms");
}
resolve();
}
).catch(reject);
});
return {
cancel : () => abortController.abort(),
promise: () => res
}
}
//https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates
exports.prompt = function prompt(strings, ...keys) {
return (...values) => {
const dict = values[values.length - 1] || {};
const result = [strings[0]];
keys.forEach((key, i) => {
const value = Number.isInteger(key) ? values[key] : dict[key];
result.push(value, strings[i + 1]);
});
return result.join("");
};
}
exports.createCompletion = function (llmodel, promptMaker, options) {
//creating the keys to insert into promptMaker. //creating the keys to insert into promptMaker.
const entries = { const fullPrompt = createPrompt(
system: options.system ?? '', messages,
header: options.header ?? "### Instruction: The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response.\n### Prompt: ", options.hasDefaultHeader ?? true,
prompt: options.prompt, options.hasDefaultFooter
...(options.promptEntries ?? {}) );
}; if (options.verbose) {
console.log("Sent: " + fullPrompt);
const fullPrompt = promptMaker(entries)+'\n### Response:';
if(options.verbose) {
console.log("sending prompt: " + `"${fullPrompt}"`)
} }
const promisifiedRawPrompt = new Promise((resolve, rej) => {
return llmodel.raw_prompt(fullPrompt, options); llmodel.raw_prompt(fullPrompt, options, (s) => {
resolve(s);
});
});
return promisifiedRawPrompt.then((response) => {
return {
llmodel: llmodel.name(),
usage: {
prompt_tokens: fullPrompt.length,
completion_tokens: response.length, //TODO
total_tokens: fullPrompt.length + response.length, //TODO
},
choices: [
{
message: {
role: "assistant",
content: response,
},
},
],
};
});
} }
module.exports = {
...config,
LLModel,
createCompletion,
downloadModel,
retrieveModel,
loadModel,
};

View File

@ -0,0 +1,69 @@
/// <reference types="node" />
declare module "gpt4all";
/**
* Initiates the download of a model file of a specific model type.
* By default this downloads without waiting. use the controller returned to alter this behavior.
* @param {ModelFile} model - The model file to be downloaded.
* @param {DownloadOptions} options - to pass into the downloader. Default is { location: (cwd), debug: false }.
* @returns {DownloadController} object that allows controlling the download process.
*
* @throws {Error} If the model already exists in the specified location.
* @throws {Error} If the model cannot be found at the specified url.
*
* @example
* const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
* controller.promise().then(() => console.log('Downloaded!'))
*/
declare function downloadModel(
modelName: string,
options?: DownloadModelOptions
): DownloadController;
/**
* Options for the model download process.
*/
export interface DownloadModelOptions {
/**
* location to download the model.
* Default is process.cwd(), or the current working directory
*/
modelPath?: string;
/**
* Debug mode -- check how long it took to download in seconds
* @default false
*/
debug?: boolean;
/**
* Remote download url. Defaults to `https://gpt4all.io/models`
* @default https://gpt4all.io/models
*/
url?: string;
}
declare function listModels(): Promise<Record<string, string>[]>;
interface RetrieveModelOptions {
allowDownload?: boolean;
verbose?: boolean;
modelPath?: string;
}
declare async function retrieveModel(
model: string,
options?: RetrieveModelOptions
): Promise<string>;
/**
* Model download controller.
*/
interface DownloadController {
/** Cancel the request to download from gpt4all website if this is called. */
cancel: () => void;
/** Convert the downloader into a promise, allowing people to await and manage its lifetime */
promise: () => Promise<void>;
}
export { downloadModel, DownloadModelOptions, DownloadController, listModels, retrieveModel, RetrieveModelOptions };

View File

@ -0,0 +1,156 @@
const { createWriteStream, existsSync } = require("fs");
const { performance } = require("node:perf_hooks");
const path = require("node:path");
const {mkdirp} = require("mkdirp");
const { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } = require("./config.js");
async function listModels() {
const res = await fetch("https://gpt4all.io/models/models.json");
const modelList = await res.json();
return modelList;
}
function appendBinSuffixIfMissing(name) {
if (!name.endsWith(".bin")) {
return name + ".bin";
}
return name;
}
// readChunks() reads from the provided reader and yields the results into an async iterable
// https://css-tricks.com/web-streams-everywhere-and-fetch-for-node-js/
function readChunks(reader) {
return {
async *[Symbol.asyncIterator]() {
let readResult = await reader.read();
while (!readResult.done) {
yield readResult.value;
readResult = await reader.read();
}
},
};
}
function downloadModel(
modelName,
options = {}
) {
const downloadOptions = {
modelPath: DEFAULT_DIRECTORY,
debug: false,
url: "https://gpt4all.io/models",
...options,
};
const modelFileName = appendBinSuffixIfMissing(modelName);
const fullModelPath = path.join(downloadOptions.modelPath, modelFileName);
const modelUrl = `${downloadOptions.url}/${modelFileName}`
if (existsSync(fullModelPath)) {
throw Error(`Model already exists at ${fullModelPath}`);
}
const abortController = new AbortController();
const signal = abortController.signal;
//wrapper function to get the readable stream from request
// const baseUrl = options.url ?? "https://gpt4all.io/models";
const fetchModel = () =>
fetch(modelUrl, {
signal,
}).then((res) => {
if (!res.ok) {
throw Error(`Failed to download model from ${modelUrl} - ${res.statusText}`);
}
return res.body.getReader();
});
//a promise that executes and writes to a stream. Resolves when done writing.
const res = new Promise((resolve, reject) => {
fetchModel()
//Resolves an array of a reader and writestream.
.then((reader) => [reader, createWriteStream(fullModelPath)])
.then(async ([readable, wstream]) => {
console.log("Downloading @ ", fullModelPath);
let perf;
if (options.debug) {
perf = performance.now();
}
for await (const chunk of readChunks(readable)) {
wstream.write(chunk);
}
if (options.debug) {
console.log(
"Time taken: ",
(performance.now() - perf).toFixed(2),
" ms"
);
}
resolve(fullModelPath);
})
.catch(reject);
});
return {
cancel: () => abortController.abort(),
promise: () => res,
};
};
async function retrieveModel (
modelName,
options = {}
) {
const retrieveOptions = {
modelPath: DEFAULT_DIRECTORY,
allowDownload: true,
verbose: true,
...options,
};
await mkdirp(retrieveOptions.modelPath);
const modelFileName = appendBinSuffixIfMissing(modelName);
const fullModelPath = path.join(retrieveOptions.modelPath, modelFileName);
const modelExists = existsSync(fullModelPath);
if (modelExists) {
return fullModelPath;
}
if (!retrieveOptions.allowDownload) {
throw Error(`Model does not exist at ${fullModelPath}`);
}
const availableModels = await listModels();
const foundModel = availableModels.find((model) => model.filename === modelFileName);
if (!foundModel) {
throw Error(`Model "${modelName}" is not available.`);
}
if (retrieveOptions.verbose) {
console.log(`Downloading ${modelName}...`);
}
const downloadController = downloadModel(modelName, {
modelPath: retrieveOptions.modelPath,
debug: retrieveOptions.verbose,
});
const downloadPath = await downloadController.promise();
if (retrieveOptions.verbose) {
console.log(`Model downloaded to ${downloadPath}`);
}
return downloadPath
}
module.exports = {
appendBinSuffixIfMissing,
downloadModel,
retrieveModel,
};

View File

@ -1,14 +0,0 @@
#include "stdcapture.h"
CoutRedirect::CoutRedirect() {
old = std::cout.rdbuf(buffer.rdbuf()); // redirect cout to buffer stream
}
std::string CoutRedirect::getString() {
return buffer.str(); // get string
}
CoutRedirect::~CoutRedirect() {
std::cout.rdbuf(old); // reverse redirect
}

View File

@ -1,21 +0,0 @@
//https://stackoverflow.com/questions/5419356/redirect-stdout-stderr-to-a-string
#ifndef COUTREDIRECT_H
#define COUTREDIRECT_H
#include <iostream>
#include <streambuf>
#include <string>
#include <sstream>
class CoutRedirect {
public:
CoutRedirect();
std::string getString();
~CoutRedirect();
private:
std::stringstream buffer;
std::streambuf* old;
};
#endif // COUTREDIRECT_H

View File

@ -1,38 +1,5 @@
import * as assert from 'node:assert' import * as assert from 'node:assert'
import { prompt, download } from '../src/gpt4all.js' import { download } from '../src/gpt4all.js'
{
const somePrompt = prompt`${"header"} Hello joe, my name is Ron. ${"prompt"}`;
assert.equal(
somePrompt({ header: 'oompa', prompt: 'holy moly' }),
'oompa Hello joe, my name is Ron. holy moly'
);
}
{
const indexedPrompt = prompt`${0}, ${1} ${0}`;
assert.equal(
indexedPrompt('hello', 'world'),
'hello, world hello'
);
assert.notEqual(
indexedPrompt(['hello', 'world']),
'hello, world hello'
);
}
{
assert.equal(
(prompt`${"header"} ${"prompt"}`)({ header: 'hello', prompt: 'poo' }), 'hello poo',
"Template prompt not equal"
);
}
assert.rejects(async () => download('poo.bin').promise()); assert.rejects(async () => download('poo.bin').promise());

File diff suppressed because it is too large Load Diff