mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2024-10-01 01:06:10 -04:00
typescript: publish alpha on npm and lots of cleanup, documentation, and more (#913)
* fix typo so padding can be accessed * Small cleanups for settings dialog. * Fix the build. * localdocs * Fixup the rescan. Fix debug output. * Add remove folder implementation. * Remove this signal as unnecessary for now. * Cleanup of the database, better chunking, better matching. * Add new reverse prompt for new localdocs context feature. * Add a new muted text color. * Turn off the debugging messages by default. * Add prompt processing and localdocs to the busy indicator in UI. * Specify a large number of suffixes we will search for now. * Add a collection list to support a UI. * Add a localdocs tab. * Start fleshing out the localdocs ui. * Begin implementing the localdocs ui in earnest. * Clean up the settings dialog for localdocs a bit. * Add more of the UI for selecting collections for chats. * Complete the settings for localdocs. * Adds the collections to serialize and implement references for localdocs. * Store the references separately so they are not sent to datalake. * Add context link to references. * Don't use the full path in reference text. * Various fixes to remove unnecessary warnings. * Add a newline * ignore rider and vscode dirs * create test project and basic model loading tests * make sample print usage and cleaner * Get the backend as well as the client building/working with msvc. * Libraries named differently on msvc. * Bump the version number. * This time remember to bump the version right after a release. * rm redundant json * More precise condition * Nicer handling of missing model directory. Correct exception message. * Log where the model was found * Concise model matching * reduce nesting, better error reporting * convert to f-strings * less magic number * 1. Cleanup the interrupted download 2. with-syntax * Redundant else * Do not ignore explicitly passed 4 threads * Correct return type * Add optional verbosity * Correct indentation of the multiline error message * one funcion to append .bin suffix * hotfix default verbose optioin * export hidden types and fix prompt() type * tiny typo (#739) * Update README.md (#738) * Update README.md fix golang gpt4all import path Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> * Update README.md Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> --------- Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> * fix(training instructions): model repo name (#728) Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com> * C# Bindings - Prompt formatting (#712) * Added support for custom prompt formatting * more docs added * bump version * clean up cc files and revert things * LocalDocs documentation initial (#761) * LocalDocs documentation initial * Improved localdocs documentation (#762) * Improved localdocs documentation * Improved localdocs documentation * Improved localdocs documentation * Improved localdocs documentation * New tokenizer implementation for MPT and GPT-J Improves output quality by making these tokenizers more closely match the behavior of the huggingface `tokenizers` based BPE tokenizers these models were trained with. Featuring: * Fixed unicode handling (via ICU) * Fixed BPE token merge handling * Complete added vocabulary handling * buf_ref.into() can be const now * add tokenizer readme w/ instructions for convert script * Revert "add tokenizer readme w/ instructions for convert script" This reverts commit9c15d1f83e
. * Revert "buf_ref.into() can be const now" This reverts commit840e011b75
. * Revert "New tokenizer implementation for MPT and GPT-J" This reverts commitee3469ba6c
. * Fix remove model from model download for regular models. * Fixed formatting of localdocs docs (#770) * construct and return the correct reponse when the request is a chat completion * chore: update typings to keep consistent with python api * progress, updating createCompletion to mirror py api * update spec, unfinished backend * prebuild binaries for package distribution using prebuildify/node-gyp-build * Get rid of blocking behavior for regenerate response. * Add a label to the model loading visual indicator. * Use the new MyButton for the regenerate response button. * Add a hover and pressed to the visual indication of MyButton. * Fix wording of this accessible description. * Some color and theme enhancements to make the UI contrast a bit better. * Make the comboboxes align in UI. * chore: update namespace and fix prompt bug * fix linux build * add roadmap * Fix offset of prompt/response icons for smaller text. * Dlopen backend 5 (#779) Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved. * Add a custom busy indicator to further align look and feel across platforms. * Draw the indicator for combobox to ensure it looks the same on all platforms. * Fix warning. * Use the proper text color for sending messages. * Fixup the plus new chat button. * Make all the toolbuttons highlight on hover. * Advanced avxonly autodetection (#744) * Advanced avxonly requirement detection * chore: support llamaversion >= 3 and ggml default * Dlopen better implementation management (Version 2) * Add fixme's and clean up a bit. * Documentation improvements on LocalDocs (#790) * Update gpt4all_chat.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * typo Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Adapt code * Makefile changes (WIP to test) * Debug * Adapt makefile * Style * Implemented logging mechanism (#785) * Cleaned up implementation management (#787) * Cleaned up implementation management * Initialize LLModel::m_implementation to nullptr * llmodel.h: Moved dlhandle fwd declare above LLModel class * Fix compile * Fixed double-free in LLModel::Implementation destructor * Allow user to specify custom search path via $GPT4ALL_IMPLEMENTATIONS_PATH (#789) * Drop leftover include * Add ldl in gpt4all.go for dynamic linking (#797) * Logger should also output to stderr * Fix MSVC Build, Update C# Binding Scripts * Update gpt4all_chat.md (#800) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * C# Bindings - improved logging (#714) * added optional support for .NET logging * bump version and add missing alpha suffix * avoid creating additional namespace for extensions * prefer NullLogger/NullLoggerFactory over null-conditional ILogger to avoid errors --------- Signed-off-by: mvenditto <venditto.matteo@gmail.com> * Make localdocs work with server mode. * Better name for database results. * Fix for stale references after we regenerate. * Don't hardcode these. * Fix bug with resetting context with chatgpt model. * Trying to shrink the copy+paste code and do more code sharing between backend model impl. * Remove this as it is no longer useful. * Try and fix build on mac. * Fix mac build again. * Add models/release.json to github repo to allow PRs * Fixed spelling error in models.json to make CI happy Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * updated bindings code for updated C api * load all model libs * model creation is failing... debugging * load libs correctly * fixed finding model libs * cleanup * cleanup * more cleanup * small typo fix * updated binding.gyp * Fixed model type for GPT-J (#815) Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * Fixed tons of warnings and clazy findings (#811) * Some tweaks to UI to make window resizing smooth and flow nicely. * Min constraints on about dialog. * Prevent flashing of white on resize. * Actually use the theme dark color for window background. * Add the ability to change the directory via text field not just 'browse' button. * add scripts to build dlls * markdown doc gen * add scripts, nearly done moving breaking changes * merge with main * oops, fixed comment * more meaningful name * leave for testing * Only default mlock on macOS where swap seems to be a problem Repeating the change that once was done in https://github.com/nomic-ai/gpt4all/pull/663 but then was overriden by9c6c09cbd2
Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> * Add a collection immediately and show a placeholder + busy indicator in localdocs settings. * some tweaks to optional types and defaults * mingw script for windows compilation * Update README.md huggingface -> Hugging Face Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Backend prompt dedup (#822) * Deduplicated prompt() function code * Better error handling when the model fails to load. * We no longer have an avx_only repository and better error handling for minimum hardware requirements. (#833) * Update build_and_run.md (#834) Signed-off-by: AT <manyoso@users.noreply.github.com> * Trying out a new feature to download directly from huggingface. * Try again with the url. * Allow for download of models hosted on third party hosts. * Fix up for newer models on reset context. This fixes the model from totally failing after a reset context. * Update to latest llama.cpp * Remove older models that are not as popular. (#837) * Remove older models that are not as popular. * Update models.json Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update models.json (#838) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update models.json Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * feat: finalyl compiled on windows (MSVC) goadman * update README and spec and promisfy createCompletion * update d.ts * Make installers work with mac/windows for big backend change. * Need this so the linux installer packages it as a dependency. * Try and fix mac. * Fix compile on mac. * These need to be installed for them to be packaged and work for both mac and windows. * Fix installers for windows and linux. * Fix symbol resolution on windows. * updated pypi version * Release notes for version 2.4.5 (#853) * Update README.md (#854) Signed-off-by: AT <manyoso@users.noreply.github.com> * Documentation for model sideloading (#851) * Documentation for model sideloading Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update gpt4all_chat.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Speculative fix for windows llama models with installer. * Revert "Speculative fix for windows llama models with installer." This reverts commitadd725d1eb
. * Revert "Fix bug with resetting context with chatgpt model." (#859) This reverts commite0dcf6a14f
. * Fix llama models on linux and windows. * Bump the version. * New release notes * Set thread counts after loading model (#836) * Update gpt4all_faq.md (#861) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Supports downloading officially supported models not hosted on gpt4all R2 * Replit Model (#713) * porting over replit code model to gpt4all * replaced memory with kv_self struct * continuing debug * welp it built but lot of sus things * working model loading and somewhat working generate.. need to format response? * revert back to semi working version * finally got rid of weird formatting * figured out problem is with python bindings - this is good to go for testing * addressing PR feedback * output refactor * fixed prompt reponse collection * cleanup * addressing PR comments * building replit backend with new ggmlver code * chatllm replit and clean python files * cleanup * updated replit to match new llmodel api * match llmodel api and change size_t to Token * resolve PR comments * replit model commit comment * Synced llama.cpp.cmake with upstream (#887) * Fix for windows. * fix: build script * Revert "Synced llama.cpp.cmake with upstream (#887)" This reverts commit5c5e10c1f5
. * Update README.md (#906) Add PyPI link and add clickable, more specific link to documentation Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> * Update CollectionsDialog.qml (#856) Phrasing for localdocs Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * sampling: remove incorrect offset for n_vocab (#900) no effect, but avoids a *potential* bug later if we use actualVocabSize - which is for when a model has a larger embedding tensor/# of output logits than actually trained token to allow room for adding extras in finetuning - presently all of our models have had "placeholder" tokens in the vocab so this hasn't broken anything, but if the sizes did differ we want the equivalent of `logits[actualVocabSize:]` (the start point is unchanged), not `logits[-actualVocabSize:]` (this.) * non-llama: explicitly greedy sampling for temp<=0 (#901) copied directly from llama.cpp - without this temp=0.0 will just scale all the logits to infinity and give bad output * work on thread safety and cleaning up, adding object option * chore: cleanup tests and spec * refactor for object based startup * more docs * Circleci builds for Linux, Windows, and macOS for gpt4all-chat. * more docs * Synced llama.cpp.cmake with upstream * add lock file to ignore codespell * Move usage in Python bindings readme to own section (#907) Have own section for short usage example, as it is not specific to local build Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> * Always sync for circleci. * update models json with replit model * Forgot to bump. * Change the default values for generation in GUI * Removed double-static from variables in replit.cpp The anonymous namespace already makes it static. Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * Generator in Python Bindings - streaming yields tokens at a time (#895) * generator method * cleanup * bump version number for clarity * added replace in decode to avoid unicodedecode exception * revert back to _build_prompt * Do auto detection by default in C++ API Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * remove comment * add comments for index.h * chore: add new models and edit ignore files and documentation * llama on Metal (#885) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> * Revert "llama on Metal (#885)" This reverts commitb59ce1c6e7
. * add more readme stuff and debug info * spell * Metal+LLama take two (#929) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> * add prebuilts for windows * Add new solution for context links that does not force regular markdown (#938) in responses which is disruptive to code completions in responses. * add prettier * split out non llm related methods into util.js, add listModels method * add prebuild script for creating all platforms bindings at once * check in prebuild linux/so libs and allow distribution of napi prebuilds * apply autoformatter * move constants in config.js, add loadModel and retrieveModel methods * Clean up the context links a bit. * Don't interfere with selection. * Add code blocks and python syntax highlighting. * Spelling error. * Add c++/c highighting support. * Fix some bugs with bash syntax and add some C23 keywords. * Bugfixes for prompt syntax highlighting. * Try and fix a false positive from codespell. * When recalculating context we can't erase the BOS. * Fix Windows MSVC AVX builds - bug introduced in557c82b5ed
- currently getting: `warning C5102: ignoring invalid command-line macro definition '/arch:AVX2'` - solution is to use `_options(...)` not `_definitions(...)` * remove .so unneeded path --------- Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com> Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> Signed-off-by: mvenditto <venditto.matteo@gmail.com> Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Signed-off-by: AT <manyoso@users.noreply.github.com> Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> Co-authored-by: Justin Wang <justinwang46@gmail.com> Co-authored-by: Adam Treat <treat.adam@gmail.com> Co-authored-by: redthing1 <redthing1@alt.icu> Co-authored-by: Konstantin Gukov <gukkos@gmail.com> Co-authored-by: Richard Guo <richardg7890@gmail.com> Co-authored-by: Joseph Mearman <joseph@mearman.co.uk> Co-authored-by: Nandakumar <nandagunasekaran@gmail.com> Co-authored-by: Chase McDougall <chasemcdougall@hotmail.com> Co-authored-by: mvenditto <venditto.matteo@gmail.com> Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com> Co-authored-by: Aaron Miller <apage43@ninjawhale.com> Co-authored-by: FoivosC <christoulakis.foivos@adlittle.com> Co-authored-by: limez <limez@protonmail.com> Co-authored-by: AT <manyoso@users.noreply.github.com> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> Co-authored-by: niansa <anton-sa@web.de> Co-authored-by: mudler <mudler@mocaccino.org> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Tim Miller <innerlogic4321@gmail.com> Co-authored-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: Claudius Ellsel <claudius.ellsel@live.de> Co-authored-by: pingpongching <golololologol02@gmail.com> Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: Cosmic Snow <cosmic-snow@mailfence.com>
This commit is contained in:
parent
44bf91855d
commit
8d53614444
@ -1,3 +1,3 @@
|
||||
[codespell]
|
||||
ignore-words-list = blong, belong
|
||||
skip = .git,*.pdf,*.svg
|
||||
skip = .git,*.pdf,*.svg,*.lock
|
||||
|
1
gpt4all-bindings/typescript/.gitignore
vendored
1
gpt4all-bindings/typescript/.gitignore
vendored
@ -1,2 +1,3 @@
|
||||
node_modules/
|
||||
build/
|
||||
prebuilds/
|
||||
|
@ -1,3 +1,4 @@
|
||||
test/
|
||||
spec/
|
||||
|
||||
scripts/
|
||||
build
|
@ -2,12 +2,32 @@
|
||||
The original [GPT4All typescript bindings](https://github.com/nomic-ai/gpt4all-ts) are now out of date.
|
||||
|
||||
- created by [jacoobes](https://github.com/jacoobes) and [nomic ai](https://home.nomic.ai) :D, for all to use.
|
||||
- will maintain this repository when possible, new feature requests will be handled through nomic
|
||||
|
||||
### Code (alpha)
|
||||
```js
|
||||
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } from '../src/gpt4all.js'
|
||||
|
||||
const ll = new LLModel({
|
||||
model_name: 'ggml-vicuna-7b-1.1-q4_2.bin',
|
||||
model_path: './',
|
||||
library_path: DEFAULT_LIBRARIES_DIRECTORY
|
||||
});
|
||||
|
||||
const response = await createCompletion(ll, [
|
||||
{ role : 'system', content: 'You are meant to be annoying and unhelpful.' },
|
||||
{ role : 'user', content: 'What is 1 + 1?' }
|
||||
]);
|
||||
|
||||
```
|
||||
### API
|
||||
- The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
|
||||
- [docs](./docs/api.md)
|
||||
### Build Instructions
|
||||
|
||||
- As of 05/21/2023, Tested on windows (MSVC) only. (somehow got it to work on MSVC 🤯)
|
||||
- As of 05/21/2023, Tested on windows (MSVC). (somehow got it to work on MSVC 🤯)
|
||||
- binding.gyp is compile config
|
||||
- Tested on Ubuntu. Everything seems to work fine
|
||||
- MingW works as well to build the gpt4all-backend. HOWEVER, this package works only with MSVC built dlls.
|
||||
|
||||
### Requirements
|
||||
- git
|
||||
@ -31,6 +51,15 @@ cd gpt4all-bindings/typescript
|
||||
```sh
|
||||
git submodule update --init --depth 1 --recursive
|
||||
```
|
||||
**AS OF NEW BACKEND** to build the backend,
|
||||
```sh
|
||||
yarn build:backend
|
||||
```
|
||||
This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native The only current way to use them is to put them in the current working directory of your application. That is, **WHEREVER YOU RUN YOUR NODE APPLICATION**
|
||||
- llama-xxxx.dll is required.
|
||||
- According to whatever model you are using, you'll need to select the proper model loader.
|
||||
- For example, if you running an Mosaic MPT model, you will need to select the mpt-(buildvariant).(dynamiclibrary)
|
||||
|
||||
### Test
|
||||
```sh
|
||||
yarn test
|
||||
@ -48,9 +77,22 @@ yarn test
|
||||
|
||||
#### spec/
|
||||
- Average look and feel of the api
|
||||
- Should work assuming a model is installed locally in working directory
|
||||
- Should work assuming a model and libraries are installed locally in working directory
|
||||
|
||||
#### index.cc
|
||||
- The bridge between nodejs and c. Where the bindings are.
|
||||
#### prompt.cc
|
||||
- Handling prompting and inference of models in a threadsafe, asynchronous way.
|
||||
#### docs/
|
||||
- Autogenerated documentation using the script `yarn docs:build`
|
||||
|
||||
### Roadmap
|
||||
This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
|
||||
|
||||
- [x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
|
||||
- [ ] createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)
|
||||
- [ ] proper unit testing (integrate with circle ci)
|
||||
- [ ] publish to npm under alpha tag `gpt4all@alpha`
|
||||
- [ ] have more people test on other platforms (mac tester needed)
|
||||
- [x] switch to new pluggable backend
|
||||
|
||||
|
@ -1,45 +1,55 @@
|
||||
{
|
||||
"targets": [
|
||||
{
|
||||
"target_name": "gpt4allts", # gpt4all-ts will cause compile error
|
||||
"cflags!": [ "-fno-exceptions" ],
|
||||
"cflags_cc!": [ "-fno-exceptions" ],
|
||||
"target_name": "gpt4all", # gpt4all-ts will cause compile error
|
||||
"cflags_cc!": [ "-fno-exceptions"],
|
||||
"include_dirs": [
|
||||
"<!@(node -p \"require('node-addon-api').include\")",
|
||||
"../../gpt4all-backend/llama.cpp/", # need to include llama.cpp because the include paths for examples/common.h include llama.h relatively
|
||||
"../../gpt4all-backend",
|
||||
],
|
||||
"sources": [ # is there a better way to do this
|
||||
"../../gpt4all-backend/llama.cpp/examples/common.cpp",
|
||||
"../../gpt4all-backend/llama.cpp/ggml.c",
|
||||
"../../gpt4all-backend/llama.cpp/llama.cpp",
|
||||
"../../gpt4all-backend/utils.cpp",
|
||||
"sources": [
|
||||
# PREVIOUS VERSION: had to required the sources, but with newest changes do not need to
|
||||
#"../../gpt4all-backend/llama.cpp/examples/common.cpp",
|
||||
#"../../gpt4all-backend/llama.cpp/ggml.c",
|
||||
#"../../gpt4all-backend/llama.cpp/llama.cpp",
|
||||
# "../../gpt4all-backend/utils.cpp",
|
||||
"../../gpt4all-backend/llmodel_c.cpp",
|
||||
"../../gpt4all-backend/gptj.cpp",
|
||||
"../../gpt4all-backend/llamamodel.cpp",
|
||||
"../../gpt4all-backend/mpt.cpp",
|
||||
"stdcapture.cc",
|
||||
"../../gpt4all-backend/llmodel.cpp",
|
||||
"prompt.cc",
|
||||
"index.cc",
|
||||
],
|
||||
"conditions": [
|
||||
['OS=="mac"', {
|
||||
'defines': [
|
||||
'NAPI_CPP_EXCEPTIONS'
|
||||
],
|
||||
'LIB_FILE_EXT=".dylib"',
|
||||
'NAPI_CPP_EXCEPTIONS',
|
||||
]
|
||||
}],
|
||||
['OS=="win"', {
|
||||
'defines': [
|
||||
'LIB_FILE_EXT=".dll"',
|
||||
'NAPI_CPP_EXCEPTIONS',
|
||||
"__AVX2__" # allows SIMD: https://discord.com/channels/1076964370942267462/1092290790388150272/1107564673957630023
|
||||
],
|
||||
"msvs_settings": {
|
||||
"VCCLCompilerTool": {
|
||||
"AdditionalOptions": [
|
||||
"/std:c++20",
|
||||
"/EHsc"
|
||||
],
|
||||
},
|
||||
"/EHsc",
|
||||
],
|
||||
},
|
||||
},
|
||||
}],
|
||||
['OS=="linux"', {
|
||||
'defines': [
|
||||
'LIB_FILE_EXT=".so"',
|
||||
'NAPI_CPP_EXCEPTIONS',
|
||||
],
|
||||
'cflags_cc!': [
|
||||
'-fno-rtti',
|
||||
],
|
||||
'cflags_cc': [
|
||||
'-std=c++20'
|
||||
]
|
||||
}]
|
||||
]
|
||||
}]
|
||||
|
623
gpt4all-bindings/typescript/docs/api.md
Normal file
623
gpt4all-bindings/typescript/docs/api.md
Normal file
@ -0,0 +1,623 @@
|
||||
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->
|
||||
|
||||
### Table of Contents
|
||||
|
||||
* [download][1]
|
||||
* [Parameters][2]
|
||||
* [Examples][3]
|
||||
* [DownloadOptions][4]
|
||||
* [location][5]
|
||||
* [debug][6]
|
||||
* [url][7]
|
||||
* [DownloadController][8]
|
||||
* [cancel][9]
|
||||
* [promise][10]
|
||||
* [ModelType][11]
|
||||
* [ModelFile][12]
|
||||
* [gptj][13]
|
||||
* [llama][14]
|
||||
* [mpt][15]
|
||||
* [type][16]
|
||||
* [LLModel][17]
|
||||
* [constructor][18]
|
||||
* [Parameters][19]
|
||||
* [type][20]
|
||||
* [name][21]
|
||||
* [stateSize][22]
|
||||
* [threadCount][23]
|
||||
* [setThreadCount][24]
|
||||
* [Parameters][25]
|
||||
* [raw\_prompt][26]
|
||||
* [Parameters][27]
|
||||
* [isModelLoaded][28]
|
||||
* [setLibraryPath][29]
|
||||
* [Parameters][30]
|
||||
* [getLibraryPath][31]
|
||||
* [createCompletion][32]
|
||||
* [Parameters][33]
|
||||
* [Examples][34]
|
||||
* [CompletionOptions][35]
|
||||
* [verbose][36]
|
||||
* [hasDefaultHeader][37]
|
||||
* [hasDefaultFooter][38]
|
||||
* [PromptMessage][39]
|
||||
* [role][40]
|
||||
* [content][41]
|
||||
* [prompt\_tokens][42]
|
||||
* [completion\_tokens][43]
|
||||
* [total\_tokens][44]
|
||||
* [CompletionReturn][45]
|
||||
* [model][46]
|
||||
* [usage][47]
|
||||
* [choices][48]
|
||||
* [CompletionChoice][49]
|
||||
* [message][50]
|
||||
* [LLModelPromptContext][51]
|
||||
* [logits\_size][52]
|
||||
* [tokens\_size][53]
|
||||
* [n\_past][54]
|
||||
* [n\_ctx][55]
|
||||
* [n\_predict][56]
|
||||
* [top\_k][57]
|
||||
* [top\_p][58]
|
||||
* [temp][59]
|
||||
* [n\_batch][60]
|
||||
* [repeat\_penalty][61]
|
||||
* [repeat\_last\_n][62]
|
||||
* [context\_erase][63]
|
||||
* [createTokenStream][64]
|
||||
* [Parameters][65]
|
||||
* [DEFAULT\_DIRECTORY][66]
|
||||
* [DEFAULT\_LIBRARIES\_DIRECTORY][67]
|
||||
|
||||
## download
|
||||
|
||||
Initiates the download of a model file of a specific model type.
|
||||
By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||
|
||||
### Parameters
|
||||
|
||||
* `model` **[ModelFile][12]** The model file to be downloaded.
|
||||
* `options` **[DownloadOptions][4]** to pass into the downloader. Default is { location: (cwd), debug: false }.
|
||||
|
||||
### Examples
|
||||
|
||||
```javascript
|
||||
const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
|
||||
controller.promise().then(() => console.log('Downloaded!'))
|
||||
```
|
||||
|
||||
* Throws **[Error][68]** If the model already exists in the specified location.
|
||||
* Throws **[Error][68]** If the model cannot be found at the specified url.
|
||||
|
||||
Returns **[DownloadController][8]** object that allows controlling the download process.
|
||||
|
||||
## DownloadOptions
|
||||
|
||||
Options for the model download process.
|
||||
|
||||
### location
|
||||
|
||||
location to download the model.
|
||||
Default is process.cwd(), or the current working directory
|
||||
|
||||
Type: [string][69]
|
||||
|
||||
### debug
|
||||
|
||||
Debug mode -- check how long it took to download in seconds
|
||||
|
||||
Type: [boolean][70]
|
||||
|
||||
### url
|
||||
|
||||
Remote download url. Defaults to `https://gpt4all.io/models`
|
||||
|
||||
Type: [string][69]
|
||||
|
||||
## DownloadController
|
||||
|
||||
Model download controller.
|
||||
|
||||
### cancel
|
||||
|
||||
Cancel the request to download from gpt4all website if this is called.
|
||||
|
||||
Type: function (): void
|
||||
|
||||
### promise
|
||||
|
||||
Convert the downloader into a promise, allowing people to await and manage its lifetime
|
||||
|
||||
Type: function (): [Promise][71]\<void>
|
||||
|
||||
## ModelType
|
||||
|
||||
Type of the model
|
||||
|
||||
Type: (`"gptj"` | `"llama"` | `"mpt"`)
|
||||
|
||||
## ModelFile
|
||||
|
||||
Full list of models available
|
||||
|
||||
### gptj
|
||||
|
||||
List of GPT-J Models
|
||||
|
||||
Type: (`"ggml-gpt4all-j-v1.3-groovy.bin"` | `"ggml-gpt4all-j-v1.2-jazzy.bin"` | `"ggml-gpt4all-j-v1.1-breezy.bin"` | `"ggml-gpt4all-j.bin"`)
|
||||
|
||||
### llama
|
||||
|
||||
List Llama Models
|
||||
|
||||
Type: (`"ggml-gpt4all-l13b-snoozy.bin"` | `"ggml-vicuna-7b-1.1-q4_2.bin"` | `"ggml-vicuna-13b-1.1-q4_2.bin"` | `"ggml-wizardLM-7B.q4_2.bin"` | `"ggml-stable-vicuna-13B.q4_2.bin"` | `"ggml-nous-gpt4-vicuna-13b.bin"`)
|
||||
|
||||
### mpt
|
||||
|
||||
List of MPT Models
|
||||
|
||||
Type: (`"ggml-mpt-7b-base.bin"` | `"ggml-mpt-7b-chat.bin"` | `"ggml-mpt-7b-instruct.bin"`)
|
||||
|
||||
## type
|
||||
|
||||
Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
||||
|
||||
Type: [ModelType][11]
|
||||
|
||||
## LLModel
|
||||
|
||||
LLModel class representing a language model.
|
||||
This is a base class that provides common functionality for different types of language models.
|
||||
|
||||
### constructor
|
||||
|
||||
Initialize a new LLModel.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* `path` **[string][69]** Absolute path to the model file.
|
||||
|
||||
<!---->
|
||||
|
||||
* Throws **[Error][68]** If the model file does not exist.
|
||||
|
||||
### type
|
||||
|
||||
either 'gpt', mpt', or 'llama' or undefined
|
||||
|
||||
Returns **([ModelType][11] | [undefined][72])** 
|
||||
|
||||
### name
|
||||
|
||||
The name of the model.
|
||||
|
||||
Returns **[ModelFile][12]** 
|
||||
|
||||
### stateSize
|
||||
|
||||
Get the size of the internal state of the model.
|
||||
NOTE: This state data is specific to the type of model you have created.
|
||||
|
||||
Returns **[number][73]** the size in bytes of the internal state of the model
|
||||
|
||||
### threadCount
|
||||
|
||||
Get the number of threads used for model inference.
|
||||
The default is the number of physical cores your computer has.
|
||||
|
||||
Returns **[number][73]** The number of threads used for model inference.
|
||||
|
||||
### setThreadCount
|
||||
|
||||
Set the number of threads used for model inference.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* `newNumber` **[number][73]** The new number of threads.
|
||||
|
||||
Returns **void** 
|
||||
|
||||
### raw\_prompt
|
||||
|
||||
Prompt the model with a given input and optional parameters.
|
||||
This is the raw output from std out.
|
||||
Use the prompt function exported for a value
|
||||
|
||||
#### Parameters
|
||||
|
||||
* `q` **[string][69]** The prompt input.
|
||||
* `params` **Partial<[LLModelPromptContext][51]>?** Optional parameters for the prompt context.
|
||||
|
||||
Returns **any** The result of the model prompt.
|
||||
|
||||
### isModelLoaded
|
||||
|
||||
Whether the model is loaded or not.
|
||||
|
||||
Returns **[boolean][70]** 
|
||||
|
||||
### setLibraryPath
|
||||
|
||||
Where to search for the pluggable backend libraries
|
||||
|
||||
#### Parameters
|
||||
|
||||
* `s` **[string][69]** 
|
||||
|
||||
Returns **void** 
|
||||
|
||||
### getLibraryPath
|
||||
|
||||
Where to get the pluggable backend libraries
|
||||
|
||||
Returns **[string][69]** 
|
||||
|
||||
## createCompletion
|
||||
|
||||
The nodejs equivalent to python binding's chat\_completion
|
||||
|
||||
### Parameters
|
||||
|
||||
* `llmodel` **[LLModel][17]** The language model object.
|
||||
* `messages` **[Array][74]<[PromptMessage][39]>** The array of messages for the conversation.
|
||||
* `options` **[CompletionOptions][35]** The options for creating the completion.
|
||||
|
||||
### Examples
|
||||
|
||||
```javascript
|
||||
const llmodel = new LLModel(model)
|
||||
const messages = [
|
||||
{ role: 'system', message: 'You are a weather forecaster.' },
|
||||
{ role: 'user', message: 'should i go out today?' } ]
|
||||
const completion = await createCompletion(llmodel, messages, {
|
||||
verbose: true,
|
||||
temp: 0.9,
|
||||
})
|
||||
console.log(completion.choices[0].message.content)
|
||||
// No, it's going to be cold and rainy.
|
||||
```
|
||||
|
||||
Returns **[CompletionReturn][45]** The completion result.
|
||||
|
||||
## CompletionOptions
|
||||
|
||||
**Extends Partial\<LLModelPromptContext>**
|
||||
|
||||
The options for creating the completion.
|
||||
|
||||
### verbose
|
||||
|
||||
Indicates if verbose logging is enabled.
|
||||
|
||||
Type: [boolean][70]
|
||||
|
||||
### hasDefaultHeader
|
||||
|
||||
Indicates if the default header is included in the prompt.
|
||||
|
||||
Type: [boolean][70]
|
||||
|
||||
### hasDefaultFooter
|
||||
|
||||
Indicates if the default footer is included in the prompt.
|
||||
|
||||
Type: [boolean][70]
|
||||
|
||||
## PromptMessage
|
||||
|
||||
A message in the conversation, identical to OpenAI's chat message.
|
||||
|
||||
### role
|
||||
|
||||
The role of the message.
|
||||
|
||||
Type: (`"system"` | `"assistant"` | `"user"`)
|
||||
|
||||
### content
|
||||
|
||||
The message content.
|
||||
|
||||
Type: [string][69]
|
||||
|
||||
## prompt\_tokens
|
||||
|
||||
The number of tokens used in the prompt.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
## completion\_tokens
|
||||
|
||||
The number of tokens used in the completion.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
## total\_tokens
|
||||
|
||||
The total number of tokens used.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
## CompletionReturn
|
||||
|
||||
The result of the completion, similar to OpenAI's format.
|
||||
|
||||
### model
|
||||
|
||||
The model name.
|
||||
|
||||
Type: [ModelFile][12]
|
||||
|
||||
### usage
|
||||
|
||||
Token usage report.
|
||||
|
||||
Type: {prompt\_tokens: [number][73], completion\_tokens: [number][73], total\_tokens: [number][73]}
|
||||
|
||||
### choices
|
||||
|
||||
The generated completions.
|
||||
|
||||
Type: [Array][74]<[CompletionChoice][49]>
|
||||
|
||||
## CompletionChoice
|
||||
|
||||
A completion choice, similar to OpenAI's format.
|
||||
|
||||
### message
|
||||
|
||||
Response message
|
||||
|
||||
Type: [PromptMessage][39]
|
||||
|
||||
## LLModelPromptContext
|
||||
|
||||
Model inference arguments for generating completions.
|
||||
|
||||
### logits\_size
|
||||
|
||||
The size of the raw logits vector.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### tokens\_size
|
||||
|
||||
The size of the raw tokens vector.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### n\_past
|
||||
|
||||
The number of tokens in the past conversation.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### n\_ctx
|
||||
|
||||
The number of tokens possible in the context window.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### n\_predict
|
||||
|
||||
The number of tokens to predict.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### top\_k
|
||||
|
||||
The top-k logits to sample from.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### top\_p
|
||||
|
||||
The nucleus sampling probability threshold.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### temp
|
||||
|
||||
The temperature to adjust the model's output distribution.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### n\_batch
|
||||
|
||||
The number of predictions to generate in parallel.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### repeat\_penalty
|
||||
|
||||
The penalty factor for repeated tokens.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### repeat\_last\_n
|
||||
|
||||
The number of last tokens to penalize.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
### context\_erase
|
||||
|
||||
The percentage of context to erase if the context window is exceeded.
|
||||
|
||||
Type: [number][73]
|
||||
|
||||
## createTokenStream
|
||||
|
||||
TODO: Help wanted to implement this
|
||||
|
||||
### Parameters
|
||||
|
||||
* `llmodel` **[LLModel][17]** 
|
||||
* `messages` **[Array][74]<[PromptMessage][39]>** 
|
||||
* `options` **[CompletionOptions][35]** 
|
||||
|
||||
Returns **function (ll: [LLModel][17]): AsyncGenerator<[string][69]>** 
|
||||
|
||||
## DEFAULT\_DIRECTORY
|
||||
|
||||
From python api:
|
||||
models will be stored in (homedir)/.cache/gpt4all/\`
|
||||
|
||||
Type: [string][69]
|
||||
|
||||
## DEFAULT\_LIBRARIES\_DIRECTORY
|
||||
|
||||
From python api:
|
||||
The default path for dynamic libraries to be stored.
|
||||
You may separate paths by a semicolon to search in multiple areas.
|
||||
This searches DEFAULT\_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
||||
|
||||
Type: [string][69]
|
||||
|
||||
[1]: #download
|
||||
|
||||
[2]: #parameters
|
||||
|
||||
[3]: #examples
|
||||
|
||||
[4]: #downloadoptions
|
||||
|
||||
[5]: #location
|
||||
|
||||
[6]: #debug
|
||||
|
||||
[7]: #url
|
||||
|
||||
[8]: #downloadcontroller
|
||||
|
||||
[9]: #cancel
|
||||
|
||||
[10]: #promise
|
||||
|
||||
[11]: #modeltype
|
||||
|
||||
[12]: #modelfile
|
||||
|
||||
[13]: #gptj
|
||||
|
||||
[14]: #llama
|
||||
|
||||
[15]: #mpt
|
||||
|
||||
[16]: #type
|
||||
|
||||
[17]: #llmodel
|
||||
|
||||
[18]: #constructor
|
||||
|
||||
[19]: #parameters-1
|
||||
|
||||
[20]: #type-1
|
||||
|
||||
[21]: #name
|
||||
|
||||
[22]: #statesize
|
||||
|
||||
[23]: #threadcount
|
||||
|
||||
[24]: #setthreadcount
|
||||
|
||||
[25]: #parameters-2
|
||||
|
||||
[26]: #raw_prompt
|
||||
|
||||
[27]: #parameters-3
|
||||
|
||||
[28]: #ismodelloaded
|
||||
|
||||
[29]: #setlibrarypath
|
||||
|
||||
[30]: #parameters-4
|
||||
|
||||
[31]: #getlibrarypath
|
||||
|
||||
[32]: #createcompletion
|
||||
|
||||
[33]: #parameters-5
|
||||
|
||||
[34]: #examples-1
|
||||
|
||||
[35]: #completionoptions
|
||||
|
||||
[36]: #verbose
|
||||
|
||||
[37]: #hasdefaultheader
|
||||
|
||||
[38]: #hasdefaultfooter
|
||||
|
||||
[39]: #promptmessage
|
||||
|
||||
[40]: #role
|
||||
|
||||
[41]: #content
|
||||
|
||||
[42]: #prompt_tokens
|
||||
|
||||
[43]: #completion_tokens
|
||||
|
||||
[44]: #total_tokens
|
||||
|
||||
[45]: #completionreturn
|
||||
|
||||
[46]: #model
|
||||
|
||||
[47]: #usage
|
||||
|
||||
[48]: #choices
|
||||
|
||||
[49]: #completionchoice
|
||||
|
||||
[50]: #message
|
||||
|
||||
[51]: #llmodelpromptcontext
|
||||
|
||||
[52]: #logits_size
|
||||
|
||||
[53]: #tokens_size
|
||||
|
||||
[54]: #n_past
|
||||
|
||||
[55]: #n_ctx
|
||||
|
||||
[56]: #n_predict
|
||||
|
||||
[57]: #top_k
|
||||
|
||||
[58]: #top_p
|
||||
|
||||
[59]: #temp
|
||||
|
||||
[60]: #n_batch
|
||||
|
||||
[61]: #repeat_penalty
|
||||
|
||||
[62]: #repeat_last_n
|
||||
|
||||
[63]: #context_erase
|
||||
|
||||
[64]: #createtokenstream
|
||||
|
||||
[65]: #parameters-6
|
||||
|
||||
[66]: #default_directory
|
||||
|
||||
[67]: #default_libraries_directory
|
||||
|
||||
[68]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Error
|
||||
|
||||
[69]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String
|
||||
|
||||
[70]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean
|
||||
|
||||
[71]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise
|
||||
|
||||
[72]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined
|
||||
|
||||
[73]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number
|
||||
|
||||
[74]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array
|
@ -1,68 +1,95 @@
|
||||
#include <napi.h>
|
||||
#include <iostream>
|
||||
#include "llmodel_c.h"
|
||||
#include "llmodel.h"
|
||||
#include "gptj.h"
|
||||
#include "llamamodel.h"
|
||||
#include "mpt.h"
|
||||
#include "stdcapture.h"
|
||||
#include "index.h"
|
||||
|
||||
class NodeModelWrapper : public Napi::ObjectWrap<NodeModelWrapper> {
|
||||
public:
|
||||
static Napi::Object Init(Napi::Env env, Napi::Object exports) {
|
||||
Napi::Function func = DefineClass(env, "LLModel", {
|
||||
InstanceMethod("type", &NodeModelWrapper::getType),
|
||||
InstanceMethod("name", &NodeModelWrapper::getName),
|
||||
InstanceMethod("stateSize", &NodeModelWrapper::StateSize),
|
||||
InstanceMethod("raw_prompt", &NodeModelWrapper::Prompt),
|
||||
InstanceMethod("setThreadCount", &NodeModelWrapper::SetThreadCount),
|
||||
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
|
||||
Napi::FunctionReference NodeModelWrapper::constructor;
|
||||
|
||||
Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
||||
Napi::Function self = DefineClass(env, "LLModel", {
|
||||
InstanceMethod("type", &NodeModelWrapper::getType),
|
||||
InstanceMethod("isModelLoaded", &NodeModelWrapper::IsModelLoaded),
|
||||
InstanceMethod("name", &NodeModelWrapper::getName),
|
||||
InstanceMethod("stateSize", &NodeModelWrapper::StateSize),
|
||||
InstanceMethod("raw_prompt", &NodeModelWrapper::Prompt),
|
||||
InstanceMethod("setThreadCount", &NodeModelWrapper::SetThreadCount),
|
||||
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
|
||||
InstanceMethod("getLibraryPath", &NodeModelWrapper::GetLibraryPath),
|
||||
});
|
||||
|
||||
Napi::FunctionReference* constructor = new Napi::FunctionReference();
|
||||
*constructor = Napi::Persistent(func);
|
||||
env.SetInstanceData(constructor);
|
||||
|
||||
exports.Set("LLModel", func);
|
||||
return exports;
|
||||
// Keep a static reference to the constructor
|
||||
//
|
||||
constructor = Napi::Persistent(self);
|
||||
constructor.SuppressDestruct();
|
||||
return self;
|
||||
}
|
||||
|
||||
Napi::Value getType(const Napi::CallbackInfo& info)
|
||||
|
||||
Napi::Value NodeModelWrapper::getType(const Napi::CallbackInfo& info)
|
||||
{
|
||||
if(type.empty()) {
|
||||
return info.Env().Undefined();
|
||||
}
|
||||
return Napi::String::New(info.Env(), type);
|
||||
}
|
||||
|
||||
NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
|
||||
NodeModelWrapper::NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
|
||||
{
|
||||
auto env = info.Env();
|
||||
std::string weights_path = info[0].As<Napi::String>().Utf8Value();
|
||||
fs::path model_path;
|
||||
|
||||
const char *c_weights_path = weights_path.c_str();
|
||||
|
||||
inference_ = create_model_set_type(c_weights_path);
|
||||
std::string full_weight_path;
|
||||
//todo
|
||||
std::string library_path = ".";
|
||||
std::string model_name;
|
||||
if(info[0].IsString()) {
|
||||
model_path = info[0].As<Napi::String>().Utf8Value();
|
||||
full_weight_path = model_path.string();
|
||||
std::cout << "DEPRECATION: constructor accepts object now. Check docs for more.\n";
|
||||
} else {
|
||||
auto config_object = info[0].As<Napi::Object>();
|
||||
model_name = config_object.Get("model_name").As<Napi::String>();
|
||||
model_path = config_object.Get("model_path").As<Napi::String>().Utf8Value();
|
||||
if(config_object.Has("model_type")) {
|
||||
type = config_object.Get("model_type").As<Napi::String>();
|
||||
}
|
||||
full_weight_path = (model_path / fs::path(model_name)).string();
|
||||
|
||||
if(config_object.Has("library_path")) {
|
||||
library_path = config_object.Get("library_path").As<Napi::String>();
|
||||
} else {
|
||||
library_path = ".";
|
||||
}
|
||||
}
|
||||
llmodel_set_implementation_search_path(library_path.c_str());
|
||||
llmodel_error* e = nullptr;
|
||||
inference_ = std::make_shared<llmodel_model>(llmodel_model_create2(full_weight_path.c_str(), "auto", e));
|
||||
if(e != nullptr) {
|
||||
Napi::Error::New(env, e->message).ThrowAsJavaScriptException();
|
||||
return;
|
||||
}
|
||||
if(GetInference() == nullptr) {
|
||||
std::cerr << "Tried searching libraries in \"" << library_path << "\"" << std::endl;
|
||||
std::cerr << "Tried searching for model weight in \"" << full_weight_path << "\"" << std::endl;
|
||||
Napi::Error::New(env, "Had an issue creating llmodel object, inference is null").ThrowAsJavaScriptException();
|
||||
return;
|
||||
}
|
||||
|
||||
auto success = llmodel_loadModel(inference_, c_weights_path);
|
||||
auto success = llmodel_loadModel(GetInference(), full_weight_path.c_str());
|
||||
if(!success) {
|
||||
Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException();
|
||||
return;
|
||||
}
|
||||
name = weights_path.substr(weights_path.find_last_of("/\\") + 1);
|
||||
|
||||
name = model_name.empty() ? model_path.filename().string() : model_name;
|
||||
};
|
||||
~NodeModelWrapper() {
|
||||
// destroying the model manually causes exit code 3221226505, why?
|
||||
// However, bindings seem to operate fine without destructing pointer
|
||||
//llmodel_model_destroy(inference_);
|
||||
//NodeModelWrapper::~NodeModelWrapper() {
|
||||
//GetInference().reset();
|
||||
//}
|
||||
|
||||
Napi::Value NodeModelWrapper::IsModelLoaded(const Napi::CallbackInfo& info) {
|
||||
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(GetInference()));
|
||||
}
|
||||
|
||||
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info) {
|
||||
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(inference_));
|
||||
}
|
||||
|
||||
Napi::Value StateSize(const Napi::CallbackInfo& info) {
|
||||
Napi::Value NodeModelWrapper::StateSize(const Napi::CallbackInfo& info) {
|
||||
// Implement the binding for the stateSize method
|
||||
return Napi::Number::New(info.Env(), static_cast<int64_t>(llmodel_get_state_size(inference_)));
|
||||
return Napi::Number::New(info.Env(), static_cast<int64_t>(llmodel_get_state_size(GetInference())));
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Generate a response using the model.
|
||||
@ -73,16 +100,14 @@ public:
|
||||
* @param recalculate_callback A callback function for handling recalculation requests.
|
||||
* @param ctx A pointer to the llmodel_prompt_context structure.
|
||||
*/
|
||||
Napi::Value Prompt(const Napi::CallbackInfo& info) {
|
||||
|
||||
Napi::Value NodeModelWrapper::Prompt(const Napi::CallbackInfo& info) {
|
||||
auto env = info.Env();
|
||||
|
||||
std::string question;
|
||||
if(info[0].IsString()) {
|
||||
question = info[0].As<Napi::String>().Utf8Value();
|
||||
} else {
|
||||
Napi::Error::New(env, "invalid string argument").ThrowAsJavaScriptException();
|
||||
return env.Undefined();
|
||||
Napi::Error::New(info.Env(), "invalid string argument").ThrowAsJavaScriptException();
|
||||
return info.Env().Undefined();
|
||||
}
|
||||
//defaults copied from python bindings
|
||||
llmodel_prompt_context promptContext = {
|
||||
@ -101,127 +126,90 @@ public:
|
||||
};
|
||||
if(info[1].IsObject())
|
||||
{
|
||||
auto inputObject = info[1].As<Napi::Object>();
|
||||
auto inputObject = info[1].As<Napi::Object>();
|
||||
|
||||
// Extract and assign the properties
|
||||
if (inputObject.Has("logits") || inputObject.Has("tokens")) {
|
||||
Napi::Error::New(env, "Invalid input: 'logits' or 'tokens' properties are not allowed").ThrowAsJavaScriptException();
|
||||
return env.Undefined();
|
||||
}
|
||||
if (inputObject.Has("logits") || inputObject.Has("tokens")) {
|
||||
Napi::Error::New(info.Env(), "Invalid input: 'logits' or 'tokens' properties are not allowed").ThrowAsJavaScriptException();
|
||||
return info.Env().Undefined();
|
||||
}
|
||||
// Assign the remaining properties
|
||||
if(inputObject.Has("n_past")) {
|
||||
promptContext.n_past = inputObject.Get("n_past").As<Napi::Number>().Int32Value();
|
||||
}
|
||||
if(inputObject.Has("n_ctx")) {
|
||||
promptContext.n_ctx = inputObject.Get("n_ctx").As<Napi::Number>().Int32Value();
|
||||
}
|
||||
if(inputObject.Has("n_predict")) {
|
||||
promptContext.n_predict = inputObject.Get("n_predict").As<Napi::Number>().Int32Value();
|
||||
}
|
||||
if(inputObject.Has("top_k")) {
|
||||
promptContext.top_k = inputObject.Get("top_k").As<Napi::Number>().Int32Value();
|
||||
}
|
||||
if(inputObject.Has("top_p")) {
|
||||
promptContext.top_p = inputObject.Get("top_p").As<Napi::Number>().FloatValue();
|
||||
}
|
||||
if(inputObject.Has("temp")) {
|
||||
promptContext.temp = inputObject.Get("temp").As<Napi::Number>().FloatValue();
|
||||
}
|
||||
if(inputObject.Has("n_batch")) {
|
||||
promptContext.n_batch = inputObject.Get("n_batch").As<Napi::Number>().Int32Value();
|
||||
}
|
||||
if(inputObject.Has("repeat_penalty")) {
|
||||
promptContext.repeat_penalty = inputObject.Get("repeat_penalty").As<Napi::Number>().FloatValue();
|
||||
}
|
||||
if(inputObject.Has("repeat_last_n")) {
|
||||
promptContext.repeat_last_n = inputObject.Get("repeat_last_n").As<Napi::Number>().Int32Value();
|
||||
}
|
||||
if(inputObject.Has("context_erase")) {
|
||||
promptContext.context_erase = inputObject.Get("context_erase").As<Napi::Number>().FloatValue();
|
||||
}
|
||||
if(inputObject.Has("n_past"))
|
||||
promptContext.n_past = inputObject.Get("n_past").As<Napi::Number>().Int32Value();
|
||||
if(inputObject.Has("n_ctx"))
|
||||
promptContext.n_ctx = inputObject.Get("n_ctx").As<Napi::Number>().Int32Value();
|
||||
if(inputObject.Has("n_predict"))
|
||||
promptContext.n_predict = inputObject.Get("n_predict").As<Napi::Number>().Int32Value();
|
||||
if(inputObject.Has("top_k"))
|
||||
promptContext.top_k = inputObject.Get("top_k").As<Napi::Number>().Int32Value();
|
||||
if(inputObject.Has("top_p"))
|
||||
promptContext.top_p = inputObject.Get("top_p").As<Napi::Number>().FloatValue();
|
||||
if(inputObject.Has("temp"))
|
||||
promptContext.temp = inputObject.Get("temp").As<Napi::Number>().FloatValue();
|
||||
if(inputObject.Has("n_batch"))
|
||||
promptContext.n_batch = inputObject.Get("n_batch").As<Napi::Number>().Int32Value();
|
||||
if(inputObject.Has("repeat_penalty"))
|
||||
promptContext.repeat_penalty = inputObject.Get("repeat_penalty").As<Napi::Number>().FloatValue();
|
||||
if(inputObject.Has("repeat_last_n"))
|
||||
promptContext.repeat_last_n = inputObject.Get("repeat_last_n").As<Napi::Number>().Int32Value();
|
||||
if(inputObject.Has("context_erase"))
|
||||
promptContext.context_erase = inputObject.Get("context_erase").As<Napi::Number>().FloatValue();
|
||||
}
|
||||
// custom callbacks are weird with the gpt4all c bindings: I need to turn Napi::Functions into raw c function pointers,
|
||||
// but it doesn't seem like its possible? (TODO, is it possible?)
|
||||
//copy to protect llmodel resources when splitting to new thread
|
||||
|
||||
// if(info[1].IsFunction()) {
|
||||
// Napi::Callback cb = *info[1].As<Napi::Function>();
|
||||
// }
|
||||
|
||||
|
||||
// For now, simple capture of stdout
|
||||
// possible TODO: put this on a libuv async thread. (AsyncWorker)
|
||||
CoutRedirect cr;
|
||||
llmodel_prompt(inference_, question.c_str(), &prompt_callback, &response_callback, &recalculate_callback, &promptContext);
|
||||
return Napi::String::New(env, cr.getString());
|
||||
llmodel_prompt_context copiedPrompt = promptContext;
|
||||
std::string copiedQuestion = question;
|
||||
PromptWorkContext pc = {
|
||||
copiedQuestion,
|
||||
inference_.load(),
|
||||
copiedPrompt,
|
||||
};
|
||||
auto threadSafeContext = new TsfnContext(env, pc);
|
||||
threadSafeContext->tsfn = Napi::ThreadSafeFunction::New(
|
||||
env, // Environment
|
||||
info[2].As<Napi::Function>(), // JS function from caller
|
||||
"PromptCallback", // Resource name
|
||||
0, // Max queue size (0 = unlimited).
|
||||
1, // Initial thread count
|
||||
threadSafeContext, // Context,
|
||||
FinalizerCallback, // Finalizer
|
||||
(void*)nullptr // Finalizer data
|
||||
);
|
||||
threadSafeContext->nativeThread = std::thread(threadEntry, threadSafeContext);
|
||||
return threadSafeContext->deferred_.Promise();
|
||||
}
|
||||
|
||||
void SetThreadCount(const Napi::CallbackInfo& info) {
|
||||
void NodeModelWrapper::SetThreadCount(const Napi::CallbackInfo& info) {
|
||||
if(info[0].IsNumber()) {
|
||||
llmodel_setThreadCount(inference_, info[0].As<Napi::Number>().Int64Value());
|
||||
llmodel_setThreadCount(GetInference(), info[0].As<Napi::Number>().Int64Value());
|
||||
} else {
|
||||
Napi::Error::New(info.Env(), "Could not set thread count: argument 1 is NaN").ThrowAsJavaScriptException();
|
||||
return;
|
||||
}
|
||||
}
|
||||
Napi::Value getName(const Napi::CallbackInfo& info) {
|
||||
|
||||
Napi::Value NodeModelWrapper::getName(const Napi::CallbackInfo& info) {
|
||||
return Napi::String::New(info.Env(), name);
|
||||
}
|
||||
Napi::Value ThreadCount(const Napi::CallbackInfo& info) {
|
||||
return Napi::Number::New(info.Env(), llmodel_threadCount(inference_));
|
||||
Napi::Value NodeModelWrapper::ThreadCount(const Napi::CallbackInfo& info) {
|
||||
return Napi::Number::New(info.Env(), llmodel_threadCount(GetInference()));
|
||||
}
|
||||
|
||||
private:
|
||||
llmodel_model inference_;
|
||||
std::string type;
|
||||
std::string name;
|
||||
|
||||
|
||||
//wrapper cb to capture output into stdout.then, CoutRedirect captures this
|
||||
// and writes it to a file
|
||||
static bool response_callback(int32_t tid, const char* resp)
|
||||
{
|
||||
if(tid != -1) {
|
||||
std::cout<<std::string(resp);
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
Napi::Value NodeModelWrapper::GetLibraryPath(const Napi::CallbackInfo& info) {
|
||||
return Napi::String::New(info.Env(),
|
||||
llmodel_get_implementation_search_path());
|
||||
}
|
||||
|
||||
static bool prompt_callback(int32_t tid) { return true; }
|
||||
static bool recalculate_callback(bool isrecalculating) { return isrecalculating; }
|
||||
// Had to use this instead of the c library in order
|
||||
// set the type of the model loaded.
|
||||
// causes side effect: type is mutated;
|
||||
llmodel_model create_model_set_type(const char* c_weights_path)
|
||||
{
|
||||
|
||||
uint32_t magic;
|
||||
llmodel_model model;
|
||||
FILE *f = fopen(c_weights_path, "rb");
|
||||
fread(&magic, sizeof(magic), 1, f);
|
||||
|
||||
if (magic == 0x67676d6c) {
|
||||
model = llmodel_gptj_create();
|
||||
type = "gptj";
|
||||
}
|
||||
else if (magic == 0x67676a74) {
|
||||
model = llmodel_llama_create();
|
||||
type = "llama";
|
||||
}
|
||||
else if (magic == 0x67676d6d) {
|
||||
model = llmodel_mpt_create();
|
||||
type = "mpt";
|
||||
}
|
||||
else {fprintf(stderr, "Invalid model file\n");}
|
||||
fclose(f);
|
||||
|
||||
return model;
|
||||
llmodel_model NodeModelWrapper::GetInference() {
|
||||
return *inference_.load();
|
||||
}
|
||||
};
|
||||
|
||||
//Exports Bindings
|
||||
Napi::Object Init(Napi::Env env, Napi::Object exports) {
|
||||
return NodeModelWrapper::Init(env, exports);
|
||||
exports["LLModel"] = NodeModelWrapper::GetClass(env);
|
||||
return exports;
|
||||
}
|
||||
|
||||
|
||||
|
||||
NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)
|
||||
|
45
gpt4all-bindings/typescript/index.h
Normal file
45
gpt4all-bindings/typescript/index.h
Normal file
@ -0,0 +1,45 @@
|
||||
#include <napi.h>
|
||||
#include "llmodel.h"
|
||||
#include <iostream>
|
||||
#include "llmodel_c.h"
|
||||
#include "prompt.h"
|
||||
#include <atomic>
|
||||
#include <memory>
|
||||
#include <filesystem>
|
||||
namespace fs = std::filesystem;
|
||||
|
||||
class NodeModelWrapper: public Napi::ObjectWrap<NodeModelWrapper> {
|
||||
public:
|
||||
NodeModelWrapper(const Napi::CallbackInfo &);
|
||||
//~NodeModelWrapper();
|
||||
Napi::Value getType(const Napi::CallbackInfo& info);
|
||||
Napi::Value IsModelLoaded(const Napi::CallbackInfo& info);
|
||||
Napi::Value StateSize(const Napi::CallbackInfo& info);
|
||||
/**
|
||||
* Prompting the model. This entails spawning a new thread and adding the response tokens
|
||||
* into a thread local string variable.
|
||||
*/
|
||||
Napi::Value Prompt(const Napi::CallbackInfo& info);
|
||||
void SetThreadCount(const Napi::CallbackInfo& info);
|
||||
Napi::Value getName(const Napi::CallbackInfo& info);
|
||||
Napi::Value ThreadCount(const Napi::CallbackInfo& info);
|
||||
/*
|
||||
* The path that is used to search for the dynamic libraries
|
||||
*/
|
||||
Napi::Value GetLibraryPath(const Napi::CallbackInfo& info);
|
||||
/**
|
||||
* Creates the LLModel class
|
||||
*/
|
||||
static Napi::Function GetClass(Napi::Env);
|
||||
llmodel_model GetInference();
|
||||
private:
|
||||
/**
|
||||
* The underlying inference that interfaces with the C interface
|
||||
*/
|
||||
std::atomic<std::shared_ptr<llmodel_model>> inference_;
|
||||
|
||||
std::string type;
|
||||
// corresponds to LLModel::name() in typescript
|
||||
std::string name;
|
||||
static Napi::FunctionReference constructor;
|
||||
};
|
@ -1,19 +1,32 @@
|
||||
{
|
||||
"name": "gpt4all-ts",
|
||||
"name": "gpt4all",
|
||||
"version": "2.0.0",
|
||||
"packageManager": "yarn@3.5.1",
|
||||
"gypfile": true,
|
||||
"main": "src/gpt4all.js",
|
||||
"repository": "nomic-ai/gpt4all",
|
||||
"scripts": {
|
||||
"test": "node ./test/index.mjs"
|
||||
"test": "node ./test/index.mjs",
|
||||
"build:backend": "node scripts/build.js",
|
||||
"install": "node-gyp-build",
|
||||
"prebuild": "node scripts/prebuild.js",
|
||||
"docs:build": "documentation build ./src/gpt4all.d.ts --parse-extension d.ts --format md --output docs/api.md"
|
||||
},
|
||||
"dependencies": {
|
||||
"bindings": "^1.5.0",
|
||||
"node-addon-api": "^6.1.0"
|
||||
"mkdirp": "^3.0.1",
|
||||
"node-addon-api": "^6.1.0",
|
||||
"node-gyp-build": "^4.6.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.1.5"
|
||||
"@types/node": "^20.1.5",
|
||||
"documentation": "^14.0.2",
|
||||
"prebuildify": "^5.0.1",
|
||||
"prettier": "^2.8.8"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 18.x.x"
|
||||
},
|
||||
"prettier": {
|
||||
"endOfLine": "lf",
|
||||
"tabWidth": 4
|
||||
}
|
||||
|
||||
}
|
||||
|
62
gpt4all-bindings/typescript/prompt.cc
Normal file
62
gpt4all-bindings/typescript/prompt.cc
Normal file
@ -0,0 +1,62 @@
|
||||
#include "prompt.h"
|
||||
|
||||
|
||||
TsfnContext::TsfnContext(Napi::Env env, const PromptWorkContext& pc)
|
||||
: deferred_(Napi::Promise::Deferred::New(env)), pc(pc) {
|
||||
}
|
||||
|
||||
std::mutex mtx;
|
||||
static thread_local std::string res;
|
||||
bool response_callback(int32_t token_id, const char *response) {
|
||||
res+=response;
|
||||
return token_id != -1;
|
||||
}
|
||||
bool recalculate_callback (bool isrecalculating) {
|
||||
return isrecalculating;
|
||||
};
|
||||
bool prompt_callback (int32_t tid) {
|
||||
return true;
|
||||
};
|
||||
|
||||
// The thread entry point. This takes as its arguments the specific
|
||||
// threadsafe-function context created inside the main thread.
|
||||
void threadEntry(TsfnContext* context) {
|
||||
std::lock_guard<std::mutex> lock(mtx);
|
||||
// Perform a call into JavaScript.
|
||||
napi_status status =
|
||||
context->tsfn.NonBlockingCall(&context->pc,
|
||||
[](Napi::Env env, Napi::Function jsCallback, PromptWorkContext* pc) {
|
||||
llmodel_prompt(
|
||||
*pc->inference_,
|
||||
pc->question.c_str(),
|
||||
&prompt_callback,
|
||||
&response_callback,
|
||||
&recalculate_callback,
|
||||
&pc->prompt_params
|
||||
);
|
||||
jsCallback.Call({ Napi::String::New(env, res)} );
|
||||
res.clear();
|
||||
});
|
||||
|
||||
if (status != napi_ok) {
|
||||
Napi::Error::Fatal(
|
||||
"ThreadEntry",
|
||||
"Napi::ThreadSafeNapi::Function.NonBlockingCall() failed");
|
||||
}
|
||||
|
||||
// Release the thread-safe function. This decrements the internal thread
|
||||
// count, and will perform finalization since the count will reach 0.
|
||||
context->tsfn.Release();
|
||||
}
|
||||
|
||||
void FinalizerCallback(Napi::Env env,
|
||||
void* finalizeData,
|
||||
TsfnContext* context) {
|
||||
// Join the thread
|
||||
context->nativeThread.join();
|
||||
// Resolve the Promise previously returned to JS via the CreateTSFN method.
|
||||
context->deferred_.Resolve(Napi::Boolean::New(env, true));
|
||||
delete context;
|
||||
}
|
||||
|
||||
|
42
gpt4all-bindings/typescript/prompt.h
Normal file
42
gpt4all-bindings/typescript/prompt.h
Normal file
@ -0,0 +1,42 @@
|
||||
#ifndef TSFN_CONTEXT_H
|
||||
#define TSFN_CONTEXT_H
|
||||
|
||||
#include "napi.h"
|
||||
#include "llmodel_c.h"
|
||||
#include <thread>
|
||||
#include <mutex>
|
||||
#include <iostream>
|
||||
#include <atomic>
|
||||
#include <memory>
|
||||
struct PromptWorkContext {
|
||||
std::string question;
|
||||
std::shared_ptr<llmodel_model> inference_;
|
||||
llmodel_prompt_context prompt_params;
|
||||
};
|
||||
|
||||
struct TsfnContext {
|
||||
public:
|
||||
TsfnContext(Napi::Env env, const PromptWorkContext &pc);
|
||||
std::thread nativeThread;
|
||||
Napi::Promise::Deferred deferred_;
|
||||
PromptWorkContext pc;
|
||||
Napi::ThreadSafeFunction tsfn;
|
||||
|
||||
// Some data to pass around
|
||||
// int ints[ARRAY_LENGTH];
|
||||
|
||||
};
|
||||
|
||||
// The thread entry point. This takes as its arguments the specific
|
||||
// threadsafe-function context created inside the main thread.
|
||||
void threadEntry(TsfnContext* context);
|
||||
|
||||
// The thread-safe function finalizer callback. This callback executes
|
||||
// at destruction of thread-safe function, taking as arguments the finalizer
|
||||
// data and threadsafe-function context.
|
||||
void FinalizerCallback(Napi::Env env, void* finalizeData, TsfnContext* context);
|
||||
|
||||
bool response_callback(int32_t token_id, const char *response);
|
||||
bool recalculate_callback (bool isrecalculating);
|
||||
bool prompt_callback (int32_t tid);
|
||||
#endif // TSFN_CONTEXT_H
|
17
gpt4all-bindings/typescript/scripts/build.js
Normal file
17
gpt4all-bindings/typescript/scripts/build.js
Normal file
@ -0,0 +1,17 @@
|
||||
const { spawn } = require("node:child_process");
|
||||
const { resolve } = require("path");
|
||||
const args = process.argv.slice(2);
|
||||
const platform = process.platform;
|
||||
|
||||
//windows 64bit or 32
|
||||
if (platform === "win32") {
|
||||
const path = "scripts/build_msvc.bat";
|
||||
spawn(resolve(path), ["/Y", ...args], { shell: true, stdio: "inherit" });
|
||||
process.on("data", (s) => console.log(s.toString()));
|
||||
} else if (platform === "linux" || platform === "darwin") {
|
||||
const path = "scripts/build_unix.sh";
|
||||
const bash = spawn(`sh`, [path, ...args]);
|
||||
bash.stdout.on("data", (s) => console.log(s.toString()), {
|
||||
stdio: "inherit",
|
||||
});
|
||||
}
|
16
gpt4all-bindings/typescript/scripts/build_mingw.ps1
Normal file
16
gpt4all-bindings/typescript/scripts/build_mingw.ps1
Normal file
@ -0,0 +1,16 @@
|
||||
$ROOT_DIR = '.\runtimes\win-x64'
|
||||
$BUILD_DIR = '.\runtimes\win-x64\build\mingw'
|
||||
$LIBS_DIR = '.\runtimes\win-x64\native'
|
||||
|
||||
# cleanup env
|
||||
Remove-Item -Force -Recurse $ROOT_DIR -ErrorAction SilentlyContinue | Out-Null
|
||||
mkdir $BUILD_DIR | Out-Null
|
||||
mkdir $LIBS_DIR | Out-Null
|
||||
|
||||
# build
|
||||
cmake -G "MinGW Makefiles" -S ..\..\gpt4all-backend -B $BUILD_DIR -DLLAMA_AVX2=ON
|
||||
cmake --build $BUILD_DIR --parallel --config Release
|
||||
|
||||
# copy native dlls
|
||||
# cp "C:\ProgramData\chocolatey\lib\mingw\tools\install\mingw64\bin\*dll" $LIBS_DIR
|
||||
cp "$BUILD_DIR\bin\*.dll" $LIBS_DIR
|
31
gpt4all-bindings/typescript/scripts/build_unix.sh
Normal file
31
gpt4all-bindings/typescript/scripts/build_unix.sh
Normal file
@ -0,0 +1,31 @@
|
||||
#!/bin/sh
|
||||
|
||||
SYSNAME=$(uname -s)
|
||||
|
||||
if [ "$SYSNAME" = "Linux" ]; then
|
||||
BASE_DIR="runtimes/linux-x64"
|
||||
LIB_EXT="so"
|
||||
elif [ "$SYSNAME" = "Darwin" ]; then
|
||||
BASE_DIR="runtimes/osx"
|
||||
LIB_EXT="dylib"
|
||||
elif [ -n "$SYSNAME" ]; then
|
||||
echo "Unsupported system: $SYSNAME" >&2
|
||||
exit 1
|
||||
else
|
||||
echo "\"uname -s\" failed" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
NATIVE_DIR="$BASE_DIR/native"
|
||||
BUILD_DIR="$BASE_DIR/build"
|
||||
|
||||
rm -rf "$BASE_DIR"
|
||||
mkdir -p "$NATIVE_DIR" "$BUILD_DIR"
|
||||
|
||||
cmake -S ../../gpt4all-backend -B "$BUILD_DIR" &&
|
||||
cmake --build "$BUILD_DIR" -j --config Release && {
|
||||
cp "$BUILD_DIR"/libllmodel.$LIB_EXT "$NATIVE_DIR"/
|
||||
cp "$BUILD_DIR"/libgptj*.$LIB_EXT "$NATIVE_DIR"/
|
||||
cp "$BUILD_DIR"/libllama*.$LIB_EXT "$NATIVE_DIR"/
|
||||
cp "$BUILD_DIR"/libmpt*.$LIB_EXT "$NATIVE_DIR"/
|
||||
}
|
50
gpt4all-bindings/typescript/scripts/prebuild.js
Normal file
50
gpt4all-bindings/typescript/scripts/prebuild.js
Normal file
@ -0,0 +1,50 @@
|
||||
const prebuildify = require("prebuildify");
|
||||
|
||||
async function createPrebuilds(combinations) {
|
||||
for (const { platform, arch } of combinations) {
|
||||
const opts = {
|
||||
platform,
|
||||
arch,
|
||||
napi: true,
|
||||
};
|
||||
try {
|
||||
await createPrebuild(opts);
|
||||
console.log(
|
||||
`Build succeeded for platform ${opts.platform} and architecture ${opts.arch}`
|
||||
);
|
||||
} catch (err) {
|
||||
console.error(
|
||||
`Error building for platform ${opts.platform} and architecture ${opts.arch}:`,
|
||||
err
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function createPrebuild(opts) {
|
||||
return new Promise((resolve, reject) => {
|
||||
prebuildify(opts, (err) => {
|
||||
if (err) {
|
||||
reject(err);
|
||||
} else {
|
||||
resolve();
|
||||
}
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
const prebuildConfigs = [
|
||||
{ platform: "win32", arch: "x64" },
|
||||
{ platform: "win32", arch: "arm64" },
|
||||
// { platform: 'win32', arch: 'armv7' },
|
||||
{ platform: "darwin", arch: "x64" },
|
||||
{ platform: "darwin", arch: "arm64" },
|
||||
// { platform: 'darwin', arch: 'armv7' },
|
||||
{ platform: "linux", arch: "x64" },
|
||||
{ platform: "linux", arch: "arm64" },
|
||||
{ platform: "linux", arch: "armv7" },
|
||||
];
|
||||
|
||||
createPrebuilds(prebuildConfigs)
|
||||
.then(() => console.log("All builds succeeded"))
|
||||
.catch((err) => console.error("Error building:", err));
|
@ -1,14 +1,15 @@
|
||||
import { LLModel, prompt, createCompletion } from '../src/gpt4all.js'
|
||||
|
||||
|
||||
|
||||
const ll = new LLModel("./ggml-vicuna-7b-1.1-q4_2.bin");
|
||||
import { LLModel, createCompletion, DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } from '../src/gpt4all.js'
|
||||
|
||||
const ll = new LLModel({
|
||||
model_name: 'ggml-vicuna-7b-1.1-q4_2.bin',
|
||||
model_path: './',
|
||||
library_path: DEFAULT_LIBRARIES_DIRECTORY
|
||||
});
|
||||
|
||||
try {
|
||||
class Extended extends LLModel {
|
||||
|
||||
}
|
||||
|
||||
} catch(e) {
|
||||
console.log("Extending from native class gone wrong " + e)
|
||||
}
|
||||
@ -20,13 +21,26 @@ ll.setThreadCount(5);
|
||||
console.log("thread count " + ll.threadCount());
|
||||
ll.setThreadCount(4);
|
||||
console.log("thread count " + ll.threadCount());
|
||||
console.log("name " + ll.name());
|
||||
console.log("type: " + ll.type());
|
||||
console.log("Default directory for models", DEFAULT_DIRECTORY);
|
||||
console.log("Default directory for libraries", DEFAULT_LIBRARIES_DIRECTORY);
|
||||
|
||||
|
||||
console.log(createCompletion(
|
||||
console.log(await createCompletion(
|
||||
ll,
|
||||
prompt`${"header"} ${"prompt"}`, {
|
||||
verbose: true,
|
||||
prompt: 'hello! Say something thought provoking.'
|
||||
}
|
||||
[
|
||||
{ role : 'system', content: 'You are a girl who likes playing league of legends.' },
|
||||
{ role : 'user', content: 'What is the best top laner to play right now?' },
|
||||
],
|
||||
{ verbose: false}
|
||||
));
|
||||
|
||||
|
||||
console.log(await createCompletion(
|
||||
ll,
|
||||
[
|
||||
{ role : 'user', content: 'What is the best bottom laner to play right now?' },
|
||||
],
|
||||
))
|
||||
|
||||
|
||||
|
22
gpt4all-bindings/typescript/src/config.js
Normal file
22
gpt4all-bindings/typescript/src/config.js
Normal file
@ -0,0 +1,22 @@
|
||||
const os = require("node:os");
|
||||
const path = require("node:path");
|
||||
|
||||
const DEFAULT_DIRECTORY = path.resolve(os.homedir(), ".cache/gpt4all");
|
||||
|
||||
const librarySearchPaths = [
|
||||
path.join(DEFAULT_DIRECTORY, "libraries"),
|
||||
path.resolve("./libraries"),
|
||||
path.resolve(
|
||||
__dirname,
|
||||
"..",
|
||||
`runtimes/${process.platform}-${process.arch}/native`
|
||||
),
|
||||
process.cwd(),
|
||||
];
|
||||
|
||||
const DEFAULT_LIBRARIES_DIRECTORY = librarySearchPaths.join(";");
|
||||
|
||||
module.exports = {
|
||||
DEFAULT_DIRECTORY,
|
||||
DEFAULT_LIBRARIES_DIRECTORY,
|
||||
};
|
382
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
382
gpt4all-bindings/typescript/src/gpt4all.d.ts
vendored
@ -1,162 +1,310 @@
|
||||
/// <reference types="node" />
|
||||
declare module 'gpt4all-ts';
|
||||
declare module "gpt4all";
|
||||
|
||||
export * from "./util.d.ts";
|
||||
|
||||
/** Type of the model */
|
||||
type ModelType = "gptj" | "llama" | "mpt" | "replit";
|
||||
|
||||
|
||||
interface LLModelPromptContext {
|
||||
|
||||
// Size of the raw logits vector
|
||||
logits_size: number;
|
||||
|
||||
// Size of the raw tokens vector
|
||||
tokens_size: number;
|
||||
|
||||
// Number of tokens in past conversation
|
||||
n_past: number;
|
||||
|
||||
// Number of tokens possible in context window
|
||||
n_ctx: number;
|
||||
|
||||
// Number of tokens to predict
|
||||
n_predict: number;
|
||||
|
||||
// Top k logits to sample from
|
||||
top_k: number;
|
||||
|
||||
// Nucleus sampling probability threshold
|
||||
top_p: number;
|
||||
|
||||
// Temperature to adjust model's output distribution
|
||||
temp: number;
|
||||
|
||||
// Number of predictions to generate in parallel
|
||||
n_batch: number;
|
||||
|
||||
// Penalty factor for repeated tokens
|
||||
repeat_penalty: number;
|
||||
|
||||
// Last n tokens to penalize
|
||||
repeat_last_n: number;
|
||||
|
||||
// Percent of context to erase if we exceed the context window
|
||||
context_erase: number;
|
||||
/**
|
||||
* Full list of models available
|
||||
*/
|
||||
interface ModelFile {
|
||||
/** List of GPT-J Models */
|
||||
gptj:
|
||||
| "ggml-gpt4all-j-v1.3-groovy.bin"
|
||||
| "ggml-gpt4all-j-v1.2-jazzy.bin"
|
||||
| "ggml-gpt4all-j-v1.1-breezy.bin"
|
||||
| "ggml-gpt4all-j.bin";
|
||||
/** List Llama Models */
|
||||
llama:
|
||||
| "ggml-gpt4all-l13b-snoozy.bin"
|
||||
| "ggml-vicuna-7b-1.1-q4_2.bin"
|
||||
| "ggml-vicuna-13b-1.1-q4_2.bin"
|
||||
| "ggml-wizardLM-7B.q4_2.bin"
|
||||
| "ggml-stable-vicuna-13B.q4_2.bin"
|
||||
| "ggml-nous-gpt4-vicuna-13b.bin"
|
||||
| "ggml-v3-13b-hermes-q5_1.bin";
|
||||
/** List of MPT Models */
|
||||
mpt:
|
||||
| "ggml-mpt-7b-base.bin"
|
||||
| "ggml-mpt-7b-chat.bin"
|
||||
| "ggml-mpt-7b-instruct.bin";
|
||||
/** List of Replit Models */
|
||||
replit: "ggml-replit-code-v1-3b.bin";
|
||||
}
|
||||
|
||||
|
||||
//mirrors py options
|
||||
interface LLModelOptions {
|
||||
/**
|
||||
* Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.
|
||||
*/
|
||||
type?: ModelType;
|
||||
model_name: ModelFile[ModelType];
|
||||
model_path: string;
|
||||
library_path?: string;
|
||||
}
|
||||
/**
|
||||
* LLModel class representing a language model.
|
||||
* This is a base class that provides common functionality for different types of language models.
|
||||
*/
|
||||
declare class LLModel {
|
||||
//either 'gpt', mpt', or 'llama'
|
||||
type() : ModelType;
|
||||
//The name of the model
|
||||
name(): ModelFile;
|
||||
/**
|
||||
* Initialize a new LLModel.
|
||||
* @param path Absolute path to the model file.
|
||||
* @throws {Error} If the model file does not exist.
|
||||
*/
|
||||
constructor(path: string);
|
||||
constructor(options: LLModelOptions);
|
||||
|
||||
/** either 'gpt', mpt', or 'llama' or undefined */
|
||||
type(): ModelType | undefined;
|
||||
|
||||
/** The name of the model. */
|
||||
name(): ModelFile;
|
||||
|
||||
/**
|
||||
* Get the size of the internal state of the model.
|
||||
* NOTE: This state data is specific to the type of model you have created.
|
||||
* @return the size in bytes of the internal state of the model
|
||||
*/
|
||||
stateSize(): number;
|
||||
|
||||
/**
|
||||
* Get the number of threads used for model inference.
|
||||
* The default is the number of physical cores your computer has.
|
||||
* @returns The number of threads used for model inference.
|
||||
*/
|
||||
threadCount() : number;
|
||||
threadCount(): number;
|
||||
|
||||
/**
|
||||
* Set the number of threads used for model inference.
|
||||
* @param newNumber The new number of threads.
|
||||
*/
|
||||
setThreadCount(newNumber: number): void;
|
||||
/**
|
||||
* Prompt the model with a given input and optional parameters.
|
||||
* This is the raw output from std out.
|
||||
* Use the prompt function exported for a value
|
||||
* @param q The prompt input.
|
||||
* @param params Optional parameters for the prompt context.
|
||||
* @returns The result of the model prompt.
|
||||
*/
|
||||
raw_prompt(q: string, params?: Partial<LLModelPromptContext>) : unknown; //todo work on return type
|
||||
|
||||
}
|
||||
|
||||
interface DownloadController {
|
||||
//Cancel the request to download from gpt4all website if this is called.
|
||||
cancel: () => void;
|
||||
//Convert the downloader into a promise, allowing people to await and manage its lifetime
|
||||
promise: () => Promise<void>
|
||||
}
|
||||
|
||||
|
||||
export interface DownloadConfig {
|
||||
/**
|
||||
* location to download the model.
|
||||
* Default is process.cwd(), or the current working directory
|
||||
* Prompt the model with a given input and optional parameters.
|
||||
* This is the raw output from std out.
|
||||
* Use the prompt function exported for a value
|
||||
* @param q The prompt input.
|
||||
* @param params Optional parameters for the prompt context.
|
||||
* @returns The result of the model prompt.
|
||||
*/
|
||||
location: string;
|
||||
raw_prompt(q: string, params: Partial<LLModelPromptContext>, callback: (res: string) => void): void; // TODO work on return type
|
||||
|
||||
/**
|
||||
* Debug mode -- check how long it took to download in seconds
|
||||
* Whether the model is loaded or not.
|
||||
*/
|
||||
debug: boolean;
|
||||
isModelLoaded(): boolean;
|
||||
|
||||
/**
|
||||
* Default link = https://gpt4all.io/models`
|
||||
* This property overrides the default.
|
||||
* Where to search for the pluggable backend libraries
|
||||
*/
|
||||
link?: string
|
||||
}
|
||||
/**
|
||||
* Initiates the download of a model file of a specific model type.
|
||||
* By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||
* @param {ModelFile[ModelType]} m - The model file to be downloaded.
|
||||
* @param {Record<string, unknown>} op - options to pass into the downloader. Default is { location: (cwd), debug: false }.
|
||||
* @returns {DownloadController} A DownloadController object that allows controlling the download process.
|
||||
*/
|
||||
declare function download(m: ModelFile[ModelType], op: { location: string, debug: boolean, link?:string }): DownloadController
|
||||
|
||||
|
||||
type ModelType = 'gptj' | 'llama' | 'mpt';
|
||||
|
||||
/*
|
||||
* A nice interface for intellisense of all possibly models.
|
||||
*/
|
||||
interface ModelFile {
|
||||
'gptj': | "ggml-gpt4all-j-v1.3-groovy.bin"
|
||||
| "ggml-gpt4all-j-v1.2-jazzy.bin"
|
||||
| "ggml-gpt4all-j-v1.1-breezy.bin"
|
||||
| "ggml-gpt4all-j.bin";
|
||||
'llama':| "ggml-gpt4all-l13b-snoozy.bin"
|
||||
| "ggml-vicuna-7b-1.1-q4_2.bin"
|
||||
| "ggml-vicuna-13b-1.1-q4_2.bin"
|
||||
| "ggml-wizardLM-7B.q4_2.bin"
|
||||
| "ggml-stable-vicuna-13B.q4_2.bin"
|
||||
| "ggml-nous-gpt4-vicuna-13b.bin"
|
||||
'mpt': | "ggml-mpt-7b-base.bin"
|
||||
| "ggml-mpt-7b-chat.bin"
|
||||
| "ggml-mpt-7b-instruct.bin"
|
||||
setLibraryPath(s: string): void;
|
||||
/**
|
||||
* Where to get the pluggable backend libraries
|
||||
*/
|
||||
getLibraryPath(): string;
|
||||
}
|
||||
|
||||
interface ExtendedOptions {
|
||||
interface LoadModelOptions {
|
||||
modelPath?: string;
|
||||
librariesPath?: string;
|
||||
allowDownload?: boolean;
|
||||
verbose?: boolean;
|
||||
system?: string;
|
||||
header?: string;
|
||||
prompt: string;
|
||||
promptEntries?: Record<string, unknown>
|
||||
}
|
||||
|
||||
type PromptTemplate = (...args: string[]) => string;
|
||||
declare function loadModel(
|
||||
modelName: string,
|
||||
options?: LoadModelOptions
|
||||
): Promise<LLModel>;
|
||||
|
||||
/**
|
||||
* The nodejs equivalent to python binding's chat_completion
|
||||
* @param {LLModel} llmodel - The language model object.
|
||||
* @param {PromptMessage[]} messages - The array of messages for the conversation.
|
||||
* @param {CompletionOptions} options - The options for creating the completion.
|
||||
* @returns {CompletionReturn} The completion result.
|
||||
* @example
|
||||
* const llmodel = new LLModel(model)
|
||||
* const messages = [
|
||||
* { role: 'system', message: 'You are a weather forecaster.' },
|
||||
* { role: 'user', message: 'should i go out today?' } ]
|
||||
* const completion = await createCompletion(llmodel, messages, {
|
||||
* verbose: true,
|
||||
* temp: 0.9,
|
||||
* })
|
||||
* console.log(completion.choices[0].message.content)
|
||||
* // No, it's going to be cold and rainy.
|
||||
*/
|
||||
declare function createCompletion(
|
||||
model: LLModel,
|
||||
pt: PromptTemplate,
|
||||
options: LLModelPromptContext&ExtendedOptions
|
||||
) : string
|
||||
llmodel: LLModel,
|
||||
messages: PromptMessage[],
|
||||
options?: CompletionOptions
|
||||
): Promise<CompletionReturn>;
|
||||
|
||||
function prompt(
|
||||
strings: TemplateStringsArray
|
||||
): PromptTemplate
|
||||
/**
|
||||
* The options for creating the completion.
|
||||
*/
|
||||
interface CompletionOptions extends Partial<LLModelPromptContext> {
|
||||
/**
|
||||
* Indicates if verbose logging is enabled.
|
||||
* @default true
|
||||
*/
|
||||
verbose?: boolean;
|
||||
|
||||
/**
|
||||
* Indicates if the default header is included in the prompt.
|
||||
* @default true
|
||||
*/
|
||||
hasDefaultHeader?: boolean;
|
||||
|
||||
export { LLModel, LLModelPromptContext, ModelType, download, DownloadController, prompt, ExtendedOptions, createCompletion }
|
||||
/**
|
||||
* Indicates if the default footer is included in the prompt.
|
||||
* @default true
|
||||
*/
|
||||
hasDefaultFooter?: boolean;
|
||||
}
|
||||
|
||||
/**
|
||||
* A message in the conversation, identical to OpenAI's chat message.
|
||||
*/
|
||||
interface PromptMessage {
|
||||
/** The role of the message. */
|
||||
role: "system" | "assistant" | "user";
|
||||
|
||||
/** The message content. */
|
||||
content: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* The result of the completion, similar to OpenAI's format.
|
||||
*/
|
||||
interface CompletionReturn {
|
||||
/** The model name.
|
||||
* @type {ModelFile}
|
||||
*/
|
||||
model: ModelFile[ModelType];
|
||||
|
||||
/** Token usage report. */
|
||||
usage: {
|
||||
/** The number of tokens used in the prompt. */
|
||||
prompt_tokens: number;
|
||||
|
||||
/** The number of tokens used in the completion. */
|
||||
completion_tokens: number;
|
||||
|
||||
/** The total number of tokens used. */
|
||||
total_tokens: number;
|
||||
};
|
||||
|
||||
/** The generated completions. */
|
||||
choices: CompletionChoice[];
|
||||
}
|
||||
|
||||
/**
|
||||
* A completion choice, similar to OpenAI's format.
|
||||
*/
|
||||
interface CompletionChoice {
|
||||
/** Response message */
|
||||
message: PromptMessage;
|
||||
}
|
||||
|
||||
/**
|
||||
* Model inference arguments for generating completions.
|
||||
*/
|
||||
interface LLModelPromptContext {
|
||||
/** The size of the raw logits vector. */
|
||||
logits_size: number;
|
||||
|
||||
/** The size of the raw tokens vector. */
|
||||
tokens_size: number;
|
||||
|
||||
/** The number of tokens in the past conversation. */
|
||||
n_past: number;
|
||||
|
||||
/** The number of tokens possible in the context window.
|
||||
* @default 1024
|
||||
*/
|
||||
n_ctx: number;
|
||||
|
||||
/** The number of tokens to predict.
|
||||
* @default 128
|
||||
* */
|
||||
n_predict: number;
|
||||
|
||||
/** The top-k logits to sample from.
|
||||
* @default 40
|
||||
* */
|
||||
top_k: number;
|
||||
|
||||
/** The nucleus sampling probability threshold.
|
||||
* @default 0.9
|
||||
* */
|
||||
top_p: number;
|
||||
|
||||
/** The temperature to adjust the model's output distribution.
|
||||
* @default 0.72
|
||||
* */
|
||||
temp: number;
|
||||
|
||||
/** The number of predictions to generate in parallel.
|
||||
* @default 8
|
||||
* */
|
||||
n_batch: number;
|
||||
|
||||
/** The penalty factor for repeated tokens.
|
||||
* @default 1
|
||||
* */
|
||||
repeat_penalty: number;
|
||||
|
||||
/** The number of last tokens to penalize.
|
||||
* @default 10
|
||||
* */
|
||||
repeat_last_n: number;
|
||||
|
||||
/** The percentage of context to erase if the context window is exceeded.
|
||||
* @default 0.5
|
||||
* */
|
||||
context_erase: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* TODO: Help wanted to implement this
|
||||
*/
|
||||
declare function createTokenStream(
|
||||
llmodel: LLModel,
|
||||
messages: PromptMessage[],
|
||||
options: CompletionOptions
|
||||
): (ll: LLModel) => AsyncGenerator<string>;
|
||||
/**
|
||||
* From python api:
|
||||
* models will be stored in (homedir)/.cache/gpt4all/`
|
||||
*/
|
||||
declare const DEFAULT_DIRECTORY: string;
|
||||
/**
|
||||
* From python api:
|
||||
* The default path for dynamic libraries to be stored.
|
||||
* You may separate paths by a semicolon to search in multiple areas.
|
||||
* This searches DEFAULT_DIRECTORY/libraries, cwd/libraries, and finally cwd.
|
||||
*/
|
||||
declare const DEFAULT_LIBRARIES_DIRECTORY: string;
|
||||
interface PromptMessage {
|
||||
role: "system" | "assistant" | "user";
|
||||
content: string;
|
||||
}
|
||||
export {
|
||||
ModelType,
|
||||
ModelFile,
|
||||
LLModel,
|
||||
LLModelPromptContext,
|
||||
PromptMessage,
|
||||
CompletionOptions,
|
||||
LoadModelOptions,
|
||||
loadModel,
|
||||
createCompletion,
|
||||
createTokenStream,
|
||||
DEFAULT_DIRECTORY,
|
||||
DEFAULT_LIBRARIES_DIRECTORY,
|
||||
};
|
||||
|
@ -1,112 +1,138 @@
|
||||
"use strict";
|
||||
|
||||
/// This file implements the gpt4all.d.ts file endings.
|
||||
/// Written in commonjs to support both ESM and CJS projects.
|
||||
const { existsSync } = require("fs");
|
||||
const path = require("node:path");
|
||||
const { LLModel } = require("node-gyp-build")(path.resolve(__dirname, ".."));
|
||||
const {
|
||||
retrieveModel,
|
||||
downloadModel,
|
||||
appendBinSuffixIfMissing,
|
||||
} = require("./util.js");
|
||||
const config = require("./config.js");
|
||||
|
||||
const { LLModel } = require('bindings')('../build/Release/gpt4allts');
|
||||
const { createWriteStream, existsSync } = require('fs');
|
||||
const { join } = require('path');
|
||||
const { performance } = require('node:perf_hooks');
|
||||
|
||||
|
||||
|
||||
// readChunks() reads from the provided reader and yields the results into an async iterable
|
||||
// https://css-tricks.com/web-streams-everywhere-and-fetch-for-node-js/
|
||||
function readChunks(reader) {
|
||||
return {
|
||||
async* [Symbol.asyncIterator]() {
|
||||
let readResult = await reader.read();
|
||||
while (!readResult.done) {
|
||||
yield readResult.value;
|
||||
readResult = await reader.read();
|
||||
}
|
||||
},
|
||||
async function loadModel(modelName, options = {}) {
|
||||
const loadOptions = {
|
||||
modelPath: config.DEFAULT_DIRECTORY,
|
||||
librariesPath: config.DEFAULT_LIBRARIES_DIRECTORY,
|
||||
allowDownload: true,
|
||||
verbose: true,
|
||||
...options,
|
||||
};
|
||||
|
||||
await retrieveModel(modelName, {
|
||||
modelPath: loadOptions.modelPath,
|
||||
allowDownload: loadOptions.allowDownload,
|
||||
verbose: loadOptions.verbose,
|
||||
});
|
||||
|
||||
const libSearchPaths = loadOptions.librariesPath.split(";");
|
||||
|
||||
let libPath = null;
|
||||
|
||||
for (const searchPath of libSearchPaths) {
|
||||
if (existsSync(searchPath)) {
|
||||
libPath = searchPath;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
const llmOptions = {
|
||||
model_name: appendBinSuffixIfMissing(modelName),
|
||||
model_path: loadOptions.modelPath,
|
||||
library_path: libPath,
|
||||
};
|
||||
|
||||
if (loadOptions.verbose) {
|
||||
console.log("Creating LLModel with options:", llmOptions);
|
||||
}
|
||||
const llmodel = new LLModel(llmOptions);
|
||||
|
||||
return llmodel;
|
||||
}
|
||||
|
||||
exports.LLModel = LLModel;
|
||||
function createPrompt(messages, hasDefaultHeader, hasDefaultFooter) {
|
||||
let fullPrompt = "";
|
||||
|
||||
for (const message of messages) {
|
||||
if (message.role === "system") {
|
||||
const systemMessage = message.content + "\n";
|
||||
fullPrompt += systemMessage;
|
||||
}
|
||||
}
|
||||
if (hasDefaultHeader) {
|
||||
fullPrompt += `### Instruction:
|
||||
The prompt below is a question to answer, a task to complete, or a conversation
|
||||
to respond to; decide which and write an appropriate response.
|
||||
\n### Prompt:
|
||||
`;
|
||||
}
|
||||
for (const message of messages) {
|
||||
if (message.role === "user") {
|
||||
const user_message = "\n" + message["content"];
|
||||
fullPrompt += user_message;
|
||||
}
|
||||
if (message["role"] == "assistant") {
|
||||
const assistant_message = "\nResponse: " + message["content"];
|
||||
fullPrompt += assistant_message;
|
||||
}
|
||||
}
|
||||
if (hasDefaultFooter) {
|
||||
fullPrompt += "\n### Response:";
|
||||
}
|
||||
|
||||
exports.download = function (
|
||||
name,
|
||||
options = { debug: false, location: process.cwd(), link: undefined }
|
||||
return fullPrompt;
|
||||
}
|
||||
|
||||
async function createCompletion(
|
||||
llmodel,
|
||||
messages,
|
||||
options = {
|
||||
hasDefaultHeader: true,
|
||||
hasDefaultFooter: false,
|
||||
verbose: true,
|
||||
}
|
||||
) {
|
||||
const abortController = new AbortController();
|
||||
const signal = abortController.signal;
|
||||
|
||||
const pathToModel = join(options.location, name);
|
||||
if(existsSync(pathToModel)) {
|
||||
throw Error("Path to model already exists");
|
||||
}
|
||||
|
||||
//wrapper function to get the readable stream from request
|
||||
const fetcher = (name) => fetch(options.link ?? `https://gpt4all.io/models/${name}`, {
|
||||
signal,
|
||||
})
|
||||
.then(res => {
|
||||
if(!res.ok) {
|
||||
throw Error("Could not find "+ name + " from " + `https://gpt4all.io/models/` )
|
||||
}
|
||||
return res.body.getReader()
|
||||
})
|
||||
|
||||
//a promise that executes and writes to a stream. Resolves when done writing.
|
||||
const res = new Promise((resolve, reject) => {
|
||||
fetcher(name)
|
||||
//Resolves an array of a reader and writestream.
|
||||
.then(reader => [reader, createWriteStream(pathToModel)])
|
||||
.then(
|
||||
async ([readable, wstream]) => {
|
||||
console.log('(CLI might hang) Downloading @ ', pathToModel);
|
||||
let perf;
|
||||
if(options.debug) {
|
||||
perf = performance.now();
|
||||
}
|
||||
for await (const chunk of readChunks(readable)) {
|
||||
wstream.write(chunk);
|
||||
}
|
||||
if(options.debug) {
|
||||
console.log("Time taken: ", (performance.now()-perf).toFixed(2), " ms");
|
||||
}
|
||||
resolve();
|
||||
}
|
||||
).catch(reject);
|
||||
});
|
||||
|
||||
return {
|
||||
cancel : () => abortController.abort(),
|
||||
promise: () => res
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
//https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals#tagged_templates
|
||||
exports.prompt = function prompt(strings, ...keys) {
|
||||
return (...values) => {
|
||||
const dict = values[values.length - 1] || {};
|
||||
const result = [strings[0]];
|
||||
keys.forEach((key, i) => {
|
||||
const value = Number.isInteger(key) ? values[key] : dict[key];
|
||||
result.push(value, strings[i + 1]);
|
||||
});
|
||||
return result.join("");
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
|
||||
exports.createCompletion = function (llmodel, promptMaker, options) {
|
||||
//creating the keys to insert into promptMaker.
|
||||
const entries = {
|
||||
system: options.system ?? '',
|
||||
header: options.header ?? "### Instruction: The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response.\n### Prompt: ",
|
||||
prompt: options.prompt,
|
||||
...(options.promptEntries ?? {})
|
||||
};
|
||||
|
||||
const fullPrompt = promptMaker(entries)+'\n### Response:';
|
||||
|
||||
if(options.verbose) {
|
||||
console.log("sending prompt: " + `"${fullPrompt}"`)
|
||||
const fullPrompt = createPrompt(
|
||||
messages,
|
||||
options.hasDefaultHeader ?? true,
|
||||
options.hasDefaultFooter
|
||||
);
|
||||
if (options.verbose) {
|
||||
console.log("Sent: " + fullPrompt);
|
||||
}
|
||||
|
||||
return llmodel.raw_prompt(fullPrompt, options);
|
||||
const promisifiedRawPrompt = new Promise((resolve, rej) => {
|
||||
llmodel.raw_prompt(fullPrompt, options, (s) => {
|
||||
resolve(s);
|
||||
});
|
||||
});
|
||||
return promisifiedRawPrompt.then((response) => {
|
||||
return {
|
||||
llmodel: llmodel.name(),
|
||||
usage: {
|
||||
prompt_tokens: fullPrompt.length,
|
||||
completion_tokens: response.length, //TODO
|
||||
total_tokens: fullPrompt.length + response.length, //TODO
|
||||
},
|
||||
choices: [
|
||||
{
|
||||
message: {
|
||||
role: "assistant",
|
||||
content: response,
|
||||
},
|
||||
},
|
||||
],
|
||||
};
|
||||
});
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
...config,
|
||||
LLModel,
|
||||
createCompletion,
|
||||
downloadModel,
|
||||
retrieveModel,
|
||||
loadModel,
|
||||
};
|
||||
|
69
gpt4all-bindings/typescript/src/util.d.ts
vendored
Normal file
69
gpt4all-bindings/typescript/src/util.d.ts
vendored
Normal file
@ -0,0 +1,69 @@
|
||||
/// <reference types="node" />
|
||||
declare module "gpt4all";
|
||||
|
||||
/**
|
||||
* Initiates the download of a model file of a specific model type.
|
||||
* By default this downloads without waiting. use the controller returned to alter this behavior.
|
||||
* @param {ModelFile} model - The model file to be downloaded.
|
||||
* @param {DownloadOptions} options - to pass into the downloader. Default is { location: (cwd), debug: false }.
|
||||
* @returns {DownloadController} object that allows controlling the download process.
|
||||
*
|
||||
* @throws {Error} If the model already exists in the specified location.
|
||||
* @throws {Error} If the model cannot be found at the specified url.
|
||||
*
|
||||
* @example
|
||||
* const controller = download('ggml-gpt4all-j-v1.3-groovy.bin')
|
||||
* controller.promise().then(() => console.log('Downloaded!'))
|
||||
*/
|
||||
declare function downloadModel(
|
||||
modelName: string,
|
||||
options?: DownloadModelOptions
|
||||
): DownloadController;
|
||||
|
||||
/**
|
||||
* Options for the model download process.
|
||||
*/
|
||||
export interface DownloadModelOptions {
|
||||
/**
|
||||
* location to download the model.
|
||||
* Default is process.cwd(), or the current working directory
|
||||
*/
|
||||
modelPath?: string;
|
||||
|
||||
/**
|
||||
* Debug mode -- check how long it took to download in seconds
|
||||
* @default false
|
||||
*/
|
||||
debug?: boolean;
|
||||
|
||||
/**
|
||||
* Remote download url. Defaults to `https://gpt4all.io/models`
|
||||
* @default https://gpt4all.io/models
|
||||
*/
|
||||
url?: string;
|
||||
}
|
||||
|
||||
declare function listModels(): Promise<Record<string, string>[]>;
|
||||
|
||||
interface RetrieveModelOptions {
|
||||
allowDownload?: boolean;
|
||||
verbose?: boolean;
|
||||
modelPath?: string;
|
||||
}
|
||||
|
||||
declare async function retrieveModel(
|
||||
model: string,
|
||||
options?: RetrieveModelOptions
|
||||
): Promise<string>;
|
||||
|
||||
/**
|
||||
* Model download controller.
|
||||
*/
|
||||
interface DownloadController {
|
||||
/** Cancel the request to download from gpt4all website if this is called. */
|
||||
cancel: () => void;
|
||||
/** Convert the downloader into a promise, allowing people to await and manage its lifetime */
|
||||
promise: () => Promise<void>;
|
||||
}
|
||||
|
||||
export { downloadModel, DownloadModelOptions, DownloadController, listModels, retrieveModel, RetrieveModelOptions };
|
156
gpt4all-bindings/typescript/src/util.js
Normal file
156
gpt4all-bindings/typescript/src/util.js
Normal file
@ -0,0 +1,156 @@
|
||||
const { createWriteStream, existsSync } = require("fs");
|
||||
const { performance } = require("node:perf_hooks");
|
||||
const path = require("node:path");
|
||||
const {mkdirp} = require("mkdirp");
|
||||
const { DEFAULT_DIRECTORY, DEFAULT_LIBRARIES_DIRECTORY } = require("./config.js");
|
||||
|
||||
async function listModels() {
|
||||
const res = await fetch("https://gpt4all.io/models/models.json");
|
||||
const modelList = await res.json();
|
||||
return modelList;
|
||||
}
|
||||
|
||||
function appendBinSuffixIfMissing(name) {
|
||||
if (!name.endsWith(".bin")) {
|
||||
return name + ".bin";
|
||||
}
|
||||
return name;
|
||||
}
|
||||
|
||||
// readChunks() reads from the provided reader and yields the results into an async iterable
|
||||
// https://css-tricks.com/web-streams-everywhere-and-fetch-for-node-js/
|
||||
function readChunks(reader) {
|
||||
return {
|
||||
async *[Symbol.asyncIterator]() {
|
||||
let readResult = await reader.read();
|
||||
while (!readResult.done) {
|
||||
yield readResult.value;
|
||||
readResult = await reader.read();
|
||||
}
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
function downloadModel(
|
||||
modelName,
|
||||
options = {}
|
||||
) {
|
||||
const downloadOptions = {
|
||||
modelPath: DEFAULT_DIRECTORY,
|
||||
debug: false,
|
||||
url: "https://gpt4all.io/models",
|
||||
...options,
|
||||
};
|
||||
|
||||
const modelFileName = appendBinSuffixIfMissing(modelName);
|
||||
const fullModelPath = path.join(downloadOptions.modelPath, modelFileName);
|
||||
const modelUrl = `${downloadOptions.url}/${modelFileName}`
|
||||
|
||||
if (existsSync(fullModelPath)) {
|
||||
throw Error(`Model already exists at ${fullModelPath}`);
|
||||
}
|
||||
|
||||
const abortController = new AbortController();
|
||||
const signal = abortController.signal;
|
||||
|
||||
//wrapper function to get the readable stream from request
|
||||
// const baseUrl = options.url ?? "https://gpt4all.io/models";
|
||||
const fetchModel = () =>
|
||||
fetch(modelUrl, {
|
||||
signal,
|
||||
}).then((res) => {
|
||||
if (!res.ok) {
|
||||
throw Error(`Failed to download model from ${modelUrl} - ${res.statusText}`);
|
||||
}
|
||||
return res.body.getReader();
|
||||
});
|
||||
|
||||
//a promise that executes and writes to a stream. Resolves when done writing.
|
||||
const res = new Promise((resolve, reject) => {
|
||||
fetchModel()
|
||||
//Resolves an array of a reader and writestream.
|
||||
.then((reader) => [reader, createWriteStream(fullModelPath)])
|
||||
.then(async ([readable, wstream]) => {
|
||||
console.log("Downloading @ ", fullModelPath);
|
||||
let perf;
|
||||
if (options.debug) {
|
||||
perf = performance.now();
|
||||
}
|
||||
for await (const chunk of readChunks(readable)) {
|
||||
wstream.write(chunk);
|
||||
}
|
||||
if (options.debug) {
|
||||
console.log(
|
||||
"Time taken: ",
|
||||
(performance.now() - perf).toFixed(2),
|
||||
" ms"
|
||||
);
|
||||
}
|
||||
resolve(fullModelPath);
|
||||
})
|
||||
.catch(reject);
|
||||
});
|
||||
|
||||
return {
|
||||
cancel: () => abortController.abort(),
|
||||
promise: () => res,
|
||||
};
|
||||
};
|
||||
|
||||
async function retrieveModel (
|
||||
modelName,
|
||||
options = {}
|
||||
) {
|
||||
const retrieveOptions = {
|
||||
modelPath: DEFAULT_DIRECTORY,
|
||||
allowDownload: true,
|
||||
verbose: true,
|
||||
...options,
|
||||
};
|
||||
|
||||
await mkdirp(retrieveOptions.modelPath);
|
||||
|
||||
const modelFileName = appendBinSuffixIfMissing(modelName);
|
||||
const fullModelPath = path.join(retrieveOptions.modelPath, modelFileName);
|
||||
const modelExists = existsSync(fullModelPath);
|
||||
|
||||
if (modelExists) {
|
||||
return fullModelPath;
|
||||
}
|
||||
|
||||
if (!retrieveOptions.allowDownload) {
|
||||
throw Error(`Model does not exist at ${fullModelPath}`);
|
||||
}
|
||||
|
||||
const availableModels = await listModels();
|
||||
const foundModel = availableModels.find((model) => model.filename === modelFileName);
|
||||
|
||||
if (!foundModel) {
|
||||
throw Error(`Model "${modelName}" is not available.`);
|
||||
}
|
||||
|
||||
if (retrieveOptions.verbose) {
|
||||
console.log(`Downloading ${modelName}...`);
|
||||
}
|
||||
|
||||
const downloadController = downloadModel(modelName, {
|
||||
modelPath: retrieveOptions.modelPath,
|
||||
debug: retrieveOptions.verbose,
|
||||
});
|
||||
|
||||
const downloadPath = await downloadController.promise();
|
||||
|
||||
if (retrieveOptions.verbose) {
|
||||
console.log(`Model downloaded to ${downloadPath}`);
|
||||
}
|
||||
|
||||
return downloadPath
|
||||
|
||||
}
|
||||
|
||||
|
||||
module.exports = {
|
||||
appendBinSuffixIfMissing,
|
||||
downloadModel,
|
||||
retrieveModel,
|
||||
};
|
@ -1,14 +0,0 @@
|
||||
|
||||
#include "stdcapture.h"
|
||||
|
||||
CoutRedirect::CoutRedirect() {
|
||||
old = std::cout.rdbuf(buffer.rdbuf()); // redirect cout to buffer stream
|
||||
}
|
||||
|
||||
std::string CoutRedirect::getString() {
|
||||
return buffer.str(); // get string
|
||||
}
|
||||
|
||||
CoutRedirect::~CoutRedirect() {
|
||||
std::cout.rdbuf(old); // reverse redirect
|
||||
}
|
@ -1,21 +0,0 @@
|
||||
//https://stackoverflow.com/questions/5419356/redirect-stdout-stderr-to-a-string
|
||||
#ifndef COUTREDIRECT_H
|
||||
#define COUTREDIRECT_H
|
||||
|
||||
#include <iostream>
|
||||
#include <streambuf>
|
||||
#include <string>
|
||||
#include <sstream>
|
||||
|
||||
class CoutRedirect {
|
||||
public:
|
||||
CoutRedirect();
|
||||
std::string getString();
|
||||
~CoutRedirect();
|
||||
|
||||
private:
|
||||
std::stringstream buffer;
|
||||
std::streambuf* old;
|
||||
};
|
||||
|
||||
#endif // COUTREDIRECT_H
|
@ -1,38 +1,5 @@
|
||||
import * as assert from 'node:assert'
|
||||
import { prompt, download } from '../src/gpt4all.js'
|
||||
|
||||
{
|
||||
|
||||
const somePrompt = prompt`${"header"} Hello joe, my name is Ron. ${"prompt"}`;
|
||||
assert.equal(
|
||||
somePrompt({ header: 'oompa', prompt: 'holy moly' }),
|
||||
'oompa Hello joe, my name is Ron. holy moly'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
{
|
||||
|
||||
const indexedPrompt = prompt`${0}, ${1} ${0}`;
|
||||
assert.equal(
|
||||
indexedPrompt('hello', 'world'),
|
||||
'hello, world hello'
|
||||
);
|
||||
|
||||
assert.notEqual(
|
||||
indexedPrompt(['hello', 'world']),
|
||||
'hello, world hello'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
{
|
||||
assert.equal(
|
||||
(prompt`${"header"} ${"prompt"}`)({ header: 'hello', prompt: 'poo' }), 'hello poo',
|
||||
"Template prompt not equal"
|
||||
);
|
||||
|
||||
}
|
||||
import { download } from '../src/gpt4all.js'
|
||||
|
||||
|
||||
assert.rejects(async () => download('poo.bin').promise());
|
||||
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user