gpt4all/gpt4all-bindings/typescript
Andreas Obersteiner a602f7fde7
typescript bindings maintenance (#2363)
* remove outdated comments

Signed-off-by: limez <limez@protonmail.com>

* simpler build from source

Signed-off-by: limez <limez@protonmail.com>

* update unix build script to create .so runtimes correctly

Signed-off-by: limez <limez@protonmail.com>

* configure ci build type, use RelWithDebInfo for dev build script

Signed-off-by: limez <limez@protonmail.com>

* add clean script

Signed-off-by: limez <limez@protonmail.com>

* fix streamed token decoding / emoji

Signed-off-by: limez <limez@protonmail.com>

* remove deprecated nCtx

Signed-off-by: limez <limez@protonmail.com>

* update typings

Signed-off-by: jacob <jacoobes@sern.dev>

update typings

Signed-off-by: jacob <jacoobes@sern.dev>

* readme,mspell

Signed-off-by: jacob <jacoobes@sern.dev>

* cuda/backend logic changes + name napi methods like their js counterparts

Signed-off-by: limez <limez@protonmail.com>

* convert llmodel example into a test, separate test suite that can run in ci

Signed-off-by: limez <limez@protonmail.com>

* update examples / naming

Signed-off-by: limez <limez@protonmail.com>

* update deps, remove the need for binding.ci.gyp, make node-gyp-build fallback easier testable

Signed-off-by: limez <limez@protonmail.com>

* make sure the assert-backend-sources.js script is published, but not the others

Signed-off-by: limez <limez@protonmail.com>

* build correctly on windows (regression on node-gyp-build)

Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* codespell

Signed-off-by: limez <limez@protonmail.com>

* make sure dlhandle.cpp gets linked correctly

Signed-off-by: limez <limez@protonmail.com>

* add include for check_cxx_compiler_flag call during aarch64 builds

Signed-off-by: limez <limez@protonmail.com>

* x86 > arm64 cross compilation of runtimes and bindings

Signed-off-by: limez <limez@protonmail.com>

* default to cpu instead of kompute on arm64

Signed-off-by: limez <limez@protonmail.com>

* formatting, more minimal example

Signed-off-by: limez <limez@protonmail.com>

---------

Signed-off-by: limez <limez@protonmail.com>
Signed-off-by: jacob <jacoobes@sern.dev>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
Co-authored-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
Co-authored-by: jacob <jacoobes@sern.dev>
2024-06-03 11:12:55 -05:00
..
scripts typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
spec typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
src typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
test typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
.clang-format typescript!: chatSessions, fixes, tokenStreams (#2045) 2024-03-28 12:08:23 -04:00
.gitignore typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
.npmignore typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
.yarnrc.yml vulkan support for typescript bindings, gguf support (#1390) 2023-11-01 14:38:58 -05:00
binding.gyp typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
index.cc typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
index.h typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
package.json typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
prompt.cc typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
prompt.h typescript!: chatSessions, fixes, tokenStreams (#2045) 2024-03-28 12:08:23 -04:00
README.md typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00
yarn.lock typescript bindings maintenance (#2363) 2024-06-03 11:12:55 -05:00

GPT4All Node.js API

Native Node.js LLM bindings for all.

yarn add gpt4all@latest

npm install gpt4all@latest

pnpm install gpt4all@latest

Breaking changes in version 4!!

Contents

Api Examples

Chat Completion

Use a chat session to keep context between completions. This is useful for efficient back and forth conversations.

import { createCompletion, loadModel } from "../src/gpt4all.js";

const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", {
    verbose: true, // logs loaded model configuration
    device: "gpu", // defaults to 'cpu'
    nCtx: 2048, // the maximum sessions context window size.
});

// initialize a chat session on the model. a model instance can have only one chat session at a time.
const chat = await model.createChatSession({
    // any completion options set here will be used as default for all completions in this chat session
    temperature: 0.8,
    // a custom systemPrompt can be set here. note that the template depends on the model.
    // if unset, the systemPrompt that comes with the model will be used.
    systemPrompt: "### System:\nYou are an advanced mathematician.\n\n",
});

// create a completion using a string as input
const res1 = await createCompletion(chat, "What is 1 + 1?");
console.debug(res1.choices[0].message);

// multiple messages can be input to the conversation at once.
// note that if the last message is not of role 'user', an empty message will be returned.
await createCompletion(chat, [
    {
        role: "user",
        content: "What is 2 + 2?",
    },
    {
        role: "assistant",
        content: "It's 5.",
    },
]);

const res3 = await createCompletion(chat, "Could you recalculate that?");
console.debug(res3.choices[0].message);

model.dispose();

Stateless usage

You can use the model without a chat session. This is useful for one-off completions.

import { createCompletion, loadModel } from "../src/gpt4all.js";

const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf");

// createCompletion methods can also be used on the model directly.
// context is not maintained between completions.
const res1 = await createCompletion(model, "What is 1 + 1?");
console.debug(res1.choices[0].message);

// a whole conversation can be input as well.
// note that if the last message is not of role 'user', an error will be thrown.
const res2 = await createCompletion(model, [
    {
        role: "user",
        content: "What is 2 + 2?",
    },
    {
        role: "assistant",
        content: "It's 5.",
    },
    {
        role: "user",
        content: "Could you recalculate that?",
    },
]);
console.debug(res2.choices[0].message);

Embedding

import { loadModel, createEmbedding } from '../src/gpt4all.js'

const embedder = await loadModel("nomic-embed-text-v1.5.f16.gguf", { verbose: true, type: 'embedding'})

console.log(createEmbedding(embedder, "Maybe Minecraft was the friends we made along the way"));

Streaming responses

import { loadModel, createCompletionStream } from "../src/gpt4all.js";

const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", {
    device: "gpu",
});

process.stdout.write("Output: ");
const stream = createCompletionStream(model, "How are you?");
stream.tokens.on("data", (data) => {
    process.stdout.write(data);
});
//wait till stream finishes. We cannot continue until this one is done.
await stream.result;
process.stdout.write("\n");
model.dispose();

Async Generators

import { loadModel, createCompletionGenerator } from "../src/gpt4all.js";

const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf");

process.stdout.write("Output: ");
const gen = createCompletionGenerator(
    model,
    "Redstone in Minecraft is Turing Complete. Let that sink in. (let it in!)"
);
for await (const chunk of gen) {
    process.stdout.write(chunk);
}

process.stdout.write("\n");
model.dispose();

Offline usage

do this b4 going offline

curl -L https://gpt4all.io/models/models3.json -o ./models3.json
import { createCompletion, loadModel } from 'gpt4all'

//make sure u downloaded the models before going offline!
const model = await loadModel('mistral-7b-openorca.gguf2.Q4_0.gguf', {
    verbose: true,
    device: 'gpu',
    modelConfigFile: "./models3.json"
});

await createCompletion(model, 'What is 1 + 1?', { verbose: true })

model.dispose();

Develop

Build Instructions

  • binding.gyp is compile config
  • Tested on Ubuntu. Everything seems to work fine
  • Tested on Windows. Everything works fine.
  • Sparse testing on mac os.
  • MingW script works to build the gpt4all-backend. We left it there just in case. HOWEVER, this package works only with MSVC built dlls.

Requirements

  • git
  • node.js >= 18.0.0
  • yarn
  • node-gyp
    • all of its requirements.
  • (unix) gcc version 12
  • (win) msvc version 143
    • Can be obtained with visual studio 2022 build tools
  • python 3
  • On Windows and Linux, building GPT4All requires the complete Vulkan SDK. You may download it from here: https://vulkan.lunarg.com/sdk/home
  • macOS users do not need Vulkan, as GPT4All will use Metal instead.
  • CUDA Toolkit >= 11.4 (you can bypass this with adding a custom flag to build step)
    • Windows: There is difficulty compiling with cuda if the Visual Studio IDE is NOT present.

Build (from source)

git clone https://github.com/nomic-ai/gpt4all.git
cd gpt4all-bindings/typescript

llama.cpp git submodule for gpt4all can be possibly absent or outdated. Make sure to run

git submodule update --init --recursive

The below shell commands assume the current working directory is typescript.

Using yarn

yarn install
yarn build

Using npm

npm install
npm run build

The build:runtimes script will create runtime libraries for your platform in runtimes and build:prebuilds will create the bindings in prebuilds. build is a shortcut for both.

Test

yarn test

Source Overview

src/

  • Extra functions to help aid devex
  • Typings for the native node addon
  • the javascript interface

test/

  • simple unit testings for some functions exported.
  • more advanced ai testing is not handled

spec/

  • Average look and feel of the api
  • Should work assuming a model and libraries are installed locally in working directory

index.cc

  • The bridge between nodejs and c. Where the bindings are.

prompt.cc

  • Handling prompting and inference of models in a threadsafe, asynchronous way.

Known Issues

  • why your model may be spewing bull 💩
    • The downloaded model is broken (just reinstall or download from official site)
  • Your model is hanging after a call to generate tokens.
    • Is nPast set too high? This may cause your model to hang (03/16/2024), Linux Mint, Ubuntu 22.04
  • Your GPU usage is still high after node.js exits.
    • Make sure to call model.dispose()!!!

Roadmap

This package has been stabilizing over time development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:

  • x] [Purely offline](#Offline-usage). Per the gui, which can be run completely offline, the bindings should be as well. 
    
  •  ] NPM bundle size reduction via optionalDependencies strategy (need help)
    *   Should include prebuilds to avoid painful node-gyp errors
    
  • x] createChatSession ( the python equivalent to create\_chat\_session )
    
  • x] generateTokens, the new name for createTokenStream. As of 3.2.0, this is released but not 100% tested. Check spec/generator.mjs!
    
  • x] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete
    
  • x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
    
  • x] generateTokens is the new name for this^
    
  • x] proper unit testing (integrate with circle ci)
    
  • x] publish to npm under alpha tag `gpt4all@alpha`
    
  • x] have more people test on other platforms (mac tester needed)
    
  • x] switch to new pluggable backend
    
    

Changes

This repository serves as the new bindings for nodejs users.

  • If you were a user of these bindings, they are outdated.
  • Version 4 includes the follow breaking changes
    • createEmbedding & EmbeddingModel.embed() returns an object, EmbeddingResult, instead of a Float32Array.
    • Removed deprecated types ModelType and ModelFile
    • Removed deprecated initiation of model by string path only

API Reference