a602f7fde7
* remove outdated comments Signed-off-by: limez <limez@protonmail.com> * simpler build from source Signed-off-by: limez <limez@protonmail.com> * update unix build script to create .so runtimes correctly Signed-off-by: limez <limez@protonmail.com> * configure ci build type, use RelWithDebInfo for dev build script Signed-off-by: limez <limez@protonmail.com> * add clean script Signed-off-by: limez <limez@protonmail.com> * fix streamed token decoding / emoji Signed-off-by: limez <limez@protonmail.com> * remove deprecated nCtx Signed-off-by: limez <limez@protonmail.com> * update typings Signed-off-by: jacob <jacoobes@sern.dev> update typings Signed-off-by: jacob <jacoobes@sern.dev> * readme,mspell Signed-off-by: jacob <jacoobes@sern.dev> * cuda/backend logic changes + name napi methods like their js counterparts Signed-off-by: limez <limez@protonmail.com> * convert llmodel example into a test, separate test suite that can run in ci Signed-off-by: limez <limez@protonmail.com> * update examples / naming Signed-off-by: limez <limez@protonmail.com> * update deps, remove the need for binding.ci.gyp, make node-gyp-build fallback easier testable Signed-off-by: limez <limez@protonmail.com> * make sure the assert-backend-sources.js script is published, but not the others Signed-off-by: limez <limez@protonmail.com> * build correctly on windows (regression on node-gyp-build) Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * codespell Signed-off-by: limez <limez@protonmail.com> * make sure dlhandle.cpp gets linked correctly Signed-off-by: limez <limez@protonmail.com> * add include for check_cxx_compiler_flag call during aarch64 builds Signed-off-by: limez <limez@protonmail.com> * x86 > arm64 cross compilation of runtimes and bindings Signed-off-by: limez <limez@protonmail.com> * default to cpu instead of kompute on arm64 Signed-off-by: limez <limez@protonmail.com> * formatting, more minimal example Signed-off-by: limez <limez@protonmail.com> --------- Signed-off-by: limez <limez@protonmail.com> Signed-off-by: jacob <jacoobes@sern.dev> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: jacob <jacoobes@sern.dev> |
||
---|---|---|
.. | ||
scripts | ||
spec | ||
src | ||
test | ||
.clang-format | ||
.gitignore | ||
.npmignore | ||
.yarnrc.yml | ||
binding.gyp | ||
index.cc | ||
index.h | ||
package.json | ||
prompt.cc | ||
prompt.h | ||
README.md | ||
yarn.lock |
GPT4All Node.js API
Native Node.js LLM bindings for all.
yarn add gpt4all@latest
npm install gpt4all@latest
pnpm install gpt4all@latest
Breaking changes in version 4!!
- See Transition
Contents
- See API Reference
- See Examples
- See Developing
- GPT4ALL nodejs bindings created by jacoobes, limez and the nomic ai community, for all to use.
- spare change for a college student? 🤑
Api Examples
Chat Completion
Use a chat session to keep context between completions. This is useful for efficient back and forth conversations.
import { createCompletion, loadModel } from "../src/gpt4all.js";
const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", {
verbose: true, // logs loaded model configuration
device: "gpu", // defaults to 'cpu'
nCtx: 2048, // the maximum sessions context window size.
});
// initialize a chat session on the model. a model instance can have only one chat session at a time.
const chat = await model.createChatSession({
// any completion options set here will be used as default for all completions in this chat session
temperature: 0.8,
// a custom systemPrompt can be set here. note that the template depends on the model.
// if unset, the systemPrompt that comes with the model will be used.
systemPrompt: "### System:\nYou are an advanced mathematician.\n\n",
});
// create a completion using a string as input
const res1 = await createCompletion(chat, "What is 1 + 1?");
console.debug(res1.choices[0].message);
// multiple messages can be input to the conversation at once.
// note that if the last message is not of role 'user', an empty message will be returned.
await createCompletion(chat, [
{
role: "user",
content: "What is 2 + 2?",
},
{
role: "assistant",
content: "It's 5.",
},
]);
const res3 = await createCompletion(chat, "Could you recalculate that?");
console.debug(res3.choices[0].message);
model.dispose();
Stateless usage
You can use the model without a chat session. This is useful for one-off completions.
import { createCompletion, loadModel } from "../src/gpt4all.js";
const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf");
// createCompletion methods can also be used on the model directly.
// context is not maintained between completions.
const res1 = await createCompletion(model, "What is 1 + 1?");
console.debug(res1.choices[0].message);
// a whole conversation can be input as well.
// note that if the last message is not of role 'user', an error will be thrown.
const res2 = await createCompletion(model, [
{
role: "user",
content: "What is 2 + 2?",
},
{
role: "assistant",
content: "It's 5.",
},
{
role: "user",
content: "Could you recalculate that?",
},
]);
console.debug(res2.choices[0].message);
Embedding
import { loadModel, createEmbedding } from '../src/gpt4all.js'
const embedder = await loadModel("nomic-embed-text-v1.5.f16.gguf", { verbose: true, type: 'embedding'})
console.log(createEmbedding(embedder, "Maybe Minecraft was the friends we made along the way"));
Streaming responses
import { loadModel, createCompletionStream } from "../src/gpt4all.js";
const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", {
device: "gpu",
});
process.stdout.write("Output: ");
const stream = createCompletionStream(model, "How are you?");
stream.tokens.on("data", (data) => {
process.stdout.write(data);
});
//wait till stream finishes. We cannot continue until this one is done.
await stream.result;
process.stdout.write("\n");
model.dispose();
Async Generators
import { loadModel, createCompletionGenerator } from "../src/gpt4all.js";
const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf");
process.stdout.write("Output: ");
const gen = createCompletionGenerator(
model,
"Redstone in Minecraft is Turing Complete. Let that sink in. (let it in!)"
);
for await (const chunk of gen) {
process.stdout.write(chunk);
}
process.stdout.write("\n");
model.dispose();
Offline usage
do this b4 going offline
curl -L https://gpt4all.io/models/models3.json -o ./models3.json
import { createCompletion, loadModel } from 'gpt4all'
//make sure u downloaded the models before going offline!
const model = await loadModel('mistral-7b-openorca.gguf2.Q4_0.gguf', {
verbose: true,
device: 'gpu',
modelConfigFile: "./models3.json"
});
await createCompletion(model, 'What is 1 + 1?', { verbose: true })
model.dispose();
Develop
Build Instructions
binding.gyp
is compile config- Tested on Ubuntu. Everything seems to work fine
- Tested on Windows. Everything works fine.
- Sparse testing on mac os.
- MingW script works to build the gpt4all-backend. We left it there just in case. HOWEVER, this package works only with MSVC built dlls.
Requirements
- git
- node.js >= 18.0.0
- yarn
- node-gyp
- all of its requirements.
- (unix) gcc version 12
- (win) msvc version 143
- Can be obtained with visual studio 2022 build tools
- python 3
- On Windows and Linux, building GPT4All requires the complete Vulkan SDK. You may download it from here: https://vulkan.lunarg.com/sdk/home
- macOS users do not need Vulkan, as GPT4All will use Metal instead.
- CUDA Toolkit >= 11.4 (you can bypass this with adding a custom flag to build step)
- Windows: There is difficulty compiling with cuda if the Visual Studio IDE is NOT present.
Build (from source)
git clone https://github.com/nomic-ai/gpt4all.git
cd gpt4all-bindings/typescript
llama.cpp git submodule for gpt4all can be possibly absent or outdated. Make sure to run
git submodule update --init --recursive
The below shell commands assume the current working directory is typescript
.
Using yarn
yarn install
yarn build
Using npm
npm install
npm run build
The build:runtimes
script will create runtime libraries for your platform in runtimes
and build:prebuilds
will create the bindings in prebuilds
. build
is a shortcut for both.
Test
yarn test
Source Overview
src/
- Extra functions to help aid devex
- Typings for the native node addon
- the javascript interface
test/
- simple unit testings for some functions exported.
- more advanced ai testing is not handled
spec/
- Average look and feel of the api
- Should work assuming a model and libraries are installed locally in working directory
index.cc
- The bridge between nodejs and c. Where the bindings are.
prompt.cc
- Handling prompting and inference of models in a threadsafe, asynchronous way.
Known Issues
- why your model may be spewing bull 💩
- The downloaded model is broken (just reinstall or download from official site)
- Your model is hanging after a call to generate tokens.
- Is
nPast
set too high? This may cause your model to hang (03/16/2024), Linux Mint, Ubuntu 22.04
- Is
- Your GPU usage is still high after node.js exits.
- Make sure to call
model.dispose()
!!!
- Make sure to call
Roadmap
This package has been stabilizing over time development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:
-
x] [Purely offline](#Offline-usage). Per the gui, which can be run completely offline, the bindings should be as well.
-
] NPM bundle size reduction via optionalDependencies strategy (need help) * Should include prebuilds to avoid painful node-gyp errors
-
x] createChatSession ( the python equivalent to create\_chat\_session )
-
x] generateTokens, the new name for createTokenStream. As of 3.2.0, this is released but not 100% tested. Check spec/generator.mjs!
-
x] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete
-
x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs
-
x] generateTokens is the new name for this^
-
x] proper unit testing (integrate with circle ci)
-
x] publish to npm under alpha tag `gpt4all@alpha`
-
x] have more people test on other platforms (mac tester needed)
-
x] switch to new pluggable backend
Changes
This repository serves as the new bindings for nodejs users.
- If you were a user of these bindings, they are outdated.
- Version 4 includes the follow breaking changes
createEmbedding
&EmbeddingModel.embed()
returns an object,EmbeddingResult
, instead of a Float32Array.- Removed deprecated types
ModelType
andModelFile
- Removed deprecated initiation of model by string path only