AI/gpt4all

mirror of https://github.com/nomic-ai/gpt4all.git synced 2024-10-01 01:06:10 -04:00

History

cebtenzzre 37b007603a bindings: replace references to GGMLv3 models with GGUF (#1547 )		2023-10-22 11:58:28 -04:00
..
scripts	python: prepare version 2.0.0rc1 (#1529 )	2023-10-18 20:24:54 -04:00
spec	bindings: replace references to GGMLv3 models with GGUF (#1547 )	2023-10-22 11:58:28 -04:00
src	update default model URLs (#1538 )	2023-10-19 15:25:37 -04:00
test	fix ts tests on windows (#1342 )	2023-08-17 10:32:08 -04:00
.gitignore	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
.npmignore	typescript: publish alpha on npm and lots of cleanup, documentation, and more (#913 )	2023-06-12 15:00:20 -04:00
binding.ci.gyp	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
binding.gyp	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
index.cc	Refactor(typescript)/error handling (#1283 )	2023-07-26 20:06:16 -07:00
index.h	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
package.json	feat(typescript)/dynamic template (#1287 ) (#1326 )	2023-08-14 12:45:45 -04:00
prompt.cc	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
prompt.h	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
README.md	update default model URLs (#1538 )	2023-10-19 15:25:37 -04:00
yarn.lock	feat(typescript)/dynamic template (#1287 ) (#1326 )	2023-08-14 12:45:45 -04:00

README.md

GPT4All Node.js API

yarn add gpt4all@alpha

npm install gpt4all@alpha

pnpm install gpt4all@alpha

The original GPT4All typescript bindings are now out of date.

New bindings created by jacoobes, limez and the nomic ai community, for all to use.
The nodejs api has made strides to mirror the python api. It is not 100% mirrored, but many pieces of the api resemble its python counterpart.
Everything should work out the box.
See API Reference

Chat Completion

import { createCompletion, loadModel } from '../src/gpt4all.js'

const model = await loadModel('ggml-vicuna-7b-1.1-q4_2', { verbose: true });

const response = await createCompletion(model, [
    { role : 'system', content: 'You are meant to be annoying and unhelpful.'  },
    { role : 'user', content: 'What is 1 + 1?'  } 
]);

Embedding

import { createEmbedding, loadModel } from '../src/gpt4all.js'

const model = await loadModel('ggml-all-MiniLM-L6-v2-f16', { verbose: true });

const fltArray = createEmbedding(model, "Pain is inevitable, suffering optional");

Build Instructions

binding.gyp is compile config
Tested on Ubuntu. Everything seems to work fine
Tested on Windows. Everything works fine.
Sparse testing on mac os.
MingW works as well to build the gpt4all-backend. HOWEVER, this package works only with MSVC built dlls.

Requirements

git
node.js >= 18.0.0
yarn
node-gyp
- all of its requirements.
(unix) gcc version 12
(win) msvc version 143
- Can be obtained with visual studio 2022 build tools
python 3
On Windows and Linux, building GPT4All requires the complete Vulkan SDK. You may download it from here: https://vulkan.lunarg.com/sdk/home
macOS users do not need Vulkan, as GPT4All will use Metal instead.

Build (from source)

git clone https://github.com/nomic-ai/gpt4all.git
cd gpt4all-bindings/typescript

The below shell commands assume the current working directory is typescript.
To Build and Rebuild:

yarn

llama.cpp git submodule for gpt4all can be possibly absent. If this is the case, make sure to run in llama.cpp parent directory

git submodule update --init --depth 1 --recursive

AS OF NEW BACKEND to build the backend,

yarn build:backend

This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native The only current way to use them is to put them in the current working directory of your application. That is, WHEREVER YOU RUN YOUR NODE APPLICATION

llama-xxxx.dll is required.
According to whatever model you are using, you'll need to select the proper model loader.
- For example, if you running an Mosaic MPT model, you will need to select the mpt-(buildvariant).(dynamiclibrary)

Test

yarn test

Source Overview

src/

Extra functions to help aid devex
Typings for the native node addon
the javascript interface

test/

simple unit testings for some functions exported.
more advanced ai testing is not handled

spec/

Average look and feel of the api
Should work assuming a model and libraries are installed locally in working directory

index.cc

The bridge between nodejs and c. Where the bindings are.

prompt.cc

Handling prompting and inference of models in a threadsafe, asynchronous way.

Known Issues

why your model may be spewing bull 💩
- The downloaded model is broken (just reinstall or download from official site)
- That's it so far

Roadmap

This package is in active development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:

x] prompt models via a threadsafe function in order to have proper non blocking behavior in nodejs

 ] ~~createTokenStream, an async iterator that streams each token emitted from the model. Planning on following this [example](https://github.com/nodejs/node-addon-examples/tree/main/threadsafe-async-iterator)~~ May not implement unless someone else can complete

x] proper unit testing (integrate with circle ci)

x] publish to npm under alpha tag `gpt4all@alpha`

x] have more people test on other platforms (mac tester needed)

```
x] switch to new pluggable backend
```

 ] NPM bundle size reduction via optionalDependencies strategy (need help)
*   Should include prebuilds to avoid painful node-gyp errors

 ] createChatSession ( the python equivalent to create\_chat\_session )

API Reference

ModelType
ModelFile
- gptj
- llama
- mpt
- replit
type
LLModel
- constructor
  - Parameters
- type
- name
- stateSize
- threadCount
- setThreadCount
  - Parameters
- raw_prompt
  - Parameters
- embed
  - Parameters
- isModelLoaded
- setLibraryPath
  - Parameters
- getLibraryPath
loadModel
- Parameters
createCompletion
- Parameters
createEmbedding
- Parameters
CompletionOptions
- verbose
- systemPromptTemplate
- promptTemplate
- promptHeader
- promptFooter
PromptMessage
- role
- content
prompt_tokens
completion_tokens
total_tokens
CompletionReturn
- model
- usage
- choices
CompletionChoice
- message
LLModelPromptContext
- logitsSize
- tokensSize
- nPast
- nCtx
- nPredict
- topK
- topP
- temp
- nBatch
- repeatPenalty
- repeatLastN
- contextErase
createTokenStream
- Parameters
DEFAULT_DIRECTORY
DEFAULT_LIBRARIES_DIRECTORY
DEFAULT_MODEL_CONFIG
DEFAULT_PROMT_CONTEXT
DEFAULT_MODEL_LIST_URL
downloadModel
- Parameters
- Examples
DownloadModelOptions
- modelPath
- verbose
- url
- md5sum
DownloadController
- cancel
- promise

ModelType

Type of the model

Type: ("gptj" | "llama" | "mpt" | "replit")

ModelFile

Full list of models available @deprecated These model names are outdated and this type will not be maintained, please use a string literal instead

gptj

List of GPT-J Models

Type: ("ggml-gpt4all-j-v1.3-groovy.bin" | "ggml-gpt4all-j-v1.2-jazzy.bin" | "ggml-gpt4all-j-v1.1-breezy.bin" | "ggml-gpt4all-j.bin")

llama

List Llama Models

mpt

List of MPT Models

Type: ("ggml-mpt-7b-base.bin" | "ggml-mpt-7b-chat.bin" | "ggml-mpt-7b-instruct.bin")

replit

List of Replit Models

Type: "ggml-replit-code-v1-3b.bin"

type

Model architecture. This argument currently does not have any functionality and is just used as descriptive identifier for user.

Type: ModelType

LLModel

LLModel class representing a language model. This is a base class that provides common functionality for different types of language models.

constructor

Initialize a new LLModel.

Parameters

path string Absolute path to the model file.

Throws Error If the model file does not exist.

type

either 'gpt', mpt', or 'llama' or undefined

Returns (ModelType | undefined)

name

The name of the model.

Returns string

stateSize

Get the size of the internal state of the model. NOTE: This state data is specific to the type of model you have created.

Returns number the size in bytes of the internal state of the model

threadCount

Get the number of threads used for model inference. The default is the number of physical cores your computer has.

Returns number The number of threads used for model inference.

setThreadCount

Set the number of threads used for model inference.

Parameters

newNumber number The new number of threads.

Returns void

raw_prompt

Prompt the model with a given input and optional parameters. This is the raw output from model. Use the prompt function exported for a value

Parameters

q string The prompt input.
params Partial<LLModelPromptContext> Optional parameters for the prompt context.
callback function (res: string): void

Returns void The result of the model prompt.

embed

Embed text with the model. Keep in mind that not all models can embed text, (only bert can embed as of 07/16/2023 (mm/dd/yyyy)) Use the prompt function exported for a value

Parameters

text string
q The prompt input.
params Optional parameters for the prompt context.

Returns Float32Array The result of the model prompt.

isModelLoaded

Whether the model is loaded or not.

Returns boolean

setLibraryPath

Where to search for the pluggable backend libraries

Parameters

s string

Returns void

getLibraryPath

Where to get the pluggable backend libraries

Returns string

loadModel

Loads a machine learning model with the specified name. The defacto way to create a model. By default this will download a model from the official GPT4ALL website, if a model is not present at given path.

Parameters

modelName string The name of the model to load.
options (LoadModelOptions | undefined)? (Optional) Additional options for loading the model.

Returns Promise<(InferenceModel | EmbeddingModel)> A promise that resolves to an instance of the loaded LLModel.

createCompletion

The nodejs equivalent to python binding's chat_completion

Parameters

model InferenceModel The language model object.
messages Array<PromptMessage> The array of messages for the conversation.
options CompletionOptions The options for creating the completion.

Returns CompletionReturn The completion result.

createEmbedding

The nodejs moral equivalent to python binding's Embed4All().embed() meow

Parameters

model EmbeddingModel The language model object.
text string text to embed

Returns Float32Array The completion result.

CompletionOptions

Extends Partial<LLModelPromptContext>

The options for creating the completion.

verbose

Indicates if verbose logging is enabled.

Type: boolean

systemPromptTemplate

Template for the system message. Will be put before the conversation with %1 being replaced by all system messages. Note that if this is not defined, system messages will not be included in the prompt.

Type: string

promptTemplate

Template for user messages, with %1 being replaced by the message.

Type: boolean

promptHeader

The initial instruction for the model, on top of the prompt

Type: string

promptFooter

The last instruction for the model, appended to the end of the prompt.

Type: string

PromptMessage

A message in the conversation, identical to OpenAI's chat message.

role

The role of the message.

Type: ("system" | "assistant" | "user")

content

The message content.

Type: string

prompt_tokens

The number of tokens used in the prompt.

README.md

GPT4All Node.js API

Chat Completion

Embedding

Build Instructions

Requirements

Build (from source)

Test

Source Overview

src/

test/

spec/

index.cc

prompt.cc

Known Issues

Roadmap

API Reference

Table of Contents

ModelType

ModelFile

gptj

llama

mpt

replit

type

LLModel

constructor

Parameters

type

name

stateSize

threadCount

setThreadCount

Parameters

raw_prompt

Parameters

embed

Parameters

isModelLoaded

setLibraryPath

Parameters

getLibraryPath

loadModel

Parameters

createCompletion

Parameters

createEmbedding

Parameters

CompletionOptions

verbose

systemPromptTemplate

promptTemplate

promptHeader

promptFooter

PromptMessage

role

content

prompt_tokens

completion_tokens

total_tokens

CompletionReturn

model

usage

choices

CompletionChoice

message

LLModelPromptContext

logitsSize

tokensSize

nPast

nCtx

nPredict

topK

topP

temp

nBatch

repeatPenalty

repeatLastN

contextErase

createTokenStream