mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2024-10-01 01:06:10 -04:00
8d53614444
* fix typo so padding can be accessed * Small cleanups for settings dialog. * Fix the build. * localdocs * Fixup the rescan. Fix debug output. * Add remove folder implementation. * Remove this signal as unnecessary for now. * Cleanup of the database, better chunking, better matching. * Add new reverse prompt for new localdocs context feature. * Add a new muted text color. * Turn off the debugging messages by default. * Add prompt processing and localdocs to the busy indicator in UI. * Specify a large number of suffixes we will search for now. * Add a collection list to support a UI. * Add a localdocs tab. * Start fleshing out the localdocs ui. * Begin implementing the localdocs ui in earnest. * Clean up the settings dialog for localdocs a bit. * Add more of the UI for selecting collections for chats. * Complete the settings for localdocs. * Adds the collections to serialize and implement references for localdocs. * Store the references separately so they are not sent to datalake. * Add context link to references. * Don't use the full path in reference text. * Various fixes to remove unnecessary warnings. * Add a newline * ignore rider and vscode dirs * create test project and basic model loading tests * make sample print usage and cleaner * Get the backend as well as the client building/working with msvc. * Libraries named differently on msvc. * Bump the version number. * This time remember to bump the version right after a release. * rm redundant json * More precise condition * Nicer handling of missing model directory. Correct exception message. * Log where the model was found * Concise model matching * reduce nesting, better error reporting * convert to f-strings * less magic number * 1. Cleanup the interrupted download 2. with-syntax * Redundant else * Do not ignore explicitly passed 4 threads * Correct return type * Add optional verbosity * Correct indentation of the multiline error message * one funcion to append .bin suffix * hotfix default verbose optioin * export hidden types and fix prompt() type * tiny typo (#739) * Update README.md (#738) * Update README.md fix golang gpt4all import path Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> * Update README.md Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> --------- Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> * fix(training instructions): model repo name (#728) Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com> * C# Bindings - Prompt formatting (#712) * Added support for custom prompt formatting * more docs added * bump version * clean up cc files and revert things * LocalDocs documentation initial (#761) * LocalDocs documentation initial * Improved localdocs documentation (#762) * Improved localdocs documentation * Improved localdocs documentation * Improved localdocs documentation * Improved localdocs documentation * New tokenizer implementation for MPT and GPT-J Improves output quality by making these tokenizers more closely match the behavior of the huggingface `tokenizers` based BPE tokenizers these models were trained with. Featuring: * Fixed unicode handling (via ICU) * Fixed BPE token merge handling * Complete added vocabulary handling * buf_ref.into() can be const now * add tokenizer readme w/ instructions for convert script * Revert "add tokenizer readme w/ instructions for convert script" This reverts commit9c15d1f83e
. * Revert "buf_ref.into() can be const now" This reverts commit840e011b75
. * Revert "New tokenizer implementation for MPT and GPT-J" This reverts commitee3469ba6c
. * Fix remove model from model download for regular models. * Fixed formatting of localdocs docs (#770) * construct and return the correct reponse when the request is a chat completion * chore: update typings to keep consistent with python api * progress, updating createCompletion to mirror py api * update spec, unfinished backend * prebuild binaries for package distribution using prebuildify/node-gyp-build * Get rid of blocking behavior for regenerate response. * Add a label to the model loading visual indicator. * Use the new MyButton for the regenerate response button. * Add a hover and pressed to the visual indication of MyButton. * Fix wording of this accessible description. * Some color and theme enhancements to make the UI contrast a bit better. * Make the comboboxes align in UI. * chore: update namespace and fix prompt bug * fix linux build * add roadmap * Fix offset of prompt/response icons for smaller text. * Dlopen backend 5 (#779) Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved. * Add a custom busy indicator to further align look and feel across platforms. * Draw the indicator for combobox to ensure it looks the same on all platforms. * Fix warning. * Use the proper text color for sending messages. * Fixup the plus new chat button. * Make all the toolbuttons highlight on hover. * Advanced avxonly autodetection (#744) * Advanced avxonly requirement detection * chore: support llamaversion >= 3 and ggml default * Dlopen better implementation management (Version 2) * Add fixme's and clean up a bit. * Documentation improvements on LocalDocs (#790) * Update gpt4all_chat.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * typo Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Adapt code * Makefile changes (WIP to test) * Debug * Adapt makefile * Style * Implemented logging mechanism (#785) * Cleaned up implementation management (#787) * Cleaned up implementation management * Initialize LLModel::m_implementation to nullptr * llmodel.h: Moved dlhandle fwd declare above LLModel class * Fix compile * Fixed double-free in LLModel::Implementation destructor * Allow user to specify custom search path via $GPT4ALL_IMPLEMENTATIONS_PATH (#789) * Drop leftover include * Add ldl in gpt4all.go for dynamic linking (#797) * Logger should also output to stderr * Fix MSVC Build, Update C# Binding Scripts * Update gpt4all_chat.md (#800) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * C# Bindings - improved logging (#714) * added optional support for .NET logging * bump version and add missing alpha suffix * avoid creating additional namespace for extensions * prefer NullLogger/NullLoggerFactory over null-conditional ILogger to avoid errors --------- Signed-off-by: mvenditto <venditto.matteo@gmail.com> * Make localdocs work with server mode. * Better name for database results. * Fix for stale references after we regenerate. * Don't hardcode these. * Fix bug with resetting context with chatgpt model. * Trying to shrink the copy+paste code and do more code sharing between backend model impl. * Remove this as it is no longer useful. * Try and fix build on mac. * Fix mac build again. * Add models/release.json to github repo to allow PRs * Fixed spelling error in models.json to make CI happy Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * updated bindings code for updated C api * load all model libs * model creation is failing... debugging * load libs correctly * fixed finding model libs * cleanup * cleanup * more cleanup * small typo fix * updated binding.gyp * Fixed model type for GPT-J (#815) Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * Fixed tons of warnings and clazy findings (#811) * Some tweaks to UI to make window resizing smooth and flow nicely. * Min constraints on about dialog. * Prevent flashing of white on resize. * Actually use the theme dark color for window background. * Add the ability to change the directory via text field not just 'browse' button. * add scripts to build dlls * markdown doc gen * add scripts, nearly done moving breaking changes * merge with main * oops, fixed comment * more meaningful name * leave for testing * Only default mlock on macOS where swap seems to be a problem Repeating the change that once was done in https://github.com/nomic-ai/gpt4all/pull/663 but then was overriden by9c6c09cbd2
Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> * Add a collection immediately and show a placeholder + busy indicator in localdocs settings. * some tweaks to optional types and defaults * mingw script for windows compilation * Update README.md huggingface -> Hugging Face Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com> * Backend prompt dedup (#822) * Deduplicated prompt() function code * Better error handling when the model fails to load. * We no longer have an avx_only repository and better error handling for minimum hardware requirements. (#833) * Update build_and_run.md (#834) Signed-off-by: AT <manyoso@users.noreply.github.com> * Trying out a new feature to download directly from huggingface. * Try again with the url. * Allow for download of models hosted on third party hosts. * Fix up for newer models on reset context. This fixes the model from totally failing after a reset context. * Update to latest llama.cpp * Remove older models that are not as popular. (#837) * Remove older models that are not as popular. * Update models.json Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update models.json (#838) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update models.json Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * feat: finalyl compiled on windows (MSVC) goadman * update README and spec and promisfy createCompletion * update d.ts * Make installers work with mac/windows for big backend change. * Need this so the linux installer packages it as a dependency. * Try and fix mac. * Fix compile on mac. * These need to be installed for them to be packaged and work for both mac and windows. * Fix installers for windows and linux. * Fix symbol resolution on windows. * updated pypi version * Release notes for version 2.4.5 (#853) * Update README.md (#854) Signed-off-by: AT <manyoso@users.noreply.github.com> * Documentation for model sideloading (#851) * Documentation for model sideloading Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Update gpt4all_chat.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> --------- Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Speculative fix for windows llama models with installer. * Revert "Speculative fix for windows llama models with installer." This reverts commitadd725d1eb
. * Revert "Fix bug with resetting context with chatgpt model." (#859) This reverts commite0dcf6a14f
. * Fix llama models on linux and windows. * Bump the version. * New release notes * Set thread counts after loading model (#836) * Update gpt4all_faq.md (#861) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * Supports downloading officially supported models not hosted on gpt4all R2 * Replit Model (#713) * porting over replit code model to gpt4all * replaced memory with kv_self struct * continuing debug * welp it built but lot of sus things * working model loading and somewhat working generate.. need to format response? * revert back to semi working version * finally got rid of weird formatting * figured out problem is with python bindings - this is good to go for testing * addressing PR feedback * output refactor * fixed prompt reponse collection * cleanup * addressing PR comments * building replit backend with new ggmlver code * chatllm replit and clean python files * cleanup * updated replit to match new llmodel api * match llmodel api and change size_t to Token * resolve PR comments * replit model commit comment * Synced llama.cpp.cmake with upstream (#887) * Fix for windows. * fix: build script * Revert "Synced llama.cpp.cmake with upstream (#887)" This reverts commit5c5e10c1f5
. * Update README.md (#906) Add PyPI link and add clickable, more specific link to documentation Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> * Update CollectionsDialog.qml (#856) Phrasing for localdocs Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> * sampling: remove incorrect offset for n_vocab (#900) no effect, but avoids a *potential* bug later if we use actualVocabSize - which is for when a model has a larger embedding tensor/# of output logits than actually trained token to allow room for adding extras in finetuning - presently all of our models have had "placeholder" tokens in the vocab so this hasn't broken anything, but if the sizes did differ we want the equivalent of `logits[actualVocabSize:]` (the start point is unchanged), not `logits[-actualVocabSize:]` (this.) * non-llama: explicitly greedy sampling for temp<=0 (#901) copied directly from llama.cpp - without this temp=0.0 will just scale all the logits to infinity and give bad output * work on thread safety and cleaning up, adding object option * chore: cleanup tests and spec * refactor for object based startup * more docs * Circleci builds for Linux, Windows, and macOS for gpt4all-chat. * more docs * Synced llama.cpp.cmake with upstream * add lock file to ignore codespell * Move usage in Python bindings readme to own section (#907) Have own section for short usage example, as it is not specific to local build Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> * Always sync for circleci. * update models json with replit model * Forgot to bump. * Change the default values for generation in GUI * Removed double-static from variables in replit.cpp The anonymous namespace already makes it static. Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * Generator in Python Bindings - streaming yields tokens at a time (#895) * generator method * cleanup * bump version number for clarity * added replace in decode to avoid unicodedecode exception * revert back to _build_prompt * Do auto detection by default in C++ API Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> * remove comment * add comments for index.h * chore: add new models and edit ignore files and documentation * llama on Metal (#885) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> * Revert "llama on Metal (#885)" This reverts commitb59ce1c6e7
. * add more readme stuff and debug info * spell * Metal+LLama take two (#929) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> * add prebuilts for windows * Add new solution for context links that does not force regular markdown (#938) in responses which is disruptive to code completions in responses. * add prettier * split out non llm related methods into util.js, add listModels method * add prebuild script for creating all platforms bindings at once * check in prebuild linux/so libs and allow distribution of napi prebuilds * apply autoformatter * move constants in config.js, add loadModel and retrieveModel methods * Clean up the context links a bit. * Don't interfere with selection. * Add code blocks and python syntax highlighting. * Spelling error. * Add c++/c highighting support. * Fix some bugs with bash syntax and add some C23 keywords. * Bugfixes for prompt syntax highlighting. * Try and fix a false positive from codespell. * When recalculating context we can't erase the BOS. * Fix Windows MSVC AVX builds - bug introduced in557c82b5ed
- currently getting: `warning C5102: ignoring invalid command-line macro definition '/arch:AVX2'` - solution is to use `_options(...)` not `_definitions(...)` * remove .so unneeded path --------- Signed-off-by: Nandakumar <nandagunasekaran@gmail.com> Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com> Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> Signed-off-by: mvenditto <venditto.matteo@gmail.com> Signed-off-by: niansa/tuxifan <tuxifan@posteo.de> Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Signed-off-by: AT <manyoso@users.noreply.github.com> Signed-off-by: Claudius Ellsel <claudius.ellsel@live.de> Co-authored-by: Justin Wang <justinwang46@gmail.com> Co-authored-by: Adam Treat <treat.adam@gmail.com> Co-authored-by: redthing1 <redthing1@alt.icu> Co-authored-by: Konstantin Gukov <gukkos@gmail.com> Co-authored-by: Richard Guo <richardg7890@gmail.com> Co-authored-by: Joseph Mearman <joseph@mearman.co.uk> Co-authored-by: Nandakumar <nandagunasekaran@gmail.com> Co-authored-by: Chase McDougall <chasemcdougall@hotmail.com> Co-authored-by: mvenditto <venditto.matteo@gmail.com> Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com> Co-authored-by: Aaron Miller <apage43@ninjawhale.com> Co-authored-by: FoivosC <christoulakis.foivos@adlittle.com> Co-authored-by: limez <limez@protonmail.com> Co-authored-by: AT <manyoso@users.noreply.github.com> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de> Co-authored-by: niansa <anton-sa@web.de> Co-authored-by: mudler <mudler@mocaccino.org> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Tim Miller <innerlogic4321@gmail.com> Co-authored-by: Peter Gagarinov <pgagarinov@users.noreply.github.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: Claudius Ellsel <claudius.ellsel@live.de> Co-authored-by: pingpongching <golololologol02@gmail.com> Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: Cosmic Snow <cosmic-snow@mailfence.com>
216 lines
8.9 KiB
C++
216 lines
8.9 KiB
C++
#include "index.h"
|
|
|
|
Napi::FunctionReference NodeModelWrapper::constructor;
|
|
|
|
Napi::Function NodeModelWrapper::GetClass(Napi::Env env) {
|
|
Napi::Function self = DefineClass(env, "LLModel", {
|
|
InstanceMethod("type", &NodeModelWrapper::getType),
|
|
InstanceMethod("isModelLoaded", &NodeModelWrapper::IsModelLoaded),
|
|
InstanceMethod("name", &NodeModelWrapper::getName),
|
|
InstanceMethod("stateSize", &NodeModelWrapper::StateSize),
|
|
InstanceMethod("raw_prompt", &NodeModelWrapper::Prompt),
|
|
InstanceMethod("setThreadCount", &NodeModelWrapper::SetThreadCount),
|
|
InstanceMethod("threadCount", &NodeModelWrapper::ThreadCount),
|
|
InstanceMethod("getLibraryPath", &NodeModelWrapper::GetLibraryPath),
|
|
});
|
|
// Keep a static reference to the constructor
|
|
//
|
|
constructor = Napi::Persistent(self);
|
|
constructor.SuppressDestruct();
|
|
return self;
|
|
}
|
|
|
|
Napi::Value NodeModelWrapper::getType(const Napi::CallbackInfo& info)
|
|
{
|
|
if(type.empty()) {
|
|
return info.Env().Undefined();
|
|
}
|
|
return Napi::String::New(info.Env(), type);
|
|
}
|
|
|
|
NodeModelWrapper::NodeModelWrapper(const Napi::CallbackInfo& info) : Napi::ObjectWrap<NodeModelWrapper>(info)
|
|
{
|
|
auto env = info.Env();
|
|
fs::path model_path;
|
|
|
|
std::string full_weight_path;
|
|
//todo
|
|
std::string library_path = ".";
|
|
std::string model_name;
|
|
if(info[0].IsString()) {
|
|
model_path = info[0].As<Napi::String>().Utf8Value();
|
|
full_weight_path = model_path.string();
|
|
std::cout << "DEPRECATION: constructor accepts object now. Check docs for more.\n";
|
|
} else {
|
|
auto config_object = info[0].As<Napi::Object>();
|
|
model_name = config_object.Get("model_name").As<Napi::String>();
|
|
model_path = config_object.Get("model_path").As<Napi::String>().Utf8Value();
|
|
if(config_object.Has("model_type")) {
|
|
type = config_object.Get("model_type").As<Napi::String>();
|
|
}
|
|
full_weight_path = (model_path / fs::path(model_name)).string();
|
|
|
|
if(config_object.Has("library_path")) {
|
|
library_path = config_object.Get("library_path").As<Napi::String>();
|
|
} else {
|
|
library_path = ".";
|
|
}
|
|
}
|
|
llmodel_set_implementation_search_path(library_path.c_str());
|
|
llmodel_error* e = nullptr;
|
|
inference_ = std::make_shared<llmodel_model>(llmodel_model_create2(full_weight_path.c_str(), "auto", e));
|
|
if(e != nullptr) {
|
|
Napi::Error::New(env, e->message).ThrowAsJavaScriptException();
|
|
return;
|
|
}
|
|
if(GetInference() == nullptr) {
|
|
std::cerr << "Tried searching libraries in \"" << library_path << "\"" << std::endl;
|
|
std::cerr << "Tried searching for model weight in \"" << full_weight_path << "\"" << std::endl;
|
|
Napi::Error::New(env, "Had an issue creating llmodel object, inference is null").ThrowAsJavaScriptException();
|
|
return;
|
|
}
|
|
|
|
auto success = llmodel_loadModel(GetInference(), full_weight_path.c_str());
|
|
if(!success) {
|
|
Napi::Error::New(env, "Failed to load model at given path").ThrowAsJavaScriptException();
|
|
return;
|
|
}
|
|
name = model_name.empty() ? model_path.filename().string() : model_name;
|
|
};
|
|
//NodeModelWrapper::~NodeModelWrapper() {
|
|
//GetInference().reset();
|
|
//}
|
|
|
|
Napi::Value NodeModelWrapper::IsModelLoaded(const Napi::CallbackInfo& info) {
|
|
return Napi::Boolean::New(info.Env(), llmodel_isModelLoaded(GetInference()));
|
|
}
|
|
|
|
Napi::Value NodeModelWrapper::StateSize(const Napi::CallbackInfo& info) {
|
|
// Implement the binding for the stateSize method
|
|
return Napi::Number::New(info.Env(), static_cast<int64_t>(llmodel_get_state_size(GetInference())));
|
|
}
|
|
|
|
|
|
/**
|
|
* Generate a response using the model.
|
|
* @param model A pointer to the llmodel_model instance.
|
|
* @param prompt A string representing the input prompt.
|
|
* @param prompt_callback A callback function for handling the processing of prompt.
|
|
* @param response_callback A callback function for handling the generated response.
|
|
* @param recalculate_callback A callback function for handling recalculation requests.
|
|
* @param ctx A pointer to the llmodel_prompt_context structure.
|
|
*/
|
|
Napi::Value NodeModelWrapper::Prompt(const Napi::CallbackInfo& info) {
|
|
auto env = info.Env();
|
|
std::string question;
|
|
if(info[0].IsString()) {
|
|
question = info[0].As<Napi::String>().Utf8Value();
|
|
} else {
|
|
Napi::Error::New(info.Env(), "invalid string argument").ThrowAsJavaScriptException();
|
|
return info.Env().Undefined();
|
|
}
|
|
//defaults copied from python bindings
|
|
llmodel_prompt_context promptContext = {
|
|
.logits = nullptr,
|
|
.tokens = nullptr,
|
|
.n_past = 0,
|
|
.n_ctx = 1024,
|
|
.n_predict = 128,
|
|
.top_k = 40,
|
|
.top_p = 0.9f,
|
|
.temp = 0.72f,
|
|
.n_batch = 8,
|
|
.repeat_penalty = 1.0f,
|
|
.repeat_last_n = 10,
|
|
.context_erase = 0.5
|
|
};
|
|
if(info[1].IsObject())
|
|
{
|
|
auto inputObject = info[1].As<Napi::Object>();
|
|
|
|
// Extract and assign the properties
|
|
if (inputObject.Has("logits") || inputObject.Has("tokens")) {
|
|
Napi::Error::New(info.Env(), "Invalid input: 'logits' or 'tokens' properties are not allowed").ThrowAsJavaScriptException();
|
|
return info.Env().Undefined();
|
|
}
|
|
// Assign the remaining properties
|
|
if(inputObject.Has("n_past"))
|
|
promptContext.n_past = inputObject.Get("n_past").As<Napi::Number>().Int32Value();
|
|
if(inputObject.Has("n_ctx"))
|
|
promptContext.n_ctx = inputObject.Get("n_ctx").As<Napi::Number>().Int32Value();
|
|
if(inputObject.Has("n_predict"))
|
|
promptContext.n_predict = inputObject.Get("n_predict").As<Napi::Number>().Int32Value();
|
|
if(inputObject.Has("top_k"))
|
|
promptContext.top_k = inputObject.Get("top_k").As<Napi::Number>().Int32Value();
|
|
if(inputObject.Has("top_p"))
|
|
promptContext.top_p = inputObject.Get("top_p").As<Napi::Number>().FloatValue();
|
|
if(inputObject.Has("temp"))
|
|
promptContext.temp = inputObject.Get("temp").As<Napi::Number>().FloatValue();
|
|
if(inputObject.Has("n_batch"))
|
|
promptContext.n_batch = inputObject.Get("n_batch").As<Napi::Number>().Int32Value();
|
|
if(inputObject.Has("repeat_penalty"))
|
|
promptContext.repeat_penalty = inputObject.Get("repeat_penalty").As<Napi::Number>().FloatValue();
|
|
if(inputObject.Has("repeat_last_n"))
|
|
promptContext.repeat_last_n = inputObject.Get("repeat_last_n").As<Napi::Number>().Int32Value();
|
|
if(inputObject.Has("context_erase"))
|
|
promptContext.context_erase = inputObject.Get("context_erase").As<Napi::Number>().FloatValue();
|
|
}
|
|
//copy to protect llmodel resources when splitting to new thread
|
|
|
|
llmodel_prompt_context copiedPrompt = promptContext;
|
|
std::string copiedQuestion = question;
|
|
PromptWorkContext pc = {
|
|
copiedQuestion,
|
|
inference_.load(),
|
|
copiedPrompt,
|
|
};
|
|
auto threadSafeContext = new TsfnContext(env, pc);
|
|
threadSafeContext->tsfn = Napi::ThreadSafeFunction::New(
|
|
env, // Environment
|
|
info[2].As<Napi::Function>(), // JS function from caller
|
|
"PromptCallback", // Resource name
|
|
0, // Max queue size (0 = unlimited).
|
|
1, // Initial thread count
|
|
threadSafeContext, // Context,
|
|
FinalizerCallback, // Finalizer
|
|
(void*)nullptr // Finalizer data
|
|
);
|
|
threadSafeContext->nativeThread = std::thread(threadEntry, threadSafeContext);
|
|
return threadSafeContext->deferred_.Promise();
|
|
}
|
|
|
|
void NodeModelWrapper::SetThreadCount(const Napi::CallbackInfo& info) {
|
|
if(info[0].IsNumber()) {
|
|
llmodel_setThreadCount(GetInference(), info[0].As<Napi::Number>().Int64Value());
|
|
} else {
|
|
Napi::Error::New(info.Env(), "Could not set thread count: argument 1 is NaN").ThrowAsJavaScriptException();
|
|
return;
|
|
}
|
|
}
|
|
|
|
Napi::Value NodeModelWrapper::getName(const Napi::CallbackInfo& info) {
|
|
return Napi::String::New(info.Env(), name);
|
|
}
|
|
Napi::Value NodeModelWrapper::ThreadCount(const Napi::CallbackInfo& info) {
|
|
return Napi::Number::New(info.Env(), llmodel_threadCount(GetInference()));
|
|
}
|
|
|
|
Napi::Value NodeModelWrapper::GetLibraryPath(const Napi::CallbackInfo& info) {
|
|
return Napi::String::New(info.Env(),
|
|
llmodel_get_implementation_search_path());
|
|
}
|
|
|
|
llmodel_model NodeModelWrapper::GetInference() {
|
|
return *inference_.load();
|
|
}
|
|
|
|
//Exports Bindings
|
|
Napi::Object Init(Napi::Env env, Napi::Object exports) {
|
|
exports["LLModel"] = NodeModelWrapper::GetClass(env);
|
|
return exports;
|
|
}
|
|
|
|
|
|
|
|
NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)
|