AI/gpt4all

mirror of https://github.com/nomic-ai/gpt4all.git synced 2024-10-01 01:06:10 -04:00

gpt4all: an ecosystem of open-source chatbots trained on a massive collections of clean assistant data including code, stories and dialogue

Go to file

Jacob Nguyen 545c23b4bd typescript: fix final bugs and polishing, circle ci documentation (#960 ) * fix: esm and cjs compatibility Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update prebuild.js Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * fix gpt4all.js Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Fix compile for windows and linux again. PLEASE DON'T REVERT THISgit gui! * version bump * polish up spec and build scripts * lock file refresh * fix: proper resource closing and error handling * check make sure libPath not null * add msvc build script and update readme requirements * python workflows in circleci * dummy python change * no need for main * second hold for pypi deploy * let me deploy pls * bring back when condition * Typo, ignore list (#967) Fix typo in javadoc, Add word to ignore list for codespellrc --------- Co-authored-by: felix <felix@zaslavskiy.net> * llmodel: change tokenToString to not use string_view (#968) fixes a definite use-after-free and likely avoids some other potential ones - std::string will convert to a std::string_view automatically but as soon as the std::string in question goes out of scope it is already freed and the string_view is pointing at freed memory - this is mostly fine if its returning a reference to the tokenizer's internal vocab table but it's, imo, too easy to return a reference to a dynamically constructed string with this as replit is doing (and unfortunately needs to do to convert the internal whitespace replacement symbol back to a space) * Initial Library Loader for .NET Bindings / Update bindings to support newest changes (#763) * Initial Library Loader * Load library as part of Model factory * Dynamically search and find the dlls * Update tests to use locally built runtimes * Fix dylib loading, add macos runtime support for sample/tests * Bypass automatic loading by default. * Only set CMAKE_OSX_ARCHITECTURES if not already set, allow cross-compile * Switch Loading again * Update build scripts for mac/linux * Update bindings to support newest breaking changes * Fix build * Use llmodel for Windows * Actually, it does need to be libllmodel * Name * Remove TFMs, bypass loading by default * Fix script * Delete mac script --------- Co-authored-by: Tim Miller <innerlogic4321@ghmail.com> * bump llama.cpp mainline to latest (#964) * fix prompt context so it's preserved in class * update setup.py * metal replit (#931) metal+replit makes replit work with Metal and removes its use of `mem_per_token` in favor of fixed size scratch buffers (closer to llama.cpp) * update documentation scripts and generation to include readme.md * update readme and documentation for source * begin tests, import jest, fix listModels export * fix typo * chore: update spec * fix: finally, reduced potential of empty string * chore: add stub for createTokenSream * refactor: protecting resources properly * add basic jest tests * update * update readme * refactor: namespace the res variable * circleci integration to automatically build docs * add starter docs * typo * more circle ci typo * forgot to add nodejs circle ci orb * fix circle ci * feat: @iimez verify download and fix prebuild script * fix: oops, option name wrong * fix: gpt4all utils not emitting docs * chore: fix up scripts * fix: update docs and typings for md5 sum * fix: macos compilation * some refactoring * Update index.cc Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * update readme and enable exceptions on mac * circle ci progress * basic embedding with sbert (not tested & cpp side only) * fix circle ci * fix circle ci * update circle ci script * bruh * fix again * fix * fixed required workflows * fix ci * fix pwd * fix pwd * update ci * revert * fix * prevent rebuild * revmove noop * Update continue_config.yml Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update binding.gyp Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * fix fs not found * remove cpp 20 standard * fix warnings, safer way to calculate arrsize * readd build backend * basic embeddings and yarn test" * fix circle ci Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Update continue_config.yml Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> fix macos paths update readme and roadmap split up spec update readme check for url in modelsjson update docs and inline stuff update yarn configuration and readme update readme readd npm publish script add exceptions bruh one space broke the yaml codespell oops forgot to add runtimes folder bump version try code snippet https://support.circleci.com/hc/en-us/articles/8325075309339-How-to-install-NPM-on-Windows-images add fallback for unknown architectures attached to wrong workspace hopefuly fix moving everything under backend to persist should work now * update circle ci script * prevent rebuild * revmove noop * Update continue_config.yml Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * Update binding.gyp Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * fix fs not found * remove cpp 20 standard * fix warnings, safer way to calculate arrsize * readd build backend * basic embeddings and yarn test" * fix circle ci Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Update continue_config.yml Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> fix macos paths update readme and roadmap split up spec update readme check for url in modelsjson update docs and inline stuff update yarn configuration and readme update readme readd npm publish script add exceptions bruh one space broke the yaml codespell oops forgot to add runtimes folder bump version try code snippet https://support.circleci.com/hc/en-us/articles/8325075309339-How-to-install-NPM-on-Windows-images add fallback for unknown architectures attached to wrong workspace hopefuly fix moving everything under backend to persist should work now * Update README.md Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> --------- Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: Adam Treat <treat.adam@gmail.com> Co-authored-by: Richard Guo <richardg7890@gmail.com> Co-authored-by: Felix Zaslavskiy <felix.zaslavskiy@gmail.com> Co-authored-by: felix <felix@zaslavskiy.net> Co-authored-by: Aaron Miller <apage43@ninjawhale.com> Co-authored-by: Tim Miller <drasticactions@users.noreply.github.com> Co-authored-by: Tim Miller <innerlogic4321@ghmail.com>		2023-07-25 11:46:40 -04:00
.circleci	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
.github	Paginate through all issues for close_issues workflow (#630 )	2023-05-18 14:30:47 -04:00
gpt4all-api	fix: don't pass around the same dict object (#1264 )	2023-07-24 15:28:12 -04:00
gpt4all-backend	Handle edge cases when generating embeddings (#1215 )	2023-07-17 13:21:03 -07:00
gpt4all-bindings	typescript: fix final bugs and polishing, circle ci documentation (#960 )	2023-07-25 11:46:40 -04:00
gpt4all-chat	Update default TopP to 0.4	2023-07-19 10:36:23 -04:00
gpt4all-docker	mono repo structure	2023-05-01 15:45:23 -04:00
gpt4all-training	fix: update train scripts and configs for other models (#1164 )	2023-07-12 15:18:24 -04:00
.codespellrc	Typo, ignore list (#967 )	2023-06-13 00:53:27 -07:00
.gitignore	fix: update train scripts and configs for other models (#1164 )	2023-07-12 15:18:24 -04:00
.gitmodules	move 230511 submodule to nomic fork, fix alibi assert	2023-06-30 21:07:21 -03:00
CONTRIBUTING.md	[DATALAD RUNCMD] run codespell throughout	2023-05-16 11:33:59 -04:00
gpt4all-lora-demo.gif	GIF	2023-03-28 15:54:44 -04:00
LICENSE.txt	Add MIT license.	2023-04-06 11:28:59 -04:00
monorepo_plan.md	Update monorepo_plan.md	2023-05-05 09:32:45 -04:00
README.md	Add AVX/AVX2 requirement to main README.md	2023-07-19 13:05:42 +02:00

README.md

GPT4All

Open-source assistant-style large language models that run locally on your CPU

GPT4All Website

GPT4All Documentation

Discord

🦜️🔗 Official Langchain Backend

GPT4All is made possible by our compute partner Paperspace.

Run on an M1 macOS Device (not sped up!)

GPT4All: An ecosystem of open-source on-edge large language models.

GPT4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs. Note that your CPU needs to support AVX or AVX2 instructions.

Learn more in the documentation.

The goal is simple - be the best instruction tuned assistant-style language model that any person or enterprise can freely use, distribute and build on.

A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. Nomic AI supports and maintains this software ecosystem to enforce quality and security alongside spearheading the effort to allow any person or enterprise to easily train and deploy their own on-edge large language models.

Chat Client

Run any GPT4All model natively on your home desktop with the auto-updating desktop chat client. See GPT4All Website for a full list of open-source models you can run with this powerful desktop application.

Direct Installer Links:

Find the most up-to-date information on the GPT4All Website

Chat Client building and running

Follow the visual instructions on the chat client build_and_run page

Bindings

Contributing

GPT4All welcomes contributions, involvement, and discussion from the open source community! Please see CONTRIBUTING.md and follow the issues, bug reports, and PR markdown templates.

Check project discord, with project owners, or through existing issues/PRs to avoid duplicate work. Please make sure to tag all of the above with relevant project identifiers or your contribution could potentially get lost. Example tags: backend, bindings, python-bindings, documentation, etc.

Technical Reports

📗 Technical Report 3: GPT4All Snoozy and Groovy

📗 Technical Report 2: GPT4All-J

📗 Technical Report 1: GPT4All

Citation

If you utilize this repository, models or data in a downstream project, please consider citing it with:

@misc{gpt4all,
  author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
  title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}

README.md Unescape Escape