gpt4all

AI/gpt4all

mirror of https://github.com/nomic-ai/gpt4all.git synced 2024-09-19 15:25:53 +00:00

Author	SHA1	Message	Date
Jared Van Bortel	7463b2170b	backend(build): set CUDA arch defaults before enable_language(CUDA) (#2855 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-08-13 14:47:48 -04:00
Jared Van Bortel	26113a17fb	don't use ranges::contains due to clang incompatibility (#2812 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-08-08 11:49:01 -04:00
Jared Van Bortel	de7cb36fcc	python: reduce size of wheels built by CI, other build tweaks (#2802 ) * Read CMAKE_CUDA_ARCHITECTURES directly * Disable CUBINs for python build in CI * Search for CUDA 11 as well as CUDA 12 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-08-07 11:27:50 -04:00
Jared Van Bortel	be66ec8ab5	chat: faster KV shift, continue generating, fix stop sequences (#2781 ) * Don't stop generating at end of context * Use llama_kv_cache ops to shift context * Fix and improve reverse prompt detection * Replace prompt recalc callback with a flag to disallow context shift	2024-08-07 11:25:24 -04:00
Jared Van Bortel	290c629442	backend: rebase llama.cpp submodule on latest upstream (#2694 ) * Adds support for GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Kompute support) * Also enables Kompute support for StarCoder2, XVERSE, Command R, and OLMo * Includes a number of Kompute resource management fixes Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-07-19 14:52:58 -04:00
AT	ca72428783	Remove support for GPT-J models. (#2676 ) Signed-off-by: Adam Treat <treat.adam@gmail.com> Signed-off-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <jared@nomic.ai>	2024-07-17 16:07:37 -04:00
Jared Van Bortel	da1823ed7a	cmake: fix CMAKE_CUDA_ARCHITECTURES default (#2421 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-06-26 14:48:18 -04:00
Jared Van Bortel	55d709862f	Revert "typescript bindings maintenance (#2363 )" As discussed on Discord, this PR was not ready to be merged. CI fails on it. This reverts commit `a602f7fde7`. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-06-03 17:26:19 -04:00
Andreas Obersteiner	a602f7fde7	typescript bindings maintenance (#2363 ) * remove outdated comments Signed-off-by: limez <limez@protonmail.com> * simpler build from source Signed-off-by: limez <limez@protonmail.com> * update unix build script to create .so runtimes correctly Signed-off-by: limez <limez@protonmail.com> * configure ci build type, use RelWithDebInfo for dev build script Signed-off-by: limez <limez@protonmail.com> * add clean script Signed-off-by: limez <limez@protonmail.com> * fix streamed token decoding / emoji Signed-off-by: limez <limez@protonmail.com> * remove deprecated nCtx Signed-off-by: limez <limez@protonmail.com> * update typings Signed-off-by: jacob <jacoobes@sern.dev> update typings Signed-off-by: jacob <jacoobes@sern.dev> * readme,mspell Signed-off-by: jacob <jacoobes@sern.dev> * cuda/backend logic changes + name napi methods like their js counterparts Signed-off-by: limez <limez@protonmail.com> * convert llmodel example into a test, separate test suite that can run in ci Signed-off-by: limez <limez@protonmail.com> * update examples / naming Signed-off-by: limez <limez@protonmail.com> * update deps, remove the need for binding.ci.gyp, make node-gyp-build fallback easier testable Signed-off-by: limez <limez@protonmail.com> * make sure the assert-backend-sources.js script is published, but not the others Signed-off-by: limez <limez@protonmail.com> * build correctly on windows (regression on node-gyp-build) Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> * codespell Signed-off-by: limez <limez@protonmail.com> * make sure dlhandle.cpp gets linked correctly Signed-off-by: limez <limez@protonmail.com> * add include for check_cxx_compiler_flag call during aarch64 builds Signed-off-by: limez <limez@protonmail.com> * x86 > arm64 cross compilation of runtimes and bindings Signed-off-by: limez <limez@protonmail.com> * default to cpu instead of kompute on arm64 Signed-off-by: limez <limez@protonmail.com> * formatting, more minimal example Signed-off-by: limez <limez@protonmail.com> --------- Signed-off-by: limez <limez@protonmail.com> Signed-off-by: jacob <jacoobes@sern.dev> Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com> Co-authored-by: jacob <jacoobes@sern.dev>	2024-06-03 11:12:55 -05:00
Jared Van Bortel	636307160e	backend: fix #includes with include-what-you-use (#2371 ) Also fix a PARENT_SCOPE warning when building the backend. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-31 16:34:54 -04:00
Jared Van Bortel	4e89a9c44f	backend: support non-ASCII characters in path to llmodel libs on Windows (#2388 ) * backend: refactor dlhandle.h into oscompat.{cpp,h} Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: alias std::filesystem Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: use wide strings for paths on Windows Using the native path representation allows us to manipulate paths and call LoadLibraryEx without mangling non-ASCII characters. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: prefer built-in std::filesystem functionality Signed-off-by: Jared Van Bortel <jared@nomic.ai> * oscompat: fix string type error Signed-off-by: Jared Van Bortel <jared@nomic.ai> * backend: rename oscompat back to dlhandle Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: fix #includes Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: remove another #include Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: move dlhandle #include Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: remove #includes that are covered by dlhandle.h Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llmodel: fix #include order Signed-off-by: Jared Van Bortel <jared@nomic.ai> --------- Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-31 13:12:28 -04:00
Jared Van Bortel	d2a99d9bc6	support the llama.cpp CUDA backend (#2310 ) * rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f * support for CUDA backend (enabled by default) * partial support for Occam's Vulkan backend (disabled by default) * partial support for HIP/ROCm backend (disabled by default) * sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt * changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA) * ship CUDA runtime with installed version * make device selection in the UI on macOS actually do something * model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-05-15 15:27:50 -04:00
Jared Van Bortel	406e88b59a	implement local Nomic Embed via llama.cpp (#2086 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-03-13 18:09:24 -04:00
Jared Van Bortel	fc7e5f4a09	ci: fix missing Kompute support in python bindings (#1953 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-09 21:40:32 -05:00
Jared Van Bortel	3acbef14b7	fix AVX support by removing direct linking to AVX2 libs (#1750 )	2023-12-13 12:11:09 -05:00
cebtenzzre	4338e72a51	MPT: use upstream llama.cpp implementation (#1515 )	2023-10-19 15:25:17 -04:00
Adam Treat	a9acdd25de	Push a new version number for llmodel backend now that it is based on gguf.	2023-10-05 18:18:07 -04:00
Cebtenzzre	050e7f076e	backend: port GPT-J to GGUF	2023-10-05 18:16:19 -04:00
Cebtenzzre	ce7be1db48	backend: use llamamodel.cpp for Falcon	2023-10-05 18:16:19 -04:00
Cebtenzzre	6277eac9cc	backend: use llamamodel.cpp for StarCoder	2023-10-05 18:16:19 -04:00
Cebtenzzre	17fc9e3e58	backend: port Replit to GGUF	2023-10-05 18:16:19 -04:00
Cebtenzzre	7c67262a13	backend: port MPT to GGUF	2023-10-05 18:16:19 -04:00
Adam Treat	045f6e6cdc	Link against ggml in bin so we can get the available devices without loading a model.	2023-09-15 14:45:25 -04:00
Adam Treat	85e34598f9	more circleci	2023-08-31 15:29:54 -04:00
Adam Treat	17d3e4976c	Add a comment indicating future work.	2023-08-31 15:29:54 -04:00
Adam Treat	987546c63b	Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.	2023-08-31 15:29:54 -04:00
Adam Treat	d55cbbee32	Update to newer llama.cpp and disable older forks.	2023-08-31 15:29:54 -04:00
Aaron Miller	0bc2274869	bump llama.cpp version + needed fixes for that	2023-08-31 15:29:54 -04:00
aaron miller	33c22be2aa	starcoder: use ggml_graph_plan	2023-08-31 15:29:54 -04:00
Adam Treat	6d03b3e500	Add starcoder support.	2023-07-27 09:15:16 -04:00
Adam Treat	4963db8f43	Bump the version numbers for both python and c backend.	2023-07-13 14:21:46 -04:00
Adam Treat	ae8eb297ac	Add sbert backend.	2023-07-13 14:21:46 -04:00
Aaron Miller	f0faa23ad5	cmakelists: always export build commands (#1179 ) friendly for using editors with clangd integration that don't also manage the build themselves	2023-07-12 10:49:24 -04:00
Aaron Miller	8d19ef3909	backend: factor out common elements in model code (#1089 ) * backend: factor out common structs in model code prepping to hack on these by hopefully making there be fewer places to fix the same bug rename * use common buffer wrapper instead of manual malloc * fix replit compile warnings	2023-06-28 17:35:07 -07:00
Aaron Miller	198b5e4832	add Falcon 7B model Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin	2023-06-27 14:06:39 -03:00
Aaron Miller	abc081e48d	fix llama.cpp k-quants (#988 ) * enable k-quants on all mainline builds	2023-06-15 14:06:14 -07:00
Aaron Miller	f71d8efc71	metal replit (#931 ) metal+replit makes replit work with Metal and removes its use of `mem_per_token` in favor of fixed size scratch buffers (closer to llama.cpp)	2023-06-13 07:29:14 -07:00
Tim Miller	797891c995	Initial Library Loader for .NET Bindings / Update bindings to support newest changes (#763 ) * Initial Library Loader * Load library as part of Model factory * Dynamically search and find the dlls * Update tests to use locally built runtimes * Fix dylib loading, add macos runtime support for sample/tests * Bypass automatic loading by default. * Only set CMAKE_OSX_ARCHITECTURES if not already set, allow cross-compile * Switch Loading again * Update build scripts for mac/linux * Update bindings to support newest breaking changes * Fix build * Use llmodel for Windows * Actually, it does need to be libllmodel * Name * Remove TFMs, bypass loading by default * Fix script * Delete mac script --------- Co-authored-by: Tim Miller <innerlogic4321@ghmail.com>	2023-06-13 14:05:34 +02:00
Aaron Miller	d3ba1295a7	Metal+LLama take two (#929 ) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>	2023-06-09 16:48:46 -04:00
Adam Treat	b162b5c64e	Revert "llama on Metal (#885 )" This reverts commit `c55f81b860`.	2023-06-09 15:08:46 -04:00
Aaron Miller	c55f81b860	llama on Metal (#885 ) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>	2023-06-09 14:58:12 -04:00
Adam Treat	7e304106cc	Fix for windows.	2023-06-07 12:58:51 -04:00
Richard Guo	c4706d0c14	Replit Model (#713 ) * porting over replit code model to gpt4all * replaced memory with kv_self struct * continuing debug * welp it built but lot of sus things * working model loading and somewhat working generate.. need to format response? * revert back to semi working version * finally got rid of weird formatting * figured out problem is with python bindings - this is good to go for testing * addressing PR feedback * output refactor * fixed prompt reponse collection * cleanup * addressing PR comments * building replit backend with new ggmlver code * chatllm replit and clean python files * cleanup * updated replit to match new llmodel api * match llmodel api and change size_t to Token * resolve PR comments * replit model commit comment	2023-06-06 17:09:00 -04:00
Adam Treat	c5de9634c9	Fix llama models on linux and windows.	2023-06-05 14:31:15 -04:00
AT	bbe195ee02	Backend prompt dedup (#822 ) * Deduplicated prompt() function code	2023-06-04 08:59:24 -04:00
Adam Treat	cec8831e12	Fix mac build again.	2023-06-02 10:51:09 -04:00
Adam Treat	70e3b7e907	Try and fix build on mac.	2023-06-02 10:47:12 -04:00
AT	48275d0dcc	Dlopen backend 5 (#779 ) Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.	2023-05-31 17:04:01 -04:00
Adam Treat	7f9f91ad94	Revert "New tokenizer implementation for MPT and GPT-J" This reverts commit `bbcee1ced5`.	2023-05-30 12:59:00 -04:00
Aaron Miller	bbcee1ced5	New tokenizer implementation for MPT and GPT-J Improves output quality by making these tokenizers more closely match the behavior of the huggingface `tokenizers` based BPE tokenizers these models were trained with. Featuring: * Fixed unicode handling (via ICU) * Fixed BPE token merge handling * Complete added vocabulary handling	2023-05-30 12:05:57 -04:00

1 2

53 Commits