gpt4all

AI/gpt4all

mirror of https://github.com/nomic-ai/gpt4all.git synced 2024-09-20 07:45:54 +00:00

Author	SHA1	Message	Date
Jared Van Bortel	7e9786fccf	chat: set search path early This fixes the issues with installed versions of v2.6.0.	2024-01-11 12:04:18 -05:00
AT	96cee4f9ac	Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808 )	2024-01-03 14:06:08 -05:00
ThiloteE	2d566710e5	Address review	2024-01-03 11:13:07 -06:00
ThiloteE	a0f7d7ae0e	Fix for "LLModel ERROR: Could not find CPU LLaMA implementation" v2	2024-01-03 11:13:07 -06:00
ThiloteE	38d81c14d0	Fixes https://github.com/nomic-ai/gpt4all/issues/1760 LLModel ERROR: Could not find CPU LLaMA implementation. Inspired by Microsoft docs for LoadLibraryExA (https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa). When using LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR, the lpFileName parameter must specify a fully qualified path, also it needs to be backslashes (\), not forward slashes (/).	2024-01-03 11:13:07 -06:00
Jared Van Bortel	d1c56b8b28	Implement configurable context length (#1749 )	2023-12-16 17:58:15 -05:00
Jared Van Bortel	3acbef14b7	fix AVX support by removing direct linking to AVX2 libs (#1750 )	2023-12-13 12:11:09 -05:00
Jared Van Bortel	0600f551b3	chatllm: do not attempt to serialize incompatible state (#1742 )	2023-12-12 11:45:03 -05:00
Jared Van Bortel	1df3da0a88	update llama.cpp for clang warning fix	2023-12-11 13:07:41 -05:00
Jared Van Bortel	dfd8ef0186	backend: use ggml_new_graph for GGML backend v2 (#1719 )	2023-12-06 14:38:53 -05:00
Jared Van Bortel	9e28dfac9c	Update to latest llama.cpp (#1706 )	2023-12-01 16:51:15 -05:00
Adam Treat	cce5fe2045	Fix macos build.	2023-11-17 11:59:31 -05:00
Adam Treat	371e2a5cbc	LocalDocs version 2 with text embeddings.	2023-11-17 11:59:31 -05:00
Jared Van Bortel	d4ce9f4a7c	llmodel_c: improve quality of error messages (#1625 )	2023-11-07 11:20:14 -05:00
cebtenzzre	64101d3af5	update llama.cpp-mainline	2023-11-01 09:47:39 -04:00
Adam Treat	ffef60912f	Update to llama.cpp	2023-10-30 11:40:16 -04:00
Adam Treat	f5f22fdbd0	Update llama.cpp for latest bugfixes.	2023-10-28 17:47:55 -04:00
cebtenzzre	7bcd9e8089	update llama.cpp-mainline	2023-10-27 19:29:36 -04:00
cebtenzzre	fd0c501d68	backend: support GGUFv3 (#1582 )	2023-10-27 17:07:23 -04:00
Adam Treat	14b410a12a	Update to latest version of llama.cpp which fixes issue 1507.	2023-10-27 12:08:35 -04:00
Adam Treat	ab96035bec	Update to llama.cpp submodule for some vulkan fixes.	2023-10-26 13:46:38 -04:00
cebtenzzre	e90263c23f	make scripts executable (#1555 )	2023-10-24 09:28:21 -04:00
Aaron Miller	f414c28589	llmodel: whitelist library name patterns this fixes some issues that were being seen on installed windows builds of 2.5.0 only load dlls that actually might be model impl dlls, otherwise we pull all sorts of random junk into the process before it might expect to be Signed-off-by: Aaron Miller <apage43@ninjawhale.com>	2023-10-23 21:40:14 -07:00
cebtenzzre	4338e72a51	MPT: use upstream llama.cpp implementation (#1515 )	2023-10-19 15:25:17 -04:00
cebtenzzre	0fe2e19691	llamamodel: re-enable error messages by default (#1537 )	2023-10-19 13:46:33 -04:00
cebtenzzre	017c3a9649	python: prepare version 2.0.0rc1 (#1529 )	2023-10-18 20:24:54 -04:00
cebtenzzre	9a19c740ee	kompute: fix library loading issues with kp_logger (#1517 )	2023-10-16 16:58:17 -04:00
Aaron Miller	f79557d2aa	speedup: just use matvec shaders for matmat so far my from-scratch matmats are still slower than just running more invocations of the existing Metal ported matvec shaders - it should be theoretically possible to make a matmat that's faster (for actual matmat cases) than an optimal matvec, but it will need to be at least* as fast as the mat*vec op and then take special care to be cache-friendly and save memory bandwidth, as the # of compute ops is the same	2023-10-16 13:45:51 -04:00
cebtenzzre	22de3c56bd	convert scripts: fix AutoConfig typo (#1512 )	2023-10-13 14:16:51 -04:00
Aaron Miller	2490977f89	q6k, q4_1 mat*mat	2023-10-12 14:56:54 -04:00
Aaron Miller	afaa291eab	python bindings should be quiet by default * disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is nonempty * make verbose flag for retrieve_model default false (but also be overridable via gpt4all constructor) should be able to run a basic test: ```python import gpt4all model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf') print(model.generate('def fib(n):')) ``` and see no non-model output when successful	2023-10-11 14:14:36 -07:00
cebtenzzre	7b611b49f2	llmodel: print an error if the CPU does not support AVX (#1499 )	2023-10-11 15:09:40 -04:00
Aaron Miller	043617168e	do not process prompts on gpu yet	2023-10-11 13:15:50 -04:00
Aaron Miller	64001a480a	mat*mat for q4_0, q8_0	2023-10-11 13:15:50 -04:00
cebtenzzre	7a19047329	llmodel: do not call magic_match unless build variant is correct (#1488 )	2023-10-11 11:30:48 -04:00
Cebtenzzre	5fe685427a	chat: clearer CPU fallback messages	2023-10-06 11:35:14 -04:00
Adam Treat	eec906aa05	Speculative fix for build on mac.	2023-10-05 18:37:33 -04:00
Adam Treat	a9acdd25de	Push a new version number for llmodel backend now that it is based on gguf.	2023-10-05 18:18:07 -04:00
Cebtenzzre	8bb6a6c201	rebase on newer llama.cpp	2023-10-05 18:16:19 -04:00
Cebtenzzre	d87573ea75	remove old llama.cpp submodules	2023-10-05 18:16:19 -04:00
Cebtenzzre	cc6db61c93	backend: fix build with Visual Studio generator Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This is needed because Visual Studio is a multi-configuration generator, so we do not know what the build type will be until `cmake --build` is called. Fixes #1470	2023-10-05 18:16:19 -04:00
Adam Treat	f605a5b686	Add q8_0 kernels to kompute shaders and bump to latest llama/gguf.	2023-10-05 18:16:19 -04:00
Cebtenzzre	672cb850f9	differentiate between init failure and unsupported models	2023-10-05 18:16:19 -04:00
Adam Treat	906699e8e9	Bump to latest llama/gguf branch.	2023-10-05 18:16:19 -04:00
Cebtenzzre	088afada49	llamamodel: fix static vector in LLamaModel::endTokens	2023-10-05 18:16:19 -04:00
Adam Treat	b4d82ea289	Bump to the latest fixes for vulkan in llama.	2023-10-05 18:16:19 -04:00
Adam Treat	12f943e966	Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.	2023-10-05 18:16:19 -04:00
Adam Treat	5d346e13d7	Add q6_k kernels for vulkan.	2023-10-05 18:16:19 -04:00
Adam Treat	4eefd386d0	Refactor for subgroups on mat * vec kernel.	2023-10-05 18:16:19 -04:00
Cebtenzzre	3c2aa299d8	gptj: remove unused variables	2023-10-05 18:16:19 -04:00

1 2 3 4

196 Commits