gpt4all

AI/gpt4all

mirror of https://github.com/nomic-ai/gpt4all.git synced 2024-10-01 01:06:10 -04:00

Author	SHA1	Message	Date
Aaron Miller	f79557d2aa	speedup: just use matvec shaders for matmat so far my from-scratch matmats are still slower than just running more invocations of the existing Metal ported matvec shaders - it should be theoretically possible to make a matmat that's faster (for actual matmat cases) than an optimal matvec, but it will need to be at least* as fast as the mat*vec op and then take special care to be cache-friendly and save memory bandwidth, as the # of compute ops is the same	2023-10-16 13:45:51 -04:00
cebtenzzre	22de3c56bd	convert scripts: fix AutoConfig typo (#1512 )	2023-10-13 14:16:51 -04:00
Aaron Miller	2490977f89	q6k, q4_1 mat*mat	2023-10-12 14:56:54 -04:00
Aaron Miller	afaa291eab	python bindings should be quiet by default * disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is nonempty * make verbose flag for retrieve_model default false (but also be overridable via gpt4all constructor) should be able to run a basic test: ```python import gpt4all model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf') print(model.generate('def fib(n):')) ``` and see no non-model output when successful	2023-10-11 14:14:36 -07:00
cebtenzzre	7b611b49f2	llmodel: print an error if the CPU does not support AVX (#1499 )	2023-10-11 15:09:40 -04:00
Aaron Miller	043617168e	do not process prompts on gpu yet	2023-10-11 13:15:50 -04:00
Aaron Miller	64001a480a	mat*mat for q4_0, q8_0	2023-10-11 13:15:50 -04:00
cebtenzzre	7a19047329	llmodel: do not call magic_match unless build variant is correct (#1488 )	2023-10-11 11:30:48 -04:00
Cebtenzzre	5fe685427a	chat: clearer CPU fallback messages	2023-10-06 11:35:14 -04:00
Adam Treat	eec906aa05	Speculative fix for build on mac.	2023-10-05 18:37:33 -04:00
Adam Treat	a9acdd25de	Push a new version number for llmodel backend now that it is based on gguf.	2023-10-05 18:18:07 -04:00
Cebtenzzre	8bb6a6c201	rebase on newer llama.cpp	2023-10-05 18:16:19 -04:00
Cebtenzzre	d87573ea75	remove old llama.cpp submodules	2023-10-05 18:16:19 -04:00
Cebtenzzre	cc6db61c93	backend: fix build with Visual Studio generator Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This is needed because Visual Studio is a multi-configuration generator, so we do not know what the build type will be until `cmake --build` is called. Fixes #1470	2023-10-05 18:16:19 -04:00
Adam Treat	f605a5b686	Add q8_0 kernels to kompute shaders and bump to latest llama/gguf.	2023-10-05 18:16:19 -04:00
Cebtenzzre	672cb850f9	differentiate between init failure and unsupported models	2023-10-05 18:16:19 -04:00
Adam Treat	906699e8e9	Bump to latest llama/gguf branch.	2023-10-05 18:16:19 -04:00
Cebtenzzre	088afada49	llamamodel: fix static vector in LLamaModel::endTokens	2023-10-05 18:16:19 -04:00
Adam Treat	b4d82ea289	Bump to the latest fixes for vulkan in llama.	2023-10-05 18:16:19 -04:00
Adam Treat	12f943e966	Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.	2023-10-05 18:16:19 -04:00
Adam Treat	5d346e13d7	Add q6_k kernels for vulkan.	2023-10-05 18:16:19 -04:00
Adam Treat	4eefd386d0	Refactor for subgroups on mat * vec kernel.	2023-10-05 18:16:19 -04:00
Cebtenzzre	3c2aa299d8	gptj: remove unused variables	2023-10-05 18:16:19 -04:00
Cebtenzzre	f9deb87d20	convert scripts: add feed-forward length for better compatiblilty This GGUF key is used by all llama.cpp models with upstream support.	2023-10-05 18:16:19 -04:00
Cebtenzzre	cc7675d432	convert scripts: make gptj script executable	2023-10-05 18:16:19 -04:00
Cebtenzzre	0493e6eb07	convert scripts: use bytes_to_unicode from transformers	2023-10-05 18:16:19 -04:00
Cebtenzzre	d5d72f0361	gpt-j: update inference to match latest llama.cpp insights - Use F16 KV cache - Store transposed V in the cache - Avoid unnecessary Q copy Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78	2023-10-05 18:16:19 -04:00
Cebtenzzre	050e7f076e	backend: port GPT-J to GGUF	2023-10-05 18:16:19 -04:00
Cebtenzzre	8f3abb37ca	fix references to removed model types	2023-10-05 18:16:19 -04:00
Cebtenzzre	4219c0e2e7	convert scripts: make them directly executable	2023-10-05 18:16:19 -04:00
Cebtenzzre	ce7be1db48	backend: use llamamodel.cpp for Falcon	2023-10-05 18:16:19 -04:00
Cebtenzzre	cca9e6ce81	convert_mpt_hf_to_gguf.py: better tokenizer decoding	2023-10-05 18:16:19 -04:00
Cebtenzzre	25297786db	convert scripts: load model as late as possible	2023-10-05 18:16:19 -04:00
Cebtenzzre	fd47088f2b	conversion scripts: cleanup	2023-10-05 18:16:19 -04:00
Cebtenzzre	6277eac9cc	backend: use llamamodel.cpp for StarCoder	2023-10-05 18:16:19 -04:00
Cebtenzzre	17fc9e3e58	backend: port Replit to GGUF	2023-10-05 18:16:19 -04:00
Cebtenzzre	7c67262a13	backend: port MPT to GGUF	2023-10-05 18:16:19 -04:00
Cebtenzzre	42bcb814b3	backend: port BERT to GGUF	2023-10-05 18:16:19 -04:00
Cebtenzzre	1d29e4696c	llamamodel: metal supports all quantization types now	2023-10-05 18:16:19 -04:00
Aaron Miller	507753a37c	macos build fixes	2023-10-05 18:16:19 -04:00
Adam Treat	d90d003a1d	Latest rebase on llama.cpp with gguf support.	2023-10-05 18:16:19 -04:00
Adam Treat	99c106e6b5	Fix a bug seen on AMD RADEON cards with vulkan backend.	2023-09-26 11:59:47 -04:00
Jacob Nguyen	e86c63750d	Update llama.cpp.cmake Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>	2023-09-16 11:42:56 -07:00
Adam Treat	84905aa281	Fix for crashes on systems where vulkan is not installed properly.	2023-09-16 12:19:46 -04:00
Adam Treat	045f6e6cdc	Link against ggml in bin so we can get the available devices without loading a model.	2023-09-15 14:45:25 -04:00
Adam Treat	aa33419c6e	Fallback to CPU more robustly.	2023-09-14 16:53:11 -04:00
Adam Treat	9013a089bd	Bump to new llama with new bugfix.	2023-09-14 10:02:11 -04:00
Adam Treat	3076e0bf26	Only show GPU when we're actually using it.	2023-09-14 09:59:19 -04:00
Adam Treat	cf4eb530ce	Sync to a newer version of llama.cpp with bugfix for vulkan.	2023-09-13 21:01:44 -04:00
Adam Treat	4b9a345aee	Update the submodule.	2023-09-13 17:05:46 -04:00

1 2 3 4

169 Commits