Commit Graph

23 Commits

Author SHA1 Message Date
Aaron Miller
f79557d2aa speedup: just use mat*vec shaders for mat*mat
so far my from-scratch mat*mats are still slower than just running more
invocations of the existing Metal ported mat*vec shaders - it should be
theoretically possible to make a mat*mat that's faster (for actual
mat*mat cases) than an optimal mat*vec, but it will need to be at
*least* as fast as the mat*vec op and then take special care to be
cache-friendly and save memory bandwidth, as the # of compute ops is the
same
2023-10-16 13:45:51 -04:00
Aaron Miller
2490977f89 q6k, q4_1 mat*mat 2023-10-12 14:56:54 -04:00
Aaron Miller
64001a480a mat*mat for q4_0, q8_0 2023-10-11 13:15:50 -04:00
Cebtenzzre
cc6db61c93 backend: fix build with Visual Studio generator
Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This
is needed because Visual Studio is a multi-configuration generator, so
we do not know what the build type will be until `cmake --build` is
called.

Fixes #1470
2023-10-05 18:16:19 -04:00
Adam Treat
f605a5b686 Add q8_0 kernels to kompute shaders and bump to latest llama/gguf. 2023-10-05 18:16:19 -04:00
Adam Treat
5d346e13d7 Add q6_k kernels for vulkan. 2023-10-05 18:16:19 -04:00
Adam Treat
4eefd386d0 Refactor for subgroups on mat * vec kernel. 2023-10-05 18:16:19 -04:00
Aaron Miller
507753a37c macos build fixes 2023-10-05 18:16:19 -04:00
Adam Treat
d90d003a1d Latest rebase on llama.cpp with gguf support. 2023-10-05 18:16:19 -04:00
Jacob Nguyen
e86c63750d Update llama.cpp.cmake
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
2023-09-16 11:42:56 -07:00
Adam Treat
c953b321b7 Don't link against libvulkan. 2023-09-12 14:26:56 -04:00
Adam Treat
987546c63b Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0. 2023-08-31 15:29:54 -04:00
Adam Treat
d55cbbee32 Update to newer llama.cpp and disable older forks. 2023-08-31 15:29:54 -04:00
Adam Treat
84deebd223 Fix compile for windows and linux again. PLEASE DON'T REVERT THISgit gui! 2023-06-12 17:08:55 -04:00
Cosmic Snow
ae4a275bcd Fix Windows MSVC AVX builds
- bug introduced in 0cb2b86730
- currently getting: `warning C5102: ignoring invalid command-line macro definition '/arch:AVX2'`
- solution is to use `_options(...)` not `_definitions(...)`
2023-06-12 08:55:55 -07:00
Aaron Miller
d3ba1295a7
Metal+LLama take two (#929)
Support latest llama with Metal
---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-09 16:48:46 -04:00
Adam Treat
b162b5c64e Revert "llama on Metal (#885)"
This reverts commit c55f81b860.
2023-06-09 15:08:46 -04:00
Aaron Miller
c55f81b860
llama on Metal (#885)
Support latest llama with Metal

---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-09 14:58:12 -04:00
niansa
0cb2b86730 Synced llama.cpp.cmake with upstream 2023-06-08 18:21:32 -04:00
Adam Treat
010a04d96f Revert "Synced llama.cpp.cmake with upstream (#887)"
This reverts commit 89910c7ca8.
2023-06-08 07:23:41 -04:00
niansa/tuxifan
89910c7ca8
Synced llama.cpp.cmake with upstream (#887) 2023-06-07 09:18:22 -07:00
Adam Treat
c5de9634c9 Fix llama models on linux and windows. 2023-06-05 14:31:15 -04:00
AT
48275d0dcc
Dlopen backend 5 (#779)
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
2023-05-31 17:04:01 -04:00