mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2024-10-01 01:06:10 -04:00
46314dc7f3
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4.5 KiB
4.5 KiB
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
Unreleased
Added
- Warn on Windows if the Microsoft Visual C++ runtime libraries are not found (#2920)
2.8.2 - 2024-08-14
Fixed
- Fixed incompatibility with Python 3.8 since v2.7.0 and Python <=3.11 since v2.8.1 (#2871)
2.8.1 - 2024-08-13
Added
- Use greedy sampling when temperature is set to zero (#2854)
Changed
- Search for pip-installed CUDA 11 as well as CUDA 12 (#2802)
- Stop shipping CUBINs to reduce wheel size (#2802)
- Use llama_kv_cache ops to shift context faster (#2781)
- Don't stop generating at end of context (#2781)
Fixed
- Make reverse prompt detection work more reliably and prevent it from breaking output (#2781)
- Explicitly target macOS 12.6 in CI to fix Metal compatibility on older macOS (#2849)
- Do not initialize Vulkan driver when only using CPU (#2843)
- Fix a segfault on exit when using CPU mode on Linux with NVIDIA and EGL (#2843)
2.8.0 - 2024-08-05
Added
- Support GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Vulkan support) (#2694)
- Enable Vulkan support for StarCoder2, XVERSE, Command R, and OLMo (#2694)
- Support DeepSeek-V2 architecture (no Vulkan support) (#2702)
- Add Llama 3.1 8B Instruct to models3.json (by @3Simplex in #2731 and #2732)
- Support Llama 3.1 RoPE scaling (#2758)
- Add Qwen2-1.5B-Instruct to models3.json (by @ThiloteE in #2759)
- Detect use of a Python interpreter under Rosetta for a clearer error message (#2793)
Changed
- Build against CUDA 11.8 instead of CUDA 12 for better compatibility with older drivers (#2639)
- Update llama.cpp to commit 87e397d00 from July 19th (#2694)
Removed
- Remove unused internal llmodel_has_gpu_device (#2409)
- Remove support for GPT-J models (#2676, #2693)
Fixed
- Fix debug mode crash on Windows and undefined behavior in LLamaModel::embedInternal (#2467)
- Fix CUDA PTX errors with some GPT4All builds (#2421)
- Fix mishandling of inputs greater than n_ctx tokens after #1970 (#2498)
- Fix crash when Kompute falls back to CPU (#2640)
- Fix several Kompute resource management issues (#2694)
- Fix crash/hang when some models stop generating, by showing special tokens (#2701)
- Fix several backend issues (#2778)