mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2024-10-01 01:06:10 -04:00
Move FAQ entries to general FAQ and adjust, plus minor improvements
This commit is contained in:
parent
e56f977b67
commit
55f96aacc6
@ -9,6 +9,7 @@ Currently, there are five different model architectures that are supported:
|
|||||||
3. MPT - Based off of Mosaic ML's MPT architecture with examples found [here](https://huggingface.co/mosaicml/mpt-7b)
|
3. MPT - Based off of Mosaic ML's MPT architecture with examples found [here](https://huggingface.co/mosaicml/mpt-7b)
|
||||||
4. Replit - Based off of Replit Inc.'s Replit architecture with examples found [here](https://huggingface.co/replit/replit-code-v1-3b)
|
4. Replit - Based off of Replit Inc.'s Replit architecture with examples found [here](https://huggingface.co/replit/replit-code-v1-3b)
|
||||||
5. Falcon - Based off of TII's Falcon architecture with examples found [here](https://huggingface.co/tiiuae/falcon-40b)
|
5. Falcon - Based off of TII's Falcon architecture with examples found [here](https://huggingface.co/tiiuae/falcon-40b)
|
||||||
|
6. StarCoder - Based off of BigCode's StarCoder architecture with examples found [here](https://huggingface.co/bigcode/starcoder)
|
||||||
|
|
||||||
## Why so many different architectures? What differentiates them?
|
## Why so many different architectures? What differentiates them?
|
||||||
|
|
||||||
@ -16,7 +17,7 @@ One of the major differences is license. Currently, the LLAMA based models are s
|
|||||||
|
|
||||||
## How does GPT4All make these models available for CPU inference?
|
## How does GPT4All make these models available for CPU inference?
|
||||||
|
|
||||||
By leveraging the ggml library written by Georgi Gerganov and a growing community of developers. There are currently multiple different versions of this library. The original github repo can be found [here](https://github.com/ggerganov/ggml), but the developer of the library has also created a LLAMA based version [here](https://github.com/ggerganov/llama.cpp). Currently, this backend is using the latter as a submodule.
|
By leveraging the ggml library written by Georgi Gerganov and a growing community of developers. There are currently multiple different versions of this library. The original GitHub repo can be found [here](https://github.com/ggerganov/ggml), but the developer of the library has also created a LLAMA based version [here](https://github.com/ggerganov/llama.cpp). Currently, this backend is using the latter as a submodule.
|
||||||
|
|
||||||
## Does that mean GPT4All is compatible with all llama.cpp models and vice versa?
|
## Does that mean GPT4All is compatible with all llama.cpp models and vice versa?
|
||||||
|
|
||||||
@ -35,8 +36,55 @@ Your CPU needs to support [AVX or AVX2 instructions](https://en.wikipedia.org/wi
|
|||||||
|
|
||||||
In newer versions of llama.cpp, there has been some added support for NVIDIA GPU's for inference. We're investigating how to incorporate this into our downloadable installers.
|
In newer versions of llama.cpp, there has been some added support for NVIDIA GPU's for inference. We're investigating how to incorporate this into our downloadable installers.
|
||||||
|
|
||||||
## Ok, so bottom line... how do I make my model on huggingface compatible with GPT4All ecosystem right now?
|
## Ok, so bottom line... how do I make my model on Hugging Face compatible with GPT4All ecosystem right now?
|
||||||
|
|
||||||
1. Check to make sure the huggingface model is available in one of our three supported architectures
|
1. Check to make sure the Hugging Face model is available in one of our three supported architectures
|
||||||
2. If it is, then you can use the conversion script inside of our pinned llama.cpp submodule for GPTJ and LLAMA based models
|
2. If it is, then you can use the conversion script inside of our pinned llama.cpp submodule for GPTJ and LLAMA based models
|
||||||
3. Or if your model is an MPT model you can use the conversion script located directly in this backend directory under the scripts subdirectory
|
3. Or if your model is an MPT model you can use the conversion script located directly in this backend directory under the scripts subdirectory
|
||||||
|
|
||||||
|
## Language Bindings
|
||||||
|
|
||||||
|
#### There's a problem with the download
|
||||||
|
|
||||||
|
Some bindings can download a model, if allowed to do so. For example, in Python or TypeScript if `allow_download=True`
|
||||||
|
or `allowDownload=true` (default), a model is automatically downloaded into `.cache/gpt4all/` in the user's home folder,
|
||||||
|
unless it already exists.
|
||||||
|
|
||||||
|
In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
|
||||||
|
checksum by comparing it with the one listed in [models.json].
|
||||||
|
|
||||||
|
As an alternative to the basic downloader built into the bindings, you can choose to download from the
|
||||||
|
<https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.
|
||||||
|
|
||||||
|
[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
|
||||||
|
|
||||||
|
#### I need the chat GUI and bindings to behave the same
|
||||||
|
|
||||||
|
The chat GUI and bindings are based on the same backend. You can make them behave the same way by following these steps:
|
||||||
|
|
||||||
|
- First of all, ensure that all parameters in the chat GUI settings match those passed to the generating API, e.g.:
|
||||||
|
|
||||||
|
=== "Python"
|
||||||
|
``` py
|
||||||
|
from gpt4all import GPT4All
|
||||||
|
model = GPT4All(...)
|
||||||
|
model.generate("prompt text", temp=0, ...) # adjust parameters
|
||||||
|
```
|
||||||
|
=== "TypeScript"
|
||||||
|
``` ts
|
||||||
|
import { createCompletion, loadModel } from '../src/gpt4all.js'
|
||||||
|
const ll = await loadModel(...);
|
||||||
|
const messages = ...
|
||||||
|
const re = await createCompletion(ll, messages, { temp: 0, ... }); // adjust parameters
|
||||||
|
```
|
||||||
|
|
||||||
|
- To make comparing the output easier, set _Temperature_ in both to 0 for now. This will make the output deterministic.
|
||||||
|
|
||||||
|
- Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings.
|
||||||
|
- Specifically, in Python:
|
||||||
|
- With simple `generate()` calls, the input has to be surrounded with system and prompt templates.
|
||||||
|
- When using a chat session, it depends on whether the bindings are allowed to download [models.json]. If yes,
|
||||||
|
and in the chat GUI the default templates are used, it'll be handled automatically. If no, use
|
||||||
|
`chat_session()` template parameters to customize them.
|
||||||
|
|
||||||
|
- Once you're done, remember to reset _Temperature_ to its previous value in both chat GUI and your custom code.
|
||||||
|
@ -433,33 +433,5 @@ If you know exactly when a model should stop responding, you can add a custom ca
|
|||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
### FAQ
|
|
||||||
#### There's a problem with the download
|
|
||||||
If `allow_download=True` (default), a model is automatically downloaded into `.cache/gpt4all/` in the user's home
|
|
||||||
folder, unless it already exists.
|
|
||||||
|
|
||||||
In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
|
|
||||||
checksum by comparing it with the one listed in [models.json].
|
|
||||||
|
|
||||||
As an alternative to the basic downloader built into the bindings, you can choose to download from the
|
|
||||||
<https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.
|
|
||||||
|
|
||||||
|
|
||||||
#### I need the chat GUI and bindings to behave the same
|
|
||||||
The chat GUI and bindings are based on the same backend. You can make them behave the same way by following these steps:
|
|
||||||
|
|
||||||
- First of all, ensure that all parameters in the chat GUI settings match those passed to `generate()`.
|
|
||||||
|
|
||||||
- To make comparing the output easier, set _Temperature_ in both to 0 for now. This will make the output deterministic.
|
|
||||||
|
|
||||||
- Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings:
|
|
||||||
- With simple `generate()` calls, the input has to be surrounded with system and prompt templates.
|
|
||||||
- When using a chat session, it depends on whether the bindings are allowed to download [models.json]. If yes, and
|
|
||||||
in the chat GUI the default templates are used, it'll be handled automatically. If no, use `chat_session()`
|
|
||||||
template parameters to customize them.
|
|
||||||
|
|
||||||
- Once you're done, remember to reset _Temperature_ to its previous value in both chat GUI and your Python code.
|
|
||||||
|
|
||||||
|
|
||||||
## API Documentation
|
## API Documentation
|
||||||
::: gpt4all.gpt4all.GPT4All
|
::: gpt4all.gpt4all.GPT4All
|
||||||
|
Loading…
Reference in New Issue
Block a user