mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2024-10-01 01:06:10 -04:00
Switch to new models2.json for new gguf release and bump our version to
2.5.0.
This commit is contained in:
parent
088afada49
commit
ea66669cef
@ -26,7 +26,7 @@ router = APIRouter(prefix="/engines", tags=["Search Endpoints"])
|
||||
async def list_engines():
|
||||
'''
|
||||
List all available GPT4All models from
|
||||
https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json
|
||||
https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json
|
||||
'''
|
||||
raise NotImplementedError()
|
||||
return ListEnginesResponse(data=[])
|
||||
|
@ -7,7 +7,7 @@ It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running
|
||||
## Running LLMs on CPU
|
||||
The GPT4All Chat UI supports models from all newer versions of `GGML`, `llama.cpp` including the `LLaMA`, `MPT`, `replit`, `GPT-J` and `falcon` architectures
|
||||
|
||||
GPT4All maintains an official list of recommended models located in [models.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
|
||||
GPT4All maintains an official list of recommended models located in [models2.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
|
||||
|
||||
#### Sideloading any GGML model
|
||||
If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by:
|
||||
|
@ -61,12 +61,12 @@ or `allowDownload=true` (default), a model is automatically downloaded into `.ca
|
||||
unless it already exists.
|
||||
|
||||
In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
|
||||
checksum by comparing it with the one listed in [models.json].
|
||||
checksum by comparing it with the one listed in [models2.json].
|
||||
|
||||
As an alternative to the basic downloader built into the bindings, you can choose to download from the
|
||||
<https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.
|
||||
|
||||
[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
|
||||
[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json
|
||||
|
||||
#### I need the chat GUI and bindings to behave the same
|
||||
|
||||
@ -93,7 +93,7 @@ The chat GUI and bindings are based on the same backend. You can make them behav
|
||||
- Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings.
|
||||
- Specifically, in Python:
|
||||
- With simple `generate()` calls, the input has to be surrounded with system and prompt templates.
|
||||
- When using a chat session, it depends on whether the bindings are allowed to download [models.json]. If yes,
|
||||
- When using a chat session, it depends on whether the bindings are allowed to download [models2.json]. If yes,
|
||||
and in the chat GUI the default templates are used, it'll be handled automatically. If no, use
|
||||
`chat_session()` template parameters to customize them.
|
||||
|
||||
|
@ -8,7 +8,7 @@ import modal
|
||||
|
||||
def download_model():
|
||||
import gpt4all
|
||||
#you can use any model from https://gpt4all.io/models/models.json
|
||||
#you can use any model from https://gpt4all.io/models/models2.json
|
||||
return gpt4all.GPT4All("ggml-gpt4all-j-v1.3-groovy.bin")
|
||||
|
||||
image=modal.Image.debian_slim().pip_install("gpt4all").run_function(download_model)
|
||||
@ -31,4 +31,4 @@ def main():
|
||||
model = GPT4All()
|
||||
for i in range(10):
|
||||
model.generate.call()
|
||||
```
|
||||
```
|
||||
|
@ -77,10 +77,10 @@ When using GPT4All models in the `chat_session` context:
|
||||
- Consecutive chat exchanges are taken into account and not discarded until the session ends; as long as the model has capacity.
|
||||
- Internal K/V caches are preserved from previous conversation history, speeding up inference.
|
||||
- The model is given a system and prompt template which make it chatty. Depending on `allow_download=True` (default),
|
||||
it will obtain the latest version of [models.json] from the repository, which contains specifically tailored templates
|
||||
it will obtain the latest version of [models2.json] from the repository, which contains specifically tailored templates
|
||||
for models. Conversely, if it is not allowed to download, it falls back to default templates instead.
|
||||
|
||||
[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
|
||||
[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json
|
||||
|
||||
|
||||
### Streaming Generations
|
||||
@ -379,7 +379,7 @@ logging infrastructure offers [many more customization options][py-logging-cookb
|
||||
|
||||
### Without Online Connectivity
|
||||
To prevent GPT4All from accessing online resources, instantiate it with `allow_download=False`. This will disable both
|
||||
downloading missing models and [models.json], which contains information about them. As a result, predefined templates
|
||||
downloading missing models and [models2.json], which contains information about them. As a result, predefined templates
|
||||
are used instead of model-specific system and prompt templates:
|
||||
|
||||
=== "GPT4All Default Templates Example"
|
||||
|
@ -38,7 +38,7 @@ The GPT4All software ecosystem is compatible with the following Transformer arch
|
||||
- `MPT` (including `Replit`)
|
||||
- `GPT-J`
|
||||
|
||||
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json)
|
||||
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json)
|
||||
|
||||
|
||||
GPT4All models are artifacts produced through a process known as neural network quantization.
|
||||
|
@ -108,12 +108,12 @@ class GPT4All:
|
||||
@staticmethod
|
||||
def list_models() -> List[ConfigType]:
|
||||
"""
|
||||
Fetch model list from https://gpt4all.io/models/models.json.
|
||||
Fetch model list from https://gpt4all.io/models/models2.json.
|
||||
|
||||
Returns:
|
||||
Model list in JSON format.
|
||||
"""
|
||||
return requests.get("https://gpt4all.io/models/models.json").json()
|
||||
return requests.get("https://gpt4all.io/models/models2.json").json()
|
||||
|
||||
@staticmethod
|
||||
def retrieve_model(
|
||||
|
@ -21,7 +21,7 @@ const DEFAULT_MODEL_CONFIG = {
|
||||
promptTemplate: "### Human: \n%1\n### Assistant:\n",
|
||||
}
|
||||
|
||||
const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models.json";
|
||||
const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models2.json";
|
||||
|
||||
const DEFAULT_PROMPT_CONTEXT = {
|
||||
temp: 0.7,
|
||||
|
@ -236,7 +236,7 @@ async function retrieveModel(modelName, options = {}) {
|
||||
file: retrieveOptions.modelConfigFile,
|
||||
url:
|
||||
retrieveOptions.allowDownload &&
|
||||
"https://gpt4all.io/models/models.json",
|
||||
"https://gpt4all.io/models/models2.json",
|
||||
});
|
||||
|
||||
const loadedModelConfig = availableModels.find(
|
||||
|
@ -17,8 +17,8 @@ if(APPLE)
|
||||
endif()
|
||||
|
||||
set(APP_VERSION_MAJOR 2)
|
||||
set(APP_VERSION_MINOR 4)
|
||||
set(APP_VERSION_PATCH 20)
|
||||
set(APP_VERSION_MINOR 5)
|
||||
set(APP_VERSION_PATCH 0)
|
||||
set(APP_VERSION "${APP_VERSION_MAJOR}.${APP_VERSION_MINOR}.${APP_VERSION_PATCH}")
|
||||
|
||||
# Include the binary directory for the generated header file
|
||||
|
143
gpt4all-chat/metadata/models2.json
Normal file
143
gpt4all-chat/metadata/models2.json
Normal file
@ -0,0 +1,143 @@
|
||||
[
|
||||
{
|
||||
"order": "a",
|
||||
"md5sum": "5aff90007499bce5c64b1c0760c0b186",
|
||||
"name": "Wizard v1.2",
|
||||
"filename": "wizardlm-13b-v1.2.Q4_0.gguf",
|
||||
"filesize": "7365834624",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "16",
|
||||
"parameters": "13 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "LLaMA",
|
||||
"systemPrompt": " ",
|
||||
"description": "<strong>Best overall model</strong><br><ul><li>Instruction based<li>Gives very long responses<li>Finetuned with only 1k of high-quality data<li>Trained by Microsoft and Peking University<li>Cannot be used commercially</ul",
|
||||
"url": "https://gpt4all.io/models/gguf/wizardlm-13b-v1.2.Q4_0.gguf"
|
||||
},
|
||||
{
|
||||
"order": "a",
|
||||
"md5sum": "97463be739b50525df56d33b26b00852",
|
||||
"name": "Mistral Instruct",
|
||||
"filename": "mistral-7b-instruct-v0.1.Q4_0.gguf",
|
||||
"filesize": "4108916384",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "8",
|
||||
"parameters": "7 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "LLaMA",
|
||||
"systemPrompt": " ",
|
||||
"description": " ",
|
||||
"url": "https://gpt4all.io/models/gguf/mistral-7b-instruct-v0.1.Q4_0.gguf"
|
||||
},
|
||||
{
|
||||
"order": "a",
|
||||
"md5sum": "48de9538c774188eb25a7e9ee024bbd3",
|
||||
"name": "Mistral OpenOrca",
|
||||
"filename": "mistral-7b-openorca.Q4_0.gguf",
|
||||
"filesize": "4108927744",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "8",
|
||||
"parameters": "7 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "LLaMA",
|
||||
"systemPrompt": " ",
|
||||
"description": " ",
|
||||
"url": "https://gpt4all.io/models/gguf/mistral-7b-openorca.Q4_0.gguf"
|
||||
},
|
||||
{
|
||||
"order": "b",
|
||||
"md5sum": "31cb6d527bd3bfb5e73c2e9dfbc75033",
|
||||
"name": "GPT4All Falcon",
|
||||
"filename": "gpt4all-falcon-q4_0.gguf",
|
||||
"filesize": "4210419040",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "8",
|
||||
"parameters": "7 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "Falcon",
|
||||
"systemPrompt": " ",
|
||||
"description": "<strong>Best overall smaller model</strong><br><ul><li>Fast responses</li><li>Instruction based</li><li>Trained by TII<li>Finetuned by Nomic AI<li>Licensed for commercial use</ul>",
|
||||
"url": "https://gpt4all.io/models/gguf/gpt4all-falcon-q4_0.gguf",
|
||||
"promptTemplate": "### Instruction:\n%1\n### Response:\n"
|
||||
},
|
||||
{
|
||||
"order": "c",
|
||||
"md5sum": "3d12810391d04d1153b692626c0c6e16",
|
||||
"name": "Hermes",
|
||||
"filename": "nous-hermes-llama2-13b.Q4_0.gguf",
|
||||
"filesize": "7366062080",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "16",
|
||||
"parameters": "13 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "LLaMA",
|
||||
"systemPrompt": " ",
|
||||
"description": "<strong>Extremely good model</strong><br><ul><li>Instruction based<li>Gives long responses<li>Curated with 300,000 uncensored instructions<li>Trained by Nous Research<li>Cannot be used commercially</ul>",
|
||||
"url": "https://gpt4all.io/models/gguf/nous-hermes-llama2-13b.Q4_0.gguf",
|
||||
"promptTemplate": "### Instruction:\n%1\n### Response:\n"
|
||||
},
|
||||
{
|
||||
"order": "f",
|
||||
"md5sum": "40388eb2f8d16bb5d08c96fdfaac6b2c",
|
||||
"name": "Snoozy",
|
||||
"filename": "gpt4all-13b-snoozy-q4_0.gguf",
|
||||
"filesize": "7365834624",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "16",
|
||||
"parameters": "13 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "LLaMA",
|
||||
"systemPrompt": " ",
|
||||
"description": "<strong>Very good overall model</strong><br><ul><li>Instruction based<li>Based on the same dataset as Groovy<li>Slower than Groovy, with higher quality responses<li>Trained by Nomic AI<li>Cannot be used commercially</ul>",
|
||||
"url": "https://gpt4all.io/models/gguf/gpt4all-13b-snoozy-q4_0.gguf"
|
||||
},
|
||||
{
|
||||
"order": "g",
|
||||
"md5sum": "f5bc6a52f72efd9128efb2eeed802c86",
|
||||
"name": "MPT Chat",
|
||||
"filename": "mpt-7b-chat-q4_0.gguf",
|
||||
"filesize": "3911522272",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "8",
|
||||
"parameters": "7 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "MPT",
|
||||
"description": "<strong>Best overall smaller model</strong><br><ul><li>Fast responses<li>Chat based<li>Trained by Mosaic ML<li>Cannot be used commercially</ul>",
|
||||
"url": "https://gpt4all.io/models/gguf/mpt-7b-chat-q4_0.gguf",
|
||||
"promptTemplate": "<|im_start|>user\n%1<|im_end|><|im_start|>assistant\n",
|
||||
"systemPrompt": "<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>"
|
||||
},
|
||||
{
|
||||
"order": "i",
|
||||
"md5sum": "aae346fe095e60139ca39b3fda4ac7ae",
|
||||
"name": "Mini Orca (Small)",
|
||||
"filename": "orca-mini-3b.q4_0.gguf",
|
||||
"filesize": "1928648352",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "4",
|
||||
"parameters": "3 billion",
|
||||
"quant": "q4_0",
|
||||
"type": "OpenLLaMa",
|
||||
"description": "<strong>Small version of new model with novel dataset</strong><br><ul><li>Instruction based<li>Explain tuned datasets<li>Orca Research Paper dataset construction approaches<li>Licensed for commercial use</ul>",
|
||||
"url": "https://gpt4all.io/models/gguf/orca-mini-3b.q4_0.gguf",
|
||||
"promptTemplate": "### User:\n%1\n### Response:\n",
|
||||
"systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n"
|
||||
},
|
||||
{
|
||||
"order": "s",
|
||||
"md5sum": "51c627fac9062e208f9b386f105cbd48",
|
||||
"disableGUI": "true",
|
||||
"name": "Replit",
|
||||
"filename": "replit-code-v1-3b-q4_0.gguf",
|
||||
"filesize": "1532949760",
|
||||
"requires": "2.5.0",
|
||||
"ramrequired": "4",
|
||||
"parameters": "3 billion",
|
||||
"quant": "f16",
|
||||
"type": "Replit",
|
||||
"systemPrompt": " ",
|
||||
"promptTemplate": "%1",
|
||||
"description": "<strong>Trained on subset of the Stack</strong><br><ul><li>Code completion based<li>Licensed for commercial use</ul>",
|
||||
"url": "https://gpt4all.io/models/gguf/replit-code-v1-3b-q4_0.gguf"
|
||||
}
|
||||
]
|
@ -834,12 +834,14 @@ void ModelList::updateModelsFromDirectory()
|
||||
processDirectory(localPath);
|
||||
}
|
||||
|
||||
#define MODELS_VERSION 2
|
||||
|
||||
void ModelList::updateModelsFromJson()
|
||||
{
|
||||
#if defined(USE_LOCAL_MODELSJSON)
|
||||
QUrl jsonUrl("file://" + QDir::homePath() + "/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models.json");
|
||||
QUrl jsonUrl("file://" + QDir::homePath() + QString("/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models%1.json").arg(MODELS_VERSION));
|
||||
#else
|
||||
QUrl jsonUrl("http://gpt4all.io/models/models.json");
|
||||
QUrl jsonUrl(QString("http://gpt4all.io/models/models%1.json").arg(MODELS_VERSION));
|
||||
#endif
|
||||
QNetworkRequest request(jsonUrl);
|
||||
QSslConfiguration conf = request.sslConfiguration();
|
||||
@ -881,9 +883,9 @@ void ModelList::updateModelsFromJsonAsync()
|
||||
emit asyncModelRequestOngoingChanged();
|
||||
|
||||
#if defined(USE_LOCAL_MODELSJSON)
|
||||
QUrl jsonUrl("file://" + QDir::homePath() + "/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models.json");
|
||||
QUrl jsonUrl("file://" + QDir::homePath() + QString("/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models%1.json").arg(MODELS_VERSION));
|
||||
#else
|
||||
QUrl jsonUrl("http://gpt4all.io/models/models.json");
|
||||
QUrl jsonUrl(QString("http://gpt4all.io/models/models%1.json").arg(MODELS_VERSION));
|
||||
#endif
|
||||
QNetworkRequest request(jsonUrl);
|
||||
QSslConfiguration conf = request.sslConfiguration();
|
||||
|
@ -47,7 +47,7 @@ MyDialog {
|
||||
Layout.fillHeight: true
|
||||
horizontalAlignment: Qt.AlignHCenter
|
||||
verticalAlignment: Qt.AlignVCenter
|
||||
text: qsTr("Network error: could not retrieve http://gpt4all.io/models/models.json")
|
||||
text: qsTr("Network error: could not retrieve http://gpt4all.io/models/models2.json")
|
||||
font.pixelSize: theme.fontSizeLarge
|
||||
color: theme.mutedTextColor
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user