Switch to new models2.json for new gguf release and bump our version to

2.5.0.
This commit is contained in:
Adam Treat 2023-10-05 09:56:40 -04:00
parent 088afada49
commit ea66669cef
13 changed files with 167 additions and 22 deletions

View File

@ -26,7 +26,7 @@ router = APIRouter(prefix="/engines", tags=["Search Endpoints"])
async def list_engines():
'''
List all available GPT4All models from
https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json
https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json
'''
raise NotImplementedError()
return ListEnginesResponse(data=[])

View File

@ -7,7 +7,7 @@ It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running
## Running LLMs on CPU
The GPT4All Chat UI supports models from all newer versions of `GGML`, `llama.cpp` including the `LLaMA`, `MPT`, `replit`, `GPT-J` and `falcon` architectures
GPT4All maintains an official list of recommended models located in [models.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
GPT4All maintains an official list of recommended models located in [models2.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
#### Sideloading any GGML model
If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by:

View File

@ -61,12 +61,12 @@ or `allowDownload=true` (default), a model is automatically downloaded into `.ca
unless it already exists.
In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
checksum by comparing it with the one listed in [models.json].
checksum by comparing it with the one listed in [models2.json].
As an alternative to the basic downloader built into the bindings, you can choose to download from the
<https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.
[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json
#### I need the chat GUI and bindings to behave the same
@ -93,7 +93,7 @@ The chat GUI and bindings are based on the same backend. You can make them behav
- Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings.
- Specifically, in Python:
- With simple `generate()` calls, the input has to be surrounded with system and prompt templates.
- When using a chat session, it depends on whether the bindings are allowed to download [models.json]. If yes,
- When using a chat session, it depends on whether the bindings are allowed to download [models2.json]. If yes,
and in the chat GUI the default templates are used, it'll be handled automatically. If no, use
`chat_session()` template parameters to customize them.

View File

@ -8,7 +8,7 @@ import modal
def download_model():
import gpt4all
#you can use any model from https://gpt4all.io/models/models.json
#you can use any model from https://gpt4all.io/models/models2.json
return gpt4all.GPT4All("ggml-gpt4all-j-v1.3-groovy.bin")
image=modal.Image.debian_slim().pip_install("gpt4all").run_function(download_model)
@ -31,4 +31,4 @@ def main():
model = GPT4All()
for i in range(10):
model.generate.call()
```
```

View File

@ -77,10 +77,10 @@ When using GPT4All models in the `chat_session` context:
- Consecutive chat exchanges are taken into account and not discarded until the session ends; as long as the model has capacity.
- Internal K/V caches are preserved from previous conversation history, speeding up inference.
- The model is given a system and prompt template which make it chatty. Depending on `allow_download=True` (default),
it will obtain the latest version of [models.json] from the repository, which contains specifically tailored templates
it will obtain the latest version of [models2.json] from the repository, which contains specifically tailored templates
for models. Conversely, if it is not allowed to download, it falls back to default templates instead.
[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json
### Streaming Generations
@ -379,7 +379,7 @@ logging infrastructure offers [many more customization options][py-logging-cookb
### Without Online Connectivity
To prevent GPT4All from accessing online resources, instantiate it with `allow_download=False`. This will disable both
downloading missing models and [models.json], which contains information about them. As a result, predefined templates
downloading missing models and [models2.json], which contains information about them. As a result, predefined templates
are used instead of model-specific system and prompt templates:
=== "GPT4All Default Templates Example"

View File

@ -38,7 +38,7 @@ The GPT4All software ecosystem is compatible with the following Transformer arch
- `MPT` (including `Replit`)
- `GPT-J`
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json)
You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json)
GPT4All models are artifacts produced through a process known as neural network quantization.

View File

@ -108,12 +108,12 @@ class GPT4All:
@staticmethod
def list_models() -> List[ConfigType]:
"""
Fetch model list from https://gpt4all.io/models/models.json.
Fetch model list from https://gpt4all.io/models/models2.json.
Returns:
Model list in JSON format.
"""
return requests.get("https://gpt4all.io/models/models.json").json()
return requests.get("https://gpt4all.io/models/models2.json").json()
@staticmethod
def retrieve_model(

View File

@ -21,7 +21,7 @@ const DEFAULT_MODEL_CONFIG = {
promptTemplate: "### Human: \n%1\n### Assistant:\n",
}
const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models.json";
const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models2.json";
const DEFAULT_PROMPT_CONTEXT = {
temp: 0.7,

View File

@ -236,7 +236,7 @@ async function retrieveModel(modelName, options = {}) {
file: retrieveOptions.modelConfigFile,
url:
retrieveOptions.allowDownload &&
"https://gpt4all.io/models/models.json",
"https://gpt4all.io/models/models2.json",
});
const loadedModelConfig = availableModels.find(

View File

@ -17,8 +17,8 @@ if(APPLE)
endif()
set(APP_VERSION_MAJOR 2)
set(APP_VERSION_MINOR 4)
set(APP_VERSION_PATCH 20)
set(APP_VERSION_MINOR 5)
set(APP_VERSION_PATCH 0)
set(APP_VERSION "${APP_VERSION_MAJOR}.${APP_VERSION_MINOR}.${APP_VERSION_PATCH}")
# Include the binary directory for the generated header file

View File

@ -0,0 +1,143 @@
[
{
"order": "a",
"md5sum": "5aff90007499bce5c64b1c0760c0b186",
"name": "Wizard v1.2",
"filename": "wizardlm-13b-v1.2.Q4_0.gguf",
"filesize": "7365834624",
"requires": "2.5.0",
"ramrequired": "16",
"parameters": "13 billion",
"quant": "q4_0",
"type": "LLaMA",
"systemPrompt": " ",
"description": "<strong>Best overall model</strong><br><ul><li>Instruction based<li>Gives very long responses<li>Finetuned with only 1k of high-quality data<li>Trained by Microsoft and Peking University<li>Cannot be used commercially</ul",
"url": "https://gpt4all.io/models/gguf/wizardlm-13b-v1.2.Q4_0.gguf"
},
{
"order": "a",
"md5sum": "97463be739b50525df56d33b26b00852",
"name": "Mistral Instruct",
"filename": "mistral-7b-instruct-v0.1.Q4_0.gguf",
"filesize": "4108916384",
"requires": "2.5.0",
"ramrequired": "8",
"parameters": "7 billion",
"quant": "q4_0",
"type": "LLaMA",
"systemPrompt": " ",
"description": " ",
"url": "https://gpt4all.io/models/gguf/mistral-7b-instruct-v0.1.Q4_0.gguf"
},
{
"order": "a",
"md5sum": "48de9538c774188eb25a7e9ee024bbd3",
"name": "Mistral OpenOrca",
"filename": "mistral-7b-openorca.Q4_0.gguf",
"filesize": "4108927744",
"requires": "2.5.0",
"ramrequired": "8",
"parameters": "7 billion",
"quant": "q4_0",
"type": "LLaMA",
"systemPrompt": " ",
"description": " ",
"url": "https://gpt4all.io/models/gguf/mistral-7b-openorca.Q4_0.gguf"
},
{
"order": "b",
"md5sum": "31cb6d527bd3bfb5e73c2e9dfbc75033",
"name": "GPT4All Falcon",
"filename": "gpt4all-falcon-q4_0.gguf",
"filesize": "4210419040",
"requires": "2.5.0",
"ramrequired": "8",
"parameters": "7 billion",
"quant": "q4_0",
"type": "Falcon",
"systemPrompt": " ",
"description": "<strong>Best overall smaller model</strong><br><ul><li>Fast responses</li><li>Instruction based</li><li>Trained by TII<li>Finetuned by Nomic AI<li>Licensed for commercial use</ul>",
"url": "https://gpt4all.io/models/gguf/gpt4all-falcon-q4_0.gguf",
"promptTemplate": "### Instruction:\n%1\n### Response:\n"
},
{
"order": "c",
"md5sum": "3d12810391d04d1153b692626c0c6e16",
"name": "Hermes",
"filename": "nous-hermes-llama2-13b.Q4_0.gguf",
"filesize": "7366062080",
"requires": "2.5.0",
"ramrequired": "16",
"parameters": "13 billion",
"quant": "q4_0",
"type": "LLaMA",
"systemPrompt": " ",
"description": "<strong>Extremely good model</strong><br><ul><li>Instruction based<li>Gives long responses<li>Curated with 300,000 uncensored instructions<li>Trained by Nous Research<li>Cannot be used commercially</ul>",
"url": "https://gpt4all.io/models/gguf/nous-hermes-llama2-13b.Q4_0.gguf",
"promptTemplate": "### Instruction:\n%1\n### Response:\n"
},
{
"order": "f",
"md5sum": "40388eb2f8d16bb5d08c96fdfaac6b2c",
"name": "Snoozy",
"filename": "gpt4all-13b-snoozy-q4_0.gguf",
"filesize": "7365834624",
"requires": "2.5.0",
"ramrequired": "16",
"parameters": "13 billion",
"quant": "q4_0",
"type": "LLaMA",
"systemPrompt": " ",
"description": "<strong>Very good overall model</strong><br><ul><li>Instruction based<li>Based on the same dataset as Groovy<li>Slower than Groovy, with higher quality responses<li>Trained by Nomic AI<li>Cannot be used commercially</ul>",
"url": "https://gpt4all.io/models/gguf/gpt4all-13b-snoozy-q4_0.gguf"
},
{
"order": "g",
"md5sum": "f5bc6a52f72efd9128efb2eeed802c86",
"name": "MPT Chat",
"filename": "mpt-7b-chat-q4_0.gguf",
"filesize": "3911522272",
"requires": "2.5.0",
"ramrequired": "8",
"parameters": "7 billion",
"quant": "q4_0",
"type": "MPT",
"description": "<strong>Best overall smaller model</strong><br><ul><li>Fast responses<li>Chat based<li>Trained by Mosaic ML<li>Cannot be used commercially</ul>",
"url": "https://gpt4all.io/models/gguf/mpt-7b-chat-q4_0.gguf",
"promptTemplate": "<|im_start|>user\n%1<|im_end|><|im_start|>assistant\n",
"systemPrompt": "<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>"
},
{
"order": "i",
"md5sum": "aae346fe095e60139ca39b3fda4ac7ae",
"name": "Mini Orca (Small)",
"filename": "orca-mini-3b.q4_0.gguf",
"filesize": "1928648352",
"requires": "2.5.0",
"ramrequired": "4",
"parameters": "3 billion",
"quant": "q4_0",
"type": "OpenLLaMa",
"description": "<strong>Small version of new model with novel dataset</strong><br><ul><li>Instruction based<li>Explain tuned datasets<li>Orca Research Paper dataset construction approaches<li>Licensed for commercial use</ul>",
"url": "https://gpt4all.io/models/gguf/orca-mini-3b.q4_0.gguf",
"promptTemplate": "### User:\n%1\n### Response:\n",
"systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n"
},
{
"order": "s",
"md5sum": "51c627fac9062e208f9b386f105cbd48",
"disableGUI": "true",
"name": "Replit",
"filename": "replit-code-v1-3b-q4_0.gguf",
"filesize": "1532949760",
"requires": "2.5.0",
"ramrequired": "4",
"parameters": "3 billion",
"quant": "f16",
"type": "Replit",
"systemPrompt": " ",
"promptTemplate": "%1",
"description": "<strong>Trained on subset of the Stack</strong><br><ul><li>Code completion based<li>Licensed for commercial use</ul>",
"url": "https://gpt4all.io/models/gguf/replit-code-v1-3b-q4_0.gguf"
}
]

View File

@ -834,12 +834,14 @@ void ModelList::updateModelsFromDirectory()
processDirectory(localPath);
}
#define MODELS_VERSION 2
void ModelList::updateModelsFromJson()
{
#if defined(USE_LOCAL_MODELSJSON)
QUrl jsonUrl("file://" + QDir::homePath() + "/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models.json");
QUrl jsonUrl("file://" + QDir::homePath() + QString("/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models%1.json").arg(MODELS_VERSION));
#else
QUrl jsonUrl("http://gpt4all.io/models/models.json");
QUrl jsonUrl(QString("http://gpt4all.io/models/models%1.json").arg(MODELS_VERSION));
#endif
QNetworkRequest request(jsonUrl);
QSslConfiguration conf = request.sslConfiguration();
@ -881,9 +883,9 @@ void ModelList::updateModelsFromJsonAsync()
emit asyncModelRequestOngoingChanged();
#if defined(USE_LOCAL_MODELSJSON)
QUrl jsonUrl("file://" + QDir::homePath() + "/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models.json");
QUrl jsonUrl("file://" + QDir::homePath() + QString("/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models%1.json").arg(MODELS_VERSION));
#else
QUrl jsonUrl("http://gpt4all.io/models/models.json");
QUrl jsonUrl(QString("http://gpt4all.io/models/models%1.json").arg(MODELS_VERSION));
#endif
QNetworkRequest request(jsonUrl);
QSslConfiguration conf = request.sslConfiguration();

View File

@ -47,7 +47,7 @@ MyDialog {
Layout.fillHeight: true
horizontalAlignment: Qt.AlignHCenter
verticalAlignment: Qt.AlignVCenter
text: qsTr("Network error: could not retrieve http://gpt4all.io/models/models.json")
text: qsTr("Network error: could not retrieve http://gpt4all.io/models/models2.json")
font.pixelSize: theme.fontSizeLarge
color: theme.mutedTextColor
}