Switch to new models2.json for new gguf release and bump our version to

2.5.0.
2024-10-01 01:06:10 -04:00 · 2023-10-05 09:56:40 -04:00 · 2023-10-05 09:56:40 -04:00 · ea66669cef
commit ea66669cef
parent 088afada49
13 changed files with 167 additions and 22 deletions
--- a/gpt4all-api/gpt4all_api/app/api_v1/routes/engines.py
+++ b/gpt4all-api/gpt4all_api/app/api_v1/routes/engines.py
@ -26,7 +26,7 @@ router = APIRouter(prefix="/engines", tags=["Search Endpoints"])
 async def list_engines():
    '''
    List all available GPT4All models from
-    https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json
+    https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json
    '''
    raise NotImplementedError()
    return ListEnginesResponse(data=[])
--- a/gpt4all-bindings/python/docs/gpt4all_chat.md
+++ b/gpt4all-bindings/python/docs/gpt4all_chat.md
@ -7,7 +7,7 @@ It is optimized to run 7-13B parameter LLMs on the CPU's of any computer running
 ## Running LLMs on CPU
 The GPT4All Chat UI supports models from all newer versions of `GGML`, `llama.cpp` including the `LLaMA`, `MPT`, `replit`,  `GPT-J` and `falcon` architectures

-GPT4All maintains an official list of recommended models located in [models.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json). You can pull request new models to it and if accepted they will show up in the official download dialog.
+GPT4All maintains an official list of recommended models located in [models2.json](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json). You can pull request new models to it and if accepted they will show up in the official download dialog.

 #### Sideloading any GGML model
 If a model is compatible with the gpt4all-backend, you can sideload it into GPT4All Chat by:
--- a/gpt4all-bindings/python/docs/gpt4all_faq.md
+++ b/gpt4all-bindings/python/docs/gpt4all_faq.md
@ -61,12 +61,12 @@ or `allowDownload=true` (default), a model is automatically downloaded into `.ca
 unless it already exists.

 In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
-checksum by comparing it with the one listed in [models.json].
+checksum by comparing it with the one listed in [models2.json].

 As an alternative to the basic downloader built into the bindings, you can choose to download from the 
 <https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.

-[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
+[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json

 #### I need the chat GUI and bindings to behave the same

@ -93,7 +93,7 @@ The chat GUI and bindings are based on the same backend. You can make them behav
 - Next you'll have to compare the templates, adjusting them as necessary, based on how you're using the bindings.
    - Specifically, in Python:
        - With simple `generate()` calls, the input has to be surrounded with system and prompt templates.
-        - When using a chat session, it depends on whether the bindings are allowed to download [models.json]. If yes,
+        - When using a chat session, it depends on whether the bindings are allowed to download [models2.json]. If yes,
          and in the chat GUI the default templates are used, it'll be handled automatically. If no, use
          `chat_session()` template parameters to customize them.

--- a/gpt4all-bindings/python/docs/gpt4all_modal.md
+++ b/gpt4all-bindings/python/docs/gpt4all_modal.md
@ -8,7 +8,7 @@ import modal

 def download_model():
    import gpt4all
-    #you can use any model from https://gpt4all.io/models/models.json
+    #you can use any model from https://gpt4all.io/models/models2.json
    return gpt4all.GPT4All("ggml-gpt4all-j-v1.3-groovy.bin")

 image=modal.Image.debian_slim().pip_install("gpt4all").run_function(download_model)
@ -31,4 +31,4 @@ def main():
    model = GPT4All()
    for i in range(10):
        model.generate.call()
-```
+```
--- a/gpt4all-bindings/python/docs/gpt4all_python.md
+++ b/gpt4all-bindings/python/docs/gpt4all_python.md
@ -77,10 +77,10 @@ When using GPT4All models in the `chat_session` context:
 - Consecutive chat exchanges are taken into account and not discarded until the session ends; as long as the model has capacity.
 - Internal K/V caches are preserved from previous conversation history, speeding up inference.
 - The model is given a system and prompt template which make it chatty. Depending on `allow_download=True` (default),
-  it will obtain the latest version of [models.json] from the repository, which contains specifically tailored templates
+  it will obtain the latest version of [models2.json] from the repository, which contains specifically tailored templates
  for models. Conversely, if it is not allowed to download, it falls back to default templates instead.

-[models.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json
+[models2.json]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models2.json


 ### Streaming Generations
@ -379,7 +379,7 @@ logging infrastructure offers [many more customization options][py-logging-cookb

 ### Without Online Connectivity
 To prevent GPT4All from accessing online resources, instantiate it with `allow_download=False`. This will disable both
-downloading missing models and [models.json], which contains information about them. As a result, predefined templates
+downloading missing models and [models2.json], which contains information about them. As a result, predefined templates
 are used instead of model-specific system and prompt templates:

 === "GPT4All Default Templates Example"
--- a/gpt4all-bindings/python/docs/index.md
+++ b/gpt4all-bindings/python/docs/index.md
@ -38,7 +38,7 @@ The GPT4All software ecosystem is compatible with the following Transformer arch
 - `MPT` (including `Replit`)
 - `GPT-J`

-You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json)
+You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models2.json)


 GPT4All models are artifacts produced through a process known as neural network quantization.
--- a/gpt4all-bindings/python/gpt4all/gpt4all.py
+++ b/gpt4all-bindings/python/gpt4all/gpt4all.py
@ -108,12 +108,12 @@ class GPT4All:
    @staticmethod
    def list_models() -> List[ConfigType]:
        """
-        Fetch model list from https://gpt4all.io/models/models.json.
+        Fetch model list from https://gpt4all.io/models/models2.json.

        Returns:
            Model list in JSON format.
        """
-        return requests.get("https://gpt4all.io/models/models.json").json()
+        return requests.get("https://gpt4all.io/models/models2.json").json()

    @staticmethod
    def retrieve_model(
--- a/gpt4all-bindings/typescript/src/config.js
+++ b/gpt4all-bindings/typescript/src/config.js
@ -21,7 +21,7 @@ const DEFAULT_MODEL_CONFIG = {
    promptTemplate: "### Human: \n%1\n### Assistant:\n",
 }

-const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models.json";
+const DEFAULT_MODEL_LIST_URL = "https://gpt4all.io/models/models2.json";

 const DEFAULT_PROMPT_CONTEXT = {
    temp: 0.7,
--- a/gpt4all-bindings/typescript/src/util.js
+++ b/gpt4all-bindings/typescript/src/util.js
@ -236,7 +236,7 @@ async function retrieveModel(modelName, options = {}) {
        file: retrieveOptions.modelConfigFile,
        url:
            retrieveOptions.allowDownload &&
-            "https://gpt4all.io/models/models.json",
+            "https://gpt4all.io/models/models2.json",
    });

    const loadedModelConfig = availableModels.find(
--- a/gpt4all-chat/CMakeLists.txt
+++ b/gpt4all-chat/CMakeLists.txt
@ -17,8 +17,8 @@ if(APPLE)
 endif()

 set(APP_VERSION_MAJOR 2)
-set(APP_VERSION_MINOR 4)
-set(APP_VERSION_PATCH 20)
+set(APP_VERSION_MINOR 5)
+set(APP_VERSION_PATCH 0)
 set(APP_VERSION "${APP_VERSION_MAJOR}.${APP_VERSION_MINOR}.${APP_VERSION_PATCH}")

 # Include the binary directory for the generated header file
--- a/gpt4all-chat/metadata/models2.json
+++ b/gpt4all-chat/metadata/models2.json
@ -0,0 +1,143 @@
+[
+  {
+    "order": "a",
+    "md5sum": "5aff90007499bce5c64b1c0760c0b186",
+    "name": "Wizard v1.2",
+    "filename": "wizardlm-13b-v1.2.Q4_0.gguf",
+    "filesize": "7365834624",
+    "requires": "2.5.0",
+    "ramrequired": "16",
+    "parameters": "13 billion",
+    "quant": "q4_0",
+    "type": "LLaMA",
+    "systemPrompt": " ",
+    "description": "<strong>Best overall model</strong><br><ul><li>Instruction based<li>Gives very long responses<li>Finetuned with only 1k of high-quality data<li>Trained by Microsoft and Peking University<li>Cannot be used commercially</ul",
+    "url": "https://gpt4all.io/models/gguf/wizardlm-13b-v1.2.Q4_0.gguf"
+  },
+  {
+    "order": "a",
+    "md5sum": "97463be739b50525df56d33b26b00852",
+    "name": "Mistral Instruct",
+    "filename": "mistral-7b-instruct-v0.1.Q4_0.gguf",
+    "filesize": "4108916384",
+    "requires": "2.5.0",
+    "ramrequired": "8",
+    "parameters": "7 billion",
+    "quant": "q4_0",
+    "type": "LLaMA",
+    "systemPrompt": " ",
+    "description": " ",
+    "url": "https://gpt4all.io/models/gguf/mistral-7b-instruct-v0.1.Q4_0.gguf"
+  },
+  {
+    "order": "a",
+    "md5sum": "48de9538c774188eb25a7e9ee024bbd3",
+    "name": "Mistral OpenOrca",
+    "filename": "mistral-7b-openorca.Q4_0.gguf",
+    "filesize": "4108927744",
+    "requires": "2.5.0",
+    "ramrequired": "8",
+    "parameters": "7 billion",
+    "quant": "q4_0",
+    "type": "LLaMA",
+    "systemPrompt": " ",
+    "description": " ",
+    "url": "https://gpt4all.io/models/gguf/mistral-7b-openorca.Q4_0.gguf"
+  },
+  {
+    "order": "b",
+    "md5sum": "31cb6d527bd3bfb5e73c2e9dfbc75033",
+    "name": "GPT4All Falcon",
+    "filename": "gpt4all-falcon-q4_0.gguf",
+    "filesize": "4210419040",
+    "requires": "2.5.0",
+    "ramrequired": "8",
+    "parameters": "7 billion",
+    "quant": "q4_0",
+    "type": "Falcon",
+    "systemPrompt": " ",
+    "description": "<strong>Best overall smaller model</strong><br><ul><li>Fast responses</li><li>Instruction based</li><li>Trained by TII<li>Finetuned by Nomic AI<li>Licensed for commercial use</ul>",
+    "url": "https://gpt4all.io/models/gguf/gpt4all-falcon-q4_0.gguf",
+    "promptTemplate": "### Instruction:\n%1\n### Response:\n"
+  },
+  {
+    "order": "c",
+    "md5sum": "3d12810391d04d1153b692626c0c6e16",
+    "name": "Hermes",
+    "filename": "nous-hermes-llama2-13b.Q4_0.gguf",
+    "filesize": "7366062080",
+    "requires": "2.5.0",
+    "ramrequired": "16",
+    "parameters": "13 billion",
+    "quant": "q4_0",
+    "type": "LLaMA",
+    "systemPrompt": " ",
+    "description": "<strong>Extremely good model</strong><br><ul><li>Instruction based<li>Gives long responses<li>Curated with 300,000 uncensored instructions<li>Trained by Nous Research<li>Cannot be used commercially</ul>",
+    "url": "https://gpt4all.io/models/gguf/nous-hermes-llama2-13b.Q4_0.gguf",
+    "promptTemplate": "### Instruction:\n%1\n### Response:\n"
+  },
+  {
+    "order": "f",
+    "md5sum": "40388eb2f8d16bb5d08c96fdfaac6b2c",
+    "name": "Snoozy",
+    "filename": "gpt4all-13b-snoozy-q4_0.gguf",
+    "filesize": "7365834624",
+    "requires": "2.5.0",
+    "ramrequired": "16",
+    "parameters": "13 billion",
+    "quant": "q4_0",
+    "type": "LLaMA",
+    "systemPrompt": " ",
+    "description": "<strong>Very good overall model</strong><br><ul><li>Instruction based<li>Based on the same dataset as Groovy<li>Slower than Groovy, with higher quality responses<li>Trained by Nomic AI<li>Cannot be used commercially</ul>",
+    "url": "https://gpt4all.io/models/gguf/gpt4all-13b-snoozy-q4_0.gguf"
+  },
+  {
+    "order": "g",
+    "md5sum": "f5bc6a52f72efd9128efb2eeed802c86",
+    "name": "MPT Chat",
+    "filename": "mpt-7b-chat-q4_0.gguf",
+    "filesize": "3911522272",
+    "requires": "2.5.0",
+    "ramrequired": "8",
+    "parameters": "7 billion",
+    "quant": "q4_0",
+    "type": "MPT",
+    "description": "<strong>Best overall smaller model</strong><br><ul><li>Fast responses<li>Chat based<li>Trained by Mosaic ML<li>Cannot be used commercially</ul>",
+    "url": "https://gpt4all.io/models/gguf/mpt-7b-chat-q4_0.gguf",
+    "promptTemplate": "<|im_start|>user\n%1<|im_end|><|im_start|>assistant\n",
+    "systemPrompt": "<|im_start|>system\n- You are a helpful assistant chatbot trained by MosaicML.\n- You answer questions.\n- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>"
+  },
+  {
+    "order": "i",
+    "md5sum": "aae346fe095e60139ca39b3fda4ac7ae",
+    "name": "Mini Orca (Small)",
+    "filename": "orca-mini-3b.q4_0.gguf",
+    "filesize": "1928648352",
+    "requires": "2.5.0",
+    "ramrequired": "4",
+    "parameters": "3 billion",
+    "quant": "q4_0",
+    "type": "OpenLLaMa",
+    "description": "<strong>Small version of new model with novel dataset</strong><br><ul><li>Instruction based<li>Explain tuned datasets<li>Orca Research Paper dataset construction approaches<li>Licensed for commercial use</ul>",
+    "url": "https://gpt4all.io/models/gguf/orca-mini-3b.q4_0.gguf",
+    "promptTemplate": "### User:\n%1\n### Response:\n",
+    "systemPrompt": "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n"
+  },
+  {
+    "order": "s",
+    "md5sum": "51c627fac9062e208f9b386f105cbd48",
+    "disableGUI": "true",
+    "name": "Replit",
+    "filename": "replit-code-v1-3b-q4_0.gguf",
+    "filesize": "1532949760",
+    "requires": "2.5.0",
+    "ramrequired": "4",
+    "parameters": "3 billion",
+    "quant": "f16",
+    "type": "Replit",
+    "systemPrompt": " ",
+    "promptTemplate": "%1",
+    "description": "<strong>Trained on subset of the Stack</strong><br><ul><li>Code completion based<li>Licensed for commercial use</ul>",
+    "url": "https://gpt4all.io/models/gguf/replit-code-v1-3b-q4_0.gguf"
+  }
+]
--- a/gpt4all-chat/modellist.cpp
+++ b/gpt4all-chat/modellist.cpp
@ -834,12 +834,14 @@ void ModelList::updateModelsFromDirectory()
        processDirectory(localPath);
 }

+#define MODELS_VERSION 2
+
 void ModelList::updateModelsFromJson()
 {
 #if defined(USE_LOCAL_MODELSJSON)
-    QUrl jsonUrl("file://" + QDir::homePath() + "/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models.json");
+    QUrl jsonUrl("file://" + QDir::homePath() + QString("/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models%1.json").arg(MODELS_VERSION));
 #else
-    QUrl jsonUrl("http://gpt4all.io/models/models.json");
+    QUrl jsonUrl(QString("http://gpt4all.io/models/models%1.json").arg(MODELS_VERSION));
 #endif
    QNetworkRequest request(jsonUrl);
    QSslConfiguration conf = request.sslConfiguration();
@ -881,9 +883,9 @@ void ModelList::updateModelsFromJsonAsync()
    emit asyncModelRequestOngoingChanged();

 #if defined(USE_LOCAL_MODELSJSON)
-    QUrl jsonUrl("file://" + QDir::homePath() + "/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models.json");
+    QUrl jsonUrl("file://" + QDir::homePath() + QString("/dev/large_language_models/gpt4all/gpt4all-chat/metadata/models%1.json").arg(MODELS_VERSION));
 #else
-    QUrl jsonUrl("http://gpt4all.io/models/models.json");
+    QUrl jsonUrl(QString("http://gpt4all.io/models/models%1.json").arg(MODELS_VERSION));
 #endif
    QNetworkRequest request(jsonUrl);
    QSslConfiguration conf = request.sslConfiguration();
--- a/gpt4all-chat/qml/ModelDownloaderDialog.qml
+++ b/gpt4all-chat/qml/ModelDownloaderDialog.qml
@ -47,7 +47,7 @@ MyDialog {
            Layout.fillHeight: true
            horizontalAlignment: Qt.AlignHCenter
            verticalAlignment: Qt.AlignVCenter
-            text: qsTr("Network error: could not retrieve http://gpt4all.io/models/models.json")
+            text: qsTr("Network error: could not retrieve http://gpt4all.io/models/models2.json")
            font.pixelSize: theme.fontSizeLarge
            color: theme.mutedTextColor
        }