gpt4all/gpt4all-bindings/python/docs/index.md

# GPT4All
Welcome to the GPT4All technical documentation.

GPT4All is an open-source software ecosystem that allows anyone to train and deploy **powerful** and **customized** large language models on **everyday hardware**.
Nomic AI oversees contributions to the open-source ecosystem ensuring quality, security and maintainability.

GPT4All software is optimized to run inference of 7-13 billion parameter large language models on the CPUs of laptops, desktops and servers.

=== "GPT4All Example"
    ``` py
    from gpt4all import GPT4All
    model = GPT4All("orca-mini-3b.ggmlv3.q4_0.bin")
    output = model.generate("The capital of France is ", max_tokens=3)
    print(output)
    ```
=== "Output"
    ```
    1. Paris
    ```
See [Python Bindings](gpt4all_python.md) to use GPT4All.

### Navigating the Documentation
In an effort to ensure cross-operating system and cross-language compatibility, the [GPT4All software ecosystem](https://github.com/nomic-ai/gpt4all)
is organized as a monorepo with the following structure:

- **gpt4all-backend**: The GPT4All backend maintains and exposes a universal, performance optimized C API for running inference with multi-billion parameter transformer decoders.
This C API is then bound to any higher level programming language such as C++, Python, Go, etc.
- **gpt4all-bindings**: GPT4All bindings contain a variety of high-level programming languages that implement the C API. Each directory is a bound programming language. The [CLI](gpt4all_cli.md) is included here, as well.
- **gpt4all-api**: The GPT4All API (under initial development) exposes REST API endpoints for gathering completions and embeddings from large language models.
- **gpt4all-chat**: GPT4All Chat is an OS native chat application that runs on OSX, Windows and Ubuntu. It is the easiest way to run local, privacy aware chat assistants on everyday hardware. You can download it on the [GPT4All Website](https://gpt4all.io) and read its source code in the monorepo.

Explore detailed documentation for the backend, bindings and chat client in the sidebar.
## Models
The GPT4All software ecosystem is compatible with the following Transformer architectures:

- `Falcon`
- `LLaMA` (including `OpenLLaMA`)
- `MPT` (including `Replit`)
- `GPTJ`

You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json)


GPT4All models are artifacts produced through a process known as neural network quantization.
A multi-billion parameter transformer decoder usually takes 30+ GB of VRAM to execute a forward pass.
Most people do not have such a powerful computer or access to GPU hardware. By running trained LLMs through quantization algorithms, 
GPT4All models can run on your laptop using only 4-8GB of RAM enabling their wide-spread usage.

Any model trained with one of these architectures can be quantized and run locally with all GPT4All bindings and in the
chat client. You can add new variants by contributing the gpt4all-backend.

## Frequently Asked Questions
Find answers to frequently asked questions by searching the [Github issues](https://github.com/nomic-ai/gpt4all/issues) or in the [documentation FAQ](gpt4all_faq.md).

## Getting the most of your local LLM

**Inference Speed**
of a local LLM depends on two factors: model size and the number of tokens given as input. 
It is not advised to prompt local LLMs with large chunks of context as their inference speed will heavily degrade.
You will likely want to run GPT4All models on GPU if you would like to utilize context windows larger than 750 tokens. Native GPU support for GPT4All models is planned.

**Inference Performance**
Which model is best? That question depends on your use-case. The ability of an LLM to faithfully follow instructions is conditioned
on the quantity and diversity of the pre-training data it trained on and the diversity, quality and factuality of the data the LLM
was fine-tuned on. A goal of GPT4All is to bring the most powerful local assistant model to your desktop and Nomic AI is actively
working on efforts to improve their performance and quality.
GPT4All Updated Docs and FAQ (#632) * working on docs * more doc organization * faq * some reformatting 2023-05-18 16:07:57 -04:00			`# GPT4All`
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`Welcome to the GPT4All technical documentation.`
transfer python bindings code 2023-05-10 13:38:32 -04:00
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`GPT4All is an open-source software ecosystem that allows anyone to train and deploy powerful and customized large language models on everyday hardware.`
			`Nomic AI oversees contributions to the open-source ecosystem ensuring quality, security and maintainability.`
transfer python bindings code 2023-05-10 13:38:32 -04:00
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`GPT4All software is optimized to run inference of 7-13 billion parameter large language models on the CPUs of laptops, desktops and servers.`
transfer python bindings code 2023-05-10 13:38:32 -04:00
Python Bindings: Improved unit tests, documentation and unification of API (#1090) * Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com> 2023-06-30 16:02:02 -04:00			`=== "GPT4All Example"`
			``` py
			`from gpt4all import GPT4All`
			`model = GPT4All("orca-mini-3b.ggmlv3.q4_0.bin")`
			`output = model.generate("The capital of France is ", max_tokens=3)`
			`print(output)`
			```
			`=== "Output"`
			```
			`1. Paris`
			```
			`See [Python Bindings](gpt4all_python.md) to use GPT4All.`

Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`### Navigating the Documentation`
			`In an effort to ensure cross-operating system and cross-language compatibility, the [GPT4All software ecosystem](https://github.com/nomic-ai/gpt4all)`
			`is organized as a monorepo with the following structure:`

			`- gpt4all-backend: The GPT4All backend maintains and exposes a universal, performance optimized C API for running inference with multi-billion parameter transformer decoders.`
			`This C API is then bound to any higher level programming language such as C++, Python, Go, etc.`
CLI Improvements (#1021) * Add gpt4all-bindings/cli/README.md * Unify version information - Was previously split; base one on the other - Add VERSION_INFO as the "source of truth": - Modelled after sys.version_info. - Implemented as a tuple, because it's much easier for (partial) programmatic comparison. - Previous API is kept intact. * Add gpt4all-bindings/cli/developer_notes.md - A few notes on what's what, especially regarding docs * Add gpt4all-bindings/python/docs/gpt4all_cli.md - The CLI user documentation * Bump CLI version to 0.3.5 * Finalise docs & add to index.md - Amend where necessary - Fix typo in gpt4all_cli.md - Mention and add link to CLI doc in index.md * Add docstings to gpt4all-bindings/cli/app.py * Better 'groovy' link & fix typo - Documentation: point to the Hugging Face model card for 'groovy' - Correct typo in app.py 2023-06-23 15:09:31 -04:00			`- gpt4all-bindings: GPT4All bindings contain a variety of high-level programming languages that implement the C API. Each directory is a bound programming language. The [CLI](gpt4all_cli.md) is included here, as well.`
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`- gpt4all-api: The GPT4All API (under initial development) exposes REST API endpoints for gathering completions and embeddings from large language models.`
			`- gpt4all-chat: GPT4All Chat is an OS native chat application that runs on OSX, Windows and Ubuntu. It is the easiest way to run local, privacy aware chat assistants on everyday hardware. You can download it on the [GPT4All Website](https://gpt4all.io) and read its source code in the monorepo.`

			`Explore detailed documentation for the backend, bindings and chat client in the sidebar.`
GPT4All Updated Docs and FAQ (#632) * working on docs * more doc organization * faq * some reformatting 2023-05-18 16:07:57 -04:00			`## Models`
Python Bindings: Improved unit tests, documentation and unification of API (#1090) * Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com> 2023-06-30 16:02:02 -04:00			`The GPT4All software ecosystem is compatible with the following Transformer architectures:`
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00
Python Bindings: Improved unit tests, documentation and unification of API (#1090) * Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com> 2023-06-30 16:02:02 -04:00			- `Falcon`
			- `LLaMA` (including `OpenLLaMA`)
			- `MPT` (including `Replit`)
			- `GPTJ`
Update index.md (#689) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> 2023-05-22 20:29:01 -04:00
Python Bindings: Improved unit tests, documentation and unification of API (#1090) * Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com> 2023-06-30 16:02:02 -04:00			`You can find an exhaustive list of supported models on the [website](https://gpt4all.io) or in the [models directory](https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models.json)`
Update index.md (#689) Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com> 2023-05-22 20:29:01 -04:00

Python Bindings: Improved unit tests, documentation and unification of API (#1090) * Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com> 2023-06-30 16:02:02 -04:00			`GPT4All models are artifacts produced through a process known as neural network quantization.`
			`A multi-billion parameter transformer decoder usually takes 30+ GB of VRAM to execute a forward pass.`
			`Most people do not have such a powerful computer or access to GPU hardware. By running trained LLMs through quantization algorithms,`
			`GPT4All models can run on your laptop using only 4-8GB of RAM enabling their wide-spread usage.`
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00
			`Any model trained with one of these architectures can be quantized and run locally with all GPT4All bindings and in the`
			`chat client. You can add new variants by contributing the gpt4all-backend.`

			`## Frequently Asked Questions`
			`Find answers to frequently asked questions by searching the [Github issues](https://github.com/nomic-ai/gpt4all/issues) or in the [documentation FAQ](gpt4all_faq.md).`
transfer python bindings code 2023-05-10 13:38:32 -04:00
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`## Getting the most of your local LLM`
documentation and cleanup 2023-05-11 11:02:44 -04:00
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`Inference Speed`
Python Bindings: Improved unit tests, documentation and unification of API (#1090) * Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com> 2023-06-30 16:02:02 -04:00			`of a local LLM depends on two factors: model size and the number of tokens given as input.`
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`It is not advised to prompt local LLMs with large chunks of context as their inference speed will heavily degrade.`
			`You will likely want to run GPT4All models on GPU if you would like to utilize context windows larger than 750 tokens. Native GPU support for GPT4All models is planned.`
documentation and cleanup 2023-05-11 11:02:44 -04:00
Improved documentation landing page (#665) * Better doc landing page * Typo * Improved docs landing page 2023-05-21 23:14:18 -04:00			`Inference Performance`
			`Which model is best? That question depends on your use-case. The ability of an LLM to faithfully follow instructions is conditioned`
			`on the quantity and diversity of the pre-training data it trained on and the diversity, quality and factuality of the data the LLM`
			`was fine-tuned on. A goal of GPT4All is to bring the most powerful local assistant model to your desktop and Nomic AI is actively`
			`working on efforts to improve their performance and quality.`