GPT4All is optimized to run LLMs in the 3-13B parameter range on consumer-grade hardware.
LLMs are downloaded to your device so you can run them locally and privately. With our backend anyone can interact with LLMs efficiently and securely on their own hardware.
GPT4All connects you with LLMs from HuggingFace with a [`llama.cpp`](https://github.com/ggerganov/llama.cpp) backend so that they will run efficiently on your hardware. Many of these models can be identified by the file type `.gguf`.
![Explore models](../assets/search_mistral.png)
## Example Models
Many LLMs are available at various sizes, quantizations, and licenses.
- LLMs with more parameters tend to be better at coherently responding to instructions
- LLMs with a smaller quantization (e.g. 4bit instead of 16bit) are much faster and less memory intensive, and tend to have slightly worse performance
- Licenses vary in their terms for personal and commercial use
You can add your API key for remote model providers.
**Note**: this does not download a model file to your computer to use securely. Instead, this way of interacting with models has your prompts leave your computer to the API provider and returns the response to your computer.