From 412e7a6a961622ccdf84ec28a396d92356b9b66a Mon Sep 17 00:00:00 2001 From: jllllll <3887729+jllllll@users.noreply.github.com> Date: Wed, 31 May 2023 09:07:56 -0500 Subject: [PATCH] Update README.md to include missing flags (#2449) --- README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/README.md b/README.md index e94c019a..22134c83 100644 --- a/README.md +++ b/README.md @@ -266,6 +266,13 @@ Optionally, you can use the following command-line flags: | `--warmup_autotune` | (triton) Enable warmup autotune. | | `--fused_mlp` | (triton) Enable fused mlp. | +#### AutoGPTQ + +| Flag | Description | +|------------------|-------------| +| `--autogptq` | Use AutoGPTQ for loading quantized models instead of the internal GPTQ loader. | +| `--triton` | Use triton. | + #### FlexGen | Flag | Description | @@ -308,6 +315,8 @@ Optionally, you can use the following command-line flags: |---------------------------------------|-------------| | `--api` | Enable the API extension. | | `--public-api` | Create a public URL for the API using Cloudfare. | +| `--api-blocking-port BLOCKING_PORT` | The listening port for the blocking API. | +| `--api-streaming-port STREAMING_PORT` | The listening port for the streaming API. | #### Multimodal