add ack.

2024-10-01 05:35:37 -04:00 · 2023-03-15 11:49:54 -07:00 · 2023-03-15 11:49:54 -07:00 · 38fade9806
commit 38fade9806
parent 1807a44181 3a50f614fc
1 changed files with 1 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -166,6 +166,7 @@ torchrun --nproc_per_node=4 --master_port=<your_random_port> train.py \
 ```

 Note the given training script is meant to be simple and easy to use, and is not particularly optimized.
+To run on more gpus, you may prefer to turn down `gradient_accumulation_steps` to keep a global batch size of 128. Global batch size has not been tested for optimality.

 ### Authors
 All grad students below contributed equally and the order is determined by random draw.