Commit Graph

48 Commits

Author SHA1 Message Date
Lily
630d1146c8
Update export_hf_checkpoint.py (#302)
* Update export_hf_checkpoint.py

* Update finetune.py

New tokenizer base model for the current dev branch of transformers

* Update generate.py

* Update export_state_dict_checkpoint.py

* Update export_hf_checkpoint.py
2023-04-09 14:07:59 -07:00
Angainor Development
8d58d37b65
Templated prompter (#184)
* Templated prompter

* fix dup import

* Set Verbose False by default

I forgot to disable after testing.

* Fix imports order

* Use Black Formatting

* lint

* Re-introduce lost line

* Cleanup

* template default

* isort

---------

Co-authored-by: Eric Wang <eric.james.wang@gmail.com>
2023-03-29 19:36:04 -04:00
Angainor Development
fcbc45e4c0
Print only on Rank 0 (#187)
* Print only on Rank 0

When training on multiple GPU, the settings are printed once per gpu.
This only prints from rank 0

See https://github.com/tloen/alpaca-lora/issues/182#issuecomment-1485550636
for a sample output.

Could apply to a few other prints further down as well.

* Typo

* Added failsafe

So this works whether or not LOCAL_RANK is defined.
2023-03-29 19:25:17 -04:00
Angainor Development
c59d5672b0
Add jsonl support (#212)
Handled by default with same "json" type, the lib auto detects the precise type.
2023-03-29 12:22:19 -04:00
Gene Ruebsamen
28eb8cac3c
Default dataset to cleaned alpaca dataset from HF (#202) 2023-03-28 16:52:47 -04:00
кѳѳsнī
55b664f46f
Enabling model parallelism (training 30b on 2x 3090s and beyond) (#131)
* override broken data parallelism with model parallelism

* formatting

* formatting, again

---------

Co-authored-by: Eric Wang <eric.james.wang@gmail.com>
2023-03-28 11:48:47 -04:00
Eric Wang
3b79ea4029 256 -> 512 -> 256 2023-03-28 08:34:36 -07:00
Eric Wang
804d22ad43 remove asserts 2023-03-28 08:33:47 -07:00
Angainor Development
69b9d9ea8b
Fix a warning (#186)
Avoids the 
"Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning." 
warning
2023-03-27 15:13:35 -04:00
Eric J. Wang
dbd04f3560
Fix linters (#185)
* install isort

* isort .

* whoops

* fix black
2023-03-27 14:34:23 -04:00
NanoCode012
69b31e0fed
Feat: Add wandb (#168)
* Add wandb

* Fix KeyError

* Add WANDB_WATCH and WANDB_LOG_MODEL

* run_name -> wandb_run_name

* ,

* fix TrainingArgs

---------

Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>
2023-03-27 13:51:36 -04:00
claysauruswrecks
1310547f9f
Add HF dataset loading, add linters, pyproject.toml (#175)
* add HF dataset loading, add linters, pyproject.toml

- applied markdownlint
- add black, black[jupyter], isort
- fix noqa codes
- add .github workflow linting
- update README.md

* restore default settings

* resume_from_checkpoint

Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>

* Print warning on checkpoint not found

* add HF dataset loading, add linters, pyproject.toml

- applied markdownlint
- add black, black[jupyter], isort
- fix noqa codes
- add .github workflow linting
- update README.md

* Default to local copy and update it

* Typo

* Remove duplicate code block

---------

Co-authored-by: Eric Wang <eric.james.wang@gmail.com>
Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>
2023-03-27 13:31:44 -04:00
Angainor Development
9d6b822019
Avoid a deprecation warning (#181)
Removes the warning:
`FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead`
2023-03-27 12:06:44 -04:00
Eric Wang
683810b4a1 Print warning on checkpoint not found 2023-03-26 17:25:15 -07:00
Eric Wang
da6b427a08 resume_from_checkpoint
Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>
2023-03-26 17:17:54 -07:00
Eric Wang
b948f892ba restore default settings 2023-03-26 17:10:05 -07:00
Eric J. Wang
5fa807d106
Use CLI arguments (#159)
* CLI args for finetune

* Update README

* CLI args for generate.py

* reqs.txt

* reorder hyperparams

* lora_target_modules

* cleanup
2023-03-24 17:18:42 -04:00
Eric J. Wang
af30df1999
Unified tokenizer update PR (#146)
* Improve tokenization

The PR changes a few things related to tokenization:

- Sets the padding to the left, which is required if you want to run batched inference with decoder models.

- Pads to the maximum length in each batch ensuring multiple of 8 length (tensor cores like multiple of 8) instead of CUTOFF_LEN. This should make the training faster as less tokens are fed into the model when it is not required (~10% faster in my experiments). To correctly implement this change I need to manually append the eos token (if the input is not truncated) so I have deleted "add_eos" token from the Tokenizer load function. 

- Returns the labels in the tokenize function since some users seem to prefer it this way. This requires using the DataCollatorForSeq2Seq for padding the labels as well as input ids. Behavior of both DataCollators is the same if mlm=False. I can revert to DataCollatorForLanguageModeling if preferred.

* Experimental dynamic batching

* mask out user prompt, again

* Add options

* Remove BASE_MODEL again

* Small optimization

* Final tweaks

---------

Co-authored-by: Iker García-Ferrero <i.garciaferrerosanpelayo@gmail.com>
Co-authored-by: Sebastiaan <751205+SharkWipf@users.noreply.github.com>
2023-03-24 15:46:55 -04:00
Eric Wang
72aabcb5a4 Remove LLaMA download code, as a precaution 2023-03-23 13:54:39 -07:00
Eric Wang
b12c3b90f8 Unwind input masking to avoid confusion 2023-03-22 13:52:27 -07:00
Eric Wang
7fb06c6c22 Revert "Mask out prompt tokens for real"
This reverts commit 4a712d4d8e.
2023-03-21 12:42:06 -07:00
Eric Wang
2204a71505 set EPOCHS back to 3 2023-03-21 11:52:28 -07:00
Eric Wang
4a712d4d8e Mask out prompt tokens for real 2023-03-21 11:24:38 -07:00
Eric Wang
fac53721a2 masking bugfix 2023-03-20 21:37:39 -07:00
Kohaku-Blueleaf
b5a1a0bca7
Add support for valid set size 0 (#83)
* Add support for valid set size 0

* Make param about valid to default when 0
2023-03-19 22:02:14 -07:00
Kohaku-Blueleaf
0af44f0262
Add option for output dir (#84) 2023-03-19 22:01:24 -07:00
Kohaku-Blueleaf
450206caaf
Fix torch.compile call on windows (#81)
* Windows not support compile

* Fix code style
2023-03-19 20:16:02 -07:00
Eric Wang
cfad895aa1 mask prompt in loss 2023-03-19 15:53:21 -07:00
Kakigōri Maker
9dab7ba438
add multi-gpu support (ddp) (#54)
* add multi-gpu support (ddp)

* Update finetune.py
2023-03-17 22:27:33 -07:00
Eric Wang
f7044049ab dataset cleaning, visualizations 2023-03-17 15:04:25 -07:00
Eric Wang
35029da078 Validation set 2023-03-16 15:05:17 -07:00
Eric Wang
5f6614e6fc Catch outdated installs 2023-03-16 12:11:47 -07:00
andreas.echavez
1862976b33 Update alpaca-lora to use transformers main branch 2023-03-16 12:11:29 -07:00
Eric Wang
2fa1c66388 repair tokenization logic, again 2023-03-15 23:58:44 -07:00
Eric Wang
024dde7dab Revert "fix <eos> tokenization"
This reverts commit 6b69ea8665.
2023-03-15 22:52:54 -07:00
Eric Wang
6b69ea8665 fix <eos> tokenization 2023-03-15 18:21:06 -07:00
Eric Wang
a2607faff0 fix finetuning code :( 2023-03-14 21:45:12 -07:00
Eric Wang
d714a73e8c Update README.md with new checkpoint details 2023-03-14 21:33:12 -07:00
Eric Wang
ec98533876 Update README.md; clean up hyperparameters 2023-03-14 16:30:38 -07:00
Eric Wang
46ddd2ca85 Ready to go 2023-03-14 15:10:33 -07:00
Eric Wang
648af26073 update hyperparams 2023-03-14 08:51:30 -07:00
Eric Wang
5cd474bcc0 lr=2e-5 2023-03-14 08:47:49 -07:00
Jan Malte Lichtenberg
a3b80fdbd5 Fix bug in generate promp using 'instruction' instead of 'input' 2023-03-14 15:14:37 +01:00
Eric Wang
41e0ff6c78 tokenizer changes 2023-03-13 21:53:19 -07:00
Eric Wang
df2a5dc4be cleanup notebooks 2023-03-13 17:33:27 -07:00
Eric Wang
357ec81a17 decapoda 2023-03-13 17:23:29 -07:00
Eric Wang
63121244c8 Licenses and whatnot 2023-03-13 15:00:05 -07:00
Eric Wang
26f64780ad initial commit 2023-03-13 14:34:26 -07:00