alpaca-lora

mirror of https://github.com/tloen/alpaca-lora.git synced 2024-10-01 01:05:56 -04:00

Author	SHA1	Message	Date
кѳѳsнī	55b664f46f	Enabling model parallelism (training 30b on 2x 3090s and beyond) (#131 ) * override broken data parallelism with model parallelism * formatting * formatting, again --------- Co-authored-by: Eric Wang <eric.james.wang@gmail.com>	2023-03-28 11:48:47 -04:00
Eric Wang	3b79ea4029	256 -> 512 -> 256	2023-03-28 08:34:36 -07:00
Eric Wang	804d22ad43	remove asserts	2023-03-28 08:33:47 -07:00
Angainor Development	69b9d9ea8b	Fix a warning (#186 ) Avoids the "Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning." warning	2023-03-27 15:13:35 -04:00
Eric J. Wang	dbd04f3560	Fix linters (#185 ) * install isort * isort . * whoops * fix black	2023-03-27 14:34:23 -04:00
NanoCode012	69b31e0fed	Feat: Add wandb (#168 ) * Add wandb * Fix KeyError * Add WANDB_WATCH and WANDB_LOG_MODEL * run_name -> wandb_run_name * , * fix TrainingArgs --------- Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>	2023-03-27 13:51:36 -04:00
Eric J. Wang	95b30a256c	Fix lint.yml	2023-03-27 13:48:44 -04:00
claysauruswrecks	1310547f9f	Add HF dataset loading, add linters, pyproject.toml (#175 ) * add HF dataset loading, add linters, pyproject.toml - applied markdownlint - add black, black[jupyter], isort - fix noqa codes - add .github workflow linting - update README.md * restore default settings * resume_from_checkpoint Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com> * Print warning on checkpoint not found * add HF dataset loading, add linters, pyproject.toml - applied markdownlint - add black, black[jupyter], isort - fix noqa codes - add .github workflow linting - update README.md * Default to local copy and update it * Typo * Remove duplicate code block --------- Co-authored-by: Eric Wang <eric.james.wang@gmail.com> Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>	2023-03-27 13:31:44 -04:00
Xie Zejian	b00629d773	Add Chinese 13b lora link (#178 )	2023-03-27 12:09:41 -04:00
Angainor Development	9d6b822019	Avoid a deprecation warning (#181 ) Removes the warning: `FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead`	2023-03-27 12:06:44 -04:00
Eric Wang	683810b4a1	Print warning on checkpoint not found	2023-03-26 17:25:15 -07:00
Eric Wang	da6b427a08	resume_from_checkpoint Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>	2023-03-26 17:17:54 -07:00
Eric Wang	b948f892ba	restore default settings	2023-03-26 17:10:05 -07:00
Eric J. Wang	d358124af6	Add dotslash to example data_path	2023-03-24 17:19:57 -04:00
Eric J. Wang	5fa807d106	Use CLI arguments (#159 ) * CLI args for finetune * Update README * CLI args for generate.py * reqs.txt * reorder hyperparams * lora_target_modules * cleanup	2023-03-24 17:18:42 -04:00
Andrea Santilli	e2f07029aa	Add Italian 7b model to readme (#156 )	2023-03-24 15:47:45 -04:00
Eric J. Wang	af30df1999	Unified tokenizer update PR (#146 ) * Improve tokenization The PR changes a few things related to tokenization: - Sets the padding to the left, which is required if you want to run batched inference with decoder models. - Pads to the maximum length in each batch ensuring multiple of 8 length (tensor cores like multiple of 8) instead of CUTOFF_LEN. This should make the training faster as less tokens are fed into the model when it is not required (~10% faster in my experiments). To correctly implement this change I need to manually append the eos token (if the input is not truncated) so I have deleted "add_eos" token from the Tokenizer load function. - Returns the labels in the tokenize function since some users seem to prefer it this way. This requires using the DataCollatorForSeq2Seq for padding the labels as well as input ids. Behavior of both DataCollators is the same if mlm=False. I can revert to DataCollatorForLanguageModeling if preferred. * Experimental dynamic batching * mask out user prompt, again * Add options * Remove BASE_MODEL again * Small optimization * Final tweaks --------- Co-authored-by: Iker García-Ferrero <i.garciaferrerosanpelayo@gmail.com> Co-authored-by: Sebastiaan <751205+SharkWipf@users.noreply.github.com>	2023-03-24 15:46:55 -04:00
Martin Thissen	d3760cd84a	Added fine-tuned7b model for German language (#134 ) Co-authored-by: Martin Thissen <> Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>	2023-03-23 16:23:51 -07:00
Thaweewat	6853b8802e	Add Thai weight URL on READ.ME (#132 ) Fine Tuned Alpaca LoRa with Thai Q&A datasets (Standford Alpcaca translated, WikiQA, Pantip) https://huggingface.co/Thaweewat/thai-buffala-lora-7b-v0-1 Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>	2023-03-23 16:22:08 -07:00
Eric Wang	fcdb143f1f	Amend README	2023-03-23 13:59:31 -07:00
Eric Wang	72aabcb5a4	Remove LLaMA download code, as a precaution	2023-03-23 13:54:39 -07:00
Eric Wang	8955a9c5a1	bos, eos in generate.py	2023-03-23 13:44:45 -07:00
Eric J. Wang	1384a4d24c	Update README.md for multi-GPU training	2023-03-22 22:05:36 -07:00
bofeng huang	c7eabb86e2	Add french version "vigogne" (#127 )	2023-03-22 15:59:14 -07:00
Eric J. Wang	a74793c571	Rearrange resources on README, add 13B-30B models	2023-03-22 14:17:31 -07:00
Eric Wang	b12c3b90f8	Unwind input masking to avoid confusion	2023-03-22 13:52:27 -07:00
Eric Wang	e04897baae	fix fp16 inference	2023-03-21 14:31:30 -07:00
Eric J. Wang	052da42cbb	Replace Colab with HF in README	2023-03-21 13:59:44 -07:00
Eric Wang	7fb06c6c22	Revert "Mask out prompt tokens for real" This reverts commit `4a712d4d8e`.	2023-03-21 12:42:06 -07:00
Eric Wang	2204a71505	set EPOCHS back to 3	2023-03-21 11:52:28 -07:00
Eric Wang	4a712d4d8e	Mask out prompt tokens for real	2023-03-21 11:24:38 -07:00
Eric Wang	fac53721a2	masking bugfix	2023-03-20 21:37:39 -07:00
Eric J. Wang	3cdbfe5b0c	Update README.md	2023-03-20 14:32:55 -07:00
Eric J. Wang	c08c34eabb	mention chatbot project in README.md	2023-03-20 14:26:56 -07:00
Eric J. Wang	f0082d8e8b	Link to resources more prominently	2023-03-20 11:30:42 -07:00
Eric J. Wang	d38802e843	Point volunteers to Open Assistant	2023-03-20 10:52:39 -07:00
Kohaku-Blueleaf	b5a1a0bca7	Add support for valid set size 0 (#83 ) * Add support for valid set size 0 * Make param about valid to default when 0	2023-03-19 22:02:14 -07:00
Kohaku-Blueleaf	0af44f0262	Add option for output dir (#84 )	2023-03-19 22:01:24 -07:00
Kohaku-Blueleaf	450206caaf	Fix torch.compile call on windows (#81 ) * Windows not support compile * Fix code style	2023-03-19 20:16:02 -07:00
Karun	81eb72f707	cleans up alphabetical prompts (#76 )	2023-03-19 15:55:02 -07:00
Eric Wang	997f6cd81f	slider for tokens generated	2023-03-19 15:53:21 -07:00
Eric Wang	cfad895aa1	mask prompt in loss	2023-03-19 15:53:21 -07:00
Eric J. Wang	d66908c0ca	Remove messy test code	2023-03-19 11:22:02 -07:00
Yaqub Mahmoud	0e752ea5f3	Update requirements.txt (#67 ) Added appdirs package to requirements.txt	2023-03-19 11:15:07 -07:00
Eric Wang	c83e30ab78	generate.py tweaks	2023-03-18 23:00:18 -07:00
Eric Wang	80fd9833db	don't share publicly	2023-03-18 16:43:53 -07:00
Eric Wang	6ced8d9907	fix HF export script	2023-03-18 16:42:58 -07:00
Eric J. Wang	8dc0f614c6	Update README.md	2023-03-18 13:24:42 -07:00
Eric J. Wang	d9c19ff34e	Update README.md	2023-03-17 22:27:58 -07:00
Kakigōri Maker	9dab7ba438	add multi-gpu support (ddp) (#54 ) * add multi-gpu support (ddp) * Update finetune.py	2023-03-17 22:27:33 -07:00

1 2 3

113 Commits