alpaca-lora

mirror of https://github.com/tloen/alpaca-lora.git synced 2024-10-01 01:05:56 -04:00

Author	SHA1	Message	Date
Eric Wang	683810b4a1	Print warning on checkpoint not found	2023-03-26 17:25:15 -07:00
Eric Wang	da6b427a08	resume_from_checkpoint Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>	2023-03-26 17:17:54 -07:00
Eric Wang	b948f892ba	restore default settings	2023-03-26 17:10:05 -07:00
Eric J. Wang	5fa807d106	Use CLI arguments (#159 ) * CLI args for finetune * Update README * CLI args for generate.py * reqs.txt * reorder hyperparams * lora_target_modules * cleanup	2023-03-24 17:18:42 -04:00
Eric J. Wang	af30df1999	Unified tokenizer update PR (#146 ) * Improve tokenization The PR changes a few things related to tokenization: - Sets the padding to the left, which is required if you want to run batched inference with decoder models. - Pads to the maximum length in each batch ensuring multiple of 8 length (tensor cores like multiple of 8) instead of CUTOFF_LEN. This should make the training faster as less tokens are fed into the model when it is not required (~10% faster in my experiments). To correctly implement this change I need to manually append the eos token (if the input is not truncated) so I have deleted "add_eos" token from the Tokenizer load function. - Returns the labels in the tokenize function since some users seem to prefer it this way. This requires using the DataCollatorForSeq2Seq for padding the labels as well as input ids. Behavior of both DataCollators is the same if mlm=False. I can revert to DataCollatorForLanguageModeling if preferred. * Experimental dynamic batching * mask out user prompt, again * Add options * Remove BASE_MODEL again * Small optimization * Final tweaks --------- Co-authored-by: Iker García-Ferrero <i.garciaferrerosanpelayo@gmail.com> Co-authored-by: Sebastiaan <751205+SharkWipf@users.noreply.github.com>	2023-03-24 15:46:55 -04:00
Eric Wang	72aabcb5a4	Remove LLaMA download code, as a precaution	2023-03-23 13:54:39 -07:00
Eric Wang	b12c3b90f8	Unwind input masking to avoid confusion	2023-03-22 13:52:27 -07:00
Eric Wang	7fb06c6c22	Revert "Mask out prompt tokens for real" This reverts commit `4a712d4d8e`.	2023-03-21 12:42:06 -07:00
Eric Wang	2204a71505	set EPOCHS back to 3	2023-03-21 11:52:28 -07:00
Eric Wang	4a712d4d8e	Mask out prompt tokens for real	2023-03-21 11:24:38 -07:00
Eric Wang	fac53721a2	masking bugfix	2023-03-20 21:37:39 -07:00
Kohaku-Blueleaf	b5a1a0bca7	Add support for valid set size 0 (#83 ) * Add support for valid set size 0 * Make param about valid to default when 0	2023-03-19 22:02:14 -07:00
Kohaku-Blueleaf	0af44f0262	Add option for output dir (#84 )	2023-03-19 22:01:24 -07:00
Kohaku-Blueleaf	450206caaf	Fix torch.compile call on windows (#81 ) * Windows not support compile * Fix code style	2023-03-19 20:16:02 -07:00
Eric Wang	cfad895aa1	mask prompt in loss	2023-03-19 15:53:21 -07:00
Kakigōri Maker	9dab7ba438	add multi-gpu support (ddp) (#54 ) * add multi-gpu support (ddp) * Update finetune.py	2023-03-17 22:27:33 -07:00
Eric Wang	f7044049ab	dataset cleaning, visualizations	2023-03-17 15:04:25 -07:00
Eric Wang	35029da078	Validation set	2023-03-16 15:05:17 -07:00
Eric Wang	5f6614e6fc	Catch outdated installs	2023-03-16 12:11:47 -07:00
andreas.echavez	1862976b33	Update alpaca-lora to use transformers main branch	2023-03-16 12:11:29 -07:00
Eric Wang	2fa1c66388	repair tokenization logic, again	2023-03-15 23:58:44 -07:00
Eric Wang	024dde7dab	Revert "fix <eos> tokenization" This reverts commit `6b69ea8665`.	2023-03-15 22:52:54 -07:00
Eric Wang	6b69ea8665	fix <eos> tokenization	2023-03-15 18:21:06 -07:00
Eric Wang	a2607faff0	fix finetuning code :(	2023-03-14 21:45:12 -07:00
Eric Wang	d714a73e8c	Update README.md with new checkpoint details	2023-03-14 21:33:12 -07:00
Eric Wang	ec98533876	Update README.md; clean up hyperparameters	2023-03-14 16:30:38 -07:00
Eric Wang	46ddd2ca85	Ready to go	2023-03-14 15:10:33 -07:00
Eric Wang	648af26073	update hyperparams	2023-03-14 08:51:30 -07:00
Eric Wang	5cd474bcc0	lr=2e-5	2023-03-14 08:47:49 -07:00
Jan Malte Lichtenberg	a3b80fdbd5	Fix bug in generate promp using 'instruction' instead of 'input'	2023-03-14 15:14:37 +01:00
Eric Wang	41e0ff6c78	tokenizer changes	2023-03-13 21:53:19 -07:00
Eric Wang	df2a5dc4be	cleanup notebooks	2023-03-13 17:33:27 -07:00
Eric Wang	357ec81a17	decapoda	2023-03-13 17:23:29 -07:00
Eric Wang	63121244c8	Licenses and whatnot	2023-03-13 15:00:05 -07:00
Eric Wang	26f64780ad	initial commit	2023-03-13 14:34:26 -07:00

35 Commits