alpaca-lora

mirror of https://github.com/tloen/alpaca-lora.git synced 2024-10-01 01:05:56 -04:00

Author	SHA1	Message	Date
Eric Wang	683810b4a1	Print warning on checkpoint not found	2023-03-26 17:25:15 -07:00
Eric Wang	da6b427a08	resume_from_checkpoint Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>	2023-03-26 17:17:54 -07:00
Eric Wang	b948f892ba	restore default settings	2023-03-26 17:10:05 -07:00
Eric J. Wang	d358124af6	Add dotslash to example data_path	2023-03-24 17:19:57 -04:00
Eric J. Wang	5fa807d106	Use CLI arguments (#159 ) * CLI args for finetune * Update README * CLI args for generate.py * reqs.txt * reorder hyperparams * lora_target_modules * cleanup	2023-03-24 17:18:42 -04:00
Andrea Santilli	e2f07029aa	Add Italian 7b model to readme (#156 )	2023-03-24 15:47:45 -04:00
Eric J. Wang	af30df1999	Unified tokenizer update PR (#146 ) * Improve tokenization The PR changes a few things related to tokenization: - Sets the padding to the left, which is required if you want to run batched inference with decoder models. - Pads to the maximum length in each batch ensuring multiple of 8 length (tensor cores like multiple of 8) instead of CUTOFF_LEN. This should make the training faster as less tokens are fed into the model when it is not required (~10% faster in my experiments). To correctly implement this change I need to manually append the eos token (if the input is not truncated) so I have deleted "add_eos" token from the Tokenizer load function. - Returns the labels in the tokenize function since some users seem to prefer it this way. This requires using the DataCollatorForSeq2Seq for padding the labels as well as input ids. Behavior of both DataCollators is the same if mlm=False. I can revert to DataCollatorForLanguageModeling if preferred. * Experimental dynamic batching * mask out user prompt, again * Add options * Remove BASE_MODEL again * Small optimization * Final tweaks --------- Co-authored-by: Iker García-Ferrero <i.garciaferrerosanpelayo@gmail.com> Co-authored-by: Sebastiaan <751205+SharkWipf@users.noreply.github.com>	2023-03-24 15:46:55 -04:00
Martin Thissen	d3760cd84a	Added fine-tuned7b model for German language (#134 ) Co-authored-by: Martin Thissen <> Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>	2023-03-23 16:23:51 -07:00
Thaweewat	6853b8802e	Add Thai weight URL on READ.ME (#132 ) Fine Tuned Alpaca LoRa with Thai Q&A datasets (Standford Alpcaca translated, WikiQA, Pantip) https://huggingface.co/Thaweewat/thai-buffala-lora-7b-v0-1 Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>	2023-03-23 16:22:08 -07:00
Eric Wang	fcdb143f1f	Amend README	2023-03-23 13:59:31 -07:00
Eric Wang	72aabcb5a4	Remove LLaMA download code, as a precaution	2023-03-23 13:54:39 -07:00
Eric Wang	8955a9c5a1	bos, eos in generate.py	2023-03-23 13:44:45 -07:00
Eric J. Wang	1384a4d24c	Update README.md for multi-GPU training	2023-03-22 22:05:36 -07:00
bofeng huang	c7eabb86e2	Add french version "vigogne" (#127 )	2023-03-22 15:59:14 -07:00
Eric J. Wang	a74793c571	Rearrange resources on README, add 13B-30B models	2023-03-22 14:17:31 -07:00
Eric Wang	b12c3b90f8	Unwind input masking to avoid confusion	2023-03-22 13:52:27 -07:00
Eric Wang	e04897baae	fix fp16 inference	2023-03-21 14:31:30 -07:00
Eric J. Wang	052da42cbb	Replace Colab with HF in README	2023-03-21 13:59:44 -07:00
Eric Wang	7fb06c6c22	Revert "Mask out prompt tokens for real" This reverts commit `4a712d4d8e`.	2023-03-21 12:42:06 -07:00
Eric Wang	2204a71505	set EPOCHS back to 3	2023-03-21 11:52:28 -07:00
Eric Wang	4a712d4d8e	Mask out prompt tokens for real	2023-03-21 11:24:38 -07:00
Eric Wang	fac53721a2	masking bugfix	2023-03-20 21:37:39 -07:00
Eric J. Wang	3cdbfe5b0c	Update README.md	2023-03-20 14:32:55 -07:00
Eric J. Wang	c08c34eabb	mention chatbot project in README.md	2023-03-20 14:26:56 -07:00
Eric J. Wang	f0082d8e8b	Link to resources more prominently	2023-03-20 11:30:42 -07:00
Eric J. Wang	d38802e843	Point volunteers to Open Assistant	2023-03-20 10:52:39 -07:00
Kohaku-Blueleaf	b5a1a0bca7	Add support for valid set size 0 (#83 ) * Add support for valid set size 0 * Make param about valid to default when 0	2023-03-19 22:02:14 -07:00
Kohaku-Blueleaf	0af44f0262	Add option for output dir (#84 )	2023-03-19 22:01:24 -07:00
Kohaku-Blueleaf	450206caaf	Fix torch.compile call on windows (#81 ) * Windows not support compile * Fix code style	2023-03-19 20:16:02 -07:00
Karun	81eb72f707	cleans up alphabetical prompts (#76 )	2023-03-19 15:55:02 -07:00
Eric Wang	997f6cd81f	slider for tokens generated	2023-03-19 15:53:21 -07:00
Eric Wang	cfad895aa1	mask prompt in loss	2023-03-19 15:53:21 -07:00
Eric J. Wang	d66908c0ca	Remove messy test code	2023-03-19 11:22:02 -07:00
Yaqub Mahmoud	0e752ea5f3	Update requirements.txt (#67 ) Added appdirs package to requirements.txt	2023-03-19 11:15:07 -07:00
Eric Wang	c83e30ab78	generate.py tweaks	2023-03-18 23:00:18 -07:00
Eric Wang	80fd9833db	don't share publicly	2023-03-18 16:43:53 -07:00
Eric Wang	6ced8d9907	fix HF export script	2023-03-18 16:42:58 -07:00
Eric J. Wang	8dc0f614c6	Update README.md	2023-03-18 13:24:42 -07:00
Eric J. Wang	d9c19ff34e	Update README.md	2023-03-17 22:27:58 -07:00
Kakigōri Maker	9dab7ba438	add multi-gpu support (ddp) (#54 ) * add multi-gpu support (ddp) * Update finetune.py	2023-03-17 22:27:33 -07:00
Eric Wang	a0295813b0	normalize cleaned data row with missing output	2023-03-17 20:52:14 -07:00
Eric Wang	3b160d745b	HF export script	2023-03-17 17:56:10 -07:00
Eric Wang	8aecde83cd	construciton	2023-03-17 15:11:35 -07:00
Eric Wang	cb046d647e	min beams = 1	2023-03-17 15:07:08 -07:00
Eric Wang	f7044049ab	dataset cleaning, visualizations	2023-03-17 15:04:25 -07:00
Peter Marelas	db4af6a7ff	Enable inference on CPU and Mac GPU using pytorch support for MPS (#48 )	2023-03-17 13:53:21 -07:00
Eric J. Wang	9bff21cc68	huggingface -> Hugging Face Update README.md	2023-03-17 11:08:01 -07:00
Ikko Eltociear Ashimine	65299df970	Update README.md huggingface -> Hugging Face	2023-03-17 16:34:56 +09:00
Eric J. Wang	daf13eea40	Add notes about dataset and model updates	2023-03-16 21:17:55 -07:00
Eric J. Wang	d60701b895	Merge pull request #35 from T-Atlas/patch-1 Update generate.py	2023-03-16 19:39:43 -07:00

1 2 3

103 Commits