Commit Graph

99 Commits

Author SHA1 Message Date
Eric J. Wang
5fa807d106
Use CLI arguments (#159)
* CLI args for finetune

* Update README

* CLI args for generate.py

* reqs.txt

* reorder hyperparams

* lora_target_modules

* cleanup
2023-03-24 17:18:42 -04:00
Andrea Santilli
e2f07029aa
Add Italian 7b model to readme (#156) 2023-03-24 15:47:45 -04:00
Eric J. Wang
af30df1999
Unified tokenizer update PR (#146)
* Improve tokenization

The PR changes a few things related to tokenization:

- Sets the padding to the left, which is required if you want to run batched inference with decoder models.

- Pads to the maximum length in each batch ensuring multiple of 8 length (tensor cores like multiple of 8) instead of CUTOFF_LEN. This should make the training faster as less tokens are fed into the model when it is not required (~10% faster in my experiments). To correctly implement this change I need to manually append the eos token (if the input is not truncated) so I have deleted "add_eos" token from the Tokenizer load function. 

- Returns the labels in the tokenize function since some users seem to prefer it this way. This requires using the DataCollatorForSeq2Seq for padding the labels as well as input ids. Behavior of both DataCollators is the same if mlm=False. I can revert to DataCollatorForLanguageModeling if preferred.

* Experimental dynamic batching

* mask out user prompt, again

* Add options

* Remove BASE_MODEL again

* Small optimization

* Final tweaks

---------

Co-authored-by: Iker García-Ferrero <i.garciaferrerosanpelayo@gmail.com>
Co-authored-by: Sebastiaan <751205+SharkWipf@users.noreply.github.com>
2023-03-24 15:46:55 -04:00
Martin Thissen
d3760cd84a
Added fine-tuned7b model for German language (#134)
Co-authored-by: Martin Thissen <>
Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>
2023-03-23 16:23:51 -07:00
Thaweewat
6853b8802e
Add Thai weight URL on READ.ME (#132)
Fine Tuned Alpaca LoRa with Thai Q&A datasets (Standford Alpcaca translated, WikiQA, Pantip)
https://huggingface.co/Thaweewat/thai-buffala-lora-7b-v0-1

Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>
2023-03-23 16:22:08 -07:00
Eric Wang
fcdb143f1f Amend README 2023-03-23 13:59:31 -07:00
Eric Wang
72aabcb5a4 Remove LLaMA download code, as a precaution 2023-03-23 13:54:39 -07:00
Eric Wang
8955a9c5a1 bos, eos in generate.py 2023-03-23 13:44:45 -07:00
Eric J. Wang
1384a4d24c
Update README.md for multi-GPU training 2023-03-22 22:05:36 -07:00
bofeng huang
c7eabb86e2
Add french version "vigogne" (#127) 2023-03-22 15:59:14 -07:00
Eric J. Wang
a74793c571
Rearrange resources on README, add 13B-30B models 2023-03-22 14:17:31 -07:00
Eric Wang
b12c3b90f8 Unwind input masking to avoid confusion 2023-03-22 13:52:27 -07:00
Eric Wang
e04897baae fix fp16 inference 2023-03-21 14:31:30 -07:00
Eric J. Wang
052da42cbb
Replace Colab with HF in README 2023-03-21 13:59:44 -07:00
Eric Wang
7fb06c6c22 Revert "Mask out prompt tokens for real"
This reverts commit 4a712d4d8e.
2023-03-21 12:42:06 -07:00
Eric Wang
2204a71505 set EPOCHS back to 3 2023-03-21 11:52:28 -07:00
Eric Wang
4a712d4d8e Mask out prompt tokens for real 2023-03-21 11:24:38 -07:00
Eric Wang
fac53721a2 masking bugfix 2023-03-20 21:37:39 -07:00
Eric J. Wang
3cdbfe5b0c
Update README.md 2023-03-20 14:32:55 -07:00
Eric J. Wang
c08c34eabb
mention chatbot project in README.md 2023-03-20 14:26:56 -07:00
Eric J. Wang
f0082d8e8b
Link to resources more prominently 2023-03-20 11:30:42 -07:00
Eric J. Wang
d38802e843
Point volunteers to Open Assistant 2023-03-20 10:52:39 -07:00
Kohaku-Blueleaf
b5a1a0bca7
Add support for valid set size 0 (#83)
* Add support for valid set size 0

* Make param about valid to default when 0
2023-03-19 22:02:14 -07:00
Kohaku-Blueleaf
0af44f0262
Add option for output dir (#84) 2023-03-19 22:01:24 -07:00
Kohaku-Blueleaf
450206caaf
Fix torch.compile call on windows (#81)
* Windows not support compile

* Fix code style
2023-03-19 20:16:02 -07:00
Karun
81eb72f707
cleans up alphabetical prompts (#76) 2023-03-19 15:55:02 -07:00
Eric Wang
997f6cd81f slider for tokens generated 2023-03-19 15:53:21 -07:00
Eric Wang
cfad895aa1 mask prompt in loss 2023-03-19 15:53:21 -07:00
Eric J. Wang
d66908c0ca
Remove messy test code 2023-03-19 11:22:02 -07:00
Yaqub Mahmoud
0e752ea5f3
Update requirements.txt (#67)
Added appdirs package to requirements.txt
2023-03-19 11:15:07 -07:00
Eric Wang
c83e30ab78 generate.py tweaks 2023-03-18 23:00:18 -07:00
Eric Wang
80fd9833db don't share publicly 2023-03-18 16:43:53 -07:00
Eric Wang
6ced8d9907 fix HF export script 2023-03-18 16:42:58 -07:00
Eric J. Wang
8dc0f614c6
Update README.md 2023-03-18 13:24:42 -07:00
Eric J. Wang
d9c19ff34e
Update README.md 2023-03-17 22:27:58 -07:00
Kakigōri Maker
9dab7ba438
add multi-gpu support (ddp) (#54)
* add multi-gpu support (ddp)

* Update finetune.py
2023-03-17 22:27:33 -07:00
Eric Wang
a0295813b0 normalize cleaned data row with missing output 2023-03-17 20:52:14 -07:00
Eric Wang
3b160d745b HF export script 2023-03-17 17:56:10 -07:00
Eric Wang
8aecde83cd construciton 2023-03-17 15:11:35 -07:00
Eric Wang
cb046d647e min beams = 1 2023-03-17 15:07:08 -07:00
Eric Wang
f7044049ab dataset cleaning, visualizations 2023-03-17 15:04:25 -07:00
Peter Marelas
db4af6a7ff
Enable inference on CPU and Mac GPU using pytorch support for MPS (#48) 2023-03-17 13:53:21 -07:00
Eric J. Wang
9bff21cc68
huggingface -> Hugging Face
Update README.md
2023-03-17 11:08:01 -07:00
Ikko Eltociear Ashimine
65299df970
Update README.md
huggingface -> Hugging Face
2023-03-17 16:34:56 +09:00
Eric J. Wang
daf13eea40
Add notes about dataset and model updates 2023-03-16 21:17:55 -07:00
Eric J. Wang
d60701b895
Merge pull request #35 from T-Atlas/patch-1
Update generate.py
2023-03-16 19:39:43 -07:00
Lian Junhong
3a47bd18e8
Update generate.py
Adapting to the input function, a text box for inputting content has been added.
2023-03-17 10:30:27 +08:00
Eric Wang
c39da83e2b add Gradio interface to generate.py 2023-03-16 16:04:06 -07:00
Eric Wang
35029da078 Validation set 2023-03-16 15:05:17 -07:00
Eric Wang
5f6614e6fc Catch outdated installs 2023-03-16 12:11:47 -07:00