Commit Graph

138 Commits

Author SHA1 Message Date
Chris
4606e12494
Add link to Polish Alpaca LoRa 7B (#289) 2023-04-07 10:22:38 -07:00
Eric Wang
fb9d9832e7 Add LLaMA-GPT4 dataset 2023-04-06 20:39:28 -07:00
YXHXianYu
a368c486be
Fix readme formatting error (#274) 2023-04-05 10:37:25 -04:00
Eric J. Wang
34e505f516
Add Angainor's 13B weights 2023-04-05 10:25:49 -04:00
Pokai Chang
e2ed209d3b
Support streaming output on generate (#263) 2023-04-04 11:05:20 -04:00
marcinmosiolek
8e51ebf3f4
Adding reference to the Polish version (#262) 2023-04-03 13:36:33 -04:00
Yurii Paniv
69bb90c382
Add link to Ukrainian Alpaca 7B (#247) 2023-04-03 13:36:22 -04:00
Chansung Park
a3027fea37
Add 65b checkpoint (#257) 2023-04-03 13:35:57 -04:00
Ilya Gusev
9de612e582
Adding Russian Alpaca 7B (#238) 2023-03-31 14:49:23 -04:00
Eric J. Wang
ab35a0f402
Add information about official LoRA 2023-03-30 14:54:19 -04:00
Eric Wang
46f587738c Fix server_name 2023-03-30 08:57:40 -07:00
Chris Alexiuk
4367a43fcb Added Dockerfile and docker-compose.yml (#207)
* Added Dockerfile for inference

* Added instructions for Dockerfile

* Update README.md

* Update README.md

* Update README.md

* Pass env through Dockerfile

* Added docker compose setup and instructions

* Added more environment options

* Set a safer default mount point

* add docker-compose changes

* Added Dockerfile for inference

* Added instructions for Dockerfile

* Update README.md

* Update README.md

* Update README.md

* Pass env through Dockerfile

* Added docker compose setup and instructions

* Added more environment options

* Set a safer default mount point

* add to gitignore, update to new generate.py

* add docker ignore, simplify docker compose file

* add back missing requirements

* Adjustments to compose and generate.py, added Docker to README.md

* Linting adjust to Black

* Adjusting import linting

* Update README.md

* Update README.md

* Removed comment by original Dockerfile creator.

Comment not necessary.

* cleanup README

Co-authored-by: Francesco Saverio Zuppichini <zuppif@usi.ch>

---------

Co-authored-by: Francesco Saverio Zuppichini <zuppif@usi.ch>
Co-authored-by: Chris Alexiuk <c.s.alexiuk@gmail.com>
Co-authored-by: ElRoberto538 <>
Co-authored-by: Sam Sipe <samsipe@gmail.com>
Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>
2023-03-30 08:57:40 -07:00
Eric J. Wang
216e785d9c
Add sentencepiece back to requirements.txt 2023-03-29 20:07:03 -04:00
Angainor Development
8d58d37b65
Templated prompter (#184)
* Templated prompter

* fix dup import

* Set Verbose False by default

I forgot to disable after testing.

* Fix imports order

* Use Black Formatting

* lint

* Re-introduce lost line

* Cleanup

* template default

* isort

---------

Co-authored-by: Eric Wang <eric.james.wang@gmail.com>
2023-03-29 19:36:04 -04:00
Angainor Development
fcbc45e4c0
Print only on Rank 0 (#187)
* Print only on Rank 0

When training on multiple GPU, the settings are printed once per gpu.
This only prints from rank 0

See https://github.com/tloen/alpaca-lora/issues/182#issuecomment-1485550636
for a sample output.

Could apply to a few other prints further down as well.

* Typo

* Added failsafe

So this works whether or not LOCAL_RANK is defined.
2023-03-29 19:25:17 -04:00
Eric Wang
a48d947298 把中文LoRA放在一起 2023-03-29 09:29:55 -07:00
Ziqing Yang
63de355963
Add chinese-alpaca-lora-7b (#208) 2023-03-29 12:29:09 -04:00
Junbum Lee
dc4f049322
Add Korean based Alpaca LoRA Huggingface (30B,65B) (#210) 2023-03-29 12:23:10 -04:00
Angainor Development
c59d5672b0
Add jsonl support (#212)
Handled by default with same "json" type, the lib auto detects the precise type.
2023-03-29 12:22:19 -04:00
Eric Wang
6545d432e4 add Nomic LoRA 2023-03-29 09:21:36 -07:00
Gene Ruebsamen
28eb8cac3c
Default dataset to cleaned alpaca dataset from HF (#202) 2023-03-28 16:52:47 -04:00
Claudio Aracena
17c5f8a31f
Add spanish alpaca lora 13b link (#201)
* Update README.md

add spanish alpaca lora

* Update README.md
2023-03-28 16:49:38 -04:00
Eric J. Wang
345c8fbb7b
Remove tagline from README 2023-03-28 13:01:39 -04:00
Jiaxin Shan
4a3c7e2231
Add option to share Gradio demo publicly (#189)
* Add option to share Gradio demo publicly

* gradio_share -> share_gradio

---------

Co-authored-by: Eric Wang <eric.james.wang@gmail.com>
2023-03-28 12:43:29 -04:00
Eric J. Wang
f3876137f7
Clarify that dataset is still ODC-By 2023-03-28 12:22:00 -04:00
кѳѳsнī
55b664f46f
Enabling model parallelism (training 30b on 2x 3090s and beyond) (#131)
* override broken data parallelism with model parallelism

* formatting

* formatting, again

---------

Co-authored-by: Eric Wang <eric.james.wang@gmail.com>
2023-03-28 11:48:47 -04:00
Eric Wang
3b79ea4029 256 -> 512 -> 256 2023-03-28 08:34:36 -07:00
Eric Wang
804d22ad43 remove asserts 2023-03-28 08:33:47 -07:00
Angainor Development
69b9d9ea8b
Fix a warning (#186)
Avoids the 
"Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning." 
warning
2023-03-27 15:13:35 -04:00
Eric J. Wang
dbd04f3560
Fix linters (#185)
* install isort

* isort .

* whoops

* fix black
2023-03-27 14:34:23 -04:00
NanoCode012
69b31e0fed
Feat: Add wandb (#168)
* Add wandb

* Fix KeyError

* Add WANDB_WATCH and WANDB_LOG_MODEL

* run_name -> wandb_run_name

* ,

* fix TrainingArgs

---------

Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>
2023-03-27 13:51:36 -04:00
Eric J. Wang
95b30a256c
Fix lint.yml 2023-03-27 13:48:44 -04:00
claysauruswrecks
1310547f9f
Add HF dataset loading, add linters, pyproject.toml (#175)
* add HF dataset loading, add linters, pyproject.toml

- applied markdownlint
- add black, black[jupyter], isort
- fix noqa codes
- add .github workflow linting
- update README.md

* restore default settings

* resume_from_checkpoint

Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>

* Print warning on checkpoint not found

* add HF dataset loading, add linters, pyproject.toml

- applied markdownlint
- add black, black[jupyter], isort
- fix noqa codes
- add .github workflow linting
- update README.md

* Default to local copy and update it

* Typo

* Remove duplicate code block

---------

Co-authored-by: Eric Wang <eric.james.wang@gmail.com>
Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>
2023-03-27 13:31:44 -04:00
Xie Zejian
b00629d773
Add Chinese 13b lora link (#178) 2023-03-27 12:09:41 -04:00
Angainor Development
9d6b822019
Avoid a deprecation warning (#181)
Removes the warning:
`FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead`
2023-03-27 12:06:44 -04:00
Eric Wang
683810b4a1 Print warning on checkpoint not found 2023-03-26 17:25:15 -07:00
Eric Wang
da6b427a08 resume_from_checkpoint
Co-authored-by: AngainorDev <54739135+AngainorDev@users.noreply.github.com>
2023-03-26 17:17:54 -07:00
Eric Wang
b948f892ba restore default settings 2023-03-26 17:10:05 -07:00
Eric J. Wang
d358124af6
Add dotslash to example data_path 2023-03-24 17:19:57 -04:00
Eric J. Wang
5fa807d106
Use CLI arguments (#159)
* CLI args for finetune

* Update README

* CLI args for generate.py

* reqs.txt

* reorder hyperparams

* lora_target_modules

* cleanup
2023-03-24 17:18:42 -04:00
Andrea Santilli
e2f07029aa
Add Italian 7b model to readme (#156) 2023-03-24 15:47:45 -04:00
Eric J. Wang
af30df1999
Unified tokenizer update PR (#146)
* Improve tokenization

The PR changes a few things related to tokenization:

- Sets the padding to the left, which is required if you want to run batched inference with decoder models.

- Pads to the maximum length in each batch ensuring multiple of 8 length (tensor cores like multiple of 8) instead of CUTOFF_LEN. This should make the training faster as less tokens are fed into the model when it is not required (~10% faster in my experiments). To correctly implement this change I need to manually append the eos token (if the input is not truncated) so I have deleted "add_eos" token from the Tokenizer load function. 

- Returns the labels in the tokenize function since some users seem to prefer it this way. This requires using the DataCollatorForSeq2Seq for padding the labels as well as input ids. Behavior of both DataCollators is the same if mlm=False. I can revert to DataCollatorForLanguageModeling if preferred.

* Experimental dynamic batching

* mask out user prompt, again

* Add options

* Remove BASE_MODEL again

* Small optimization

* Final tweaks

---------

Co-authored-by: Iker García-Ferrero <i.garciaferrerosanpelayo@gmail.com>
Co-authored-by: Sebastiaan <751205+SharkWipf@users.noreply.github.com>
2023-03-24 15:46:55 -04:00
Martin Thissen
d3760cd84a
Added fine-tuned7b model for German language (#134)
Co-authored-by: Martin Thissen <>
Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>
2023-03-23 16:23:51 -07:00
Thaweewat
6853b8802e
Add Thai weight URL on READ.ME (#132)
Fine Tuned Alpaca LoRa with Thai Q&A datasets (Standford Alpcaca translated, WikiQA, Pantip)
https://huggingface.co/Thaweewat/thai-buffala-lora-7b-v0-1

Co-authored-by: Eric J. Wang <eric.james.wang@gmail.com>
2023-03-23 16:22:08 -07:00
Eric Wang
fcdb143f1f Amend README 2023-03-23 13:59:31 -07:00
Eric Wang
72aabcb5a4 Remove LLaMA download code, as a precaution 2023-03-23 13:54:39 -07:00
Eric Wang
8955a9c5a1 bos, eos in generate.py 2023-03-23 13:44:45 -07:00
Eric J. Wang
1384a4d24c
Update README.md for multi-GPU training 2023-03-22 22:05:36 -07:00
bofeng huang
c7eabb86e2
Add french version "vigogne" (#127) 2023-03-22 15:59:14 -07:00
Eric J. Wang
a74793c571
Rearrange resources on README, add 13B-30B models 2023-03-22 14:17:31 -07:00