oobabooga
|
59b5f7a4b7
|
Improve usage of stopping_criteria
|
2023-03-08 12:13:40 -03:00 |
|
oobabooga
|
add9330e5e
|
Bug fixes
|
2023-03-08 11:26:29 -03:00 |
|
oobabooga
|
33fb6aed74
|
Minor bug fix
|
2023-03-08 03:08:16 -03:00 |
|
oobabooga
|
ad2970374a
|
Readability improvements
|
2023-03-08 03:00:06 -03:00 |
|
oobabooga
|
72d539dbff
|
Better separate the FlexGen case
|
2023-03-08 02:54:47 -03:00 |
|
oobabooga
|
ab50f80542
|
New text streaming method (much faster)
|
2023-03-08 02:46:35 -03:00 |
|
oobabooga
|
8e89bc596b
|
Fix encode() for RWKV
|
2023-03-07 23:15:46 -03:00 |
|
oobabooga
|
19a34941ed
|
Add proper streaming to RWKV
|
2023-03-07 18:17:56 -03:00 |
|
oobabooga
|
8660227e1b
|
Add top_k to RWKV
|
2023-03-07 17:24:28 -03:00 |
|
oobabooga
|
20bd645f6a
|
Fix bug in multigpu setups (attempt 3)
|
2023-03-06 15:58:18 -03:00 |
|
oobabooga
|
09a7c36e1b
|
Minor improvement while running custom models
|
2023-03-06 15:36:35 -03:00 |
|
oobabooga
|
24c4c20391
|
Fix bug in multigpu setups (attempt #2)
|
2023-03-06 15:23:29 -03:00 |
|
oobabooga
|
d88b7836c6
|
Fix bug in multigpu setups
|
2023-03-06 14:58:30 -03:00 |
|
oobabooga
|
e91f4bc25a
|
Add RWKV tokenizer
|
2023-03-06 08:45:49 -03:00 |
|
oobabooga
|
a54b91af77
|
Improve readability
|
2023-03-05 10:21:15 -03:00 |
|
oobabooga
|
8e706df20e
|
Fix a memory leak when text streaming is on
|
2023-03-05 10:12:43 -03:00 |
|
oobabooga
|
c33715ad5b
|
Move towards HF LLaMA implementation
|
2023-03-05 01:20:31 -03:00 |
|
oobabooga
|
c93f1fa99b
|
Count the tokens more conservatively
|
2023-03-04 03:10:21 -03:00 |
|
oobabooga
|
05e703b4a4
|
Print the performance information more reliably
|
2023-03-03 21:24:32 -03:00 |
|
oobabooga
|
a345a2acd2
|
Add a tokenizer placeholder
|
2023-03-03 15:16:55 -03:00 |
|
oobabooga
|
5b354817f6
|
Make chat minimally work with LLaMA
|
2023-03-03 15:04:41 -03:00 |
|
oobabooga
|
ea5c5eb3da
|
Add LLaMA support
|
2023-03-03 14:39:14 -03:00 |
|
oobabooga
|
7bbe32f618
|
Don't return a value in an iterator function
|
2023-03-02 00:48:46 -03:00 |
|
oobabooga
|
ff9f649c0c
|
Remove some unused imports
|
2023-03-02 00:36:20 -03:00 |
|
oobabooga
|
955cf431e8
|
Minor consistency fix
|
2023-03-01 19:11:26 -03:00 |
|
oobabooga
|
831ac7ed3f
|
Add top_p
|
2023-03-01 16:45:48 -03:00 |
|
oobabooga
|
7c4d5ca8cc
|
Improve the text generation call a bit
|
2023-03-01 16:40:25 -03:00 |
|
oobabooga
|
0f6708c471
|
Sort the imports
|
2023-03-01 12:18:17 -03:00 |
|
oobabooga
|
e735806c51
|
Add a generate() function for RWKV
|
2023-03-01 12:16:11 -03:00 |
|
oobabooga
|
f871971de1
|
Trying to get the chat to work
|
2023-02-28 00:25:30 -03:00 |
|
oobabooga
|
ebd698905c
|
Add streaming to RWKV
|
2023-02-28 00:04:04 -03:00 |
|
oobabooga
|
70e522732c
|
Move RWKV loader into a separate file
|
2023-02-27 23:50:16 -03:00 |
|
oobabooga
|
ebc64a408c
|
RWKV support prototype
|
2023-02-27 23:03:35 -03:00 |
|
oobabooga
|
6e843a11d6
|
Fix FlexGen in chat mode
|
2023-02-26 00:36:04 -03:00 |
|
oobabooga
|
fa58fd5559
|
Proper way to free the cuda cache
|
2023-02-25 15:50:29 -03:00 |
|
oobabooga
|
700311ce40
|
Empty the cuda cache at model.generate()
|
2023-02-25 14:39:13 -03:00 |
|
oobabooga
|
78ad55641b
|
Remove duplicate max_new_tokens parameter
|
2023-02-24 17:19:42 -03:00 |
|
oobabooga
|
65326b545a
|
Move all gradio elements to shared (so that extensions can use them)
|
2023-02-24 16:46:50 -03:00 |
|
oobabooga
|
9ae063e42b
|
Fix softprompts when deepspeed is active (#112)
|
2023-02-23 20:22:47 -03:00 |
|
oobabooga
|
7224343a70
|
Improve the imports
|
2023-02-23 14:41:42 -03:00 |
|
oobabooga
|
1dacd34165
|
Further refactor
|
2023-02-23 13:28:30 -03:00 |
|
oobabooga
|
ce7feb3641
|
Further refactor
|
2023-02-23 13:03:52 -03:00 |
|