Commit Graph

142 Commits

Author SHA1 Message Date
Pete
f4005164f4
Fix llama.cpp truncation (#3400)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-08-03 20:01:15 -03:00
oobabooga
e931844fe2
Add auto_max_new_tokens parameter (#3419) 2023-08-02 14:52:20 -03:00
oobabooga
75c2dd38cf Remove flexgen support 2023-07-25 15:15:29 -07:00
appe233
89e0d15cf5
Use 'torch.backends.mps.is_available' to check if mps is supported (#3164) 2023-07-17 21:27:18 -03:00
Morgan Schweers
6d1e911577
Add support for logits processors in extensions (#3029) 2023-07-13 17:22:41 -03:00
oobabooga
4b1804a438
Implement sessions + add basic multi-user support (#2991) 2023-07-04 00:03:30 -03:00
oobabooga
3443219cbc
Add repetition penalty range parameter to transformers (#2916) 2023-06-29 13:40:13 -03:00
oobabooga
365b672531 Minor change to prevent future bugs 2023-06-25 01:38:54 -03:00
快乐的我531
e356f69b36
Make stop_everything work with non-streamed generation (#2848) 2023-06-24 11:19:16 -03:00
oobabooga
3e80f2aceb Apply the output extensions only once
Relevant for google translate, silero
2023-06-24 10:59:07 -03:00
oobabooga
8bb3bb39b3
Implement stopping string search in string space (#2847) 2023-06-24 09:43:00 -03:00
LarryVRH
580c1ee748
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777) 2023-06-21 15:31:42 -03:00
oobabooga
7f06d551a3 Fix streaming callback 2023-06-16 21:44:56 -03:00
oobabooga
9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga
7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00
brandonj60
b04e18d10c
Add Mirostat v2 sampling to transformer models (#2571) 2023-06-09 21:26:31 -03:00
oobabooga
00b94847da Remove softprompt support 2023-06-06 07:42:23 -03:00
oobabooga
9f215523e2 Remove some unused imports 2023-06-06 07:05:46 -03:00
oobabooga
b6c407f51d Don't stream at more than 24 fps
This is a performance optimization
2023-05-31 23:41:42 -03:00
Luis Lopez
9e7204bef4
Add tail-free and top-a sampling (#2357) 2023-05-29 21:40:01 -03:00
oobabooga
9ee1e37121 Fix return message when no model is loaded 2023-05-28 22:46:32 -03:00
oobabooga
37d4ad012b Add a button for rendering markdown for any model 2023-05-25 11:59:27 -03:00
flurb18
d37a28730d
Beginning of multi-user support (#2262)
Adds a lock to generate_reply
2023-05-24 09:38:20 -03:00
oobabooga
c0fd7f3257
Add mirostat parameters for llama.cpp (#2287) 2023-05-22 19:37:24 -03:00
oobabooga
e116d31180 Prevent unwanted log messages from modules 2023-05-21 22:42:34 -03:00
oobabooga
8ac3636966
Add epsilon_cutoff/eta_cutoff parameters (#2258) 2023-05-21 15:11:57 -03:00
Konstantin Gukov
1b52bddfcc
Mitigate UnboundLocalError (#2136) 2023-05-19 14:46:18 -03:00
oobabooga
71693161eb Better handle spaces in LlamaTokenizer 2023-05-11 17:55:50 -03:00
oobabooga
7221d1389a Fix a bug 2023-05-11 17:11:10 -03:00
oobabooga
0d36c18f5d Always return only the new tokens in generation functions 2023-05-11 17:07:20 -03:00
oobabooga
638c6a65a2
Refactor chat functions (#2003) 2023-05-11 15:37:04 -03:00
Wojtab
e9e75a9ec7
Generalize multimodality (llava/minigpt4 7b and 13b now supported) (#1741) 2023-05-09 20:18:02 -03:00
IJumpAround
020fe7b50b
Remove mutable defaults from function signature. (#1663) 2023-05-08 22:55:41 -03:00
oobabooga
8aafb1f796
Refactor text_generation.py, add support for custom generation functions (#1817) 2023-05-05 18:53:03 -03:00
oobabooga
f673f4a4ca Change --verbose behavior 2023-05-04 15:56:06 -03:00
oobabooga
95d04d6a8d Better warning messages 2023-05-03 21:43:17 -03:00
Wojtab
80c2f25131
LLaVA: small fixes (#1664)
* change multimodal projector to the correct one

* remove reference to custom stopping strings from readme

* fix stopping strings if tokenizer extension adds/removes tokens

* add API example

* LLaVA 7B just dropped, add to readme that there is no support for it currently
2023-05-02 23:12:22 -03:00
Carl Kenner
2f1a2846d1
Verbose should always print special tokens in input (#1707) 2023-05-02 01:24:56 -03:00
oobabooga
15940e762e Fix missing initial space for LlamaTokenizer 2023-04-25 22:47:23 -03:00
Vincent Brouwers
92cdb4f22b
Seq2Seq support (including FLAN-T5) (#1535)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-25 22:39:04 -03:00
oobabooga
1a0c12c6f2
Refactor text-generation.py a bit 2023-04-24 19:24:12 -03:00
Wojtab
12212cf6be
LLaVA support (#1487) 2023-04-23 20:32:22 -03:00
oobabooga
fcb594b90e Don't require llama.cpp models to be placed in subfolders 2023-04-22 14:56:48 -03:00
oobabooga
27f3a78834 Better detect when no model is loaded 2023-04-16 17:35:54 -03:00
oobabooga
b937c9d8c2
Add skip_special_tokens checkbox for Dolly model (#1218) 2023-04-16 14:24:49 -03:00
kernyan
ac19d5101f
revert incorrect eos_token_id change from #814 (#1261)
- fixes #1054
2023-04-16 01:47:01 -03:00
oobabooga
a2127239de Fix a bug 2023-04-16 01:41:37 -03:00
oobabooga
9d3c6d2dc3 Fix a bug 2023-04-16 01:40:47 -03:00
Mikel Bober-Irizar
16a3a5b039
Merge pull request from GHSA-hv5m-3rp9-xcpf
* Remove eval of API input

* Remove unnecessary eval/exec for security

* Use ast.literal_eval

* Use ast.literal_eval

---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 01:36:50 -03:00
oobabooga
8e31f2bad4
Automatically set wbits/groupsize/instruct based on model name (#1167) 2023-04-14 11:07:28 -03:00