* Don't stop generating at end of context
* Use llama_kv_cache ops to shift context
* Fix and improve reverse prompt detection
* Replace prompt recalc callback with a flag to disallow context shift
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Cédric Sazos <cedric.sazos@tutanota.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>