Commit Graph

24 Commits

Author SHA1 Message Date
josc146
c9513822c9 fix the issue where state cache could be modified leading to inconsistent hit results 2024-03-01 13:35:16 +08:00
josc146
7e2380e4ed fix body.state 2023-12-28 23:53:58 +08:00
josc146
7f3cfd54b0 improve state cache performance 2023-12-28 22:15:31 +08:00
josc146
e083f2c629 webgpu(python) state cache 2023-12-28 20:43:57 +08:00
josc146
0ddd2e9fea add WebGPU Python Mode (https://github.com/cryscan/web-rwkv-py) 2023-12-14 18:37:07 +08:00
josc146
b14fbc29b7 rwkv.cpp(ggml) support 2023-12-12 20:29:55 +08:00
josc146
d9e25ad69f better state cache 2023-12-08 15:28:33 +08:00
josc146
c8470e77fd fix state_cache of deploy mode 2023-11-17 21:32:11 +08:00
josc146
7235e1067b add deployment mode. If /switch-model with deploy: true, will disable /switch-model, /exit and other dangerous APIs (state cache APIs, part of midi APIs) 2023-11-08 23:29:42 +08:00
josc146
ff7306349a improve memory usage of state cache 2023-10-28 23:04:49 +08:00
josc146
02d5d641d1 chore 2023-08-24 22:48:54 +08:00
josc146
da68926e9c chore (AddStateBody class) 2023-08-13 21:27:29 +08:00
josc146
d0fd480bd6 chore 2023-07-26 22:24:26 +08:00
josc146
f56748a941 improve python backend startup speed 2023-07-25 16:14:29 +08:00
josc146
994fc7c828 fix cross-device state cache exception 2023-07-11 11:20:12 +08:00
josc146
377f71b16b type 2023-06-19 22:32:02 +08:00
josc146
721653a812 fix the state cache crash caused by bad prompts 2023-06-15 22:37:00 +08:00
josc146
5896593951 max_trie_len 2023-06-12 15:22:17 +08:00
josc146
8431b5d24f log Generation Prompt 2023-06-12 13:41:51 +08:00
josc146
5990567a79 avoid misoperations of state_cache 2023-06-12 12:32:50 +08:00
josc146
fa0fcc2c89 add support for python3.8 3.9 2023-06-12 12:09:23 +08:00
josc146
cea1d8b4d1 add logs for state cache and switch-model 2023-06-09 20:46:19 +08:00
josc146
b41a2e7039 move state cache to memory (todo: state cache db) 2023-06-02 21:33:57 +08:00
josc146
3e11128c9d feat: use model state cache to achieve 5x - 50x faster preparation time for generation 2023-05-28 23:52:38 +08:00