Compare commits

...

38 Commits

Author SHA1 Message Date
josc146
8ca920a114 release v1.6.6 2023-12-25 21:02:26 +08:00
josc146
5f3d449a66 improve Models page 2023-12-25 20:37:40 +08:00
josc146
13735e7dfb chore 2023-12-25 20:35:00 +08:00
josc146
a38d5c3a25 enable web-rwkv-py turbo 2023-12-25 20:34:35 +08:00
josc146
5bae637c67 update Related Repositories 2023-12-25 20:32:54 +08:00
josc146
12e488ba80 improve strategy 2023-12-25 19:30:57 +08:00
josc146
ad30c63c69 update Writer preset params 2023-12-25 19:30:14 +08:00
josc146
a116eff7df webgpu max_buffer_size 2023-12-25 18:08:13 +08:00
josc146
01bc355dde allow manifest customTokenizer 2023-12-25 16:57:32 +08:00
josc146
8e05f3c360 chore 2023-12-25 16:56:46 +08:00
josc146
fde988dd4e update manifest.json 2023-12-25 16:08:20 +08:00
josc146
91401ad14f * text=auto eol=lf 2023-12-24 22:51:23 +08:00
josc146
280194647c improve refreshRemoteModels 2023-12-22 14:44:27 +08:00
josc146
2e0a542f33 improve train_log.txt creation 2023-12-22 13:00:13 +08:00
josc146
b988694da7 better CopyEmbed 2023-12-22 12:47:26 +08:00
josc146
512c4d0f73 improve role-playing effect 2023-12-22 10:51:09 +08:00
josc146
5525fb1470 chore 2023-12-22 10:49:28 +08:00
josc146
4db735e026 update readme 2023-12-21 13:46:51 +08:00
josc146
c8c79c39d1 Create dependabot.yml 2023-12-21 12:56:21 +08:00
josc146
bcfb76d8ca update readme 2023-12-19 14:59:02 +08:00
josc146
2d9aaf8fc9 update readme 2023-12-18 19:55:25 +08:00
josc146
8a3905c09a reduce precompiled web_rwkv_py size 2023-12-15 16:26:01 +08:00
github-actions[bot]
54cd8a46fa release v1.6.5 2023-12-14 14:09:13 +00:00
josc146
1b83bf261a release v1.6.5 2023-12-14 22:07:17 +08:00
josc146
2a7d22dab1 Composition Option: Only Auto Play Generated Content 2023-12-14 22:06:39 +08:00
josc146
f7494b0cfb update midi_filter_config.json 2023-12-14 21:18:48 +08:00
github-actions[bot]
9ca91d59ec release v1.6.4 2023-12-14 12:40:56 +00:00
josc146
11feaa6e68 release v1.6.4 2023-12-14 20:40:24 +08:00
josc146
18d4b2304e WebGPU (Python) strategy 2023-12-14 20:39:42 +08:00
github-actions[bot]
2f45e9c33a release v1.6.3 2023-12-14 10:43:36 +00:00
josc146
f7df10cb66 release v1.6.3 2023-12-14 18:42:58 +08:00
josc146
46e9a2f5b2 add precompiled web_rwkv_py 2023-12-14 18:42:00 +08:00
josc146
69b8d2e0a1 fix refreshBuiltInModels 2023-12-14 18:37:37 +08:00
josc146
0ddd2e9fea add WebGPU Python Mode (https://github.com/cryscan/web-rwkv-py) 2023-12-14 18:37:07 +08:00
josc146
01c95f5bc4 chore 2023-12-14 14:13:12 +08:00
josc146
e0bf44d82f bump MIDI-LLM-tokenizer (fix note off) 2023-12-14 13:33:27 +08:00
josc146
f328e84ea7 update Readme_Install.txt 2023-12-13 15:23:34 +08:00
github-actions[bot]
c81f5015a1 release v1.6.2 2023-12-12 15:51:23 +00:00
47 changed files with 66247 additions and 128 deletions

2
.gitattributes vendored
View File

@@ -1,3 +1,5 @@
* text=auto eol=lf
backend-python/rwkv_pip/** linguist-vendored
backend-python/wkv_cuda_utils/** linguist-vendored
backend-python/get-pip.py linguist-vendored

9
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,9 @@
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
commit-message:
prefix: "chore"
include: "scope"

View File

@@ -98,6 +98,7 @@ jobs:
rm ./backend-python/get-pip.py
rm ./backend-python/rwkv_pip/cpp/librwkv.dylib
rm ./backend-python/rwkv_pip/cpp/rwkv.dll
rm ./backend-python/rwkv_pip/webgpu/web_rwkv_py.cp310-win_amd64.pyd
make
mv build/bin/RWKV-Runner build/bin/RWKV-Runner_linux_x64
@@ -124,6 +125,7 @@ jobs:
rm ./backend-python/get-pip.py
rm ./backend-python/rwkv_pip/cpp/rwkv.dll
rm ./backend-python/rwkv_pip/cpp/librwkv.so
rm ./backend-python/rwkv_pip/webgpu/web_rwkv_py.cp310-win_amd64.pyd
make
cp build/darwin/Readme_Install.txt build/bin/Readme_Install.txt
cp build/bin/RWKV-Runner.app/Contents/MacOS/RWKV-Runner build/bin/RWKV-Runner_darwin_universal

View File

@@ -1,13 +1,17 @@
## Changes
- rwkv.cpp python38 compatibility
- improve rwkv.cpp operation prompts
- add load failed traceback
- fix windows cmd waiting
- improve refreshRemoteModels
- reduce precompiled web_rwkv_py size
- webgpu(Python) max_buffer_size (12B support) and turbo
- improve role-playing effect
- update manifest.json (a lot of new models)
- bump webgpu(ai00_server) mode to v0.3.8
- improve details
## Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Server-Deploy-Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples

View File

@@ -47,13 +47,28 @@ English | [简体中文](README_ZH.md) | [日本語](README_JA.md)
</div>
#### Tip: You can deploy [backend-python](./backend-python/) on a server and use this program as a client only. Fill in your server address in the Settings `API URL`.
## Tips
#### Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility issues (output garbled), go to the Configs page and turn off `Use Custom CUDA kernel to Accelerate`, or try to upgrade your gpu driver.
- You can deploy [backend-python](./backend-python/) on a server and use this program as a client only. Fill in
your server address in the Settings `API URL`.
#### If Windows Defender claims this is a virus, you can try downloading [v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip) and letting it update automatically to the latest version, or add it to the trusted list (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`).
- If you are deploying and providing public services, please limit the request size through API gateway to prevent
excessive resource usage caused by submitting overly long prompts. Additionally, please restrict the upper limit of
requests' max_tokens based on your actual
situation: https://github.com/josStorer/RWKV-Runner/blob/master/backend-python/utils/rwkv.py#L567, the default is set
as le=102400, which may result in significant resource consumption for individual responses in extreme cases.
#### For different tasks, adjusting API parameters can achieve better results. For example, for translation tasks, you can try setting Temperature to 1 and Top_P to 0.3.
- Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you
encounter possible compatibility issues (output garbled), go to the Configs page and turn
off `Use Custom CUDA kernel to Accelerate`, or try to upgrade your gpu driver.
- If Windows Defender claims this is a virus, you can try
downloading [v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip)
and letting it update automatically to the latest version, or add it to the trusted
list (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`).
- For different tasks, adjusting API parameters can achieve better results. For example, for translation tasks, you can
try setting Temperature to 1 and Top_P to 0.3.
## Features
@@ -168,6 +183,10 @@ Tip: You can download https://github.com/josStorer/sgm_plus and unzip it to the
to use it as an offline sound source. Please note that if you are compiling the program from source code, do not place
it in the source code directory.
If you don't have a MIDI keyboard, you can use virtual MIDI input software like `Virtual Midi Controller 3 LE`, along
with [loopMIDI](https://www.tobias-erichsen.de/wp-content/uploads/2020/01/loopMIDISetup_1_0_16_27.zip), to use a regular
computer keyboard as MIDI input.
### USB MIDI Connection
- USB MIDI devices are plug-and-play, and you can select your input device in the Composition page
@@ -206,12 +225,16 @@ it in the source code directory.
## Related Repositories:
- RWKV-5-World: https://huggingface.co/BlinkDL/rwkv-5-world/tree/main
- RWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main
- RWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main
- ChatRWKV: https://github.com/BlinkDL/ChatRWKV
- RWKV-LM: https://github.com/BlinkDL/RWKV-LM
- RWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA
- MIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer
- ai00_rwkv_server: https://github.com/cgisky1980/ai00_rwkv_server
- rwkv.cpp: https://github.com/saharNooby/rwkv.cpp
- web-rwkv-py: https://github.com/cryscan/web-rwkv-py
## Preview

View File

@@ -47,13 +47,26 @@
</div>
#### ヒント:サーバーに[backend-python](./backend-python/)をデプロイし、このプログラムをクライアントとして使用することができます。設定された`API URL`にサーバーアドレスを入力してください。
## ヒント
#### デフォルトの設定はカスタム CUDA カーネルアクセラレーションを有効にしています。互換性の問題 (文字化けを出力する) が発生する可能性がある場合は、コンフィグページに移動し、`Use Custom CUDA kernel to Accelerate` をオフにしてください、あるいは、GPUドライバーをアップグレードしてみてください。
- サーバーに [backend-python](./backend-python/)
をデプロイし、このプログラムをクライアントとして使用することができます。設定された`API URL`にサーバーアドレスを入力してください。
#### Windows Defender がこれをウイルスだと主張する場合は、[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip) をダウンロードして最新版に自動更新させるか、信頼済みリストに追加してみてください (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`)。
- もし、あなたがデプロイし、外部に公開するサービスを提供している場合、APIゲートウェイを使用してリクエストのサイズを制限し、
長すぎるプロンプトの提出がリソースを占有しないようにしてください。さらに、実際の状況に応じて、リクエストの max_tokens
の上限を制限してくださいhttps://github.com/josStorer/RWKV-Runner/blob/master/backend-python/utils/rwkv.py#L567
、デフォルトは le=102400 ですが、極端な場合には単一の応答が大量のリソースを消費する可能性があります。
#### 異なるタスクについては、API パラメータを調整することで、より良い結果を得ることができます。例えば、翻訳タスクの場合、Temperature を 1 に、Top_P を 0.3 に設定してみてください。
- デフォルトの設定はカスタム CUDA カーネルアクセラレーションを有効にしています。互換性の問題 (文字化けを出力する)
が発生する可能性がある場合は、コンフィグページに移動し、`Use Custom CUDA kernel to Accelerate`
をオフにしてください、あるいは、GPUドライバーをアップグレードしてみてください。
- Windows Defender
がこれをウイルスだと主張する場合は、[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip)
をダウンロードして最新版に自動更新させるか、信頼済みリストに追加してみてください (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`)。
- 異なるタスクについては、API パラメータを調整することで、より良い結果を得ることができます。例えば、翻訳タスクの場合、Temperature
を 1 に、Top_P を 0.3 に設定してみてください。
## 特徴
@@ -167,6 +180,10 @@ Tip: You can download https://github.com/josStorer/sgm_plus and unzip it to the
to use it as an offline sound source. Please note that if you are compiling the program from source code, do not place
it in the source code directory.
MIDIキーボードをお持ちでない場合、`Virtual Midi Controller 3 LE`
などの仮想MIDI入力ソフトウェアを使用することができます。[loopMIDI](https://www.tobias-erichsen.de/wp-content/uploads/2020/01/loopMIDISetup_1_0_16_27.zip)
を組み合わせて、通常のコンピュータキーボードをMIDI入力として使用できます。
### USB MIDI Connection
- USB MIDI devices are plug-and-play, and you can select your input device in the Composition page
@@ -205,12 +222,16 @@ it in the source code directory.
## 関連リポジトリ:
- RWKV-5-World: https://huggingface.co/BlinkDL/rwkv-5-world/tree/main
- RWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main
- RWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main
- ChatRWKV: https://github.com/BlinkDL/ChatRWKV
- RWKV-LM: https://github.com/BlinkDL/RWKV-LM
- RWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA
- MIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer
- ai00_rwkv_server: https://github.com/cgisky1980/ai00_rwkv_server
- rwkv.cpp: https://github.com/saharNooby/rwkv.cpp
- web-rwkv-py: https://github.com/cryscan/web-rwkv-py
## Preview

View File

@@ -46,13 +46,22 @@ API兼容的接口这意味着一切ChatGPT客户端都是RWKV客户端。
</div>
#### 小贴士:你可以在服务器部署[backend-python](./backend-python/),然后将此程序仅用作客户端,在设置的`API URL`中填入你的服务器地址
## 小贴士
#### 预设配置已经开启自定义CUDA算子加速速度更快且显存消耗更少。如果你遇到可能的兼容性(输出乱码)问题,前往配置页面,关闭`使用自定义CUDA算子加速`,或更新你的显卡驱动
- 你可以在服务器部署[backend-python](./backend-python/),然后将此程序仅用作客户端,在设置的`API URL`中填入你的服务器地址
#### 如果Windows Defender说这是一个病毒你可以尝试下载[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip),然后让其自动更新到最新版,或添加信任 (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`)
- 如果你正在部署并对外提供公开服务请通过API网关限制请求大小避免过长的prompt提交占用资源。此外请根据你的实际情况限制请求的
max_tokens 上限: https://github.com/josStorer/RWKV-Runner/blob/master/backend-python/utils/rwkv.py#L567,
默认le=102400, 这可能导致极端情况下单个响应消耗大量资源
#### 对于不同的任务调整API参数会获得更好的效果例如对于翻译任务你可以尝试设置Temperature为1Top_P为0.3
- 预设配置已经开启自定义CUDA算子加速速度更快且显存消耗更少。如果你遇到可能的兼容性(输出乱码)
问题,前往配置页面,关闭`使用自定义CUDA算子加速`,或更新你的显卡驱动
- 如果 Windows Defender
说这是一个病毒,你可以尝试下载[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip)
然后让其自动更新到最新版,或添加信任 (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`)
- 对于不同的任务调整API参数会获得更好的效果例如对于翻译任务你可以尝试设置Temperature为1Top_P为0.3
## 功能
@@ -161,6 +170,9 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
小贴士: 你可以下载 https://github.com/josStorer/sgm_plus, 并解压到程序的`assets/sound-font`目录, 以使用离线音源. 注意,
如果你正在从源码编译程序, 请不要将其放置在源码目录中
如果你没有MIDI键盘, 你可以使用像 `Virtual Midi Controller 3 LE` 这样的虚拟MIDI输入软件,
配合[loopMIDI](https://www.tobias-erichsen.de/wp-content/uploads/2020/01/loopMIDISetup_1_0_16_27.zip), 使用普通电脑键盘作为MIDI输入
### USB MIDI 连接
- USB MIDI设备是即插即用的, 你能够在作曲页面选择你的输入设备
@@ -192,12 +204,16 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
## 相关仓库:
- RWKV-5-World: https://huggingface.co/BlinkDL/rwkv-5-world/tree/main
- RWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main
- RWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main
- ChatRWKV: https://github.com/BlinkDL/ChatRWKV
- RWKV-LM: https://github.com/BlinkDL/RWKV-LM
- RWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA
- MIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer
- ai00_rwkv_server: https://github.com/cgisky1980/ai00_rwkv_server
- rwkv.cpp: https://github.com/saharNooby/rwkv.cpp
- web-rwkv-py: https://github.com/cryscan/web-rwkv-py
## Preview

View File

@@ -50,9 +50,12 @@ func (a *App) OnStartup(ctx context.Context) {
os.Mkdir(a.exDir+"models", os.ModePerm)
os.Mkdir(a.exDir+"lora-models", os.ModePerm)
os.Mkdir(a.exDir+"finetune/json2binidx_tool/data", os.ModePerm)
f, err := os.Create(a.exDir + "lora-models/train_log.txt")
if err == nil {
f.Close()
trainLogPath := a.exDir + "lora-models/train_log.txt"
if !a.FileExists(trainLogPath) {
f, err := os.Create(trainLogPath)
if err == nil {
f.Close()
}
}
a.downloadLoop()

View File

@@ -10,7 +10,7 @@ import (
"strings"
)
func (a *App) StartServer(python string, port int, host string, webui bool, rwkvBeta bool, rwkvcpp bool) (string, error) {
func (a *App) StartServer(python string, port int, host string, webui bool, rwkvBeta bool, rwkvcpp bool, webgpu bool) (string, error) {
var err error
if python == "" {
python, err = GetPython()
@@ -28,6 +28,9 @@ func (a *App) StartServer(python string, port int, host string, webui bool, rwkv
if rwkvcpp {
args = append(args, "--rwkv.cpp")
}
if webgpu {
args = append(args, "--webgpu")
}
args = append(args, "--port", strconv.Itoa(port), "--host", host)
return Cmd(args...)
}
@@ -55,6 +58,17 @@ func (a *App) ConvertSafetensors(modelPath string, outPath string) (string, erro
return Cmd(args...)
}
func (a *App) ConvertSafetensorsWithPython(python string, modelPath string, outPath string) (string, error) {
var err error
if python == "" {
python, err = GetPython()
}
if err != nil {
return "", err
}
return Cmd(python, "./backend-python/convert_safetensors.py", "--input", modelPath, "--output", outPath)
}
func (a *App) ConvertGGML(python string, modelPath string, outPath string, Q51 bool) (string, error) {
var err error
if python == "" {

View File

@@ -3,6 +3,7 @@ package backend_golang
import (
"archive/zip"
"bufio"
"crypto/sha256"
"embed"
"errors"
"fmt"
@@ -112,9 +113,19 @@ func CopyEmbed(efs embed.FS) error {
return err
}
err = os.WriteFile(path, content, 0644)
if err != nil {
return err
executeWrite := true
existedContent, err := os.ReadFile(path)
if err == nil {
if fmt.Sprintf("%x", sha256.Sum256(existedContent)) == fmt.Sprintf("%x", sha256.Sum256(content)) {
executeWrite = false
}
}
if executeWrite {
err = os.WriteFile(path, content, 0644)
if err != nil {
return err
}
}
return nil

View File

@@ -30,6 +30,33 @@ def convert_file(pt_filename: str, sf_filename: str, rename={}, transpose_names=
if "state_dict" in loaded:
loaded = loaded["state_dict"]
kk = list(loaded.keys())
version = 4
for x in kk:
if "ln_x" in x:
version = max(5, version)
if "gate.weight" in x:
version = max(5.1, version)
if int(version) == 5 and "att.time_decay" in x:
if len(loaded[x].shape) > 1:
if loaded[x].shape[1] > 1:
version = max(5.2, version)
if "time_maa" in x:
version = max(6, version)
if version == 5.1 and "midi" in pt_filename.lower():
import numpy as np
np.set_printoptions(precision=4, suppress=True, linewidth=200)
kk = list(loaded.keys())
_, n_emb = loaded["emb.weight"].shape
for k in kk:
if "time_decay" in k or "time_faaaa" in k:
# print(k, mm[k].shape)
loaded[k] = (
loaded[k].unsqueeze(1).repeat(1, n_emb // loaded[k].shape[0])
)
loaded = {k: v.clone().half() for k, v in loaded.items()}
# for k, v in loaded.items():
# print(f'{k}\t{v.shape}\t{v.dtype}')

View File

@@ -37,6 +37,11 @@ def get_args(args: Union[Sequence[str], None] = None):
action="store_true",
help="whether to use rwkv.cpp (default: False)",
)
group.add_argument(
"--webgpu",
action="store_true",
help="whether to use webgpu (default: False)",
)
args = parser.parse_args(args)
return args

View File

@@ -8,7 +8,6 @@ import base64
from fastapi import APIRouter, Request, status, HTTPException
from sse_starlette.sse import EventSourceResponse
from pydantic import BaseModel, Field
import numpy as np
import tiktoken
from utils.rwkv import *
from utils.log import quick_log
@@ -335,6 +334,8 @@ The following is a coherent verbose detailed conversation between a girl named {
body.stop.append(f"\n\n{bot_code}")
elif body.stop is None:
body.stop = default_stop
if not body.presystem:
body.stop.append("\n\n")
if body.stream:
return EventSourceResponse(
@@ -396,6 +397,8 @@ class EmbeddingsBody(BaseModel):
def embedding_base64(embedding: List[float]) -> str:
import numpy as np
return base64.b64encode(np.array(embedding).astype(np.float32)).decode("utf-8")

View File

@@ -37,10 +37,14 @@ def text_to_midi(body: TextToMidiBody):
async def midi_to_text(file_data: UploadFile):
vocab_config = "backend-python/utils/midi_vocab_config.json"
cfg = VocabConfig.from_json(vocab_config)
filter_config = "backend-python/utils/midi_filter_config.json"
filter_cfg = FilterConfig.from_json(filter_config)
mid = mido.MidiFile(file=file_data.file)
text = convert_midi_to_str(cfg, mid)
output_list = convert_midi_to_str(cfg, filter_cfg, mid)
if len(output_list) == 0:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "bad midi file")
return {"text": text}
return {"text": output_list[0]}
class TxtToMidiBody(BaseModel):

View File

@@ -87,18 +87,34 @@ def add_state(body: AddStateBody):
raise HTTPException(status.HTTP_400_BAD_REQUEST, "trie not loaded")
import torch
import numpy as np
try:
devices: List[torch.device] = []
state: Union[Any, None] = None
if body.state is not None:
if type(body.state) == list or type(body.state) == np.ndarray:
devices = [
(
tensor.device
if hasattr(tensor, "device")
else torch.device("cpu")
)
for tensor in body.state
]
state = (
[tensor.cpu() for tensor in body.state]
if hasattr(body.state[0], "device")
else copy.deepcopy(body.state)
)
else:
pass # WebGPU
id: int = trie.insert(body.prompt)
devices: List[torch.device] = [
(tensor.device if hasattr(tensor, "device") else torch.device("cpu"))
for tensor in body.state
]
dtrie[id] = {
"tokens": copy.deepcopy(body.tokens),
"state": [tensor.cpu() for tensor in body.state]
if hasattr(body.state[0], "device")
else copy.deepcopy(body.state),
"state": state,
"logits": copy.deepcopy(body.logits),
"devices": devices,
}
@@ -174,6 +190,7 @@ def longest_prefix_state(body: LongestPrefixStateBody, request: Request):
raise HTTPException(status.HTTP_400_BAD_REQUEST, "trie not loaded")
import torch
import numpy as np
id = -1
try:
@@ -185,14 +202,16 @@ def longest_prefix_state(body: LongestPrefixStateBody, request: Request):
v = dtrie[id]
devices: List[torch.device] = v["devices"]
prompt: str = trie[id]
state: Union[Any, None] = v["state"]
if state is not None and type(state) == list and hasattr(state[0], "device"):
state = [tensor.to(devices[i]) for i, tensor in enumerate(state)]
quick_log(request, body, "Hit:\n" + prompt)
return {
"prompt": prompt,
"tokens": v["tokens"],
"state": [tensor.to(devices[i]) for i, tensor in enumerate(v["state"])]
if hasattr(v["state"][0], "device")
else v["state"],
"state": state,
"logits": v["logits"],
}
else:

File diff suppressed because it is too large Load Diff

View File

@@ -84,6 +84,8 @@ class PIPELINE:
return e / e.sum(axis=axis, keepdims=True)
def sample_logits(self, logits, temperature=1.0, top_p=0.85, top_k=0):
if type(logits) == list:
logits = np.array(logits)
np_logits = type(logits) == np.ndarray
if np_logits:
probs = self.np_softmax(logits, axis=-1)

26
backend-python/rwkv_pip/webgpu/model.py vendored Normal file
View File

@@ -0,0 +1,26 @@
from typing import Any, List, Union
try:
import web_rwkv_py as wrp
except ModuleNotFoundError:
try:
from . import web_rwkv_py as wrp
except ImportError:
raise ModuleNotFoundError(
"web_rwkv_py not found, install it from https://github.com/cryscan/web-rwkv-py"
)
class RWKV:
def __init__(self, model_path: str, strategy: str = None):
self.model = wrp.v5.Model(
model_path,
turbo=True,
quant=32 if "i8" in strategy else None,
quant_nf4=26 if "i4" in strategy else None,
)
self.w = {} # fake weight
self.w["emb.weight"] = [0] * wrp.peek_info(model_path).num_vocab
def forward(self, tokens: List[int], state: Union[Any, None] = None):
return wrp.v5.run_one(self.model, tokens, state)

Binary file not shown.

View File

@@ -52,6 +52,8 @@ class VocabConfig:
bin_name_to_program_name: Dict[str, str]
# Mapping from program number to instrument name.
instrument_names: Dict[str, str]
# Manual override for velocity bins. Each element is the max velocity value for that bin by index.
velocity_bins_override: Optional[List[int]] = None
def __post_init__(self):
self.validate()
@@ -116,6 +118,12 @@ class VocabConfig:
raise ValueError("velocity_bins must be at least 2")
if len(self.bin_instrument_names) > 16:
raise ValueError("bin_instruments must have at most 16 values")
if self.velocity_bins_override:
print("VocabConfig is using velocity_bins_override. Ignoring velocity_exp.")
if len(self.velocity_bins_override) != self.velocity_bins:
raise ValueError(
"velocity_bins_override must have same length as velocity_bins"
)
if (
self.ch10_instrument_bin_name
and self.ch10_instrument_bin_name not in self.bin_instrument_names
@@ -156,6 +164,11 @@ class VocabUtils:
def velocity_to_bin(self, velocity: float) -> int:
velocity = max(0, min(velocity, self.cfg.velocity_events - 1))
if self.cfg.velocity_bins_override:
for i, v in enumerate(self.cfg.velocity_bins_override):
if velocity <= v:
return i
return 0
binsize = self.cfg.velocity_events / (self.cfg.velocity_bins - 1)
if self.cfg.velocity_exp == 1.0:
return ceil(velocity / binsize)
@@ -176,6 +189,8 @@ class VocabUtils:
)
def bin_to_velocity(self, bin: int) -> int:
if self.cfg.velocity_bins_override:
return self.cfg.velocity_bins_override[bin]
binsize = self.cfg.velocity_events / (self.cfg.velocity_bins - 1)
if self.cfg.velocity_exp == 1.0:
return max(0, ceil(bin * binsize - 1))
@@ -358,13 +373,32 @@ class AugmentConfig:
)
@dataclass
class FilterConfig:
# Whether to filter out MIDI files with duplicate MD5 hashes.
deduplicate_md5: bool
# Minimum time delay between notes in a file before splitting into multiple documents.
piece_split_delay: float
# Minimum length of a piece in milliseconds.
min_piece_length: float
@classmethod
def from_json(cls, path: str):
with open(path, "r") as f:
config = json.load(f)
return cls(**config)
def mix_volume(velocity: int, volume: int, expression: int) -> float:
return velocity * (volume / 127.0) * (expression / 127.0)
def convert_midi_to_str(
cfg: VocabConfig, mid: mido.MidiFile, augment: AugmentValues = None
) -> str:
cfg: VocabConfig,
filter_cfg: FilterConfig,
mid: mido.MidiFile,
augment: AugmentValues = None,
) -> List[str]:
utils = VocabUtils(cfg)
if augment is None:
augment = AugmentValues.default()
@@ -390,7 +424,9 @@ def convert_midi_to_str(
} # {channel: {(note, program) -> True}}
started_flag = False
output_list = []
output = ["<start>"]
output_length_ms = 0.0
token_data_buffer: List[
Tuple[int, int, int, float]
] = [] # need to sort notes between wait tokens
@@ -432,16 +468,33 @@ def convert_midi_to_str(
token_data_buffer = []
def consume_note_program_data(prog: int, chan: int, note: int, vel: float):
nonlocal output, started_flag, delta_time_ms, cfg, utils, token_data_buffer
nonlocal output, output_length_ms, started_flag, delta_time_ms, cfg, utils, token_data_buffer
is_token_valid = (
utils.prog_data_to_token_data(prog, chan, note, vel) is not None
)
if not is_token_valid:
return
if delta_time_ms > filter_cfg.piece_split_delay * 1000.0:
# check if any notes are still held
silent = True
for channel in channel_notes.keys():
if len(channel_notes[channel]) > 0:
silent = False
break
if silent:
flush_token_data_buffer()
output.append("<end>")
if output_length_ms > filter_cfg.min_piece_length * 1000.0:
output_list.append(" ".join(output))
output = ["<start>"]
output_length_ms = 0.0
started_flag = False
if started_flag:
wait_tokens = utils.data_to_wait_tokens(delta_time_ms)
if len(wait_tokens) > 0:
flush_token_data_buffer()
output_length_ms += delta_time_ms
output += wait_tokens
delta_time_ms = 0.0
token_data_buffer.append((prog, chan, note, vel * augment.velocity_mod_factor))
@@ -510,7 +563,9 @@ def convert_midi_to_str(
flush_token_data_buffer()
output.append("<end>")
return " ".join(output)
if output_length_ms > filter_cfg.min_piece_length * 1000.0:
output_list.append(" ".join(output))
return output_list
def generate_program_change_messages(cfg: VocabConfig):
@@ -633,10 +688,10 @@ def token_to_midi_message(
if utils.cfg.decode_fix_repeated_notes:
if (channel, note) in state.active_notes:
del state.active_notes[(channel, note)]
yield mido.Message(
"note_off", note=note, time=ticks, channel=channel
), state
ticks = 0
yield mido.Message(
"note_off", note=note, time=ticks, channel=channel
), state
ticks = 0
state.active_notes[(channel, note)] = state.total_time
yield mido.Message(
"note_on", note=note, velocity=velocity, time=ticks, channel=channel

View File

@@ -0,0 +1,5 @@
{
"deduplicate_md5": true,
"piece_split_delay": 10000,
"min_piece_length": 0
}

View File

@@ -8,7 +8,6 @@ from typing import Dict, Iterable, List, Tuple, Union, Type
from utils.log import quick_log
from fastapi import HTTPException
from pydantic import BaseModel, Field
import numpy as np
from routes import state_cache
import global_var
@@ -68,6 +67,8 @@ class AbstractRWKV(ABC):
pass
def get_embedding(self, input: str, fast_mode: bool) -> Tuple[List[float], int]:
import numpy as np
if fast_mode:
embedding, token_len = self.__fast_embedding(
self.fix_tokens(self.pipeline.encode(input)), None
@@ -222,6 +223,8 @@ class AbstractRWKV(ABC):
def generate(
self, prompt: str, stop: Union[str, List[str], None] = None
) -> Iterable[Tuple[str, str, int, int]]:
import numpy as np
quick_log(None, None, "Generation Prompt:\n" + prompt)
cache = None
delta_prompt = prompt
@@ -231,7 +234,7 @@ class AbstractRWKV(ABC):
)
except HTTPException:
pass
if cache is None or cache["prompt"] == "":
if cache is None or cache["prompt"] == "" or cache["state"] is None:
self.model_state = None
self.model_tokens = []
else:
@@ -511,6 +514,7 @@ def get_tokenizer(tokenizer_len: int):
def RWKV(model: str, strategy: str, tokenizer: Union[str, None]) -> AbstractRWKV:
rwkv_beta = global_var.get(global_var.Args).rwkv_beta
rwkv_cpp = getattr(global_var.get(global_var.Args), "rwkv.cpp")
webgpu = global_var.get(global_var.Args).webgpu
if "midi" in model.lower() or "abc" in model.lower():
os.environ["RWKV_RESCALE_LAYER"] = "999"
@@ -526,6 +530,11 @@ def RWKV(model: str, strategy: str, tokenizer: Union[str, None]) -> AbstractRWKV
from rwkv_pip.cpp.model import (
RWKV as Model,
)
elif webgpu:
print("Using webgpu")
from rwkv_pip.webgpu.model import (
RWKV as Model,
)
else:
from rwkv_pip.model import (
RWKV as Model,

View File

@@ -1,3 +1,8 @@
Client Download URL:
客户端下载地址:
クライアントのダウンロードURL:
https://github.com/josStorer/RWKV-Runner/releases/latest/download/RWKV-Runner_macos_universal.zip
For Mac and Linux users, please manually install Python 3.10 (usually the latest systems come with it built-in). You can specify the Python interpreter to use in Settings. (which python3)
对于Mac和Linux用户请手动安装 Python3.10 (通常最新的系统已经内置了). 你可以在设置中指定使用的Python解释器. (which python3)
MacおよびLinuxのユーザーの方は、Python3.10を手動でインストールしてください(通常、最新のシステムには既に組み込まれています)。 設定メニューで使用するPythonインタプリタを指定することができます。 (which python3)

View File

@@ -1,3 +1,8 @@
Client Download URL:
客户端下载地址:
クライアントのダウンロードURL:
https://github.com/josStorer/RWKV-Runner/releases/latest/download/RWKV-Runner_linux_x64
For Mac and Linux users, please manually install Python 3.10 (usually the latest systems come with it built-in). You can specify the Python interpreter to use in Settings.
对于Mac和Linux用户请手动安装 Python3.10 (通常最新的系统已经内置了). 你可以在设置中指定使用的Python解释器.
MacおよびLinuxのユーザーの方は、Python3.10を手動でインストールしてください(通常、最新のシステムには既に組み込まれています)。 設定メニューで使用するPythonインタプリタを指定することができます。

View File

@@ -1,3 +1,8 @@
Client Download URL:
客户端下载地址:
クライアントのダウンロードURL:
https://github.com/josStorer/RWKV-Runner/releases/latest/download/RWKV-Runner_windows_x64.exe
Please execute this program in an empty directory. All related dependencies will be placed in this directory.
请将本程序放在一个空目录内执行, 所有相关依赖均会放置于此目录.
このプログラムを空のディレクトリで実行してください。関連するすべての依存関係は、このディレクトリに配置されます。

View File

@@ -19,14 +19,15 @@ document.querySelectorAll('.grid.h-10.grid-cols-12.place-content-center.gap-x-3.
if (!data.name.endsWith('.bin') && !data.name.endsWith('.pth'))
return
data.desc = {en: '', zh: ''}
data.desc = { en: '', zh: '', ja: '' }
const rawText = await (await fetch(e.children[1].href.replace('/resolve/', '/raw/'))).text()
data.size = parseInt(extractValue(rawText, 'size'))
data.SHA256 = extractValue(rawText, 'oid sha256:')
data.lastUpdated = e.children[3].children[0].getAttribute('datetime')
data.url = e.children[1].href.replace('/resolve/', '/blob/')
data.downloadUrl = e.children[1].href
data.url = e.children[1].href.replace('/resolve/', '/blob/').replace('?download=true', '')
data.downloadUrl = e.children[1].href.replace('?download=true', '')
data.tags = []
modelsJson.push(data)
})

View File

@@ -18,6 +18,7 @@
"file-saver": "^2.0.5",
"html-midi-player": "^1.5.0",
"i18next": "^22.4.15",
"lodash-es": "^4.17.21",
"mobx": "^6.9.0",
"mobx-react-lite": "^3.4.3",
"pdfjs-dist": "^4.0.189",
@@ -40,6 +41,7 @@
},
"devDependencies": {
"@types/file-saver": "^2.0.7",
"@types/lodash-es": "^4.17.12",
"@types/react": "^18.2.6",
"@types/react-beautiful-dnd": "^13.1.4",
"@types/react-dom": "^18.2.4",
@@ -2533,6 +2535,21 @@
"hoist-non-react-statics": "^3.3.0"
}
},
"node_modules/@types/lodash": {
"version": "4.14.202",
"resolved": "https://registry.npmjs.org/@types/lodash/-/lodash-4.14.202.tgz",
"integrity": "sha512-OvlIYQK9tNneDlS0VN54LLd5uiPCBOp7gS5Z0f1mjoJYBrtStzgmJBxONW3U6OZqdtNzZPmn9BS/7WI7BFFcFQ==",
"dev": true
},
"node_modules/@types/lodash-es": {
"version": "4.17.12",
"resolved": "https://registry.npmjs.org/@types/lodash-es/-/lodash-es-4.17.12.tgz",
"integrity": "sha512-0NgftHUcV4v34VhXm8QBSftKVXtbkBG3ViCjs6+eJ5a6y6Mi/jiFGPc1sC7QK+9BFhWrURE3EOggmWaSxL9OzQ==",
"dev": true,
"dependencies": {
"@types/lodash": "*"
}
},
"node_modules/@types/long": {
"version": "4.0.2",
"resolved": "https://registry.npmjs.org/@types/long/-/long-4.0.2.tgz",
@@ -4210,6 +4227,11 @@
"integrity": "sha512-7ylylesZQ/PV29jhEDl3Ufjo6ZX7gCqJr5F7PKrqc93v7fzSymt1BpwEU8nAUXs8qzzvqhbjhK5QZg6Mt/HkBg==",
"dev": true
},
"node_modules/lodash-es": {
"version": "4.17.21",
"resolved": "https://registry.npmjs.org/lodash-es/-/lodash-es-4.17.21.tgz",
"integrity": "sha512-mKnC+QJ9pWVzv+C4/U3rRsHapFfHvQFoFB92e52xeyGMcX6/OlIl78je1u8vePzYZSkkogMPJ2yjxxsb89cxyw=="
},
"node_modules/long": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/long/-/long-4.0.0.tgz",

View File

@@ -19,6 +19,7 @@
"file-saver": "^2.0.5",
"html-midi-player": "^1.5.0",
"i18next": "^22.4.15",
"lodash-es": "^4.17.21",
"mobx": "^6.9.0",
"mobx-react-lite": "^3.4.3",
"pdfjs-dist": "^4.0.189",
@@ -41,6 +42,7 @@
},
"devDependencies": {
"@types/file-saver": "^2.0.7",
"@types/lodash-es": "^4.17.12",
"@types/react": "^18.2.6",
"@types/react-beautiful-dnd": "^13.1.4",
"@types/react-dom": "^18.2.4",

View File

@@ -162,7 +162,7 @@
"Memory is not enough, try to increase the virtual memory or use a smaller model.": "メモリが不足しています。仮想メモリを増やすか、もしくは小さなモデルを使ってみてください",
"Bad PyTorch version, please reinstall PyTorch with cuda.": "不適切なPyTorchのバージョンです。cudaと共にPyTorchを再インストールしてください。",
"The model file is corrupted, please download again.": "モデルファイルが破損しています。再度ダウンロードしてください。",
"Found no NVIDIA driver, please install the latest driver.": "NVIDIAのドライバが見つかりません。最新版のドライバをインストールしてください。",
"Found no NVIDIA driver, please install the latest driver. If you are not using an Nvidia GPU, please switch the 'Strategy' to WebGPU or CPU in the Configs page.": "NVIDIAのドライバが見つかりません。最新版のドライバをインストールしてください。NvidiaのGPUを使用していない場合は、設定ページで\"Strategy\"をWebGPUまたはCPUに切り替えてください。",
"VRAM is not enough, please reduce stored layers or use a lower precision in Configs page.": "VRAMが足りません。設定ページで保存されているレイヤーを減らすか、精度を下げてください。",
"Failed to enable custom CUDA kernel, ninja is required to load C++ extensions. You may be using the CPU version of PyTorch, please reinstall PyTorch with CUDA. Or if you are using a custom Python interpreter, you must compile the CUDA kernel by yourself or disable Custom CUDA kernel acceleration.": "カスタムCUDAカーネルの有効化に失敗しました。C++拡張を読み込むためにはNinjaが必要です。あなたは恐らくCPU版のPyTorchを使用しており、CUDA版のPyTorchを再インストールする必要があります。または、あなたがカスタムPythonインタプリタを使用している場合は、CUDAカーネルを自分でコンパイルするか、カスタムCUDAカーネルのアクセラレーションを無効にする必要があります。",
"Presets": "プリセット",
@@ -312,6 +312,8 @@
"JP": "日本語",
"Music": "音楽",
"Other": "その他",
"Role Play": "ロールプレイ",
"Recommended": "おすすめ",
"Import MIDI": "MIDIをインポート",
"Current Instrument": "現在の楽器",
"Please convert model to GGML format first": "モデルをGGML形式に変換してください",
@@ -320,5 +322,6 @@
"Play With External Player": "外部プレーヤーで再生",
"Core API URL": "コアAPI URL",
"Override core API URL(/chat/completions and /completions). If you don't know what this is, leave it blank.": "コアAPI URLを上書きします(/chat/completions と /completions)。何であるかわからない場合は空白のままにしてください。",
"Please change Strategy to CPU (rwkv.cpp) to use ggml format": "StrategyをCPU (rwkv.cpp)に変更して、ggml形式を使用してください"
"Please change Strategy to CPU (rwkv.cpp) to use ggml format": "StrategyをCPU (rwkv.cpp)に変更して、ggml形式を使用してください",
"Only Auto Play Generated Content": "生成されたコンテンツのみ自動再生"
}

View File

@@ -162,7 +162,7 @@
"Memory is not enough, try to increase the virtual memory or use a smaller model.": "内存不足,尝试增加虚拟内存,或使用一个更小规模的模型",
"Bad PyTorch version, please reinstall PyTorch with cuda.": "错误的PyTorch版本请重新安装CUDA版本的PyTorch",
"The model file is corrupted, please download again.": "模型文件损坏,请重新下载",
"Found no NVIDIA driver, please install the latest driver.": "没有找到NVIDIA驱动请安装最新驱动",
"Found no NVIDIA driver, please install the latest driver. If you are not using an Nvidia GPU, please switch the 'Strategy' to WebGPU or CPU in the Configs page.": "没有找到NVIDIA驱动请安装最新驱动。如果你没有使用Nvidia显卡请在配置页面将“Strategy”改为WebGPU或CPU",
"VRAM is not enough, please reduce stored layers or use a lower precision in Configs page.": "显存不足,请在配置页面减少载入显存层数,或使用更低的精度",
"Failed to enable custom CUDA kernel, ninja is required to load C++ extensions. You may be using the CPU version of PyTorch, please reinstall PyTorch with CUDA. Or if you are using a custom Python interpreter, you must compile the CUDA kernel by yourself or disable Custom CUDA kernel acceleration.": "自定义CUDA算子开启失败需要安装Ninja来读取C++扩展。你可能正在使用CPU版本的PyTorch请重新安装CUDA版本的PyTorch。如果你正在使用自定义Python解释器你必须自己编译CUDA算子或禁用自定义CUDA算子加速",
"Presets": "预设",
@@ -312,6 +312,8 @@
"JP": "日文",
"Music": "音乐",
"Other": "其他",
"Role Play": "角色扮演",
"Recommended": "推荐",
"Import MIDI": "导入MIDI",
"Current Instrument": "当前乐器",
"Please convert model to GGML format first": "请先将模型转换为GGML格式",
@@ -320,5 +322,6 @@
"Play With External Player": "使用外部播放器播放",
"Core API URL": "核心 API URL",
"Override core API URL(/chat/completions and /completions). If you don't know what this is, leave it blank.": "覆盖核心的 API URL (/chat/completions 和 /completions)。如果你不知道这是什么,请留空",
"Please change Strategy to CPU (rwkv.cpp) to use ggml format": "请将Strategy改为CPU (rwkv.cpp)以使用ggml格式"
"Please change Strategy to CPU (rwkv.cpp) to use ggml format": "请将Strategy改为CPU (rwkv.cpp)以使用ggml格式",
"Only Auto Play Generated Content": "仅自动播放新生成的内容"
}

View File

@@ -48,6 +48,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
const modelConfig = commonStore.getCurrentModelConfig();
const webgpu = modelConfig.modelParameters.device === 'WebGPU';
const webgpuPython = modelConfig.modelParameters.device === 'WebGPU (Python)';
const cpp = modelConfig.modelParameters.device === 'CPU (rwkv.cpp)';
let modelName = '';
let modelPath = '';
@@ -77,7 +78,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
});
};
if (webgpu) {
if (webgpu || webgpuPython) {
if (!['.st', '.safetensors'].some(ext => modelPath.endsWith(ext))) {
const stModelPath = modelPath.replace(/\.pth$/, '.st');
if (await FileExists(stModelPath)) {
@@ -92,7 +93,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
return;
} else {
toastWithButton(t('Please convert model to safe tensors format first'), t('Convert'), () => {
convertToSt(modelConfig);
convertToSt(modelConfig, navigate);
});
commonStore.setStatus({ status: ModelStatus.Offline });
return;
@@ -100,7 +101,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
}
}
if (!webgpu) {
if (!webgpu && !webgpuPython) {
if (['.st', '.safetensors'].some(ext => modelPath.endsWith(ext))) {
toast(t('Please change Strategy to WebGPU to use safetensors format'), { type: 'error' });
commonStore.setStatus({ status: ModelStatus.Offline });
@@ -176,7 +177,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
const isUsingCudaBeta = modelConfig.modelParameters.device === 'CUDA-Beta';
startServer(commonStore.settings.customPythonPath, port, commonStore.settings.host !== '127.0.0.1' ? '0.0.0.0' : '127.0.0.1',
!!modelConfig.enableWebUI, isUsingCudaBeta, cpp
!!modelConfig.enableWebUI, isUsingCudaBeta, cpp, webgpuPython
).catch((e) => {
const errMsg = e.message || e;
if (errMsg.includes('path contains space'))
@@ -216,7 +217,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
const strategy = getStrategy(modelConfig);
let customCudaFile = '';
if ((modelConfig.modelParameters.device.includes('CUDA') || modelConfig.modelParameters.device === 'Custom')
if ((modelConfig.modelParameters.device.startsWith('CUDA') || modelConfig.modelParameters.device === 'Custom')
&& modelConfig.modelParameters.useCustomCuda
&& !strategy.split('->').some(s => ['cuda', 'fp32'].every(v => s.includes(v)))) {
if (commonStore.platform === 'windows') {
@@ -264,7 +265,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
navigate({ pathname: '/' + buttonName.toLowerCase() });
};
if ((modelConfig.modelParameters.device === 'CUDA' || modelConfig.modelParameters.device === 'CUDA-Beta') &&
if (modelConfig.modelParameters.device.startsWith('CUDA') &&
modelConfig.modelParameters.storedLayers < modelConfig.modelParameters.maxStoredLayers &&
commonStore.monitorData && commonStore.monitorData.totalVram !== 0 &&
(commonStore.monitorData.usedVram / commonStore.monitorData.totalVram) < 0.9)
@@ -279,7 +280,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
'not enough memory': 'Memory is not enough, try to increase the virtual memory or use a smaller model.',
'not compiled with CUDA': 'Bad PyTorch version, please reinstall PyTorch with cuda.',
'invalid header or archive is corrupted': 'The model file is corrupted, please download again.',
'no NVIDIA driver': 'Found no NVIDIA driver, please install the latest driver.',
'no NVIDIA driver': 'Found no NVIDIA driver, please install the latest driver. If you are not using an Nvidia GPU, please switch the \'Strategy\' to WebGPU or CPU in the Configs page.',
'CUDA out of memory': 'VRAM is not enough, please reduce stored layers or use a lower precision in Configs page.',
'Ninja is required to load C++ extensions': 'Failed to enable custom CUDA kernel, ninja is required to load C++ extensions. You may be using the CPU version of PyTorch, please reinstall PyTorch with CUDA. Or if you are using a custom Python interpreter, you must compile the CUDA kernel by yourself or disable Custom CUDA kernel acceleration.'
};

View File

@@ -152,10 +152,14 @@ const CompositionPanel: FC = observer(() => {
if (autoPlay) {
if (commonStore.compositionParams.externalPlay)
externalPlayListener();
else
else {
if (commonStore.compositionParams.playOnlyGeneratedContent && playerRef.current) {
playerRef.current.currentTime = Math.max(commonStore.compositionParams.generationStartTime - 1, 0);
}
setTimeout(() => {
playerRef.current?.start();
});
}
}
});
});
@@ -314,6 +318,14 @@ const CompositionPanel: FC = observer(() => {
autoPlay: data.checked as boolean
});
}} />
<Checkbox className="select-none"
size="large" label={t('Only Auto Play Generated Content')} checked={params.playOnlyGeneratedContent}
onChange={async (_, data) => {
setParams({
autoPlay: data.checked as boolean || commonStore.compositionParams.autoPlay,
playOnlyGeneratedContent: data.checked as boolean
});
}} />
<Labeled flex breakline label={t('MIDI Input')}
desc={t('Select the MIDI input device to be used.')}
content={
@@ -359,6 +371,9 @@ const CompositionPanel: FC = observer(() => {
contentText={t('Are you sure you want to reset this page? It cannot be undone.')}
onConfirm={() => {
commonStore.setCompositionSubmittedPrompt(defaultCompositionPrompt);
setParams({
generationStartTime: 0
});
setPrompt(defaultCompositionPrompt);
}} />
<Button className="grow" appearance="primary" onClick={() => {
@@ -368,6 +383,9 @@ const CompositionPanel: FC = observer(() => {
generateNs(params.autoPlay);
} else {
commonStore.setCompositionGenerating(true);
setParams({
generationStartTime: playerRef.current ? playerRef.current.duration : 0
});
onSubmit(params.prompt);
}
}}>{!commonStore.compositionGenerating ? t('Generate') : t('Stop')}</Button>

View File

@@ -228,9 +228,18 @@ const Configs: FC = observer(() => {
<Select style={{ minWidth: 0 }} className="grow"
value={selectedConfig.modelParameters.modelName}
onChange={(e, data) => {
setSelectedConfigModelParams({
modelName: data.value
});
const modelSource = commonStore.modelSourceList.find(item => item.name === data.value);
if (modelSource?.customTokenizer)
setSelectedConfigModelParams({
modelName: data.value,
useCustomTokenizer: true,
customTokenizer: modelSource?.customTokenizer
});
else // prevent customTokenizer from being overwritten
setSelectedConfigModelParams({
modelName: data.value,
useCustomTokenizer: false
});
}}>
{!commonStore.modelSourceList.find(item => item.name === selectedConfig.modelParameters.modelName)?.isComplete
&& <option key={-1}
@@ -246,7 +255,7 @@ const Configs: FC = observer(() => {
</div>
} />
{
selectedConfig.modelParameters.device !== 'WebGPU' ?
!selectedConfig.modelParameters.device.startsWith('WebGPU') ?
(selectedConfig.modelParameters.device !== 'CPU (rwkv.cpp)' ?
<ToolTipButton text={t('Convert')}
desc={t('Convert model with these configs. Using a converted model will greatly improve the loading speed, but model parameters of the converted model cannot be modified.')}
@@ -256,7 +265,7 @@ const Configs: FC = observer(() => {
onClick={() => convertToGGML(selectedConfig, navigate)} />)
: <ToolTipButton text={t('Convert To Safe Tensors Format')}
desc=""
onClick={() => convertToSt(selectedConfig)} />
onClick={() => convertToSt(selectedConfig, navigate)} />
}
<Labeled label={t('Strategy')} content={
<Dropdown style={{ minWidth: 0 }} className="grow" value={t(selectedConfig.modelParameters.device)!}
@@ -274,6 +283,7 @@ const Configs: FC = observer(() => {
<Option value="CUDA">CUDA</Option>
<Option value="CUDA-Beta">{t('CUDA (Beta, Faster)')!}</Option>
<Option value="WebGPU">WebGPU</Option>
<Option value="WebGPU (Python)">WebGPU (Python)</Option>
<Option value="Custom">{t('Custom')!}</Option>
</Dropdown>
} />
@@ -281,7 +291,8 @@ const Configs: FC = observer(() => {
selectedConfig.modelParameters.device !== 'Custom' && <Labeled label={t('Precision')}
desc={t('int8 uses less VRAM, but has slightly lower quality. fp16 has higher quality.')}
content={
<Dropdown style={{ minWidth: 0 }} className="grow"
<Dropdown
style={{ minWidth: 0 }} className="grow"
value={selectedConfig.modelParameters.precision}
selectedOptions={[selectedConfig.modelParameters.precision]}
onOptionSelect={(_, data) => {
@@ -294,20 +305,20 @@ const Configs: FC = observer(() => {
{selectedConfig.modelParameters.device !== 'CPU' && selectedConfig.modelParameters.device !== 'MPS' &&
<Option>fp16</Option>}
{selectedConfig.modelParameters.device !== 'CPU (rwkv.cpp)' && <Option>int8</Option>}
{selectedConfig.modelParameters.device === 'WebGPU' && <Option>nf4</Option>}
{selectedConfig.modelParameters.device !== 'CPU (rwkv.cpp)' && selectedConfig.modelParameters.device !== 'WebGPU' &&
{selectedConfig.modelParameters.device.startsWith('WebGPU') && <Option>nf4</Option>}
{selectedConfig.modelParameters.device !== 'CPU (rwkv.cpp)' && !selectedConfig.modelParameters.device.startsWith('WebGPU') &&
<Option>fp32</Option>}
{selectedConfig.modelParameters.device === 'CPU (rwkv.cpp)' && <Option>Q5_1</Option>}
</Dropdown>
} />
}
{
selectedConfig.modelParameters.device.includes('CUDA') &&
selectedConfig.modelParameters.device.startsWith('CUDA') &&
<Labeled label={t('Current Strategy')}
content={<Text> {getStrategy(selectedConfig)} </Text>} />
}
{
selectedConfig.modelParameters.device.includes('CUDA') &&
selectedConfig.modelParameters.device.startsWith('CUDA') &&
<Labeled label={t('Stored Layers')}
desc={t('Number of the neural network layers loaded into VRAM, the more you load, the faster the speed, but it consumes more VRAM. (If your VRAM is not enough, it will fail to load)')}
content={
@@ -320,7 +331,7 @@ const Configs: FC = observer(() => {
}} />
} />
}
{selectedConfig.modelParameters.device.includes('CUDA') && <div />}
{selectedConfig.modelParameters.device.startsWith('CUDA') && <div />}
{
displayStrategyImg &&
<img style={{ width: '80vh', height: 'auto', zIndex: 100 }}
@@ -345,7 +356,7 @@ const Configs: FC = observer(() => {
}
{selectedConfig.modelParameters.device === 'Custom' && <div />}
{
(selectedConfig.modelParameters.device.includes('CUDA') || selectedConfig.modelParameters.device === 'Custom') &&
(selectedConfig.modelParameters.device.startsWith('CUDA') || selectedConfig.modelParameters.device === 'Custom') &&
<Labeled label={t('Use Custom CUDA kernel to Accelerate')}
desc={t('Enabling this option can greatly improve inference speed and save some VRAM, but there may be compatibility issues (output garbled). If it fails to start, please turn off this option, or try to upgrade your gpu driver.')}
content={
@@ -394,6 +405,7 @@ const Configs: FC = observer(() => {
</div>
}
/>
{mq && <div style={{ minHeight: '30px' }} />}
</div>
<div className="flex flex-row-reverse sm:fixed bottom-2 right-2">
<div className="flex gap-2">

View File

@@ -153,23 +153,32 @@ const columns: TableColumnDefinition<ModelSourceItem>[] = [
})
];
const getTags = () => {
return Array.from(new Set(
['Recommended',
...commonStore.modelSourceList.map(item => item.tags || []).flat()
.filter(i => !i.includes('Other') && !i.includes('Local'))
, 'Other', 'Local']));
};
const getCurrentModelList = () => {
if (commonStore.activeModelListTags.length === 0)
return commonStore.modelSourceList;
else
return commonStore.modelSourceList.filter(item => commonStore.activeModelListTags.some(tag => item.tags?.includes(tag)));
};
const Models: FC = observer(() => {
const { t } = useTranslation();
const [tags, setTags] = useState<Array<string>>([]);
const [modelSourceList, setModelSourceList] = useState<ModelSourceItem[]>(commonStore.modelSourceList);
const [tags, setTags] = useState<Array<string>>(getTags());
const [modelSourceList, setModelSourceList] = useState<ModelSourceItem[]>(getCurrentModelList());
useEffect(() => {
setTags(Array.from(new Set(
[...commonStore.modelSourceList.map(item => item.tags || []).flat()
.filter(i => !i.includes('Other') && !i.includes('Local'))
, 'Other', 'Local'])));
setTags(getTags());
}, [commonStore.modelSourceList]);
useEffect(() => {
if (commonStore.activeModelListTags.length === 0)
setModelSourceList(commonStore.modelSourceList);
else
setModelSourceList(commonStore.modelSourceList.filter(item => commonStore.activeModelListTags.some(tag => item.tags?.includes(tag))));
setModelSourceList(getCurrentModelList());
}, [commonStore.modelSourceList, commonStore.activeModelListTags]);
return (

View File

@@ -8,10 +8,10 @@ export const defaultPresets: CompletionPreset[] = [{
prompt: 'The following is an epic science fiction masterpiece that is immortalized, with delicate descriptions and grand depictions of interstellar civilization wars.\nChapter 1.\n',
params: {
maxResponseToken: 500,
temperature: 1.2,
topP: 0.5,
presencePenalty: 0.4,
frequencyPenalty: 0.4,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '\\n\\nUser',
injectStart: '',
injectEnd: ''

View File

@@ -96,7 +96,9 @@ class CommonStore {
useLocalSoundFont: false,
externalPlay: false,
midi: null,
ns: null
ns: null,
generationStartTime: 0,
playOnlyGeneratedContent: true
};
compositionGenerating: boolean = false;
compositionSubmittedPrompt: string = defaultCompositionPrompt;

View File

@@ -11,7 +11,9 @@ export type CompositionParams = {
useLocalSoundFont: boolean,
externalPlay: boolean,
midi: ArrayBuffer | null,
ns: NoteSequence | null
ns: NoteSequence | null,
generationStartTime: number,
playOnlyGeneratedContent: boolean,
}
export type Track = {
id: string;

View File

@@ -6,7 +6,7 @@ export type ApiParameters = {
presencePenalty: number;
frequencyPenalty: number;
}
export type Device = 'CPU' | 'CPU (rwkv.cpp)' | 'CUDA' | 'CUDA-Beta' | 'WebGPU' | 'MPS' | 'Custom';
export type Device = 'CPU' | 'CPU (rwkv.cpp)' | 'CUDA' | 'CUDA-Beta' | 'WebGPU' | 'WebGPU (Python)' | 'MPS' | 'Custom';
export type Precision = 'fp16' | 'int8' | 'fp32' | 'nf4' | 'Q5_1';
export type ModelParameters = {
// different models can not have the same name

View File

@@ -1,15 +1,17 @@
export type ModelSourceItem = {
name: string;
size: number;
lastUpdated: string;
desc?: { [lang: string]: string | undefined; };
size: number;
SHA256?: string;
lastUpdated: string;
url?: string;
downloadUrl?: string;
tags?: string[];
customTokenizer?: string;
hide?: boolean;
lastUpdatedMs?: number;
isComplete?: boolean;
isLocal?: boolean;
localSize?: number;
lastUpdatedMs?: number;
tags?: string[];
hide?: boolean;
};

View File

@@ -5,6 +5,7 @@ import {
ConvertGGML,
ConvertModel,
ConvertSafetensors,
ConvertSafetensorsWithPython,
FileExists,
GetPyError
} from '../../wailsjs/go/backend_golang/App';
@@ -51,12 +52,22 @@ export const convertModel = async (selectedConfig: ModelConfig, navigate: Naviga
};
export const convertToSt = async (selectedConfig: ModelConfig) => {
export const convertToSt = async (selectedConfig: ModelConfig, navigate: NavigateFunction) => {
const webgpuPython = selectedConfig.modelParameters.device === 'WebGPU (Python)';
if (webgpuPython) {
const ok = await checkDependencies(navigate);
if (!ok)
return;
}
const modelPath = `${commonStore.settings.customModelsPath}/${selectedConfig.modelParameters.modelName}`;
if (await FileExists(modelPath)) {
toast(t('Start Converting'), { autoClose: 2000, type: 'info' });
const newModelPath = modelPath.replace(/\.pth$/, '.st');
ConvertSafetensors(modelPath, newModelPath).then(async () => {
const convert = webgpuPython ?
(input: string, output: string) => ConvertSafetensorsWithPython(commonStore.settings.customPythonPath, input, output)
: ConvertSafetensors;
convert(modelPath, newModelPath).then(async () => {
if (!await FileExists(newModelPath)) {
if (commonStore.platform === 'windows' || commonStore.platform === 'linux')
toast(t('Convert Failed') + ' - ' + await GetPyError(), { type: 'error' });

View File

@@ -27,6 +27,7 @@ import logo from '../assets/images/logo.png';
import { Preset } from '../types/presets';
import { botName, Conversation, MessageType, userName } from '../types/chat';
import { v4 as uuid } from 'uuid';
import { findLastIndex } from 'lodash-es';
export type Cache = {
version: string
@@ -51,11 +52,11 @@ export async function refreshBuiltInModels(readCache: boolean = false) {
await ReadJson('cache.json').then((cacheData: Cache) => {
if (cacheData.models)
cache.models = cacheData.models;
else cache.models = manifest.models;
else cache.models = manifest.models.slice();
}).catch(() => {
cache.models = manifest.models;
cache.models = manifest.models.slice();
});
else cache.models = manifest.models;
else cache.models = manifest.models.slice();
commonStore.setModelSourceList(cache.models);
await saveCache().catch(() => {
@@ -90,7 +91,7 @@ export async function refreshLocalModels(cache: {
for (let i = 0; i < cache.models.length; i++) {
if (!cache.models[i].lastUpdatedMs)
cache.models[i].lastUpdatedMs = Date.parse(cache.models[i].lastUpdated);
if (!cache.models[i].tags)
if (!cache.models[i].tags || !Array.isArray(cache.models[i].tags) || cache.models[i].tags?.length === 0)
cache.models[i].tags = ['Other'];
for (let j = i + 1; j < cache.models.length; j++) {
@@ -145,7 +146,7 @@ function initLastUnfinishedModelDownloads() {
export async function refreshRemoteModels(cache: {
models: ModelSourceItem[]
}) {
}, filter: boolean = true, initUnfinishedModels: boolean = false) {
const manifestUrls = commonStore.modelSourceManifestList.split(/[,;\n]/);
const requests = manifestUrls.filter(url => url.endsWith('.json')).map(
url => fetch(url, { cache: 'no-cache' }).then(r => r.json()));
@@ -162,18 +163,16 @@ export async function refreshRemoteModels(cache: {
});
cache.models = cache.models.filter((model, index, self) => {
return modelSuffix.some((ext => model.name.endsWith(ext)))
&& index === self.findIndex(
m => m.name === model.name || (m.SHA256 && m.SHA256 === model.SHA256 && m.size === model.size));
});
commonStore.setModelSourceList(cache.models);
await saveCache().catch(() => {
&& index === findLastIndex(self,
m => m.name === model.name || (!!m.SHA256 && m.SHA256 === model.SHA256 && m.size === model.size));
});
await refreshLocalModels(cache, filter, initUnfinishedModels);
}
export const refreshModels = async (readCache: boolean = false, initUnfinishedModels: boolean = false) => {
const cache = await refreshBuiltInModels(readCache);
await refreshLocalModels(cache, false, initUnfinishedModels);
await refreshRemoteModels(cache);
await refreshRemoteModels(cache, false, initUnfinishedModels);
};
export const getStrategy = (modelConfig: ModelConfig | undefined = undefined) => {
@@ -192,6 +191,7 @@ export const getStrategy = (modelConfig: ModelConfig | undefined = undefined) =>
strategy += params.precision === 'int8' ? 'fp32i8' : 'fp32';
break;
case 'WebGPU':
case 'WebGPU (Python)':
strategy += params.precision === 'nf4' ? 'fp16i4' : params.precision === 'int8' ? 'fp16i8' : 'fp16';
break;
case 'CUDA':
@@ -202,6 +202,8 @@ export const getStrategy = (modelConfig: ModelConfig | undefined = undefined) =>
strategy += params.precision === 'int8' ? 'fp16i8' : params.precision === 'fp32' ? 'fp32' : 'fp16';
if (params.storedLayers < params.maxStoredLayers)
strategy += ` *${params.storedLayers}+`;
else
strategy += ` -> cuda fp16 *1`;
break;
case 'MPS':
if (avoidOverflow)
@@ -307,7 +309,7 @@ export function getServerRoot(defaultLocalPort: number, isCore: boolean = false)
const coreCustomApiUrl = commonStore.settings.coreApiUrl.trim().replace(/\/$/, '');
if (isCore && coreCustomApiUrl)
return coreCustomApiUrl;
const defaultRoot = `http://127.0.0.1:${defaultLocalPort}`;
if (commonStore.status.status !== ModelStatus.Offline)
return defaultRoot;

View File

@@ -12,7 +12,7 @@ const vendor = [
'mobx', 'mobx-react-lite',
'i18next', 'react-i18next',
'usehooks-ts', 'react-toastify',
'classnames'
'classnames', 'lodash-es'
];
const embedded = [

4
frontend/wailsjs/go/backend_golang/App.d.ts generated vendored Executable file → Normal file
View File

@@ -16,6 +16,8 @@ export function ConvertModel(arg1:string,arg2:string,arg3:string,arg4:string):Pr
export function ConvertSafetensors(arg1:string,arg2:string):Promise<string>;
export function ConvertSafetensorsWithPython(arg1:string,arg2:string,arg3:string):Promise<string>;
export function CopyFile(arg1:string,arg2:string):Promise<void>;
export function DeleteFile(arg1:string):Promise<void>;
@@ -64,7 +66,7 @@ export function SaveJson(arg1:string,arg2:any):Promise<void>;
export function StartFile(arg1:string):Promise<void>;
export function StartServer(arg1:string,arg2:number,arg3:string,arg4:boolean,arg5:boolean,arg6:boolean):Promise<string>;
export function StartServer(arg1:string,arg2:number,arg3:string,arg4:boolean,arg5:boolean,arg6:boolean,arg7:boolean):Promise<string>;
export function StartWebGPUServer(arg1:number,arg2:string):Promise<string>;

8
frontend/wailsjs/go/backend_golang/App.js generated Executable file → Normal file
View File

@@ -30,6 +30,10 @@ export function ConvertSafetensors(arg1, arg2) {
return window['go']['backend_golang']['App']['ConvertSafetensors'](arg1, arg2);
}
export function ConvertSafetensorsWithPython(arg1, arg2, arg3) {
return window['go']['backend_golang']['App']['ConvertSafetensorsWithPython'](arg1, arg2, arg3);
}
export function CopyFile(arg1, arg2) {
return window['go']['backend_golang']['App']['CopyFile'](arg1, arg2);
}
@@ -126,8 +130,8 @@ export function StartFile(arg1) {
return window['go']['backend_golang']['App']['StartFile'](arg1);
}
export function StartServer(arg1, arg2, arg3, arg4, arg5, arg6) {
return window['go']['backend_golang']['App']['StartServer'](arg1, arg2, arg3, arg4, arg5, arg6);
export function StartServer(arg1, arg2, arg3, arg4, arg5, arg6, arg7) {
return window['go']['backend_golang']['App']['StartServer'](arg1, arg2, arg3, arg4, arg5, arg6, arg7);
}
export function StartWebGPUServer(arg1, arg2) {

0
frontend/wailsjs/go/models.ts generated Executable file → Normal file
View File

View File

@@ -109,7 +109,7 @@ func main() {
err = wails.Run(&options.App{
Title: "RWKV-Runner",
Width: 1024,
Height: 680,
Height: 700,
MinWidth: 375,
MinHeight: 640,
EnableDefaultContextMenu: true,

View File

@@ -1,12 +1,12 @@
{
"version": "1.6.1",
"version": "1.6.5",
"introduction": {
"en": "RWKV is an open-source, commercially usable large language model with high flexibility and great potential for development.\n### About This Tool\nThis tool aims to lower the barrier of entry for using large language models, making it accessible to everyone. It provides fully automated dependency and model management. You simply need to click and run, following the instructions, to deploy a local large language model. The tool itself is very compact and only requires a single executable file for one-click deployment.\nAdditionally, this tool offers an interface that is fully compatible with the OpenAI API. This means you can use any ChatGPT client as a client for RWKV, enabling capability expansion beyond just chat functionality.\n### Preset Configuration Rules at the Bottom\nThis tool comes with a series of preset configurations to reduce complexity. The naming rules for each configuration represent the following in order: device - required VRAM/memory - model size - model language.\nFor example, \"GPU-8G-3B-EN\" indicates that this configuration is for a graphics card with 8GB of VRAM, a model size of 3 billion parameters, and it uses an English language model.\nLarger model sizes have higher performance and VRAM requirements. Among configurations with the same model size, those with higher VRAM usage will have faster runtime.\nFor example, if you have 12GB of VRAM but running the \"GPU-12G-7B-EN\" configuration is slow, you can downgrade to \"GPU-8G-3B-EN\" for a significant speed improvement.\n### About RWKV\nRWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the \"GPT\" mode to quickly compute the hidden state for the \"RNN\" mode.<br/>So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, \"infinite\" ctx_len, and free sentence embedding (using the final hidden state).",
"zh": "RWKV是一个开源且允许商用的大语言模型灵活性很高且极具发展潜力。\n### 关于本工具\n本工具旨在降低大语言模型的使用门槛做到人人可用本工具提供了全自动化的依赖和模型管理你只需要直接点击运行跟随引导即可完成本地大语言模型的部署工具本身体积极小只需要一个exe即可完成一键部署。\n此外本工具提供了与OpenAI API完全兼容的接口这意味着你可以把任意ChatGPT客户端用作RWKV的客户端实现能力拓展而不局限于聊天。\n### 底部的预设配置规则\n本工具内置了一系列预设配置以降低使用难度每个配置名的规则依次代表着设备-所需显存/内存-模型规模-模型语言。\n例如GPU-8G-3B-CN表示该配置用于显卡需要8G显存模型规模为30亿参数使用的是中文模型。\n模型规模越大性能要求越高显存要求也越高而同样模型规模的配置中显存占用越高的运行速度越快。\n例如当你有12G显存但运行GPU-12G-7B-CN配置速度比较慢可降级成GPU-8G-3B-CN将会大幅提速。\n### 关于RWKV\nRWKV是具有Transformer级别LLM性能的RNN也可以像GPT Transformer一样直接进行训练可并行化。而且它是100% attention-free的。你只需在位置t处获得隐藏状态即可计算位置t + 1处的状态。你可以使用“GPT”模式快速计算用于“RNN”模式的隐藏状态。\n因此它将RNN和Transformer的优点结合起来 - 高性能、快速推理、节省显存、快速训练、“无限”上下文长度以及免费的语句嵌入(使用最终隐藏状态)。"
},
"about": {
"en": "<div align=\"center\">\n\nProject Source Code:\nhttps://github.com/josStorer/RWKV-Runner\nAuthor: [@josStorer](https://github.com/josStorer)\nFAQs: https://github.com/josStorer/RWKV-Runner/wiki/FAQs\n\nRelated Repositories:\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\nMIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer\n\n</div>",
"zh": "<div align=\"center\">\n\n本项目源码:\nhttps://github.com/josStorer/RWKV-Runner\n作者: [@josStorer](https://github.com/josStorer)\n演示与常见问题说明视频: https://www.bilibili.com/video/BV1hM4y1v76R\n疑难解答: https://www.bilibili.com/read/cv23921171\n\n相关仓库:\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\nMIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer\n\n</div>"
"en": "<div align=\"center\">\n\nProject Source Code and Introduction:\nhttps://github.com/josStorer/RWKV-Runner\nAuthor: [@josStorer](https://github.com/josStorer)\n\nRelated Repositories:\nRWKV-5-World: https://huggingface.co/BlinkDL/rwkv-5-world/tree/main\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\nMIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer\nai00_rwkv_server: https://github.com/cgisky1980/ai00_rwkv_server\nrwkv.cpp: https://github.com/saharNooby/rwkv.cpp\nweb-rwkv-py: https://github.com/cryscan/web-rwkv-py\n\n</div>",
"zh": "<div align=\"center\">\n\n本项目源码及介绍页:\nhttps://github.com/josStorer/RWKV-Runner\n作者: [@josStorer](https://github.com/josStorer)\n演示与常见问题说明视频: https://www.bilibili.com/video/BV1hM4y1v76R\n\n相关仓库:\nRWKV-5-World: https://huggingface.co/BlinkDL/rwkv-5-world/tree/main\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\nMIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer\nai00_rwkv_server: https://github.com/cgisky1980/ai00_rwkv_server\nrwkv.cpp: https://github.com/saharNooby/rwkv.cpp\nweb-rwkv-py: https://github.com/cryscan/web-rwkv-py\n\n</div>"
},
"programFiles": [
{
@@ -25,8 +25,8 @@
"size": 385598386,
"SHA256": "c844a3ee05bcb9065848cb05b10c48a3f381f5ac1953aad89e156ecdf31d7703",
"lastUpdated": "2023-08-03T15:18:46",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-0.1B-v1-20230803-ctx4096.pth?download=true",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-0.1B-v1-20230803-ctx4096.pth?download=true",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-0.1B-v1-20230803-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-0.1B-v1-20230803-ctx4096.pth",
"tags": [
"Main",
"RWKV-5",
@@ -43,8 +43,8 @@
"size": 923523954,
"SHA256": "5a288c54c7f30b0e2d4af23991133fad2af2d5e59ec7ad850ffe78054a5e4f92",
"lastUpdated": "2023-11-14T01:23:49",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-0.4B-v2-20231113-ctx4096.pth?download=true",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-0.4B-v2-20231113-ctx4096.pth?download=true",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-0.4B-v2-20231113-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-0.4B-v2-20231113-ctx4096.pth",
"tags": [
"Main",
"RWKV-5",
@@ -69,6 +69,45 @@
"Global"
]
},
{
"name": "RWKV-5-1B5-one-state-slim.pth",
"desc": {
"en": "RWKV-5 Global Languages 1.5B v2 Ctx16k Role Play",
"zh": "RWKV-5 全球语言 1.5B v2 16k上下文 角色扮演",
"ja": "RWKV-5 グローバル言語 1.5B v2 16kコンテキスト ロールプレイ"
},
"size": 3155589871,
"SHA256": "43e7b922d7ad49eafa17f8909c2813c91394925bc7f24caf0e19a91aa3281273",
"lastUpdated": "2023-11-02T04:03:27",
"url": "https://huggingface.co/xiaol/RWKV-v5-world-v2-1.5B-one-state-slim-16k/blob/main/RWKV-5-1B5-one-state-slim.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-v5-world-v2-1.5B-one-state-slim-16k/resolve/main/RWKV-5-1B5-one-state-slim.pth",
"tags": [
"Finetuned",
"RWKV-5",
"Global",
"Role Play"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-5-1B5-one-state-slim-novel-tuned.pth",
"desc": {
"en": "RWKV-5 Global Languages 1.5B v2 Ctx16k Novel",
"zh": "RWKV-5 全球语言 1.5B v2 16k上下文 小说",
"ja": "RWKV-5 グローバル言語 1.5B v2 16kコンテキスト 小説"
},
"size": 3155589871,
"SHA256": "4f0aaecdce676e5236018ebd63e3d37c2f300fbac04001ee3a9c00d2f4244d0f",
"lastUpdated": "2023-11-03T02:45:52",
"url": "https://huggingface.co/xiaol/RWKV-v5-world-v2-1.5B-one-state-slim-16k-novel-tuned/blob/main/RWKV-5-1B5-one-state-slim-novel-tuned.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-v5-world-v2-1.5B-one-state-slim-16k-novel-tuned/resolve/main/RWKV-5-1B5-one-state-slim-novel-tuned.pth",
"tags": [
"Finetuned",
"RWKV-5",
"Global"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-5-World-3B-v2-20231113-ctx4096.pth",
"desc": {
@@ -79,12 +118,13 @@
"size": 6126106674,
"SHA256": "a4bd430343c6fd138b85bbc68bb20262d3a2f053ea57dc4b41078269af68ff9c",
"lastUpdated": "2023-11-14T01:23:49",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-3B-v2-20231113-ctx4096.pth?download=true",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-3B-v2-20231113-ctx4096.pth?download=true",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-3B-v2-20231113-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-3B-v2-20231113-ctx4096.pth",
"tags": [
"Main",
"RWKV-5",
"Global"
"Global",
"Recommended"
]
},
{
@@ -97,14 +137,94 @@
"size": 6126106467,
"SHA256": "efa5178d1c824b94ef17c6c9a456674e5581a8be832becbda9aba4dc533f88c2",
"lastUpdated": "2023-11-19T04:21:04",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-3B-v2-20231118-ctx16k.pth?download=true",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-3B-v2-20231118-ctx16k.pth?download=true",
"url": "https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-3B-v2-20231118-ctx16k.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-5-world/resolve/main/RWKV-5-World-3B-v2-20231118-ctx16k.pth",
"tags": [
"Main",
"RWKV-5",
"Global"
"Global",
"Recommended"
]
},
{
"name": "rwkv-v5-7B-0.4-long-ctx-16k.pth",
"desc": {
"en": "RWKV-5 Global Languages 7B v2 40% Ctx300k Document Reader",
"zh": "RWKV-5 全球语言 7B v2 40% 300k上下文 文档阅读",
"ja": "RWKV-5 グローバル言語 7B v2 40% 300kコンテキスト ドキュメントリーダー"
},
"size": 15036198115,
"SHA256": "5888471a45caab903c1bd9c35af1c639ac8d03be6ee6eb39fa9fd3194fa6d437",
"lastUpdated": "2023-11-10T17:12:04",
"url": "https://huggingface.co/xiaol/RWKV-5-world-v2-7B-0.4-300k/blob/main/rwkv-v5-7B-0.4-long-ctx-16k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-5-world-v2-7B-0.4-300k/resolve/main/rwkv-v5-7B-0.4-long-ctx-16k.pth",
"tags": [
"Finetuned",
"RWKV-5",
"Global"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "rwkv-v5.2-7B-horror-16k.pth",
"desc": {
"en": "RWKV-5 Global Languages 7B v2 40% Ctx16k Horror",
"zh": "RWKV-5 全球语言 7B v2 40% 16k上下文 恐怖",
"ja": "RWKV-5 グローバル言語 7B v2 40% 16kコンテキスト ホラー"
},
"size": 15036198115,
"SHA256": "3b36ce99bef06627dcb5d860972e2c1515327afe7db415b8c82dd5c3b926b52f",
"lastUpdated": "2023-11-13T15:21:25",
"url": "https://huggingface.co/xiaol/RWKV-v5.2-7B-horror-16k/blob/main/rwkv-v5.2-7B-horror-16k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-v5.2-7B-horror-16k/resolve/main/rwkv-v5.2-7B-horror-16k.pth",
"tags": [
"Finetuned",
"RWKV-5",
"Global"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "rwkv_v5.2_7B_role_play_16k.pth",
"desc": {
"en": "RWKV-5 Global Languages 7B v2 Ctx16k Claude Like",
"zh": "RWKV-5 全球语言 7B v2 16k上下文 Claude功能",
"ja": "RWKV-5 グローバル言語 7B v2 16kコンテキスト Claude機能"
},
"size": 15036198115,
"SHA256": "6fe8a7bf06b9f5e5b740cd87e24bff91325518ad19bf92bf5c75799b3c24b150",
"lastUpdated": "2023-11-14T04:18:16",
"url": "https://huggingface.co/xiaol/RWKV-v5.2-7B-Role-play-16k/blob/main/rwkv_v5.2_7B_role_play_16k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-v5.2-7B-Role-play-16k/resolve/main/rwkv_v5.2_7B_role_play_16k.pth",
"tags": [
"Finetuned",
"RWKV-5",
"Global",
"Role Play",
"Recommended"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-5-12B-one-state-chat-16k.pth",
"desc": {
"en": "RWKV-5 Global Languages 12B Ctx16k",
"zh": "RWKV-5 全球语言 12B 16k上下文",
"ja": "RWKV-5 グローバル言語 12B 16kコンテキスト"
},
"size": 23157296483,
"SHA256": "330be74738d3936f4c9bd6caf838db11c96f52ff360d0f4fa5401d9bafc898ab",
"lastUpdated": "2023-12-16T16:34:30",
"url": "https://huggingface.co/xiaol/RWKV-v5-12B-one-state-chat-16k/blob/main/RWKV-5-12B-one-state-chat-16k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-v5-12B-one-state-chat-16k/resolve/main/RWKV-5-12B-one-state-chat-16k.pth",
"tags": [
"Finetuned",
"RWKV-5",
"Global",
"Recommended"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-4-World-CHNtuned-0.1B-v1-20230617-ctx4096.pth",
"desc": {
@@ -257,6 +377,25 @@
"Global"
]
},
{
"name": "RWKV-for-mobile-4-world-1.5B-20230906-ctx16k.pth",
"desc": {
"en": "Global Languages 1.5B v1 Ctx16k Claude Like",
"zh": "全球语言 1.5B v1 16k上下文 Claude功能",
"ja": "グローバル言語 1.5B v1 16kコンテキスト Claude機能"
},
"size": 3155280301,
"SHA256": "20547a6deca32add57c45d2f6cff52c6b59cd3b92676ee369b964affba35619d",
"lastUpdated": "2023-09-07T01:35:46",
"url": "https://huggingface.co/xiaol/RWKV-claude-for-mobile-v4-world-1.5B-16k/blob/main/RWKV-for-mobile-4-world-1.5B-20230906-ctx16k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-claude-for-mobile-v4-world-1.5B-16k/resolve/main/RWKV-for-mobile-4-world-1.5B-20230906-ctx16k.pth",
"tags": [
"Finetuned",
"RWKV-4",
"Global"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-4-World-3B-v1-OnlyForTest_35%_trained-20230529-ctx4096.pth",
"desc": {
@@ -513,6 +652,83 @@
"Global"
]
},
{
"name": "RWKV-7B-world-one.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx65k Novel",
"zh": "全球语言 7B v1 65k上下文 小说",
"ja": "グローバル言語 7B v1 65kコンテキスト 小説"
},
"size": 15035391533,
"SHA256": "7ce95a4b460c3385c75c29b6ebe3cd7db438b1107e85d7d3e42dff85cfaa0b78",
"lastUpdated": "2023-10-09T05:23:38",
"url": "https://huggingface.co/xiaol/RWKV-v4-world-7B-one-state-65k/blob/main/RWKV-7B-world-one.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-v4-world-7B-one-state-65k/resolve/main/RWKV-7B-world-one.pth",
"tags": [
"Finetuned",
"RWKV-4",
"Global"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "rwkv-world-one-novel-cot-ultrachat-novel-instructions.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx65k Novel Instruction",
"zh": "全球语言 7B v1 65k上下文 小说指令",
"ja": "グローバル言語 7B v1 65kコンテキスト 小説指示"
},
"size": 15035391533,
"SHA256": "fc2d4643828bb9dfe0733c3b2eb54ba2d996ed3eb6afa051b558da2eb2c1e309",
"lastUpdated": "2023-10-22T09:50:39",
"url": "https://huggingface.co/xiaol/RWKV-4-world-one-state-ultrachat-COT-65k/blob/main/rwkv-world-one-novel-cot-ultrachat-novel-instructions.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-4-world-one-state-ultrachat-COT-65k/resolve/main/rwkv-world-one-novel-cot-ultrachat-novel-instructions.pth",
"tags": [
"Finetuned",
"RWKV-4",
"Global"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-world-novel-one-state-ultrachat-cot-tuned-Role-play-65k.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx65k Role Play",
"zh": "全球语言 7B v1 65k上下文 角色扮演",
"ja": "グローバル言語 7B v1 65kコンテキスト ロールプレイ"
},
"size": 15035391533,
"SHA256": "2f55b4710dcd360e83b4df9a6358661284d9a6c6108f62c5a30b86df181ed67a",
"lastUpdated": "2023-10-22T05:54:27",
"url": "https://huggingface.co/xiaol/RWKV-4-world-one-state-ultrachat-COT-65k/blob/main/RWKV-world-novel-one-state-ultrachat-cot-tuned-Role-play-65k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-4-world-one-state-ultrachat-COT-65k/resolve/main/RWKV-world-novel-one-state-ultrachat-cot-tuned-Role-play-65k.pth",
"tags": [
"Finetuned",
"RWKV-4",
"Global",
"Role Play"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-4-7B-world-one-novel-tuned-65k.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx65k Chinese Novel Instruction",
"zh": "全球语言 7B v1 65k上下文 中文小说指令",
"ja": "グローバル言語 7B v1 65kコンテキスト 中国語小説指示"
},
"size": 15035391533,
"SHA256": "e8ff256d74ca404621dcbf87c43c37e25ea745fed30c404fbf45cc5acc7ba2b5",
"lastUpdated": "2023-10-15T00:57:53",
"url": "https://huggingface.co/xiaol/RWKV-4-world-one-state-novel-tuned-65k/blob/main/RWKV-4-7B-world-one-novel-tuned-65k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-4-world-one-state-novel-tuned-65k/resolve/main/RWKV-4-7B-world-one-novel-tuned-65k.pth",
"tags": [
"Finetuned",
"RWKV-4",
"CN"
],
"customTokenizer": "backend-python/rwkv_pip/rwkv_vocab_v20230424_special_token.txt"
},
{
"name": "RWKV-4-World-CHNtuned-7B-v1-20230709-ctx4096.pth",
"desc": {