Compare commits

...

76 Commits

Author SHA1 Message Date
josc146
c0aa6aaba9 release v1.4.7 2023-09-18 23:03:54 +08:00
josc146
d7abe5f0d1 add pre-compiled beta cuda kernel (rwkv-beta==0.8.5, 40%+ faster for fp16) (thanks to #180, pre-compiled kernel of RTX 40 Series will be included later) 2023-09-18 23:02:49 +08:00
josc146
5e5e1e9651 custom tokenizer .txt support 2023-09-18 17:20:55 +08:00
github-actions[bot]
f8388a0527 release v1.4.6 2023-09-16 05:06:08 +00:00
josc146
f8b764ef8f release v1.4.6 2023-09-16 13:05:34 +08:00
josc146
fcfaa5944e frontend feature adaptation for api params (user_name, assistant_name, presystem) 2023-09-16 13:02:06 +08:00
josc146
f89e89c1c9 chore 2023-09-16 12:23:16 +08:00
josc146
a25965530c custom tokenizer (#77) 2023-09-16 00:34:11 +08:00
josc146
971124d0d7 upgrade to wails@v2.6.0 (EnableDefaultContextMenu: true) 2023-09-16 00:29:45 +08:00
josc146
d7dcc90008 chore 2023-09-15 16:31:14 +08:00
josc146
df969fcfc6 upgrade cuda-beta 2023-09-15 16:30:11 +08:00
josc146
c4042bbfd8 improve ui desc 2023-09-15 16:26:32 +08:00
josc146
4112200b4c revert(2d5456): refresh local models when download complete (for macOS) 2023-09-15 16:25:04 +08:00
Ikko Eltociear Ashimine
3f9a54e36f Update README_JA.md
add translation.
2023-09-13 16:11:43 +08:00
github-actions[bot]
3ed4456135 release v1.4.5 2023-08-27 15:57:18 +00:00
josc146
e0df9ae47b release v1.4.5 2023-08-27 23:56:37 +08:00
josc146
87b2c3ed7d fix build 2023-08-27 23:56:30 +08:00
josc146
50ff7ef6bc always use requirements.txt 2023-08-27 23:52:52 +08:00
josc146
c7a580ca8a update manifest 2023-08-27 23:16:56 +08:00
josc146
eaae7624a7 add HardwareMonitor (Windows Only) 2023-08-27 22:53:18 +08:00
josc146
fcd59de6fb correct Preset UI description 2023-08-27 21:37:32 +08:00
josc146
1bbe127209 fix webgpu_server file permissions of linux and macos 2023-08-27 21:22:26 +08:00
josc146
b868adc058 chore 2023-08-27 21:21:34 +08:00
josc146
a24b78e8c3 python-backend: extra ChatCompletionBody params (raw, presystem);
add default_stop when stop is null
2023-08-27 21:21:11 +08:00
josc146
c8025f1cff allow message content to be empty 2023-08-27 21:02:54 +08:00
josc146
fe0860dbf0 fix lora finetune max_epochs (#170) 2023-08-24 22:49:57 +08:00
josc146
02d5d641d1 chore 2023-08-24 22:48:54 +08:00
github-actions[bot]
a057bb6c5b release v1.4.4 2023-08-16 15:33:53 +00:00
josc146
c9e4ae7fa1 release v1.4.4 2023-08-16 23:33:22 +08:00
josc146
79a97b2bc4 webgpu release support 2023-08-16 23:31:04 +08:00
josc146
ef53951a16 webgpu support 2023-08-16 23:07:58 +08:00
josc146
74f1a1c033 chore 2023-08-16 21:11:58 +08:00
josc146
ce986cfc6d chore 2023-08-16 12:50:22 +08:00
josc146
61cea2a784 add misc API (/models and /dashboard/billing/credit_grants) 2023-08-14 23:37:55 +08:00
josc146
8a13bd3c1e add rwkv-cuda-beta support (faster) 2023-08-14 22:07:15 +08:00
josc146
da68926e9c chore (AddStateBody class) 2023-08-13 21:27:29 +08:00
josc146
e0b7453883 allow multiple systems 2023-08-04 22:27:55 +08:00
josc146
91e2828a95 allow completions input to be null 2023-08-04 22:22:59 +08:00
github-actions[bot]
bcf6409536 release v1.4.3 2023-07-31 14:51:01 +00:00
josc146
d7d4f87620 release v1.4.3 2023-07-31 22:50:29 +08:00
josc146
b3e35a4cdd allow custom user_name and assistant_name (/chat/completions API) 2023-07-31 22:48:54 +08:00
josc146
8764c37b03 RWKVType 2023-07-31 22:46:13 +08:00
josc146
d12a173f39 global penalty 2023-07-31 22:02:28 +08:00
josc146
64fa939c19 japanese UI chore 2023-07-29 21:44:33 +08:00
josc146
9c8e7b2f08 japanese UI 2023-07-29 21:19:45 +08:00
josc146
abfd668523 update defaultConfigs 2023-07-29 19:41:54 +08:00
github-actions[bot]
ebacf383f5 release v1.4.2 2023-07-29 11:34:18 +00:00
josc146
eb25dc6bcb release v1.4.2 2023-07-29 19:33:52 +08:00
josc146
aecacde819 remove response field of completions api 2023-07-29 19:20:43 +08:00
josc146
3ef22239eb improve default ChatCompletion stop 2023-07-29 19:19:38 +08:00
josc146
719090cc8c improve python backend startup speed 2023-07-29 19:18:01 +08:00
josc146
dbb8374d89 update defaultConfigs 2023-07-29 19:16:44 +08:00
github-actions[bot]
4d875a8c00 release v1.4.1 2023-07-28 14:16:37 +00:00
josc146
30b6d66a2d release v1.4.1 2023-07-28 22:14:53 +08:00
josc146
9d89b6f4db fix params 2023-07-28 22:13:19 +08:00
josc146
d2928e54f7 fix failed to build cyac 2023-07-28 21:40:17 +08:00
josc146
49ba5c97f7 update readme 2023-07-28 13:13:14 +08:00
github-actions[bot]
4054fac359 release v1.4.0 2023-07-28 05:06:42 +00:00
josc146
dfae1d9645 release v1.4.0 2023-07-28 13:05:55 +08:00
josc146
0f16a0dd1b remove LoraFinetunePrecision fp32 2023-07-28 12:53:41 +08:00
josc146
cb05a8a2ae update manifest 2023-07-28 12:50:39 +08:00
josc146
a51385173c add CPU-120M-Music config 2023-07-28 12:45:31 +08:00
josc146
4e18222a35 improve RunButton prompt 2023-07-28 12:45:13 +08:00
josc146
daabcf58a0 add Composition Page (RWKV-Music) 2023-07-28 12:30:05 +08:00
josc146
d0fd480bd6 chore 2023-07-26 22:24:26 +08:00
josc146
1df345b5eb improve embeddings API results 2023-07-25 20:30:43 +08:00
josc146
77868c798b chore 2023-07-25 16:37:06 +08:00
josc146
f56748a941 improve python backend startup speed 2023-07-25 16:14:29 +08:00
josc146
29c5b1d804 add midi api 2023-07-25 16:11:17 +08:00
josc146
34095a6c36 support for stop array 2023-07-25 16:10:22 +08:00
josc146
05b9b42b56 add support for MIDI RWKV 2023-07-25 16:09:31 +08:00
josc146
211ae342af improve sse fetch 2023-07-25 15:59:37 +08:00
josc146
5ae683e915 update presets 2023-07-25 15:53:25 +08:00
josc146
dc59fb39c7 update readme 2023-07-18 14:21:09 +08:00
josc146
49960774ee update readme 2023-07-18 14:16:50 +08:00
github-actions[bot]
b718452618 release v1.3.9 2023-07-17 05:05:17 +00:00
79 changed files with 95413 additions and 546 deletions

2
.gitattributes vendored
View File

@@ -2,6 +2,8 @@ backend-python/rwkv_pip/** linguist-vendored
backend-python/wkv_cuda_utils/** linguist-vendored
backend-python/get-pip.py linguist-vendored
backend-python/convert_model.py linguist-vendored
backend-python/convert_safetensors.py linguist-vendored
backend-python/utils/midi.py linguist-vendored
build/** linguist-vendored
finetune/lora/** linguist-vendored
finetune/json2binidx_tool/** linguist-vendored

View File

@@ -11,7 +11,7 @@ env:
jobs:
create-draft:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- run: echo "VERSION=${GITHUB_REF_NAME#v}" >> $GITHUB_ENV
- uses: actions/checkout@v3
@@ -35,7 +35,7 @@ jobs:
gh release create ${{github.ref_name}} -d -F CURRENT_CHANGE.md -t ${{github.ref_name}}
windows:
runs-on: windows-latest
runs-on: windows-2022
needs: create-draft
steps:
- uses: actions/checkout@v3
@@ -48,19 +48,32 @@ jobs:
id: cp310
with:
python-version: '3.10'
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
target: wasm32-unknown-unknown
- uses: crazy-max/ghaction-chocolatey@v2
with:
args: install upx
- run: |
Start-BitsTransfer https://github.com/josStorer/LibreHardwareMonitor.Console/releases/download/v0.1.0/LibreHardwareMonitor.Console.zip ./LibreHardwareMonitor.Console.zip
Expand-Archive ./LibreHardwareMonitor.Console.zip -DestinationPath ./components/LibreHardwareMonitor.Console
Start-BitsTransfer https://www.python.org/ftp/python/3.10.11/python-3.10.11-embed-amd64.zip ./python-3.10.11-embed-amd64.zip
Expand-Archive ./python-3.10.11-embed-amd64.zip -DestinationPath ./py310
$content=Get-Content "./py310/python310._pth"; $content | ForEach-Object {if ($_.ReadCount -eq 3) {"Lib\\site-packages"} else {$_}} | Set-Content ./py310/python310._pth
./py310/python ./backend-python/get-pip.py
./py310/python -m pip install Cython
./py310/python -m pip install Cython==0.29.36
Copy-Item -Path "${{ steps.cp310.outputs.python-path }}/../include" -Destination "py310/include" -Recurse
Copy-Item -Path "${{ steps.cp310.outputs.python-path }}/../libs" -Destination "py310/libs" -Recurse
./py310/python -m pip install cyac
./py310/python -m pip install cyac==1.7
git clone https://github.com/josStorer/ai00_rwkv_server --depth=1
cd ai00_rwkv_server
cargo build --release
mv ./target/release/ai00_server.exe ../backend-rust/webgpu_server.exe
cd ..
go install github.com/wailsapp/wails/v2/cmd/wails@latest
(Get-Content -Path ./backend-golang/app.go) -replace "//go:custom_build windows ", "" | Set-Content -Path ./backend-golang/app.go
make
Rename-Item -Path "build/bin/RWKV-Runner.exe" -NewName "RWKV-Runner_windows_x64.exe"
@@ -76,12 +89,26 @@ jobs:
- uses: actions/setup-go@v4
with:
go-version: '1.20.5'
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
target: wasm32-unknown-unknown
- run: |
sudo apt-get update
sudo apt-get install upx
sudo apt-get install build-essential libgtk-3-dev libwebkit2gtk-4.0-dev
git clone https://github.com/josStorer/ai00_rwkv_server --depth=1
cd ai00_rwkv_server
sudo apt-get install libudev-dev
sudo apt-get install libasound2-dev
rustup target add x86_64-unknown-linux-gnu
cargo build --release --target x86_64-unknown-linux-gnu
mv ./target/x86_64-unknown-linux-gnu/release/ai00_server ../backend-rust/webgpu_server
cd ..
go install github.com/wailsapp/wails/v2/cmd/wails@latest
rm -rf ./backend-python/wkv_cuda_utils
rm ./backend-python/rwkv_pip/beta/wkv_cuda.pyd
rm ./backend-python/get-pip.py
sed -i '1,2d' ./backend-golang/wsl_not_windows.go
rm ./backend-golang/wsl.go
@@ -101,9 +128,20 @@ jobs:
- uses: actions/setup-go@v4
with:
go-version: '1.20.5'
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
target: wasm32-unknown-unknown
- run: |
git clone https://github.com/josStorer/ai00_rwkv_server --depth=1
cd ai00_rwkv_server
cargo build --release
mv ./target/release/ai00_server ../backend-rust/webgpu_server
cd ..
go install github.com/wailsapp/wails/v2/cmd/wails@latest
rm -rf ./backend-python/wkv_cuda_utils
rm ./backend-python/rwkv_pip/beta/wkv_cuda.pyd
rm ./backend-python/get-pip.py
sed -i '' '1,2d' ./backend-golang/wsl_not_windows.go
rm ./backend-golang/wsl.go
@@ -116,7 +154,7 @@ jobs:
- run: gh release upload ${{github.ref_name}} build/bin/RWKV-Runner_macos_universal.zip build/bin/RWKV-Runner_darwin_universal
publish-release:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
needs: [ windows, linux, macos ]
steps:
- uses: actions/checkout@v3

4
.gitignore vendored
View File

@@ -5,6 +5,8 @@ __pycache__
.idea
.vs
*.pth
*.st
*.safetensors
*.bin
/config.json
/cache.json
@@ -23,3 +25,5 @@ __pycache__
*.log
train_log.txt
finetune/json2binidx_tool/data
/wsl.state
/components

View File

@@ -1,8 +1,8 @@
## Changes
- fix always show `Convert Failed` when converting model
- fix input with array type (#96, #107)
- change chinese translation of `completion`
- custom tokenizer .txt support
- add pre-compiled beta cuda kernel (rwkv-beta==0.8.5, 40%+ faster for fp16) (thanks to #180, pre-compiled kernel of RTX
40 Series will be included later)
## Install

View File

@@ -49,7 +49,7 @@ English | [简体中文](README_ZH.md) | [日本語](README_JA.md)
#### Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility issues, go to the Configs page and turn off `Use Custom CUDA kernel to Accelerate`.
#### If Windows Defender claims this is a virus, you can try downloading [v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip) and letting it update automatically to the latest version, or add it to the trusted list.
#### If Windows Defender claims this is a virus, you can try downloading [v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip) and letting it update automatically to the latest version, or add it to the trusted list (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`).
#### For different tasks, adjusting API parameters can achieve better results. For example, for translation tasks, you can try setting Temperature to 1 and Top_P to 0.3.
@@ -91,6 +91,9 @@ body.json:
## Embeddings API Example
Note: v1.4.0 has improved the quality of embeddings API. The generated results are not compatible
with previous versions. If you are using embeddings API to generate knowledge bases or similar, please regenerate.
If you are using langchain, just use `OpenAIEmbeddings(openai_api_base="http://127.0.0.1:8000", openai_api_key="sk-")`
```python
@@ -135,6 +138,7 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
- ChatRWKV: https://github.com/BlinkDL/ChatRWKV
- RWKV-LM: https://github.com/BlinkDL/RWKV-LM
- RWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA
- MIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer
## Preview
@@ -150,6 +154,10 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/bf49de8e-3b89-4543-b1ef-7cd4b19a1836)
### Composition
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/e8ad908d-3fd2-4e92-bcdb-96815cb836ee)
### Configuration
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/48befdc6-e03c-4851-9bee-22f77ee2640e)

View File

@@ -49,7 +49,7 @@
#### デフォルトの設定はカスタム CUDA カーネルアクセラレーションを有効にしています。互換性の問題が発生する可能性がある場合は、コンフィグページに移動し、`Use Custom CUDA kernel to Accelerate` をオフにしてください。
#### Windows Defender がこれをウイルスだと主張する場合は、[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip) をダウンロードして最新版に自動更新させるか、信頼済みリストに追加してみてください。
#### Windows Defender がこれをウイルスだと主張する場合は、[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip) をダウンロードして最新版に自動更新させるか、信頼済みリストに追加してみてください (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`)
#### 異なるタスクについては、API パラメータを調整することで、より良い結果を得ることができます。例えば、翻訳タスクの場合、Temperature を 1 に、Top_P を 0.3 に設定してみてください。
@@ -91,7 +91,11 @@ body.json:
## 埋め込み API の例
LangChain を使用している場合は、`OpenAIEmbeddings(openai_api_base="http://127.0.0.1:8000", openai_api_key="sk-")`を使用してください
注意: v1.4.0 では、埋め込み API の品質が向上しました。生成される結果は、以前のバージョンとは互換性がありません。
もし、embeddings API を使って知識ベースなどを生成している場合は、再生成してください。
LangChain を使用している場合は、`OpenAIEmbeddings(openai_api_base="http://127.0.0.1:8000", openai_api_key="sk-")`
を使用してください
```python
import numpy as np
@@ -135,6 +139,7 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
- ChatRWKV: https://github.com/BlinkDL/ChatRWKV
- RWKV-LM: https://github.com/BlinkDL/RWKV-LM
- RWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA
- MIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer
## プレビュー
@@ -150,6 +155,10 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/bf49de8e-3b89-4543-b1ef-7cd4b19a1836)
### 作曲
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/e8ad908d-3fd2-4e92-bcdb-96815cb836ee)
### コンフィグ
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/48befdc6-e03c-4851-9bee-22f77ee2640e)

View File

@@ -48,7 +48,7 @@ API兼容的接口这意味着一切ChatGPT客户端都是RWKV客户端。
#### 预设配置已经开启自定义CUDA算子加速速度更快且显存消耗更少。如果你遇到可能的兼容性问题前往配置页面关闭`使用自定义CUDA算子加速`
#### 如果Windows Defender说这是一个病毒你可以尝试下载[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip),然后让其自动更新到最新版,或添加信任
#### 如果Windows Defender说这是一个病毒你可以尝试下载[v1.3.7_win.zip](https://github.com/josStorer/RWKV-Runner/releases/download/v1.3.7/RWKV-Runner_win.zip),然后让其自动更新到最新版,或添加信任 (`Windows Security` -> `Virus & threat protection` -> `Manage settings` -> `Exclusions` -> `Add or remove exclusions` -> `Add an exclusion` -> `Folder` -> `RWKV-Runner`)
#### 对于不同的任务调整API参数会获得更好的效果例如对于翻译任务你可以尝试设置Temperature为1Top_P为0.3
@@ -89,6 +89,8 @@ body.json:
## Embeddings API 示例
注意: 1.4.0 版本对embeddings API质量进行了改善生成结果与之前的版本不兼容如果你正在使用此API生成知识库等请重新生成
如果你在用langchain, 直接使用 `OpenAIEmbeddings(openai_api_base="http://127.0.0.1:8000", openai_api_key="sk-")`
```python
@@ -133,6 +135,7 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
- ChatRWKV: https://github.com/BlinkDL/ChatRWKV
- RWKV-LM: https://github.com/BlinkDL/RWKV-LM
- RWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA
- MIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer
## Preview
@@ -148,6 +151,10 @@ for i in np.argsort(embeddings_cos_sim)[::-1]:
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/69f9ba7a-2fe8-4a5e-94cb-aa655aa409e2)
### 作曲
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/95b34893-80c2-4706-87f9-bc141032ed4b)
### 配置
![image](https://github.com/josStorer/RWKV-Runner/assets/13366013/59460f69-b172-4c7a-86cb-573262543076)

Binary file not shown.

View File

@@ -0,0 +1,116 @@
# https://github.com/magenta/magenta-js/issues/164
import json
import os
import urllib.request
def get_pitches_array(min_pitch, max_pitch):
return list(range(min_pitch, max_pitch + 1))
base_url = 'https://storage.googleapis.com/magentadata/js/soundfonts'
soundfont_path = 'sgm_plus'
soundfont_json_url = f"{base_url}/{soundfont_path}/soundfont.json"
# Download soundfont.json
soundfont_json = ""
if not os.path.exists('soundfont.json'):
try:
with urllib.request.urlopen(soundfont_json_url) as response:
soundfont_json = response.read()
# Save soundfont.json
with open('soundfont.json', 'wb') as file:
file.write(soundfont_json)
except:
print("Failed to download soundfont.json")
else:
# If file exists, get it from the file system
with open('soundfont.json', 'rb') as file:
soundfont_json = file.read()
# Parse soundfont.json
soundfont_data = json.loads(soundfont_json)
if soundfont_data is not None:
# Iterate over each instrument
for instrument_id, instrument_name in soundfont_data['instruments'].items():
if not os.path.isdir(instrument_name):
# Create instrument directory if it doesn't exist
os.makedirs(instrument_name)
instrument_json = ""
instrument_path = f"{soundfont_path}/{instrument_name}"
if not os.path.exists(f"{instrument_name}/instrument.json"):
# Download instrument.json
instrument_json_url = f"{base_url}/{instrument_path}/instrument.json"
try:
with urllib.request.urlopen(instrument_json_url) as response:
instrument_json = response.read()
# Save instrument.json
with open(f"{instrument_name}/instrument.json", 'wb') as file:
file.write(instrument_json)
except:
print(f"Failed to download {instrument_name}/instrument.json")
else:
# If file exists, get it from the file system
with open(f"{instrument_name}/instrument.json", 'rb') as file:
instrument_json = file.read()
# Parse instrument.json
instrument_data = json.loads(instrument_json)
if instrument_data is not None:
# Iterate over each pitch and velocity
for velocity in instrument_data['velocities']:
pitches = get_pitches_array(instrument_data['minPitch'], instrument_data['maxPitch'])
for pitch in pitches:
# Create the file name
file_name = f'p{pitch}_v{velocity}.mp3'
# Check if the file already exists
if os.path.exists(f"{instrument_name}/{file_name}"):
pass
#print(f"Skipping {instrument_name}/{file_name} - File already exists")
else:
# Download pitch/velocity file
file_url = f"{base_url}/{instrument_path}/{file_name}"
try:
with urllib.request.urlopen(file_url) as response:
file_contents = response.read()
# Save pitch/velocity file
with open(f"{instrument_name}/{file_name}", 'wb') as file:
file.write(file_contents)
print(f"Downloaded {instrument_name}/{file_name}")
except:
print(f"Failed to download {instrument_name}/{file_name}")
else:
print(f"Failed to parse instrument.json for {instrument_name}")
else:
print('Failed to parse soundfont.json')

View File

@@ -0,0 +1,134 @@
{
"name": "sgm_plus",
"instruments": {
"0": "acoustic_grand_piano",
"1": "bright_acoustic_piano",
"2": "electric_grand_piano",
"3": "honkytonk_piano",
"4": "electric_piano_1",
"5": "electric_piano_2",
"6": "harpsichord",
"7": "clavichord",
"8": "celesta",
"9": "glockenspiel",
"10": "music_box",
"11": "vibraphone",
"12": "marimba",
"13": "xylophone",
"14": "tubular_bells",
"15": "dulcimer",
"16": "drawbar_organ",
"17": "percussive_organ",
"18": "rock_organ",
"19": "church_organ",
"20": "reed_organ",
"21": "accordion",
"22": "harmonica",
"23": "tango_accordion",
"24": "acoustic_guitar_nylon",
"25": "acoustic_guitar_steel",
"26": "electric_guitar_jazz",
"27": "electric_guitar_clean",
"28": "electric_guitar_muted",
"29": "overdriven_guitar",
"30": "distortion_guitar",
"31": "guitar_harmonics",
"32": "acoustic_bass",
"33": "electric_bass_finger",
"34": "electric_bass_pick",
"35": "fretless_bass",
"36": "slap_bass_1",
"37": "slap_bass_2",
"38": "synth_bass_1",
"39": "synth_bass_2",
"40": "violin",
"41": "viola",
"42": "cello",
"43": "contrabass",
"44": "tremolo_strings",
"45": "pizzicato_strings",
"46": "orchestral_harp",
"47": "timpani",
"48": "string_ensemble_1",
"49": "string_ensemble_2",
"50": "synthstrings_1",
"51": "synthstrings_2",
"52": "choir_aahs",
"53": "voice_oohs",
"54": "synth_voice",
"55": "orchestra_hit",
"56": "trumpet",
"57": "trombone",
"58": "tuba",
"59": "muted_trumpet",
"60": "french_horn",
"61": "brass_section",
"62": "synthbrass_1",
"63": "synthbrass_2",
"64": "soprano_sax",
"65": "alto_sax",
"66": "tenor_sax",
"67": "baritone_sax",
"68": "oboe",
"69": "english_horn",
"70": "bassoon",
"71": "clarinet",
"72": "piccolo",
"73": "flute",
"74": "recorder",
"75": "pan_flute",
"76": "blown_bottle",
"77": "shakuhachi",
"78": "whistle",
"79": "ocarina",
"80": "lead_1_square",
"81": "lead_2_sawtooth",
"82": "lead_3_calliope",
"83": "lead_4_chiff",
"84": "lead_5_charang",
"85": "lead_6_voice",
"86": "lead_7_fifths",
"87": "lead_8_bass_lead",
"88": "pad_1_new_age",
"89": "pad_2_warm",
"90": "pad_3_polysynth",
"91": "pad_4_choir",
"92": "pad_5_bowed",
"93": "pad_6_metallic",
"94": "pad_7_halo",
"95": "pad_8_sweep",
"96": "fx_1_rain",
"97": "fx_2_soundtrack",
"98": "fx_3_crystal",
"99": "fx_4_atmosphere",
"100": "fx_5_brightness",
"101": "fx_6_goblins",
"102": "fx_7_echoes",
"103": "fx_8_scifi",
"104": "sitar",
"105": "banjo",
"106": "shamisen",
"107": "koto",
"108": "kalimba",
"109": "bag_pipe",
"110": "fiddle",
"111": "shanai",
"112": "tinkle_bell",
"113": "agogo",
"114": "steel_drums",
"115": "woodblock",
"116": "taiko_drum",
"117": "melodic_tom",
"118": "synth_drum",
"119": "reverse_cymbal",
"120": "guitar_fret_noise",
"121": "breath_noise",
"122": "seashore",
"123": "bird_tweet",
"124": "telephone_ring",
"125": "helicopter",
"126": "applause",
"127": "gunshot",
"drums": "percussion"
}
}

469
assets/soundfont_builder.rb Normal file
View File

@@ -0,0 +1,469 @@
#!/usr/bin/env ruby
#
# JavaScript Soundfont Builder for MIDI.js
# Author: 0xFE <mohit@muthanna.com>
# edited by Valentijn Nieman <valentijnnieman@gmail.com>
#
# Requires:
#
# FluidSynth
# Lame
# Ruby Gems: midilib parallel
#
# $ brew install fluidsynth lame (on OSX)
# $ gem install midilib parallel
#
# You'll need to download a GM soundbank to generate audio.
#
# Usage:
#
# 1) Install the above dependencies.
# 2) Edit BUILD_DIR, SOUNDFONT, and INSTRUMENTS as required.
# 3) Run without any argument.
require 'base64'
require 'digest/sha1'
require 'etc'
require 'fileutils'
require 'midilib'
require 'parallel'
require 'zlib'
require 'json'
include FileUtils
BUILD_DIR = "./sound-font" # Output path
SOUNDFONT = "./default_sound_font.sf2" # Soundfont file path
# This script will generate MIDI.js-compatible instrument JS files for
# all instruments in the below array. Add or remove as necessary.
INSTRUMENTS = [
0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
35,
36,
37,
38,
39,
40,
41,
42,
43,
44,
45,
46,
47,
48,
49,
50,
51,
52,
53,
54,
55,
56,
57,
58,
59,
60,
61,
62,
63,
64,
65,
66,
67,
68,
69,
70,
71,
72,
73,
74,
75,
76,
77,
78,
79,
80,
81,
82,
83,
84,
85,
86,
87,
88,
89,
90,
91,
92,
93,
94,
95,
96,
97,
98,
99,
100,
101,
102,
103,
104,
105,
106,
107,
108,
109,
110,
111,
112,
113,
114,
115,
116,
117,
118,
119,
120,
121,
122,
123,
124,
125,
126,
127
]
# It was found that midilib uses names that are incompatible with MIDI.js
# For example, midilib uses "SynthBrass 1" -> https://github.com/jimm/midilib/blob/6c8e481ae72cd9f00a38eb3700ddfca6b549f153/lib/midilib/consts.rb#L280
# and the MIDI association uses "SynthBrass 1" -> https://www.midi.org/specifications-old/item/gm-level-1-sound-set
# but the MIDI.js calls this "Synth Brass 1" -> https://github.com/mudcube/MIDI.js/blob/a8a84257afa70721ae462448048a87301fc1554a/js/midi/gm.js#L44
# there are others like "Bag pipe" vs "Bagpipe", etc.
# here, we use the MIDI.js definitions because that is how most users will interact with the generated soundfonts.
MIDIJS_PATCH_NAMES = [
"Acoustic Grand Piano",
"Bright Acoustic Piano",
"Electric Grand Piano",
"Honky-tonk Piano",
"Electric Piano 1",
"Electric Piano 2",
"Harpsichord",
"Clavinet",
"Celesta",
"Glockenspiel",
"Music Box",
"Vibraphone",
"Marimba",
"Xylophone",
"Tubular Bells",
"Dulcimer",
"Drawbar Organ",
"Percussive Organ",
"Rock Organ",
"Church Organ",
"Reed Organ",
"Accordion",
"Harmonica",
"Tango Accordion",
"Acoustic Guitar (nylon)",
"Acoustic Guitar (steel)",
"Electric Guitar (jazz)",
"Electric Guitar (clean)",
"Electric Guitar (muted)",
"Overdriven Guitar",
"Distortion Guitar",
"Guitar Harmonics",
"Acoustic Bass",
"Electric Bass (finger)",
"Electric Bass (pick)",
"Fretless Bass",
"Slap Bass 1",
"Slap Bass 2",
"Synth Bass 1",
"Synth Bass 2",
"Violin",
"Viola",
"Cello",
"Contrabass",
"Tremolo Strings",
"Pizzicato Strings",
"Orchestral Harp",
"Timpani",
"String Ensemble 1",
"String Ensemble 2",
"Synth Strings 1",
"Synth Strings 2",
"Choir Aahs",
"Voice Oohs",
"Synth Choir",
"Orchestra Hit",
"Trumpet",
"Trombone",
"Tuba",
"Muted Trumpet",
"French Horn",
"Brass Section",
"Synth Brass 1",
"Synth Brass 2",
"Soprano Sax",
"Alto Sax",
"Tenor Sax",
"Baritone Sax",
"Oboe",
"English Horn",
"Bassoon",
"Clarinet",
"Piccolo",
"Flute",
"Recorder",
"Pan Flute",
"Blown Bottle",
"Shakuhachi",
"Whistle",
"Ocarina",
"Lead 1 (square)",
"Lead 2 (sawtooth)",
"Lead 3 (calliope)",
"Lead 4 (chiff)",
"Lead 5 (charang)",
"Lead 6 (voice)",
"Lead 7 (fifths)",
"Lead 8 (bass + lead)",
"Pad 1 (new age)",
"Pad 2 (warm)",
"Pad 3 (polysynth)",
"Pad 4 (choir)",
"Pad 5 (bowed)",
"Pad 6 (metallic)",
"Pad 7 (halo)",
"Pad 8 (sweep)",
"FX 1 (rain)",
"FX 2 (soundtrack)",
"FX 3 (crystal)",
"FX 4 (atmosphere)",
"FX 5 (brightness)",
"FX 6 (goblins)",
"FX 7 (echoes)",
"FX 8 (sci-fi)",
"Sitar",
"Banjo",
"Shamisen",
"Koto",
"Kalimba",
"Bagpipe",
"Fiddle",
"Shanai",
"Tinkle Bell",
"Agogo",
"Steel Drums",
"Woodblock",
"Taiko Drum",
"Melodic Tom",
"Synth Drum",
"Reverse Cymbal",
"Guitar Fret Noise",
"Breath Noise",
"Seashore",
"Bird Tweet",
"Telephone Ring",
"Helicopter",
"Applause",
"Gunshot"
]
# The encoders and tools are expected in your PATH. You can supply alternate
# paths by changing the constants below.
LAME = "lame" # `which lame`.chomp
FLUIDSYNTH = "fluidsynth" # `which fluidsynth`.chomp
puts "Building the following instruments using font: " + SOUNDFONT
# Display instrument names.
INSTRUMENTS.each do |i|
puts " #{i}: " + MIDIJS_PATCH_NAMES[i]
end
puts
puts "Using MP3 encoder: " + LAME
puts "Using FluidSynth encoder: " + FLUIDSYNTH
puts
puts "Sending output to: " + BUILD_DIR
puts
raise "Can't find soundfont: #{SOUNDFONT}" unless File.exist? SOUNDFONT
raise "Can't find 'lame' command" if LAME.empty?
raise "Can't find 'fluidsynth' command" if FLUIDSYNTH.empty?
raise "Output directory does not exist: #{BUILD_DIR}" unless File.exist?(BUILD_DIR)
puts "Hit return to begin."
$stdin.readline
NOTES = {
"C" => 0,
"Db" => 1,
"D" => 2,
"Eb" => 3,
"E" => 4,
"F" => 5,
"Gb" => 6,
"G" => 7,
"Ab" => 8,
"A" => 9,
"Bb" => 10,
"B" => 11
}
MIDI_C0 = 12
VELOCITY = 100
DURATION = Integer(3000)
TEMP_FILE = "#{BUILD_DIR}/%s%stemp.midi"
FLUIDSYNTH_RAW = "%s.wav"
def deflate(string, level)
z = Zlib::Deflate.new(level)
dst = z.deflate(string, Zlib::FINISH)
z.close
dst
end
def note_to_int(note, octave)
value = NOTES[note]
increment = MIDI_C0 * octave
return value + increment
end
def int_to_note(value)
raise "Bad Value" if value < MIDI_C0
reverse_notes = NOTES.invert
value -= MIDI_C0
octave = value / 12
note = value % 12
return { key: reverse_notes[note],
octave: octave }
end
# Run a quick table validation
MIDI_C0.upto(100) do |x|
note = int_to_note x
#raise "Broken table" unless note_to_int(note[:key], note[:octave]) == x
end
def generate_midi(program, note_value, file)
include MIDI
seq = Sequence.new()
track = Track.new(seq)
seq.tracks << track
track.events << ProgramChange.new(0, Integer(program))
track.events << NoteOn.new(0, note_value, VELOCITY, 0) # channel, note, velocity, delta
track.events << NoteOff.new(0, note_value, VELOCITY, DURATION)
File.open(file, 'wb') { | file | seq.write(file) }
end
def run_command(cmd)
puts "Running: " + cmd
`#{cmd}`
end
def midi_to_audio(source, target)
run_command "#{FLUIDSYNTH} -C no -R no -g 0.5 -F #{target} #{SOUNDFONT} #{source}"
run_command "#{LAME} -v -b 8 -B 64 #{target}"
rm target
end
def open_js_file(instrument_key, type)
js_file = File.open("#{BUILD_DIR}/#{instrument_key}-#{type}.js", "w")
js_file.write(
"""
if (typeof(MIDI) === 'undefined') var MIDI = {};
if (typeof(MIDI.Soundfont) === 'undefined') MIDI.Soundfont = {};
MIDI.Soundfont.#{instrument_key} = {
""")
return js_file
end
def close_js_file(file)
file.write("\n}\n")
file.close
end
def base64js(note, file, type)
output = '"' + note + '": '
output += '"' + "data:audio/#{type};base64,"
output += Base64.strict_encode64(File.read(file)) + '"'
return output
end
def generate_audio(program)
instrument = MIDIJS_PATCH_NAMES[program]
instrument_key = instrument.downcase.gsub(/[^a-z0-9 ]/, "").gsub(/[ ]/, "_")
puts "Generating audio for: " + instrument + "(#{instrument_key})"
mkdir_p "#{BUILD_DIR}/#{instrument_key}"
note_to_int("A", 0).upto(note_to_int("C", 8)) do |note_value|
output_name = "p#{note_value}_v#{VELOCITY}"
output_path_prefix = BUILD_DIR + "/#{instrument_key}" + output_name
puts "Generating: #{output_name}"
temp_file_specific = TEMP_FILE % [output_name, instrument_key]
generate_midi(program, note_value, temp_file_specific)
midi_to_audio(temp_file_specific, output_path_prefix + ".wav")
mv output_path_prefix + ".mp3", "#{BUILD_DIR}/#{instrument_key}/#{output_name}.mp3"
rm temp_file_specific
end
tempHash = {
"name" => instrument_key,
"minPitch" => 0,
"maxPitch" => 127,
"durationSeconds" => 3.0,
"releaseSeconds" => 1.0,
"percussive": false,
"velocities": [100]
}
File.open("#{BUILD_DIR}/#{instrument_key}/instrument.json", "w") do |f|
f.write(tempHash.to_json)
end
end
Parallel.each(INSTRUMENTS, :in_processes=>Etc.nprocessors){|i| generate_audio(i)}

View File

@@ -1,6 +1,7 @@
package backend_golang
import (
"bufio"
"context"
"errors"
"net/http"
@@ -8,6 +9,7 @@ import (
"os/exec"
"path/filepath"
"runtime"
"syscall"
"github.com/fsnotify/fsnotify"
"github.com/minio/selfupdate"
@@ -41,6 +43,7 @@ func (a *App) OnStartup(ctx context.Context) {
a.cmdPrefix = "cd " + a.exDir + " && "
}
os.Chmod("./backend-rust/webgpu_server", 0777)
os.Mkdir(a.exDir+"models", os.ModePerm)
os.Mkdir(a.exDir+"lora-models", os.ModePerm)
os.Mkdir(a.exDir+"finetune/json2binidx_tool/data", os.ModePerm)
@@ -50,7 +53,18 @@ func (a *App) OnStartup(ctx context.Context) {
}
a.downloadLoop()
a.watchFs()
a.monitorHardware()
}
func (a *App) OnBeforeClose(ctx context.Context) bool {
if monitor != nil {
monitor.Process.Kill()
}
return false
}
func (a *App) watchFs() {
watcher, err := fsnotify.NewWatcher()
if err == nil {
watcher.Add("./lora-models")
@@ -62,7 +76,7 @@ func (a *App) OnStartup(ctx context.Context) {
if !ok {
return
}
wruntime.EventsEmit(ctx, "fsnotify", event.Name)
wruntime.EventsEmit(a.ctx, "fsnotify", event.Name)
case _, ok := <-watcher.Errors:
if !ok {
return
@@ -73,6 +87,37 @@ func (a *App) OnStartup(ctx context.Context) {
}
}
var monitor *exec.Cmd
func (a *App) monitorHardware() {
if runtime.GOOS != "windows" {
return
}
monitor = exec.Command("./components/LibreHardwareMonitor.Console/LibreHardwareMonitor.Console.exe")
stdout, err := monitor.StdoutPipe()
if err != nil {
monitor = nil
return
}
go func() {
reader := bufio.NewReader(stdout)
for {
line, _, err := reader.ReadLine()
if err != nil {
wruntime.EventsEmit(a.ctx, "monitorerr", err.Error())
break
}
wruntime.EventsEmit(a.ctx, "monitor", string(line))
}
}()
monitor.SysProcAttr = &syscall.SysProcAttr{}
//go:custom_build windows monitor.SysProcAttr.HideWindow = true
monitor.Start()
}
func (a *App) UpdateApp(url string) (broken bool, err error) {
resp, err := http.Get(url)
if err != nil {

View File

@@ -122,6 +122,10 @@ func (a *App) CopyFile(src string, dst string) error {
}
func (a *App) OpenSaveFileDialog(filterPattern string, defaultFileName string, savedContent string) (string, error) {
return a.OpenSaveFileDialogBytes(filterPattern, defaultFileName, []byte(savedContent))
}
func (a *App) OpenSaveFileDialogBytes(filterPattern string, defaultFileName string, savedContent []byte) (string, error) {
path, err := wruntime.SaveFileDialog(a.ctx, wruntime.SaveDialogOptions{
DefaultFilename: defaultFileName,
Filters: []wruntime.FileFilter{{
@@ -135,7 +139,7 @@ func (a *App) OpenSaveFileDialog(filterPattern string, defaultFileName string, s
if path == "" {
return "", nil
}
if err := os.WriteFile(path, []byte(savedContent), 0644); err != nil {
if err := os.WriteFile(path, savedContent, 0644); err != nil {
return "", err
}
return path, nil

View File

@@ -10,7 +10,7 @@ import (
"strings"
)
func (a *App) StartServer(python string, port int, host string) (string, error) {
func (a *App) StartServer(python string, port int, host string, rwkvBeta bool) (string, error) {
var err error
if python == "" {
python, err = GetPython()
@@ -18,7 +18,19 @@ func (a *App) StartServer(python string, port int, host string) (string, error)
if err != nil {
return "", err
}
return Cmd(python, "./backend-python/main.py", strconv.Itoa(port), host)
args := []string{python, "./backend-python/main.py"}
if rwkvBeta {
args = append(args, "--rwkv-beta")
}
args = append(args, "--port", strconv.Itoa(port), "--host", host)
return Cmd(args...)
}
func (a *App) StartWebGPUServer(port int, host string) (string, error) {
args := []string{"./backend-rust/webgpu_server"}
args = append(args, "-a", "0", "-t", "backend-rust/assets/rwkv_vocab_v20230424.json",
"--port", strconv.Itoa(port), "--ip", host)
return Cmd(args...)
}
func (a *App) ConvertModel(python string, modelPath string, strategy string, outPath string) (string, error) {
@@ -32,6 +44,17 @@ func (a *App) ConvertModel(python string, modelPath string, strategy string, out
return Cmd(python, "./backend-python/convert_model.py", "--in", modelPath, "--out", outPath, "--strategy", strategy)
}
func (a *App) ConvertSafetensors(python string, modelPath string, outPath string) (string, error) {
var err error
if python == "" {
python, err = GetPython()
}
if err != nil {
return "", err
}
return Cmd(python, "./backend-python/convert_safetensors.py", "--input", modelPath, "--output", outPath)
}
func (a *App) ConvertData(python string, input string, outputPrefix string, vocab string) (string, error) {
var err error
if python == "" {
@@ -132,7 +155,6 @@ func (a *App) InstallPyDep(python string, cnMirror bool) (string, error) {
"exit"
if !cnMirror {
installScript = strings.Replace(installScript, " -i https://pypi.tuna.tsinghua.edu.cn/simple", "", -1)
installScript = strings.Replace(installScript, "requirements.txt", "requirements_versions.txt", -1)
}
err = os.WriteFile("./install-py-dep.bat", []byte(installScript), 0644)
if err != nil {

53
backend-python/convert_safetensors.py vendored Normal file
View File

@@ -0,0 +1,53 @@
import json
import os
import sys
import copy
import torch
from safetensors.torch import load_file, save_file
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--input", type=str, help="Path to input pth model")
parser.add_argument(
"--output",
type=str,
default="./converted.st",
help="Path to output safetensors model",
)
args = parser.parse_args()
def convert_file(
pt_filename: str,
sf_filename: str,
):
loaded = torch.load(pt_filename, map_location="cpu")
if "state_dict" in loaded:
loaded = loaded["state_dict"]
loaded = {k: v.clone().half() for k, v in loaded.items()}
for k, v in loaded.items():
print(f"{k}\t{v.shape}\t{v.dtype}")
# For tensors to be contiguous
loaded = {k: v.contiguous() for k, v in loaded.items()}
dirname = os.path.dirname(sf_filename)
os.makedirs(dirname, exist_ok=True)
save_file(loaded, sf_filename, metadata={"format": "pt"})
reloaded = load_file(sf_filename)
for k in loaded:
pt_tensor = loaded[k]
sf_tensor = reloaded[k]
if not torch.equal(pt_tensor, sf_tensor):
raise RuntimeError(f"The output tensors do not match for key {k}")
if __name__ == "__main__":
try:
convert_file(args.input, args.output)
print(f"Saved to {args.output}")
except Exception as e:
with open("error.txt", "w") as f:
f.write(str(e))

View File

@@ -1,3 +1,6 @@
import safetensors
import midi2audio
import mido
import lm_dataformat
import ftfy
import tqdm

View File

@@ -1,5 +1,6 @@
from enum import Enum, auto
Args = "args"
Model = "model"
Model_Status = "model_status"
Model_Config = "model_config"

View File

@@ -1,5 +1,11 @@
import time
start_time = time.time()
import os
import sys
import argparse
from typing import Sequence
sys.path.append(os.path.dirname(os.path.realpath(__file__)))
@@ -12,7 +18,7 @@ from utils.rwkv import *
from utils.torch import *
from utils.ngrok import *
from utils.log import log_middleware
from routes import completion, config, state_cache
from routes import completion, config, state_cache, midi, misc
import global_var
app = FastAPI(dependencies=[Depends(log_middleware)])
@@ -27,12 +33,19 @@ app.add_middleware(
app.include_router(completion.router)
app.include_router(config.router)
app.include_router(midi.router)
app.include_router(misc.router)
app.include_router(state_cache.router)
@app.on_event("startup")
def init():
global_var.init()
cmd_params = os.environ["RWKV_RUNNER_PARAMS"]
global_var.set(
global_var.Args, get_args(cmd_params.split(" ") if cmd_params else None)
)
state_cache.init()
set_torch()
@@ -41,12 +54,12 @@ def init():
ngrok_connect()
@app.get("/")
@app.get("/", tags=["Root"])
def read_root():
return {"Hello": "World!"}
@app.post("/exit")
@app.post("/exit", tags=["Root"])
def exit():
parent_pid = os.getpid()
parent = psutil.Process(parent_pid)
@@ -55,20 +68,34 @@ def exit():
parent.kill()
def debug():
model = RWKV(
model="../models/RWKV-4-Raven-7B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230430-ctx8192.pth",
strategy="cuda fp16",
tokens_path="20B_tokenizer.json",
def get_args(args: Union[Sequence[str], None] = None):
parser = argparse.ArgumentParser()
group = parser.add_argument_group(title="server arguments")
group.add_argument(
"--port",
type=int,
default=8000,
help="port to run the server on (default: 8000)",
)
d = model.pipeline.decode([])
print(d)
group.add_argument(
"--host",
type=str,
default="127.0.0.1",
help="host to run the server on (default: 127.0.0.1)",
)
group = parser.add_argument_group(title="mode arguments")
group.add_argument(
"--rwkv-beta",
action="store_true",
help="whether to use rwkv-beta (default: False)",
)
args = parser.parse_args(args)
return args
if __name__ == "__main__":
uvicorn.run(
"main:app",
port=8000 if len(sys.argv) < 2 else int(sys.argv[1]),
host="127.0.0.1" if len(sys.argv) < 3 else sys.argv[2],
)
# debug()
args = get_args()
os.environ["RWKV_RUNNER_PARAMS"] = " ".join(sys.argv[1:])
print("--- %s seconds ---" % (time.time() - start_time))
uvicorn.run("main:app", port=args.port, host=args.host, workers=1)

Binary file not shown.

View File

@@ -2,11 +2,12 @@ import asyncio
import json
from threading import Lock
from typing import List, Union
from enum import Enum
import base64
from fastapi import APIRouter, Request, status, HTTPException
from sse_starlette.sse import EventSourceResponse
from pydantic import BaseModel
from pydantic import BaseModel, Field
import numpy as np
import tiktoken
from utils.rwkv import *
@@ -16,24 +17,52 @@ import global_var
router = APIRouter()
class Role(Enum):
User = "user"
Assistant = "assistant"
System = "system"
class Message(BaseModel):
role: str
content: str
role: Role
content: str = Field(min_length=0)
raw: bool = Field(False, description="Whether to treat content as raw text")
default_stop = [
"\n\nUser",
"\n\nQuestion",
"\n\nQ",
"\n\nHuman",
"\n\nBob",
]
class ChatCompletionBody(ModelConfigBody):
messages: List[Message]
model: str = "rwkv"
messages: Union[List[Message], None]
model: Union[str, None] = "rwkv"
stream: bool = False
stop: str = None
stop: Union[str, List[str], None] = default_stop
user_name: Union[str, None] = Field(None, description="Internal user name")
assistant_name: Union[str, None] = Field(
None, description="Internal assistant name"
)
presystem: bool = Field(
True, description="Whether to insert default system prompt at the beginning"
)
class Config:
schema_extra = {
"example": {
"messages": [{"role": "user", "content": "hello"}],
"messages": [
{"role": Role.User.value, "content": "hello", "raw": False}
],
"model": "rwkv",
"stream": False,
"stop": None,
"user_name": None,
"assistant_name": None,
"presystem": True,
"max_tokens": 1000,
"temperature": 1.2,
"top_p": 0.5,
@@ -44,10 +73,10 @@ class ChatCompletionBody(ModelConfigBody):
class CompletionBody(ModelConfigBody):
prompt: Union[str, List[str]]
model: str = "rwkv"
prompt: Union[str, List[str], None]
model: Union[str, None] = "rwkv"
stream: bool = False
stop: str = None
stop: Union[str, List[str], None] = None
class Config:
schema_extra = {
@@ -72,12 +101,12 @@ requests_num = 0
async def eval_rwkv(
model: RWKV,
model: AbstractRWKV,
request: Request,
body: ModelConfigBody,
prompt: str,
stream: bool,
stop: str,
stop: Union[str, List[str], None],
chat_mode: bool,
):
global requests_num
@@ -121,7 +150,7 @@ async def eval_rwkv(
"object": "chat.completion.chunk"
if chat_mode
else "text_completion",
"response": response,
# "response": response,
"model": model.name,
"choices": [
{
@@ -159,7 +188,7 @@ async def eval_rwkv(
"object": "chat.completion.chunk"
if chat_mode
else "text_completion",
"response": response,
# "response": response,
"model": model.name,
"choices": [
{
@@ -180,7 +209,7 @@ async def eval_rwkv(
else:
yield {
"object": "chat.completion" if chat_mode else "text_completion",
"response": response,
# "response": response,
"model": model.name,
"usage": {
"prompt_tokens": prompt_tokens,
@@ -190,7 +219,7 @@ async def eval_rwkv(
"choices": [
{
"message": {
"role": "assistant",
"role": Role.Assistant.value,
"content": response,
},
"index": 0,
@@ -206,103 +235,115 @@ async def eval_rwkv(
}
@router.post("/v1/chat/completions")
@router.post("/chat/completions")
@router.post("/v1/chat/completions", tags=["Completions"])
@router.post("/chat/completions", tags=["Completions"])
async def chat_completions(body: ChatCompletionBody, request: Request):
model: RWKV = global_var.get(global_var.Model)
model: TextRWKV = global_var.get(global_var.Model)
if model is None:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "model not loaded")
question = body.messages[-1]
if question.role == "user":
question = question.content
elif question.role == "system":
question = body.messages[-2]
if question.role == "user":
question = question.content
else:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "no question found")
else:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "no question found")
if body.messages is None or body.messages == []:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "messages not found")
interface = model.interface
user = model.user
bot = model.bot
user = model.user if body.user_name is None else body.user_name
bot = model.bot if body.assistant_name is None else body.assistant_name
completion_text = (
f"""
is_raven = model.rwkv_type == RWKVType.Raven
completion_text: str = ""
basic_system: Union[str, None] = None
if body.presystem:
if body.messages[0].role == Role.System:
basic_system = body.messages[0].content
if basic_system is None:
completion_text = (
f"""
The following is a coherent verbose detailed conversation between a girl named {bot} and her friend {user}. \
{bot} is very intelligent, creative and friendly. \
{bot} is unlikely to disagree with {user}, and {bot} doesn't like to ask {user} questions. \
{bot} likes to tell {user} a lot about herself and her opinions. \
{bot} usually gives {user} kind, helpful and informative advices.\n
"""
if user == "Bob"
else f"{user}{interface} hi\n\n{bot}{interface} Hi. "
+ "I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.\n\n"
)
for message in body.messages:
if message.role == "system":
if is_raven
else (
f"{user}{interface} hi\n\n{bot}{interface} Hi. "
+ "I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.\n\n"
)
)
else:
if not body.messages[0].raw:
basic_system = (
basic_system.replace("\r\n", "\n")
.replace("\r", "\n")
.replace("\n\n", "\n")
.replace("\n", " ")
.strip()
)
completion_text = (
f"The following is a coherent verbose detailed conversation between a girl named {bot} and her friend {user}. "
if user == "Bob"
else f"{user}{interface} hi\n\n{bot}{interface} Hi. "
+ message.content.replace("\\n", "\n")
.replace("\r\n", "\n")
.replace("\n\n", "\n")
.replace("\n", " ")
.strip()
.replace("You are", f"{bot} is" if user == "Bob" else "I am")
.replace("you are", f"{bot} is" if user == "Bob" else "I am")
.replace("You're", f"{bot} is" if user == "Bob" else "I'm")
.replace("you're", f"{bot} is" if user == "Bob" else "I'm")
.replace("You", f"{bot}" if user == "Bob" else "I")
.replace("you", f"{bot}" if user == "Bob" else "I")
.replace("Your", f"{bot}'s" if user == "Bob" else "My")
.replace("your", f"{bot}'s" if user == "Bob" else "my")
.replace("", f"{bot}" if user == "Bob" else "")
(
f"The following is a coherent verbose detailed conversation between a girl named {bot} and her friend {user}. "
if is_raven
else f"{user}{interface} hi\n\n{bot}{interface} Hi. "
)
+ basic_system.replace("You are", f"{bot} is" if is_raven else "I am")
.replace("you are", f"{bot} is" if is_raven else "I am")
.replace("You're", f"{bot} is" if is_raven else "I'm")
.replace("you're", f"{bot} is" if is_raven else "I'm")
.replace("You", f"{bot}" if is_raven else "I")
.replace("you", f"{bot}" if is_raven else "I")
.replace("Your", f"{bot}'s" if is_raven else "My")
.replace("your", f"{bot}'s" if is_raven else "my")
.replace("", f"{bot}" if is_raven else "")
+ "\n\n"
)
break
for message in body.messages:
if message.role == "user":
completion_text += (
f"{user}{interface} "
+ message.content.replace("\\n", "\n")
.replace("\r\n", "\n")
for message in body.messages[(0 if basic_system is None else 1) :]:
append_message: str = ""
if message.role == Role.User:
append_message = f"{user}{interface} " + message.content
elif message.role == Role.Assistant:
append_message = f"{bot}{interface} " + message.content
elif message.role == Role.System:
append_message = message.content
if not message.raw:
append_message = (
append_message.replace("\r\n", "\n")
.replace("\r", "\n")
.replace("\n\n", "\n")
.strip()
+ "\n\n"
)
elif message.role == "assistant":
completion_text += (
f"{bot}{interface} "
+ message.content.replace("\\n", "\n")
.replace("\r\n", "\n")
.replace("\n\n", "\n")
.strip()
+ "\n\n"
)
completion_text += append_message + "\n\n"
completion_text += f"{bot}{interface}"
stop = f"\n\n{user}" if body.stop is None else body.stop
if type(body.stop) == str:
body.stop = [body.stop, f"\n\n{user}", f"\n\n{bot}"]
elif type(body.stop) == list:
body.stop.append(f"\n\n{user}")
body.stop.append(f"\n\n{bot}")
elif body.stop is None:
body.stop = default_stop
if body.stream:
return EventSourceResponse(
eval_rwkv(model, request, body, completion_text, body.stream, stop, True)
eval_rwkv(
model, request, body, completion_text, body.stream, body.stop, True
)
)
else:
try:
return await eval_rwkv(
model, request, body, completion_text, body.stream, stop, True
model, request, body, completion_text, body.stream, body.stop, True
).__anext__()
except StopAsyncIteration:
return None
@router.post("/v1/completions")
@router.post("/completions")
@router.post("/v1/completions", tags=["Completions"])
@router.post("/completions", tags=["Completions"])
async def completions(body: CompletionBody, request: Request):
model: RWKV = global_var.get(global_var.Model)
model: AbstractRWKV = global_var.get(global_var.Model)
if model is None:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "model not loaded")
@@ -326,8 +367,8 @@ async def completions(body: CompletionBody, request: Request):
class EmbeddingsBody(BaseModel):
input: Union[str, List[str], List[List[int]]]
model: str = "rwkv"
input: Union[str, List[str], List[List[int]], None]
model: Union[str, None] = "rwkv"
encoding_format: str = None
fast_mode: bool = False
@@ -346,12 +387,12 @@ def embedding_base64(embedding: List[float]) -> str:
return base64.b64encode(np.array(embedding).astype(np.float32)).decode("utf-8")
@router.post("/v1/embeddings")
@router.post("/embeddings")
@router.post("/v1/engines/text-embedding-ada-002/embeddings")
@router.post("/engines/text-embedding-ada-002/embeddings")
@router.post("/v1/embeddings", tags=["Embeddings"])
@router.post("/embeddings", tags=["Embeddings"])
@router.post("/v1/engines/text-embedding-ada-002/embeddings", tags=["Embeddings"])
@router.post("/engines/text-embedding-ada-002/embeddings", tags=["Embeddings"])
async def embeddings(body: EmbeddingsBody, request: Request):
model: RWKV = global_var.get(global_var.Model)
model: AbstractRWKV = global_var.get(global_var.Model)
if model is None:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "model not loaded")

View File

@@ -6,20 +6,22 @@ from pydantic import BaseModel
from utils.rwkv import *
from utils.torch import *
import global_var
import GPUtil
router = APIRouter()
def get_tokens_path(model_path: str):
model_path = model_path.lower()
default_tokens_path = (
f"{pathlib.Path(__file__).parent.parent.resolve()}/rwkv_pip/20B_tokenizer.json"
)
tokenizer_dir = f"{pathlib.Path(__file__).parent.parent.resolve()}/rwkv_pip/"
default_tokens_path = tokenizer_dir + "20B_tokenizer.json"
if "raven" in model_path:
return default_tokens_path
elif "world" in model_path:
return "rwkv_vocab_v20230424"
elif "midi" in model_path:
return tokenizer_dir + "tokenizer-midi.json"
else:
return default_tokens_path
@@ -27,6 +29,7 @@ def get_tokens_path(model_path: str):
class SwitchModelBody(BaseModel):
model: str
strategy: str
tokenizer: Union[str, None] = None
customCuda: bool = False
class Config:
@@ -34,12 +37,13 @@ class SwitchModelBody(BaseModel):
"example": {
"model": "models/RWKV-4-World-3B-v1-20230619-ctx4096.pth",
"strategy": "cuda fp16",
"tokenizer": None,
"customCuda": False,
}
}
@router.post("/switch-model")
@router.post("/switch-model", tags=["Configs"])
def switch_model(body: SwitchModelBody, response: Response, request: Request):
if global_var.get(global_var.Model_Status) is global_var.ModelStatus.Loading:
response.status_code = Status.HTTP_304_NOT_MODIFIED
@@ -63,13 +67,24 @@ def switch_model(body: SwitchModelBody, response: Response, request: Request):
os.environ["RWKV_CUDA_ON"] = "1" if body.customCuda else "0"
global_var.set(global_var.Model_Status, global_var.ModelStatus.Loading)
tokenizer = (
get_tokens_path(body.model)
if body.tokenizer is None or body.tokenizer == ""
else body.tokenizer
)
try:
global_var.set(
global_var.Model,
RWKV(
TextRWKV(
model=body.model,
strategy=body.strategy,
tokens_path=get_tokens_path(body.model),
tokens_path=tokenizer,
)
if "midi" not in body.model.lower()
else MusicRWKV(
model=body.model,
strategy=body.strategy,
tokens_path=tokenizer,
),
)
except Exception as e:
@@ -89,7 +104,7 @@ def switch_model(body: SwitchModelBody, response: Response, request: Request):
return "success"
@router.post("/update-config")
@router.post("/update-config", tags=["Configs"])
def update_config(body: ModelConfigBody):
"""
Will not update the model config immediately, but set it when completion called to avoid modifications during generation
@@ -101,8 +116,10 @@ def update_config(body: ModelConfigBody):
return "success"
@router.get("/status")
@router.get("/status", tags=["Configs"])
def status():
import GPUtil
gpus = GPUtil.getGPUs()
if len(gpus) == 0:
device_name = "CPU"

View File

@@ -0,0 +1,131 @@
import io
from fastapi import APIRouter, HTTPException, status
from starlette.responses import StreamingResponse
from pydantic import BaseModel
from utils.midi import *
from midi2audio import FluidSynth
router = APIRouter()
class TextToMidiBody(BaseModel):
text: str
class Config:
schema_extra = {
"example": {
"text": "p:24:a p:2a:a p:31:a p:39:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:24:0 p:2a:0 p:31:0 p:39:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:26:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:2e:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2e:0 p:3b:0 p:45:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:2e:a p:3b:a p:45:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2e:0 p:3b:0 p:45:0 b:26:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:26:a p:2a:a p:3b:a p:45:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2a:0 p:3b:0 p:45:0 b:26:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:2d:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 b:2d:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2e:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2e:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:26:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:26:a p:2e:a p:31:a p:39:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:26:0 p:2e:0 p:31:0 p:39:0 p:3b:0 p:45:0 b:21:0 t2 p:26:a p:2e:a p:31:a p:39:a p:3b:a p:45:a b:21:a t14 p:26:0 p:2e:0 p:31:0 p:39:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2a:a p:31:a p:39:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:24:0 p:2a:0 p:31:0 p:39:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:2e:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2e:0 p:3b:0 p:45:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:2e:a p:3b:a p:45:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2e:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:26:a p:2a:a p:3b:a p:45:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2a:0 p:3b:0 p:45:0 b:1f:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:1f:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:24:a p:2e:a p:3b:a p:45:a b:26:a g:39:a g:39:a g:3e:a g:3e:a g:42:a g:42:a pi:39:a pi:3e:a pi:42:a t14 p:24:0 p:2e:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0",
}
}
@router.post("/text-to-midi", tags=["MIDI"])
def text_to_midi(body: TextToMidiBody):
vocab_config = "backend-python/utils/midi_vocab_config.json"
cfg = VocabConfig.from_json(vocab_config)
mid = convert_str_to_midi(cfg, body.text.strip())
mid_data = io.BytesIO()
mid.save(None, mid_data)
mid_data.seek(0)
return StreamingResponse(mid_data, media_type="audio/midi")
class TxtToMidiBody(BaseModel):
txt_path: str
midi_path: str
class Config:
schema_extra = {
"example": {
"txt_path": "midi/sample.txt",
"midi_path": "midi/sample.mid",
}
}
@router.post("/txt-to-midi", tags=["MIDI"])
def txt_to_midi(body: TxtToMidiBody):
if not body.midi_path.startswith("midi/"):
raise HTTPException(status.HTTP_400_BAD_REQUEST, "bad output path")
vocab_config = "backend-python/utils/midi_vocab_config.json"
cfg = VocabConfig.from_json(vocab_config)
with open(body.txt_path, "r") as f:
text = f.read()
text = text.strip()
mid = convert_str_to_midi(cfg, text)
mid.save(body.midi_path)
return "success"
class MidiToWavBody(BaseModel):
midi_path: str
wav_path: str
sound_font_path: str = "assets/default_sound_font.sf2"
class Config:
schema_extra = {
"example": {
"midi_path": "midi/sample.mid",
"wav_path": "midi/sample.wav",
"sound_font_path": "assets/default_sound_font.sf2",
}
}
@router.post("/midi-to-wav", tags=["MIDI"])
def midi_to_wav(body: MidiToWavBody):
"""
Install fluidsynth first, see more: https://github.com/FluidSynth/fluidsynth/wiki/Download#distributions
"""
if not body.wav_path.startswith("midi/"):
raise HTTPException(status.HTTP_400_BAD_REQUEST, "bad output path")
fs = FluidSynth(body.sound_font_path)
fs.midi_to_audio(body.midi_path, body.wav_path)
return "success"
class TextToWavBody(BaseModel):
text: str
wav_name: str
sound_font_path: str = "assets/default_sound_font.sf2"
class Config:
schema_extra = {
"example": {
"text": "p:24:a p:2a:a p:31:a p:39:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:24:0 p:2a:0 p:31:0 p:39:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:26:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:2e:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2e:0 p:3b:0 p:45:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:2e:a p:3b:a p:45:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2e:0 p:3b:0 p:45:0 b:26:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:26:a p:2a:a p:3b:a p:45:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2a:0 p:3b:0 p:45:0 b:26:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:2d:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 b:2d:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2e:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2e:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:26:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:26:a p:2e:a p:31:a p:39:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:26:0 p:2e:0 p:31:0 p:39:0 p:3b:0 p:45:0 b:21:0 t2 p:26:a p:2e:a p:31:a p:39:a p:3b:a p:45:a b:21:a t14 p:26:0 p:2e:0 p:31:0 p:39:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2a:a p:31:a p:39:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:24:0 p:2a:0 p:31:0 p:39:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:2e:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2e:0 p:3b:0 p:45:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:2e:a p:3b:a p:45:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2e:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:26:a p:2a:a p:3b:a p:45:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2a:0 p:3b:0 p:45:0 b:1f:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:1f:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:24:a p:2e:a p:3b:a p:45:a b:26:a g:39:a g:39:a g:3e:a g:3e:a g:42:a g:42:a pi:39:a pi:3e:a pi:42:a t14 p:24:0 p:2e:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0",
"wav_name": "sample",
"sound_font_path": "assets/default_sound_font.sf2",
}
}
@router.post("/text-to-wav", tags=["MIDI"])
def text_to_wav(body: TextToWavBody):
"""
Install fluidsynth first, see more: https://github.com/FluidSynth/fluidsynth/wiki/Download#distributions
"""
text = body.text.strip()
if not text.startswith("<start>"):
text = "<start> " + text
if not text.endswith("<end>"):
text = text + " <end>"
txt_path = f"midi/{body.wav_name}.txt"
midi_path = f"midi/{body.wav_name}.mid"
wav_path = f"midi/{body.wav_name}.wav"
with open(txt_path, "w") as f:
f.write(text)
txt_to_midi(TxtToMidiBody(txt_path=txt_path, midi_path=midi_path))
midi_to_wav(
MidiToWavBody(
midi_path=midi_path, wav_path=wav_path, sound_font_path=body.sound_font_path
)
)
return "success"

View File

@@ -0,0 +1,131 @@
from fastapi import APIRouter, HTTPException, status
from utils.rwkv import AbstractRWKV
import global_var
router = APIRouter()
@router.get("/dashboard/billing/credit_grants", tags=["MISC"])
def credit_grants():
return {
"object": "credit_summary",
"total_granted": 10000,
"total_used": 0,
"total_available": 10000,
"grants": {
"object": "list",
"data": [
{
"object": "credit_grant",
"grant_amount": 10000,
"used_amount": 0,
"effective_at": 1672531200,
"expires_at": 33229440000,
}
],
},
}
fake_models = [
{
"id": "gpt-3.5-turbo",
"object": "model",
"created": 1677610602,
"owned_by": "openai",
"permission": [
{
"id": "modelperm-zy5TOjnE2zVaicIcKO9bQDgX",
"object": "model_permission",
"created": 1690864883,
"allow_create_engine": False,
"allow_sampling": True,
"allow_logprobs": True,
"allow_search_indices": False,
"allow_view": True,
"allow_fine_tuning": False,
"organization": "*",
"group": None,
"is_blocking": False,
}
],
"root": "gpt-3.5-turbo",
"parent": None,
},
{
"id": "text-davinci-003",
"object": "model",
"created": 1669599635,
"owned_by": "openai-internal",
"permission": [
{
"id": "modelperm-a6niqBmW2JaGmo0fDO7FEt1n",
"object": "model_permission",
"created": 1690930172,
"allow_create_engine": False,
"allow_sampling": True,
"allow_logprobs": True,
"allow_search_indices": False,
"allow_view": True,
"allow_fine_tuning": False,
"organization": "*",
"group": None,
"is_blocking": False,
}
],
"root": "text-davinci-003",
"parent": None,
},
]
@router.get("/v1/models", tags=["MISC"])
@router.get("/models", tags=["MISC"])
def models():
model: AbstractRWKV = global_var.get(global_var.Model)
model_name = model.name if model else "rwkv"
return {
"object": "list",
"data": [
{
"id": model_name,
"object": "model",
"owned_by": "rwkv",
"root": model_name,
"parent": None,
},
*fake_models,
],
}
@router.get("/v1/models/{model_id}", tags=["MISC"])
@router.get("/models/{model_id}", tags=["MISC"])
def model(model_id: str):
for fake_model in fake_models:
if fake_model["id"] == model_id:
return fake_model
if "rwkv" in model_id.lower():
model: AbstractRWKV = global_var.get(global_var.Model)
model_name = model.name if model else "rwkv"
return {
"id": model_name,
"object": "model",
"owned_by": "rwkv",
"root": model_name,
"parent": None,
}
raise HTTPException(
status.HTTP_404_NOT_FOUND,
{
"error": {
"message": f"The model '{model_id}' does not exist",
"type": "invalid_request_error",
"param": "model",
"code": "model_not_found",
}
},
)

View File

@@ -1,11 +1,9 @@
from typing import Any, Dict, List
from typing import Any, Dict, List, Union
from utils.log import quick_log
from fastapi import APIRouter, HTTPException, Request, Response, status
from pydantic import BaseModel
import gc
import copy
import sys
import torch
router = APIRouter()
@@ -34,7 +32,7 @@ def init():
print("cyac not found")
@router.post("/disable-state-cache")
@router.post("/disable-state-cache", tags=["State Cache"])
def disable_state_cache():
global trie, dtrie
@@ -45,7 +43,7 @@ def disable_state_cache():
return "success"
@router.post("/enable-state-cache")
@router.post("/enable-state-cache", tags=["State Cache"])
def enable_state_cache():
global trie, dtrie
try:
@@ -62,17 +60,19 @@ def enable_state_cache():
class AddStateBody(BaseModel):
prompt: str
tokens: List[str]
tokens: List[Union[str, int]]
state: Any
logits: Any
@router.post("/add-state")
@router.post("/add-state", tags=["State Cache"])
def add_state(body: AddStateBody):
global trie, dtrie, loop_del_trie_id
if trie is None:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "trie not loaded")
import torch
try:
id: int = trie.insert(body.prompt)
device: torch.device = body.state[0].device
@@ -96,7 +96,7 @@ def add_state(body: AddStateBody):
quick_log(
None,
None,
f"New Trie Id: {id}\nTrie Len: {len(trie)}\nTrie Buff Size: {trie.buff_size()}\nDtrie Buff Size Of Id: {_get_a_dtrie_buff_size(dtrie[id])}",
f"New Trie Id: {id}\nTrie Len: {len(trie)}\nTrie Buff Size: {trie.buff_size()}\nDtrie Buff Size Of Id: {__get_a_dtrie_buff_size(dtrie[id])}",
)
return "success"
except Exception as e:
@@ -105,7 +105,7 @@ def add_state(body: AddStateBody):
)
@router.post("/reset-state")
@router.post("/reset-state", tags=["State Cache"])
def reset_state():
global trie, dtrie
if trie is None:
@@ -124,7 +124,7 @@ class LongestPrefixStateBody(BaseModel):
prompt: str
def _get_a_dtrie_buff_size(dtrie_v):
def __get_a_dtrie_buff_size(dtrie_v):
# print(sys.getsizeof(dtrie_v["tokens"][0])) # str
# print(sys.getsizeof(dtrie_v["tokens"][0]) * len(dtrie_v["tokens"]))
# print(dtrie_v["state"][0][0].element_size())
@@ -141,12 +141,14 @@ def _get_a_dtrie_buff_size(dtrie_v):
return 54 * len(dtrie_v["tokens"]) + 491520 + 262144 + 28 # TODO
@router.post("/longest-prefix-state")
@router.post("/longest-prefix-state", tags=["State Cache"])
def longest_prefix_state(body: LongestPrefixStateBody, request: Request):
global trie
if trie is None:
raise HTTPException(status.HTTP_400_BAD_REQUEST, "trie not loaded")
import torch
id = -1
try:
for id, len in trie.prefix(body.prompt):
@@ -178,7 +180,7 @@ def longest_prefix_state(body: LongestPrefixStateBody, request: Request):
}
@router.post("/save-state")
@router.post("/save-state", tags=["State Cache"])
def save_state():
global trie
if trie is None:

View File

@@ -0,0 +1,124 @@
#include "ATen/ATen.h"
#include <cuda_fp16.h>
#include <cuda_runtime.h>
#include <torch/extension.h>
#include "element_wise.h"
#include "util.h"
// Equivalent Python code:
// ww = t_first + k
// p = torch.maximum(pp, ww)
// e1 = torch.exp(pp - p)
// e2 = torch.exp(ww - p)
// wkv = ((e1 * aa + e2 * v) / (e1 * bb + e2)).to(dtype=x.dtype)
// ww = t_decay + pp
// p = torch.maximum(ww, k)
// e1 = torch.exp(ww - p)
// e2 = torch.exp(k - p)
// t1 = e1 * aa + e2 * v
// t2 = e1 * bb + e2
// r = r * wkv
// return t1, t2, p, r
struct WkvForwardOne {
const float *t_first;
const float *k;
const float *pp;
const float *aa;
const float *bb;
const float *t_decay;
const float *v;
/* out */ float *t1;
/* out */ float *t2;
/* out */ float *p;
/* in & out */ half *r;
__device__ void operator()(int i) const {
float ww = t_first[i] + k[i];
float pp_ = pp[i];
float p_ = (pp_ > ww) ? pp_ : ww;
float e1 = expf(pp_ - p_);
float e2 = expf(ww - p_);
float aa_ = aa[i];
float bb_ = bb[i];
float v_ = v[i];
r[i] = __hmul(r[i], __float2half(((e1 * aa_ + e2 * v_) / (e1 * bb_ + e2))));
ww = t_decay[i] + pp_;
float k_ = k[i];
p_ = (ww > k_) ? ww : k_;
e1 = expf(ww - p_);
e2 = expf(k_ - p_);
t1[i] = e1 * aa_ + e2 * v_;
t2[i] = e1 * bb_ + e2;
p[i] = p_;
}
};
/*
Equivalent Python code:
kx = xx * k_mix + sx * (1 - k_mix)
vx = xx * v_mix + sx * (1 - v_mix)
rx = xx * r_mix + sx * (1 - r_mix)
*/
struct Mix {
const half *xx;
const half *sx;
const half *k_mix;
const half *v_mix;
const half *r_mix;
/* out */ half *kx;
/* out */ half *vx;
/* out */ half *rx;
__device__ void operator()(int i) const {
half xx_ = xx[i];
half sx_ = sx[i];
half k_mix_ = k_mix[i];
half v_mix_ = v_mix[i];
half r_mix_ = r_mix[i];
kx[i] = __hadd(__hmul(xx_, k_mix_),
__hmul(sx_, __hsub(__float2half(1), k_mix_)));
vx[i] = __hadd(__hmul(xx_, v_mix_),
__hmul(sx_, __hsub(__float2half(1), v_mix_)));
rx[i] = __hadd(__hmul(xx_, r_mix_),
__hmul(sx_, __hsub(__float2half(1), r_mix_)));
}
};
using torch::Tensor;
void gemm_fp16_cublas_tensor(Tensor a, Tensor b, Tensor c);
Tensor att_one(Tensor x, Tensor ln_w, Tensor ln_b, Tensor sx, Tensor k_mix,
Tensor v_mix, Tensor r_mix, Tensor kw,
/* imm */ Tensor kx, Tensor vw, /* imm */ Tensor vx, Tensor rw,
/* imm */ Tensor rx, Tensor ow, Tensor t_first,
/* imm */ Tensor k, Tensor pp, Tensor ww, Tensor aa, Tensor bb,
Tensor t_decay, /* imm */ Tensor v, /* in & out */ Tensor r,
/* out */ Tensor x_plus_out, /* out */ Tensor t1,
/* out */ Tensor t2, /* out */ Tensor p) {
Tensor xx = at::layer_norm(x, {x.size(-1)}, ln_w, ln_b);
element_wise(Mix{data_ptr<half>(xx), data_ptr<half>(sx),
data_ptr<half>(k_mix), data_ptr<half>(v_mix),
data_ptr<half>(r_mix), data_ptr<half>(kx),
data_ptr<half>(vx), data_ptr<half>(rx)},
x.numel());
gemm_fp16_cublas_tensor(kx, kw, k);
gemm_fp16_cublas_tensor(vx, vw, v);
gemm_fp16_cublas_tensor(rx, rw, r);
at::sigmoid_(r);
element_wise(WkvForwardOne{data_ptr<float>(t_first), data_ptr<float>(k),
data_ptr<float>(pp), data_ptr<float>(aa),
data_ptr<float>(bb), data_ptr<float>(t_decay),
data_ptr<float>(v), data_ptr<float>(t1),
data_ptr<float>(t2), data_ptr<float>(p),
data_ptr<half>(r)},
x.numel());
gemm_fp16_cublas_tensor(r, ow, x_plus_out);
x_plus_out += x;
return xx;
}

View File

@@ -0,0 +1,109 @@
#include "ATen/ATen.h"
#include <cuda_fp16.h>
#include <cuda_runtime.h>
#include <torch/extension.h>
#include "element_wise.h"
#include "util.h"
// Equivalent Python code:
// s1 = t_first * a + s
// s2 = a + t_decay * s
struct Fused1 {
const float *t_first;
const float *t_decay;
const float *a;
const float *s;
const int32_t inner_size;
/* out */ float *s1;
/* out */ float *s2;
__device__ void operator()(int i) const {
const int j = i / inner_size;
s1[i] = t_first[j] * a[i] + s[i];
s2[i] = a[i] + t_decay[j] * s[i];
}
};
/*
Equivalent Python code:
kx = xx * k_mix + sx * (1 - k_mix)
vx = xx * v_mix + sx * (1 - v_mix)
rx = xx * r_mix + sx * (1 - r_mix)
*/
struct Mix {
const half *xx;
const half *sx;
const half *k_mix;
const half *v_mix;
const half *r_mix;
/* out */ half *kx;
/* out */ half *vx;
/* out */ half *rx;
__device__ void operator()(int i) const {
half xx_ = xx[i];
half sx_ = sx[i];
half k_mix_ = k_mix[i];
half v_mix_ = v_mix[i];
half r_mix_ = r_mix[i];
kx[i] = __hadd(__hmul(xx_, k_mix_),
__hmul(sx_, __hsub(__float2half(1), k_mix_)));
vx[i] = __hadd(__hmul(xx_, v_mix_),
__hmul(sx_, __hsub(__float2half(1), v_mix_)));
rx[i] = __hadd(__hmul(xx_, r_mix_),
__hmul(sx_, __hsub(__float2half(1), r_mix_)));
}
};
using torch::Tensor;
void gemm_fp16_cublas_tensor(Tensor a, Tensor b, Tensor c);
Tensor att_one_v5(Tensor x, Tensor sx, Tensor s, Tensor ln_w, Tensor ln_b,
Tensor lx_w, Tensor lx_b, Tensor k_mix, Tensor v_mix,
Tensor r_mix, Tensor kw,
/* imm */ Tensor kx, Tensor vw, /* imm */ Tensor vx,
Tensor rw,
/* imm */ Tensor rx, Tensor ow, Tensor t_first,
/* imm */ Tensor k, Tensor t_decay, /* imm */ Tensor v,
/* imm */ Tensor r, /* imm */ Tensor s1,
/* out */ Tensor x_plus_out, /* out */ Tensor s2) {
Tensor xx = at::layer_norm(x, {x.size(-1)}, ln_w, ln_b);
element_wise(Mix{data_ptr<half>(xx), data_ptr<half>(sx),
data_ptr<half>(k_mix), data_ptr<half>(v_mix),
data_ptr<half>(r_mix), data_ptr<half>(kx),
data_ptr<half>(vx), data_ptr<half>(rx)},
x.numel());
int H = t_decay.size(0);
int S = x.size(-1) / H;
gemm_fp16_cublas_tensor(rx, rw, r);
r = at::reshape(r, {H, 1, S});
gemm_fp16_cublas_tensor(kx, kw, k);
k = at::reshape(k, {H, S, 1});
gemm_fp16_cublas_tensor(vx, vw, v);
v = at::reshape(v, {H, 1, S});
{
Tensor a = at::matmul(k, v);
// s1 = t_first * a + s
// s2 = a + t_decay * s
element_wise(Fused1{data_ptr<float>(t_first), data_ptr<float>(t_decay),
data_ptr<float>(a), data_ptr<float>(s),
static_cast<int32_t>(a.size(1) * a.size(2)),
data_ptr<float>(s1), data_ptr<float>(s2)},
a.numel());
}
Tensor out = at::matmul(r, s1);
out = at::flatten(out);
out = at::squeeze(at::group_norm(at::unsqueeze(out, 0), H, lx_w, lx_b), 0);
out = at::_cast_Half(out);
gemm_fp16_cublas_tensor(out, ow, x_plus_out);
x_plus_out += x;
return xx;
}

View File

@@ -0,0 +1,178 @@
#include "ATen/ATen.h"
#include <cuda_fp16.h>
#include <cuda_runtime.h>
#include <torch/extension.h>
#include "util.h"
#include "element_wise.h"
using torch::Tensor;
void gemm_fp16_cublas(const void *a, const void *b, void *c, int m,
int n, int k, bool output_fp32);
// based on `kernel_wkv_forward`, fusing more operations
__global__ void kernel_wkv_forward_new(
const int B, const int T, const int C, const float *__restrict__ const _w,
const float *__restrict__ const _u, const float *__restrict__ const _k,
const float *__restrict__ const _v, const half *__restrict__ const r,
half *__restrict__ const _y, float *__restrict__ const _aa,
float *__restrict__ const _bb, float *__restrict__ const _pp) {
const int idx = blockIdx.x * blockDim.x + threadIdx.x;
const int _b = idx / C;
const int _c = idx % C;
const int _offset = _b * T * C + _c;
const int _state_offset = _b * C + _c;
float u = _u[_c];
float w = _w[_c];
const float *__restrict__ const k = _k + _offset;
const float *__restrict__ const v = _v + _offset;
half *__restrict__ const y = _y + _offset;
float aa = _aa[_state_offset];
float bb = _bb[_state_offset];
float pp = _pp[_state_offset];
for (int i = 0; i < T; i++) {
const int ii = i * C;
const float kk = k[ii];
const float vv = v[ii];
float ww = u + kk;
float p = max(pp, ww);
float e1 = exp(pp - p);
float e2 = exp(ww - p);
y[ii] = __float2half((e1 * aa + e2 * vv) / (e1 * bb + e2));
ww = w + pp;
p = max(ww, kk);
e1 = exp(ww - p);
e2 = exp(kk - p);
aa = e1 * aa + e2 * vv;
bb = e1 * bb + e2;
pp = p;
}
_aa[_state_offset] = aa;
_bb[_state_offset] = bb;
_pp[_state_offset] = pp;
}
void cuda_wkv_forward_new(int B, int T, int C, float *w, float *u, float *k,
float *v, half *r, half *y, float *aa, float *bb,
float *pp) {
dim3 threadsPerBlock(min(C, 32));
assert(B * C % threadsPerBlock.x == 0);
dim3 numBlocks(B * C / threadsPerBlock.x);
kernel_wkv_forward_new<<<numBlocks, threadsPerBlock>>>(B, T, C, w, u, k, v, r,
y, aa, bb, pp);
}
__global__ void _att_mix(const half *xx, const half *sx, const half *k_mix,
const half *v_mix, const half *r_mix,
const int outer_size, const int inner_size, half *kx,
half *vx, half *rx) {
for (int idx2 = blockIdx.x * blockDim.x + threadIdx.x; idx2 < inner_size;
idx2 += blockDim.x * gridDim.x) {
half k_mix_ = k_mix[idx2];
half v_mix_ = v_mix[idx2];
half r_mix_ = r_mix[idx2];
for (int row = 0; row < outer_size; ++row) {
int idx1 = row * inner_size + idx2;
half xx_ = xx[idx1];
half sx_ = sx[idx1];
kx[idx1] = __hadd(__hmul(xx_, k_mix_),
__hmul(sx_, __hsub(__float2half(1), k_mix_)));
vx[idx1] = __hadd(__hmul(xx_, v_mix_),
__hmul(sx_, __hsub(__float2half(1), v_mix_)));
rx[idx1] = __hadd(__hmul(xx_, r_mix_),
__hmul(sx_, __hsub(__float2half(1), r_mix_)));
}
}
}
void att_mix(const half *xx, const half *sx, const half *k_mix,
const half *v_mix, const half *r_mix, const int outer_size,
const int inner_size, half *kx, half *vx, half *rx) {
// 256 is good enough on most GPUs
const int32_t BLOCK_SIZE = 256;
assert(inner_size % BLOCK_SIZE == 0);
_att_mix<<<inner_size / BLOCK_SIZE, BLOCK_SIZE>>>(
xx, sx, k_mix, v_mix, r_mix, outer_size, inner_size, kx, vx, rx);
}
struct InplaceSigmoid {
__device__ __forceinline__ half operator()(int i) const {
ptr[i] = __float2half(1.0 / (1.0 + exp(-__half2float(ptr[i]))));
}
half *ptr;
};
struct InplaceMul {
__device__ __forceinline__ half operator()(int i) const {
y[i] = __hmul(x[i], y[i]);
}
half *y;
half *x;
};
/*
Equivalent Python code:
xx = F.layer_norm(x, (x.shape[-1],), weight=ln_w, bias=ln_b)
sx = torch.cat((sx.unsqueeze(0), xx[:-1,:]))
kx = xx * k_mix + sx * (1 - k_mix)
vx = xx * v_mix + sx * (1 - v_mix)
rx = xx * r_mix + sx * (1 - r_mix)
r = torch.sigmoid(gemm(rx, rw))
k = gemm(kx, kw, output_dtype=torch.float32)
v = gemm(vx, vw, output_dtype=torch.float32)
T = x.shape[0]
for t in range(T):
kk = k[t]
vv = v[t]
ww = t_first + kk
p = torch.maximum(pp, ww)
e1 = torch.exp(pp - p)
e2 = torch.exp(ww - p)
sx[t] = ((e1 * aa + e2 * vv) / (e1 * bb + e2)).to(dtype=x.dtype)
ww = t_decay + pp
p = torch.maximum(ww, kk)
e1 = torch.exp(ww - p)
e2 = torch.exp(kk - p)
aa = e1 * aa + e2 * vv
bb = e1 * bb + e2
pp = p
out = gemm(r * sx, ow)
return x + out, xx[-1,:], aa, bb, pp
*/
Tensor att_seq(Tensor x, Tensor sx, Tensor ln_w, Tensor ln_b, Tensor k_mix,
Tensor v_mix, Tensor r_mix, Tensor kw, Tensor vw, Tensor rw,
Tensor ow, Tensor t_first, Tensor pp, Tensor aa, Tensor bb,
Tensor t_decay, /* imm */ Tensor buf, /* out */ Tensor x_plus_out) {
Tensor xx = at::layer_norm(x, {x.size(-1)}, ln_w, ln_b);
sx = at::cat({sx.unsqueeze(0), xx.slice(0, 0, -1)}, 0);
char* buf_ptr = (char*)buf.data_ptr();
half* kx = (half*)buf_ptr;
half* vx = kx + x.numel();
half* rx = vx + x.numel();
half* wkv_y = rx + x.numel();
att_mix(data_ptr<half>(xx), data_ptr<half>(sx), data_ptr<half>(k_mix),
data_ptr<half>(v_mix), data_ptr<half>(r_mix), xx.size(0), xx.size(1),
kx, vx, rx);
float* k = reinterpret_cast<float*>(wkv_y + x.numel());
float* v = k + x.size(0) * kw.size(1);
half* r = reinterpret_cast<half*>(v + x.size(0) * vw.size(1));
gemm_fp16_cublas(kx, kw.data_ptr(), k, x.size(0), kw.size(1), kw.size(0), true);
gemm_fp16_cublas(vx, vw.data_ptr(), v, x.size(0), vw.size(1), vw.size(0), true);
gemm_fp16_cublas(rx, rw.data_ptr(), r, x.size(0), rw.size(1), rw.size(0), false);
element_wise(InplaceSigmoid{r}, x.size(0) * rw.size(1));
cuda_wkv_forward_new(1, x.size(0), x.size(1), data_ptr<float>(t_decay),
data_ptr<float>(t_first), k, v, r,
wkv_y, data_ptr<float>(aa),
data_ptr<float>(bb), data_ptr<float>(pp));
element_wise(InplaceMul{wkv_y, r}, x.numel());
gemm_fp16_cublas(wkv_y, ow.data_ptr(), x_plus_out.data_ptr(), x.size(0), ow.size(1), ow.size(0), false);
x_plus_out += x;
return xx;
}

View File

@@ -0,0 +1,21 @@
#include <cassert>
#include <cstddef>
#include <cstdint>
template <typename Func> __global__ void _element_wise(Func func, int n) {
for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < n;
i += blockDim.x * gridDim.x) {
func(i);
}
}
// NOTE: packed data type (e.g. float4) is a overkill for current sizes
// (4096 in 7B model and 768 in 0.1B model),
// and is not faster than the plain float version.
template <typename Func>
void element_wise(Func func, int n) {
// 256 is good enough on most GPUs
const int32_t BLOCK_SIZE = 256;
assert(n % BLOCK_SIZE == 0);
_element_wise<<<n / BLOCK_SIZE, BLOCK_SIZE>>>(func, n);
}

165
backend-python/rwkv_pip/beta/cuda/ffn.cu vendored Normal file
View File

@@ -0,0 +1,165 @@
#include "ATen/ATen.h"
#include <cuda_fp16.h>
#include <cuda_runtime.h>
#include <torch/extension.h>
#include "element_wise.h"
#include "util.h"
using torch::Tensor;
void gemm_fp16_cublas(const void *a, const void *b, void *c, int ori_m,
int ori_n, int ori_k, bool output_fp32);
__global__ void _ffn_seq_mix(const half *xx, const half *sx, const half *k_mix,
const half *r_mix, const int outer_size,
const int inner_size, half *kx, half *rx) {
for (int idx2 = blockIdx.x * blockDim.x + threadIdx.x; idx2 < inner_size;
idx2 += blockDim.x * gridDim.x) {
half k_mix_ = k_mix[idx2];
half r_mix_ = r_mix[idx2];
for (int row = 0; row < outer_size; ++row) {
int idx1 = row * inner_size + idx2;
half xx_ = xx[idx1];
half sx_ = sx[idx1];
kx[idx1] = __hadd(__hmul(xx_, k_mix_),
__hmul(sx_, __hsub(__float2half(1), k_mix_)));
rx[idx1] = __hadd(__hmul(xx_, r_mix_),
__hmul(sx_, __hsub(__float2half(1), r_mix_)));
}
}
}
void ffn_seq_mix(const half *xx, const half *sx, const half *k_mix,
const half *r_mix, const int outer_size, const int inner_size,
half *kx, half *rx) {
// 256 is good enough on most GPUs
const int32_t BLOCK_SIZE = 256;
assert(inner_size % BLOCK_SIZE == 0);
_ffn_seq_mix<<<inner_size / BLOCK_SIZE, BLOCK_SIZE>>>(
xx, sx, k_mix, r_mix, outer_size, inner_size, kx, rx);
}
struct InplaceSigmoid {
__device__ __forceinline__ void operator()(int i) const {
ptr[i] = __float2half(1.0 / (1.0 + exp(-__half2float(ptr[i]))));
}
half *ptr;
};
struct InplaceReLUAndSquare {
__device__ __forceinline__ void operator()(int i) const {
// __hmax is not defined in old cuda
if (__hgt(ptr[i], __float2half(0))) {
ptr[i] = __hmul(ptr[i], ptr[i]);
} else {
ptr[i] = __float2half(0);
}
}
half *ptr;
};
struct InplaceFma {
__device__ __forceinline__ void operator()(int i) const {
a[i] = __hfma(a[i], b[i], c[i]);
}
half *a;
const half *b;
const half *c;
};
/*
Equivalent Python code:
xx = F.layer_norm(x, (x.shape[-1],), weight=ln_w, bias=ln_b)
sx = torch.cat((sx.unsqueeze(0), xx[:-1,:]))
kx = xx * k_mix + sx * (1 - k_mix)
rx = xx * r_mix + sx * (1 - r_mix)
r = torch.sigmoid(gemm(rx, rw))
vx = torch.square(torch.relu(gemm(kx, kw)))
out = r * gemm(vx, vw)
return x + out, xx[-1,:]
*/
Tensor ffn_seq(Tensor x, Tensor sx, Tensor ln_w, Tensor ln_b, Tensor k_mix,
Tensor r_mix, Tensor kw, Tensor vw, Tensor rw,
/* imm */ Tensor buf,
/* out */ Tensor x_plus_out) {
Tensor xx = at::layer_norm(x, {x.size(-1)}, ln_w, ln_b);
sx = at::cat({sx.unsqueeze(0), xx.slice(0, 0, -1)}, 0);
char *buf_ptr = (char *)buf.data_ptr();
half *kx = (half *)buf_ptr;
half *rx = kx + x.numel();
half *vx = rx + x.numel();
half *r = vx + x.size(0) * kw.size(1);
ffn_seq_mix(data_ptr<half>(xx), data_ptr<half>(sx), data_ptr<half>(k_mix),
data_ptr<half>(r_mix), xx.size(0), xx.size(1), kx, rx);
gemm_fp16_cublas(rx, rw.data_ptr(), r, x.size(0), rw.size(1), x.size(1),
false);
element_wise(InplaceSigmoid{r}, x.size(0) * rw.size(1));
gemm_fp16_cublas(kx, kw.data_ptr(), vx, x.size(0), kw.size(1), x.size(1),
false);
element_wise(InplaceReLUAndSquare{vx}, x.size(0) * kw.size(1));
gemm_fp16_cublas(vx, vw.data_ptr(), x_plus_out.data_ptr(), x.size(0),
vw.size(1), vw.size(0), false);
element_wise(InplaceFma{data_ptr<half>(x_plus_out), r, data_ptr<half>(x)},
x_plus_out.numel());
return xx;
}
struct FfnOneMix {
__device__ __forceinline__ void operator()(int idx) {
half k_mix_ = k_mix[idx];
half r_mix_ = r_mix[idx];
half xx_ = xx[idx];
half sx_ = sx[idx];
kx[idx] = __hadd(__hmul(xx_, k_mix_),
__hmul(sx_, __hsub(__float2half(1), k_mix_)));
rx[idx] = __hadd(__hmul(xx_, r_mix_),
__hmul(sx_, __hsub(__float2half(1), r_mix_)));
}
half *k_mix;
half *r_mix;
half *xx;
half *sx;
half *kx;
half *rx;
};
/*
Equivalent Python code:
xx = F.layer_norm(x, (x.shape[-1],), weight=ln_w, bias=ln_b)
kx = xx * k_mix + sx * (1 - k_mix)
rx = xx * r_mix + sx * (1 - r_mix)
r = torch.sigmoid(gemm(rx, rw))
vx = torch.square(torch.relu(gemm(kx, kw)))
out = r * gemm(vx, vw)
return x + out, xx
*/
Tensor ffn_one(Tensor x, Tensor sx, Tensor ln_w, Tensor ln_b, Tensor k_mix,
Tensor r_mix, Tensor kw, Tensor vw, Tensor rw,
/* imm */ Tensor buf,
/* out */ Tensor x_plus_out) {
Tensor xx = at::layer_norm(x, {x.size(-1)}, ln_w, ln_b);
char *buf_ptr = (char *)buf.data_ptr();
half *kx = (half *)buf_ptr;
half *rx = kx + x.numel();
half *vx = rx + x.numel();
half *r = vx + x.size(0) * kw.size(1);
element_wise(FfnOneMix{data_ptr<half>(k_mix), data_ptr<half>(r_mix),
data_ptr<half>(xx), data_ptr<half>(sx), kx, rx},
x.numel());
// vector * matrix, so m = 1
gemm_fp16_cublas(rx, rw.data_ptr(), r, 1, rw.size(1), rw.size(0), false);
element_wise(InplaceSigmoid{r}, rw.size(1));
gemm_fp16_cublas(kx, kw.data_ptr(), vx, 1, kw.size(1), kw.size(0), false);
element_wise(InplaceReLUAndSquare{vx}, kw.size(1));
gemm_fp16_cublas(vx, vw.data_ptr(), x_plus_out.data_ptr(), 1, vw.size(1),
vw.size(0), false);
element_wise(InplaceFma{data_ptr<half>(x_plus_out), r, data_ptr<half>(x)},
x_plus_out.numel());
return xx;
}

View File

@@ -0,0 +1,128 @@
#include <cublas_v2.h>
#include <cuda.h>
#include <cuda_fp16.h>
#include <cuda_runtime.h>
#include <torch/extension.h>
#define CUBLAS_CHECK(condition) \
for (cublasStatus_t _cublas_check_status = (condition); \
_cublas_check_status != CUBLAS_STATUS_SUCCESS;) \
throw std::runtime_error("cuBLAS error " + \
std::to_string(_cublas_check_status) + " at " + \
std::to_string(__LINE__));
#define CUDA_CHECK(condition) \
for (cudaError_t _cuda_check_status = (condition); \
_cuda_check_status != cudaSuccess;) \
throw std::runtime_error( \
"CUDA error " + std::string(cudaGetErrorString(_cuda_check_status)) + \
" at " + std::to_string(__LINE__));
cublasHandle_t get_cublas_handle() {
static cublasHandle_t cublas_handle = []() {
cublasHandle_t handle = nullptr;
CUBLAS_CHECK(cublasCreate(&handle));
#if CUDA_VERSION < 11000
CUBLAS_CHECK(cublasSetMathMode(handle, CUBLAS_TENSOR_OP_MATH));
#else
CUBLAS_CHECK(cublasSetMathMode(handle, CUBLAS_DEFAULT_MATH));
#endif // CUDA_VERSION < 11000
return handle;
}();
return cublas_handle;
}
/*
NOTE: blas gemm is column-major by default, but we need row-major output.
The data of row-major, transposed matrix is exactly the same as the
column-major, non-transposed matrix, and C = A * B ---> C^T = B^T * A^T
*/
void gemm_fp16_cublas(const void *a, const void *b, void *c, int ori_m,
int ori_n, int ori_k, bool output_fp32) {
const auto cuda_data_type = CUDA_R_16F;
const auto cuda_c_data_type = output_fp32 ? CUDA_R_32F : CUDA_R_16F;
const auto compute_type = CUDA_R_32F;
const float sp_alpha = 1.f;
// use CUBLAS_OP_N. see the notes above
const cublasOperation_t cublas_trans_a = CUBLAS_OP_N;
const cublasOperation_t cublas_trans_b = CUBLAS_OP_N;
// m = (B^T).size(0) = B.size(1) = n;
const int cublas_m = ori_n;
const int cublas_k = ori_k;
// comptiable with rwkv one mode, where 1-D tensor * 2-D tensor
// const int n = a.dense_dim() == 1 ? 1 : a.size(0);
const int cublas_n = ori_m;
const int cublas_lda = cublas_m;
const int cublas_ldb = cublas_k;
const int cublas_ldc = cublas_m;
cublasHandle_t cublas_handle = get_cublas_handle();
#if CUDA_VERSION >= 11000
cublasGemmAlgo_t algo = CUBLAS_GEMM_DEFAULT;
#else
cublasGemmAlgo_t algo = CUBLAS_GEMM_DFALT_TENSOR_OP;
#endif
const float sp_beta = 0.f;
CUBLAS_CHECK(cublasGemmEx(
cublas_handle, cublas_trans_a, cublas_trans_b, cublas_m, cublas_n,
cublas_k, &sp_alpha, b, cuda_data_type, cublas_lda,
a, cuda_data_type, cublas_ldb, &sp_beta, c,
cuda_c_data_type, cublas_ldc, compute_type, algo));
}
/*
NOTE: blas gemm is column-major by default, but we need row-major output.
The data of row-major, transposed matrix is exactly the same as the
column-major, non-transposed matrix, and C = A * B ---> C^T = B^T * A^T
*/
void gemm_fp16_cublas_tensor(torch::Tensor a, torch::Tensor b, torch::Tensor c) {
if (a.sizes().size() == 1) {
assert(b.sizes().size() == 2);
a = at::unsqueeze(a, 0);
}
const auto cuda_data_type = CUDA_R_16F;
const auto cuda_c_data_type =
c.dtype() == torch::kFloat32 ? CUDA_R_32F : CUDA_R_16F;
const auto compute_type = CUDA_R_32F;
const float sp_alpha = 1.f;
// swap a and b, and use CUBLAS_OP_N. see the notes above
std::swap(a, b);
const cublasOperation_t cublas_trans_a = CUBLAS_OP_N;
const cublasOperation_t cublas_trans_b = CUBLAS_OP_N;
// m = (B^T).size(0) = B.size(1), and = A.size(1) after swap,
// negative axis is used because of the existence of batch matmul.
const int m = a.size(-1);
const int k = a.size(-2);
const int n = b.size(-2);
const int cublas_lda = m;
const int cublas_ldb = k;
const int cublas_ldc = m;
cublasHandle_t cublas_handle = get_cublas_handle();
#if CUDA_VERSION >= 11000
cublasGemmAlgo_t algo = CUBLAS_GEMM_DEFAULT;
#else
cublasGemmAlgo_t algo = CUBLAS_GEMM_DFALT_TENSOR_OP;
#endif
const float sp_beta = 0.f;
if (a.sizes().size() == 2 && b.sizes().size() == 2) {
CUBLAS_CHECK(cublasGemmEx(
cublas_handle, cublas_trans_a, cublas_trans_b, m, n, k, &sp_alpha,
a.data_ptr(), cuda_data_type, cublas_lda, b.data_ptr(), cuda_data_type,
cublas_ldb, &sp_beta, c.data_ptr(), cuda_c_data_type, cublas_ldc,
compute_type, algo));
} else {
// batch matmul
assert(a.sizes().size() == 3 && b.sizes().size() == 3);
const long long int cublas_stride_a = m * k;
const long long int cublas_stride_b = k * n;
const long long int cublas_stride_c = m * n;
CUBLAS_CHECK(cublasGemmStridedBatchedEx(
cublas_handle, cublas_trans_a, cublas_trans_b, m,
n, k, &sp_alpha, a.data_ptr(), cuda_data_type, cublas_lda,
cublas_stride_a, b.data_ptr(), cuda_data_type, cublas_ldb, cublas_stride_b,
&sp_beta, c.data_ptr(), cuda_c_data_type, cublas_ldc, cublas_stride_c,
a.size(0), compute_type, algo));
}
}

View File

@@ -0,0 +1,246 @@
#include <stdio.h>
#include <assert.h>
#include "ATen/ATen.h"
#include <cuda_fp16.h>
#define MIN_VALUE (-1e38)
typedef at::Half fp16;
__half *cast(fp16 *ptr) {
return reinterpret_cast<__half *>(ptr);
}
template <typename F>
__global__ void kernel_wkv_forward(const int B, const int T, const int C,
const float *__restrict__ const _w, const float *__restrict__ const _u, const F *__restrict__ const _k, const F *__restrict__ const _v,
F *__restrict__ const _y, float *__restrict__ const _aa, float *__restrict__ const _bb, float *__restrict__ const _pp) {
const int idx = blockIdx.x * blockDim.x + threadIdx.x;
const int _b = idx / C;
const int _c = idx % C;
const int _offset = _b * T * C + _c;
const int _state_offset = _b * C + _c;
float u = _u[_c];
float w = _w[_c];
const F *__restrict__ const k = _k + _offset;
const F *__restrict__ const v = _v + _offset;
F *__restrict__ const y = _y + _offset;
float aa = _aa[_state_offset];
float bb = _bb[_state_offset];
float pp = _pp[_state_offset];
for (int i = 0; i < T; i++) {
const int ii = i * C;
const float kk = float(k[ii]);
const float vv = float(v[ii]);
float ww = u + kk;
float p = max(pp, ww);
float e1 = exp(pp - p);
float e2 = exp(ww - p);
y[ii] = F((e1 * aa + e2 * vv) / (e1 * bb + e2));
ww = w + pp;
p = max(ww, kk);
e1 = exp(ww - p);
e2 = exp(kk - p);
aa = e1 * aa + e2 * vv;
bb = e1 * bb + e2;
pp = p;
}
_aa[_state_offset] = aa;
_bb[_state_offset] = bb;
_pp[_state_offset] = pp;
}
template <typename F>
void cuda_wkv_forward(int B, int T, int C, float *w, float *u, F *k, F *v, F *y, float *aa, float *bb, float *pp) {
dim3 threadsPerBlock( min(C, 32) );
assert(B * C % threadsPerBlock.x == 0);
dim3 numBlocks(B * C / threadsPerBlock.x);
kernel_wkv_forward<<<numBlocks, threadsPerBlock>>>(B, T, C, w, u, k, v, y, aa, bb, pp);
}
template void cuda_wkv_forward<fp16>(
int B, int T, int C,
float *w, float *u, fp16 *k, fp16 *v, fp16 *y,
float *aa, float *bb, float *pp);
template void cuda_wkv_forward<float>(
int B, int T, int C,
float *w, float *u, float *k, float *v, float *y,
float *aa, float *bb, float *pp);
__global__ void kernel_mm_seq_fp32i8(
const int B, const int N, const int M,
const float *__restrict__ const x, const int x_stride,
const uint8_t *__restrict__ const w, const int w_stride,
const float *__restrict__ const mx,
const float *__restrict__ const rx,
const float *__restrict__ const my,
const float *__restrict__ const ry,
float *__restrict__ const y, const int y_stride) {
const int i = blockIdx.x * blockDim.x + threadIdx.x;
const int k = blockIdx.y * blockDim.y + threadIdx.y;
if (i < B && k < M) {
float y_local = 0;
for (int j = 0; j < N; ++j) {
y_local += x[i * x_stride + j] * (
(float(w[j * w_stride + k]) + 0.5f)
* rx[k] * ry[j] + mx[k] + my[j]
);
}
y[i * y_stride + k] = y_local;
}
}
template <typename F>
void cuda_mm8_seq(int B, int N, int M,
F *x, int x_stride,
uint8_t *w, int w_stride,
F *mx, F *rx,
F *my, F *ry,
F *y, int y_stride);
template <>
void cuda_mm8_seq<float>(int B, int N, int M,
float *x, int x_stride,
uint8_t *w, int w_stride,
float *mx, float *rx,
float *my, float *ry,
float *y, int y_stride) {
dim3 blockSize(1, 128);
dim3 gridSize((B + blockSize.x - 1) / blockSize.x, (M + blockSize.y - 1) / blockSize.y);
kernel_mm_seq_fp32i8<<<gridSize, blockSize>>>(
B, N, M, x, x_stride, w, w_stride,
mx, rx, my, ry, y, y_stride);
}
__global__ void kernel_mm_seq_fp16i8(
const int B, const int N, const int M,
const __half *__restrict__ const x, const int x_stride,
const uint8_t *__restrict__ const w, const int w_stride,
const __half *__restrict__ const mx,
const __half *__restrict__ const rx,
const __half *__restrict__ const my,
const __half *__restrict__ const ry,
__half *__restrict__ const y, const int y_stride) {
const int i = blockIdx.x * blockDim.x + threadIdx.x;
const int k = blockIdx.y * blockDim.y + threadIdx.y;
if (i < B && k < M) {
float y_local = 0;
for (int j = 0; j < N; ++j) {
y_local += __half2float(x[i * x_stride + j]) * (
(float(w[j * w_stride + k]) + 0.5f)
* __half2float(rx[k]) * __half2float(ry[j])
+ __half2float(mx[k]) + __half2float(my[j])
);
}
y[i * y_stride + k] = __float2half(y_local);
}
}
template <>
void cuda_mm8_seq<fp16>(int B, int N, int M,
fp16 *x, int x_stride,
uint8_t *w, int w_stride,
fp16 *mx, fp16 *rx,
fp16 *my, fp16 *ry,
fp16 *y, int y_stride) {
dim3 blockSize(1, 128);
dim3 gridSize((B + blockSize.x - 1) / blockSize.x, (M + blockSize.y - 1) / blockSize.y);
kernel_mm_seq_fp16i8<<<gridSize, blockSize>>>(
B, N, M, cast(x), x_stride, w, w_stride,
cast(mx), cast(rx), cast(my), cast(ry), cast(y), y_stride);
}
#define MM8_ONE_JSPLIT 24
#define MM8_ONE_TILE 1024
__global__ void kernel_mm_one_fp32i8(
const int N, const int M,
const float *__restrict__ const x,
const uint8_t *__restrict__ const w, const int w_stride,
const float *__restrict__ const mx,
const float *__restrict__ const rx,
const float *__restrict__ const my,
const float *__restrict__ const ry,
float *__restrict__ const y) {
const int k = blockIdx.y * blockDim.y + threadIdx.y;
const int j0 = min(N, blockIdx.x * ((N + MM8_ONE_JSPLIT - 1) / MM8_ONE_JSPLIT));
const int j1 = min(N, (blockIdx.x + 1) * ((N + MM8_ONE_JSPLIT - 1) / MM8_ONE_JSPLIT));
if (k < M) {
float y_local = 0;
for (int j = j0; j < j1; ++j) {
y_local += x[j] * (
(float(w[j * w_stride + k]) + 0.5f)
* rx[k] * ry[j] + mx[k] + my[j]
);
}
atomicAdd(&y[k], y_local);
}
}
template <typename F>
void cuda_mm8_one(int N, int M,
F *x,
uint8_t *w, int w_stride,
F *mx, F *rx,
F *my, F *ry,
float *y);
template <>
void cuda_mm8_one<float>(int N, int M,
float *x,
uint8_t *w, int w_stride,
float *mx, float *rx,
float *my, float *ry,
float *y) {
dim3 blockSize(1, MM8_ONE_TILE);
dim3 gridSize(MM8_ONE_JSPLIT, (M + blockSize.y - 1) / blockSize.y);
kernel_mm_one_fp32i8<<<gridSize, blockSize>>>(
N, M, x, w, w_stride,
mx, rx, my, ry, y);
}
__global__ void kernel_mm_one_fp16i8(
const int N, const int M,
const __half *__restrict__ const x,
const uint8_t *__restrict__ const w, const int w_stride,
const __half *__restrict__ const mx,
const __half *__restrict__ const rx,
const __half *__restrict__ const my,
const __half *__restrict__ const ry,
float *__restrict__ const y) {
const int k = blockIdx.y * blockDim.y + threadIdx.y;
const int j0 = min(N, blockIdx.x * ((N + MM8_ONE_JSPLIT - 1) / MM8_ONE_JSPLIT));
const int j1 = min(N, (blockIdx.x + 1) * ((N + MM8_ONE_JSPLIT - 1) / MM8_ONE_JSPLIT));
if (k < M) {
float y_local = 0;
for (int j = j0; j < j1; ++j) {
y_local += __half2float(x[j]) * (
(float(w[j * w_stride + k]) + 0.5f)
* __half2float(rx[k]) * __half2float(ry[j])
+ __half2float(mx[k]) + __half2float(my[j])
);
}
atomicAdd(&y[k], y_local);
}
}
template <>
void cuda_mm8_one<fp16>(int N, int M,
fp16 *x,
uint8_t *w, int w_stride,
fp16 *mx, fp16 *rx,
fp16 *my, fp16 *ry,
float *y) {
dim3 blockSize(1, MM8_ONE_TILE);
dim3 gridSize(MM8_ONE_JSPLIT, (M + blockSize.y - 1) / blockSize.y);
kernel_mm_one_fp16i8<<<gridSize, blockSize>>>(
N, M, cast(x), w, w_stride,
cast(mx), cast(rx), cast(my), cast(ry), y);
}

View File

@@ -0,0 +1,7 @@
#include "ATen/ATen.h"
#include <cuda_fp16.h>
template <typename T> T *data_ptr(torch::Tensor x) { return x.data_ptr<T>(); }
template <> inline half *data_ptr(torch::Tensor x) {
return reinterpret_cast<half *>(x.data_ptr<at::Half>());
}

View File

@@ -0,0 +1,181 @@
#include <torch/extension.h>
#include "ATen/ATen.h"
#include <iostream>
#include <c10/cuda/CUDAGuard.h>
typedef at::Half fp16;
template <typename F>
void cuda_wkv_forward(int B, int T, int C,
float *w, float *u, F *k, F *v, F *y,
float *aa, float *bb, float *pp);
template <typename F>
void cuda_mm8_seq(int B, int N, int M,
F *x, int x_stride,
uint8_t *w, int w_stride,
F *mx, F *rx,
F *my, F *ry,
F *y, int y_stride);
template <typename F>
void cuda_mm8_one(int N, int M,
F *x,
uint8_t *w, int w_stride,
F *mx, F *rx,
F *my, F *ry,
float *y);
void wkv_forward(int64_t B, int64_t T, int64_t C,
torch::Tensor &w, torch::Tensor &u,
torch::Tensor &k, torch::Tensor &v, torch::Tensor &y,
torch::Tensor &aa, torch::Tensor &bb, torch::Tensor &pp) {
const at::cuda::OptionalCUDAGuard device_guard(device_of(w));
switch (k.scalar_type()) {
case c10::ScalarType::Half:
cuda_wkv_forward(B, T, C,
w.data_ptr<float>(), u.data_ptr<float>(),
k.data_ptr<fp16>(), v.data_ptr<fp16>(), y.data_ptr<fp16>(),
aa.data_ptr<float>(), bb.data_ptr<float>(), pp.data_ptr<float>());
break;
case c10::ScalarType::Float:
cuda_wkv_forward(B, T, C,
w.data_ptr<float>(), u.data_ptr<float>(),
k.data_ptr<float>(), v.data_ptr<float>(), y.data_ptr<float>(),
aa.data_ptr<float>(), bb.data_ptr<float>(), pp.data_ptr<float>());
break;
default:
assert(false && "Only FP16 and FP32 are currently supported");
}
}
void mm8_seq(int64_t B, int64_t N, int64_t M,
torch::Tensor &x, torch::Tensor &w,
torch::Tensor &mx, torch::Tensor &rx,
torch::Tensor &my, torch::Tensor &ry,
torch::Tensor &y) {
assert(x.stride(1) == 1);
assert(w.stride(1) == 1);
assert(mx.stride(0) == 1 && rx.stride(0) == 1);
assert(my.stride(0) == 1 && ry.stride(0) == 1);
assert(y.stride(1) == 1);
const at::cuda::OptionalCUDAGuard device_guard(device_of(w));
switch (x.scalar_type()) {
case c10::ScalarType::Half:
cuda_mm8_seq(
B, N, M,
x.data_ptr<fp16>(), x.stride(0),
w.data_ptr<uint8_t>(), w.stride(0),
mx.data_ptr<fp16>(), rx.data_ptr<fp16>(),
my.data_ptr<fp16>(), ry.data_ptr<fp16>(),
y.data_ptr<fp16>(), y.stride(0));
break;
case c10::ScalarType::Float:
cuda_mm8_seq(
B, N, M,
x.data_ptr<float>(), x.stride(0),
w.data_ptr<uint8_t>(), w.stride(0),
mx.data_ptr<float>(), rx.data_ptr<float>(),
my.data_ptr<float>(), ry.data_ptr<float>(),
y.data_ptr<float>(), y.stride(0));
break;
default:
assert(false && "Only FP16 and FP32 are currently supported");
}
}
void mm8_one(int64_t N, int64_t M,
torch::Tensor &x, torch::Tensor &w,
torch::Tensor &mx, torch::Tensor &rx,
torch::Tensor &my, torch::Tensor &ry,
torch::Tensor &y) {
assert(x.stride(0) == 1);
assert(w.stride(1) == 1);
assert(mx.stride(0) == 1 && rx.stride(0) == 1);
assert(my.stride(0) == 1 && ry.stride(0) == 1);
assert(y.stride(0) == 1);
const at::cuda::OptionalCUDAGuard device_guard(device_of(w));
switch (x.scalar_type()) {
case c10::ScalarType::Half:
cuda_mm8_one(
N, M,
x.data_ptr<fp16>(),
w.data_ptr<uint8_t>(), w.stride(0),
mx.data_ptr<fp16>(), rx.data_ptr<fp16>(),
my.data_ptr<fp16>(), ry.data_ptr<fp16>(),
y.data_ptr<float>());
break;
case c10::ScalarType::Float:
cuda_mm8_one(
N, M,
x.data_ptr<float>(),
w.data_ptr<uint8_t>(), w.stride(0),
mx.data_ptr<float>(), rx.data_ptr<float>(),
my.data_ptr<float>(), ry.data_ptr<float>(),
y.data_ptr<float>());
break;
default:
assert(false && "Only FP16 and FP32 are currently supported");
}
}
using torch::Tensor;
#ifndef DISABLE_CUBLAS_GEMM
void gemm_fp16_cublas_tensor(Tensor a, Tensor b, Tensor c);
#endif
Tensor att_one(Tensor x, Tensor ln_w, Tensor ln_b, Tensor sx, Tensor k_mix,
Tensor v_mix, Tensor r_mix, Tensor kw,
/* imm */ Tensor kx, Tensor vw, /* imm */ Tensor vx, Tensor rw,
/* imm */ Tensor rx, Tensor ow, Tensor t_first,
/* imm */ Tensor k, Tensor pp, Tensor ww, Tensor aa, Tensor bb,
Tensor t_decay, /* imm */ Tensor v, /* in & out */ Tensor r,
/* out */ Tensor x_plus_out, /* out */ Tensor t1,
/* out */ Tensor t2, /* out */ Tensor p);
Tensor att_seq(Tensor x, Tensor sx, Tensor ln_w, Tensor ln_b, Tensor k_mix,
Tensor v_mix, Tensor r_mix, Tensor kw, Tensor vw, Tensor rw,
Tensor ow, Tensor t_first, Tensor pp, Tensor aa, Tensor bb,
Tensor t_decay, /* imm */ Tensor buf, /* out */ Tensor x_plus_out);
Tensor att_one_v5(Tensor x, Tensor sx, Tensor s, Tensor ln_w, Tensor ln_b,
Tensor lx_w, Tensor lx_b, Tensor k_mix, Tensor v_mix,
Tensor r_mix, Tensor kw,
/* imm */ Tensor kx, Tensor vw, /* imm */ Tensor vx,
Tensor rw,
/* imm */ Tensor rx, Tensor ow, Tensor t_first,
/* imm */ Tensor k, Tensor t_decay, /* imm */ Tensor v,
/* imm */ Tensor r, /* imm */ Tensor s1,
/* out */ Tensor x_plus_out, /* out */ Tensor s2);
Tensor ffn_seq(Tensor x, Tensor sx, Tensor ln_w, Tensor ln_b, Tensor k_mix,
Tensor r_mix, Tensor kw, Tensor vw, Tensor rw,
/* imm */ Tensor buf,
/* out */ Tensor x_plus_out);
Tensor ffn_one(Tensor x, Tensor sx, Tensor ln_w, Tensor ln_b, Tensor k_mix,
Tensor r_mix, Tensor kw, Tensor vw, Tensor rw,
/* imm */ Tensor buf,
/* out */ Tensor x_plus_out);
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
m.def("wkv_forward", &wkv_forward, "wkv forward");
m.def("mm8_seq", &mm8_seq, "mm8 seq");
m.def("mm8_one", &mm8_one, "mm8 one");
m.def("gemm_fp16_cublas", &gemm_fp16_cublas_tensor, "gemv fp16 cublas");
m.def("att_one", &att_one, "att one");
m.def("att_one_v5", &att_one_v5, "att one v5");
m.def("att_seq", &att_seq, "att seq");
m.def("ffn_seq", &ffn_seq, "ffn seq");
m.def("ffn_one", &ffn_one, "ffn one");
}
TORCH_LIBRARY(rwkv, m) {
m.def("wkv_forward", wkv_forward);
m.def("mm8_seq", mm8_seq);
m.def("mm8_one", mm8_one);
m.def("gemm_fp16_cublas", gemm_fp16_cublas_tensor);
m.def("att_one", att_one);
m.def("att_one_v5", &att_one_v5);
m.def("att_seq", att_seq);
m.def("ffn_seq", ffn_seq);
m.def("ffn_one", ffn_one);
}

1821
backend-python/rwkv_pip/beta/model.py vendored Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

20144
backend-python/rwkv_pip/tokenizer-midi.json vendored Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -33,7 +33,7 @@ class PIPELINE_ARGS:
class PIPELINE:
def __init__(self, model, WORD_NAME):
def __init__(self, model, WORD_NAME: str):
self.model = model
if WORD_NAME == "cl100k_base":
import tiktoken
@@ -47,9 +47,15 @@ class PIPELINE:
os.path.dirname(os.path.abspath(__file__)) + "/rwkv_vocab_v20230424.txt"
)
else:
from tokenizers import Tokenizer
if WORD_NAME.endswith(".txt"):
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from rwkv_tokenizer import TRIE_TOKENIZER
self.tokenizer = Tokenizer.from_file(WORD_NAME)
self.tokenizer = TRIE_TOKENIZER(WORD_NAME)
else:
from tokenizers import Tokenizer
self.tokenizer = Tokenizer.from_file(WORD_NAME)
def refine_context(self, context):
context = context.strip().split("\n")

View File

@@ -2,6 +2,8 @@ import json
import logging
from typing import Any
from fastapi import Request
from pydantic import BaseModel
from enum import Enum
logger = logging.getLogger()
@@ -14,12 +16,21 @@ fh.setFormatter(formatter)
logger.addHandler(fh)
class ClsEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, BaseModel):
return obj.dict()
if isinstance(obj, Enum):
return obj.value
return super().default(obj)
def quick_log(request: Request, body: Any, response: str):
try:
logger.info(
f"Client: {request.client if request else ''}\nUrl: {request.url if request else ''}\n"
+ (
f"Body: {json.dumps(body.__dict__, default=vars, ensure_ascii=False)}\n"
f"Body: {json.dumps(body.__dict__, ensure_ascii=False, cls=ClsEncoder)}\n"
if body
else ""
)

685
backend-python/utils/midi.py vendored Normal file
View File

@@ -0,0 +1,685 @@
# https://github.com/briansemrau/MIDI-LLM-tokenizer
# MIT License
# Copyright (c) 2023 Brian Semrau
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
import json
import random
from dataclasses import dataclass
from functools import lru_cache
from math import ceil, floor, log
from typing import Dict, Iterator, List, Optional, Tuple
import mido
@dataclass
class VocabConfig:
# Number of note events. Should be 128.
note_events: int
# Number of wait events. Configurable, must evenly divide max_wait_time.
wait_events: int
# Max wait time in milliseconds to be represented by a single token.
max_wait_time: int
# Number of velocity events. Should be 128 (or 100? need to check midi standard)
velocity_events: int
# Number of bins to quantize velocity into. Should evenly divide velocity_events.
velocity_bins: int
# Exponential scaling factor for velocity bin sizes. 1.0 = linear scaling.
velocity_exp: float
# Whether to sort tokens by instrument, note. This should improve data reducibility.
do_token_sorting: bool
# Whether tokens should be represented as combined instrument/note/velocity tokens, or separate tokens for each.
unrolled_tokens: bool
# If non-zero, notes held for this many seconds will be automatically released during str->midi decoding.
decode_end_held_note_delay: float
# If true, repeated notes will be automatically released before playing again during str->midi decoding.
decode_fix_repeated_notes: bool
# List of instrument names to use for binning. Must have at most 16 values.
bin_instrument_names: List[str]
# Indicates which bin name represents percussion instruments on MIDI channel 10.
ch10_instrument_bin_name: str
# Mapping from instrument name to bin name.
program_name_to_bin_name: Dict[str, str]
# Mapping from bin name to program name.
bin_name_to_program_name: Dict[str, str]
# Mapping from program number to instrument name.
instrument_names: Dict[str, str]
def __post_init__(self):
self.validate()
self._instrument_names_str_to_int = {
name: int(i) for i, name in self.instrument_names.items()
}
self._instrument_names_int_to_str = {
int(i): name for i, name in self.instrument_names.items()
}
self._bin_str_to_int = {
name: int(i) for i, name in enumerate(self.bin_instrument_names)
}
self._bin_int_to_instrument_int = [
self._instrument_names_str_to_int[self.bin_name_to_program_name[name]]
if name != self.ch10_instrument_bin_name
else 0
for name in self.bin_instrument_names
]
self._instrument_int_to_bin_int = [
self._bin_str_to_int[self.program_name_to_bin_name[instr]]
if self.program_name_to_bin_name[instr] != ""
else -1
for instr in self.program_name_to_bin_name.keys()
]
self._ch10_bin_int = (
self._bin_str_to_int[self.ch10_instrument_bin_name]
if self.ch10_instrument_bin_name
else -1
)
self.short_instr_bin_names = []
for instr in self.bin_instrument_names:
i = min(1, len(instr))
while instr[:i] in self.short_instr_bin_names:
i += 1
self.short_instr_bin_names.append(instr[:i])
self._short_instrument_names_str_to_int = {
name: int(i) for i, name in enumerate(self.short_instr_bin_names)
}
range_excluding_ch10 = [
(i if i < 9 else i + 1) for i in range(len(self.bin_instrument_names))
]
bins_excluding_ch10 = [
n for n in self.bin_instrument_names if n != self.ch10_instrument_bin_name
]
self.bin_channel_map = {
bin: channel
for channel, bin in zip(range_excluding_ch10, bins_excluding_ch10)
}
if self.ch10_instrument_bin_name:
self.bin_channel_map[self.ch10_instrument_bin_name] = 9
def validate(self):
if self.max_wait_time % self.wait_events != 0:
raise ValueError("max_wait_time must be exactly divisible by wait_events")
if self.velocity_bins < 2:
raise ValueError("velocity_bins must be at least 2")
if len(self.bin_instrument_names) > 16:
raise ValueError("bin_instruments must have at most 16 values")
if (
self.ch10_instrument_bin_name
and self.ch10_instrument_bin_name not in self.bin_instrument_names
):
raise ValueError("ch10_instrument_bin_name must be in bin_instruments")
if self.velocity_exp <= 0:
raise ValueError("velocity_exp must be greater than 0")
@classmethod
def from_json(cls, path: str):
with open(path, "r") as f:
config = json.load(f)
return cls(**config)
class VocabUtils:
def __init__(self, cfg: VocabConfig) -> None:
self.cfg = cfg
@lru_cache(maxsize=128)
def format_wait_token(self, wait: int) -> str:
return f"t{wait}"
@lru_cache(maxsize=128)
def format_note_token(
self, instrument_bin: int, note: int, velocity_bin: int
) -> str:
return f"{self.cfg.short_instr_bin_names[instrument_bin]}:{note:x}:{velocity_bin:x}"
def format_unrolled_note(self, note: int) -> str:
return f"n{note:x}"
def format_unrolled_velocity(self, velocity_bin: int) -> str:
return f"v{velocity_bin:x}"
def format_unrolled_instrument_bin(self, instrument_bin: int) -> str:
return f"i{self.cfg.short_instr_bin_names[instrument_bin]}"
def velocity_to_bin(self, velocity: float) -> int:
velocity = max(0, min(velocity, self.cfg.velocity_events - 1))
binsize = self.cfg.velocity_events / (self.cfg.velocity_bins - 1)
if self.cfg.velocity_exp == 1.0:
return ceil(velocity / binsize)
else:
return ceil(
(
self.cfg.velocity_events
* (
(
self.cfg.velocity_exp
** (velocity / self.cfg.velocity_events)
- 1.0
)
/ (self.cfg.velocity_exp - 1.0)
)
)
/ binsize
)
def bin_to_velocity(self, bin: int) -> int:
binsize = self.cfg.velocity_events / (self.cfg.velocity_bins - 1)
if self.cfg.velocity_exp == 1.0:
return max(0, ceil(bin * binsize - 1))
else:
return max(
0,
ceil(
self.cfg.velocity_events
* log(
((self.cfg.velocity_exp - 1) * binsize * bin)
/ self.cfg.velocity_events
+ 1,
self.cfg.velocity_exp,
)
- 1
),
)
def delta_to_wait_ids(self, delta_ms: float) -> Iterator[int]:
def roundi(f: float):
return ceil(f - 0.5)
max_wait_ms = self.cfg.max_wait_time
div = max_wait_ms / self.cfg.wait_events
# if delta_ms // max_wait_ms > 512: # arbitrary limit to avoid excessive time_shifts
# raise ValueError("delta_time is too large")
if delta_ms > max_wait_ms * 10:
delta_ms = max_wait_ms * 10 # truncate time
for _ in range(floor(delta_ms / max_wait_ms)):
yield roundi(max_wait_ms / div)
leftover_time_shift = roundi((delta_ms % max_wait_ms) / div)
if leftover_time_shift > 0:
yield leftover_time_shift
def prog_data_to_token_data(
self, program: int, channel: int, note: int, velocity: float
) -> Optional[Tuple[int, int, int]]:
if channel == 9:
if self.cfg._ch10_bin_int == -1:
return None
return self.cfg._ch10_bin_int, note, self.velocity_to_bin(velocity)
instrument_bin = self.cfg._instrument_int_to_bin_int[program]
if instrument_bin != -1:
return instrument_bin, note, self.velocity_to_bin(velocity)
return None
def prog_data_list_to_token_data_list(
self, data: List[Tuple[int, int, int, float]]
) -> Iterator[Tuple[int, int, int]]:
for d in data:
token_data = self.prog_data_to_token_data(*d)
if token_data is not None:
yield token_data
def sort_token_data(
self, data: List[Tuple[int, int, int]]
) -> List[Tuple[int, int, int]]:
# ensure order is preserved for tokens with the same instrument, note
data = [(i, n, v, x) for x, (i, n, v) in enumerate(data)]
data.sort(key=lambda x: (x[0] != self.cfg._ch10_bin_int, x[0], x[1], x[3]))
return [(i, n, v) for i, n, v, _ in data]
def data_to_wait_tokens(self, delta_ms: float) -> List[str]:
if delta_ms == 0.0:
return []
return [self.format_wait_token(i) for i in self.delta_to_wait_ids(delta_ms)]
def wait_token_to_delta(self, token: str) -> float:
return self.cfg.max_wait_time / self.cfg.wait_events * int(token[1:])
def note_token_to_data(self, token: str) -> Tuple[int, int, int]:
instr_str, note_str, velocity_str = token.strip().split(":")
instr_bin = self.cfg._short_instrument_names_str_to_int[instr_str]
note = int(note_str, base=16)
velocity = self.bin_to_velocity(int(velocity_str, base=16))
return instr_bin, note, velocity
@dataclass
class AugmentValues:
instrument_bin_remap: Dict[int, int]
velocity_mod_factor: float
transpose_semitones: int
time_stretch_factor: float
@classmethod
def default(cls) -> "AugmentValues":
return cls(
instrument_bin_remap={},
velocity_mod_factor=1.0,
transpose_semitones=0,
time_stretch_factor=1.0,
)
@dataclass
class AugmentConfig:
# The number of times to augment each MIDI file. The dataset size will be multiplied by this number.
augment_data_factor: int
# A list of instrument names to randomly swap with each other.
instrument_mixups: List[List[str]]
# A list of percentages to change the note velocity by. 0.0 = no change. 0 is included by default.
velocity_mod_pct: List[float]
# A list of semitones to transpose by. 0 is included by default.
transpose_semitones: List[int]
# A list of percentages to stretch the tempo by. 0.0 = no stretch. 0 is included by default.
time_stretch_pct: List[float]
# Random seed to use for reproducibility.
seed: int
cfg: VocabConfig
def __post_init__(self):
self.validate()
if len(self.velocity_mod_pct) == 0:
self.velocity_mod_pct = [0.0]
if len(self.transpose_semitones) == 0:
self.transpose_semitones = [0]
if len(self.time_stretch_pct) == 0:
self.time_stretch_pct = [0.0]
self._instrument_mixups_int = [
[self.cfg._bin_str_to_int[i] for i in l if i in self.cfg._bin_str_to_int]
for l in self.instrument_mixups
]
self._instrument_mixups_int = [
l for l in self._instrument_mixups_int if len(l) > 0
] # remove empty lists
self._instrument_pool_assignments = {}
self._mixup_pools = []
for pool_i, mixup_list in enumerate(self._instrument_mixups_int):
pool = set()
for i in mixup_list:
pool.add(i)
self._instrument_pool_assignments[i] = pool_i
self._mixup_pools.append(pool)
def validate(self):
if self.augment_data_factor < 1:
raise ValueError("augment_data_factor must be at least 1")
used_instruments = set()
for mixup_list in self.instrument_mixups:
for n in mixup_list:
if n in used_instruments:
raise ValueError(f"Duplicate instrument name: {n}")
used_instruments.add(n)
@classmethod
def from_json(cls, path: str, cfg: VocabConfig):
with open(path, "r") as f:
config = json.load(f)
config["cfg"] = cfg
if "seed" not in config:
config["seed"] = random.randint(0, 2**32 - 1)
return cls(**config)
def get_augment_values(self, filename: str) -> Iterator[AugmentValues]:
# first yield default values
yield AugmentValues.default()
rng = random.Random(self.seed + hash(filename))
for _ in range(int(self.augment_data_factor - 1)):
# randomize order for each pool
randomized_pools = [list(pool) for pool in self._mixup_pools]
for pool in randomized_pools:
rng.shuffle(pool)
# distribute reassignments
instrument_bin_remap = {}
for i, pool in enumerate(randomized_pools):
for j, instrument in enumerate(pool):
instrument_bin_remap[instrument] = randomized_pools[i - 1][j]
yield AugmentValues(
instrument_bin_remap=instrument_bin_remap,
velocity_mod_factor=1.0 + rng.choice(self.velocity_mod_pct),
transpose_semitones=rng.choice(self.transpose_semitones),
time_stretch_factor=1.0 + rng.choice(self.time_stretch_pct),
)
def mix_volume(velocity: int, volume: int, expression: int) -> float:
return velocity * (volume / 127.0) * (expression / 127.0)
def convert_midi_to_str(
cfg: VocabConfig, mid: mido.MidiFile, augment: AugmentValues = None
) -> str:
utils = VocabUtils(cfg)
if augment is None:
augment = AugmentValues.default()
# filter out unknown meta messages before merge (https://github.com/mido/mido/pull/286)
for i in range(len(mid.tracks)):
mid.tracks[i] = [msg for msg in mid.tracks[i] if msg.type != "unknown_meta"]
if len(mid.tracks) > 1:
mid.tracks = [mido.merge_tracks(mid.tracks)]
delta_time_ms = 0.0
tempo = 500000
channel_program = {i: 0 for i in range(16)}
channel_volume = {i: 127 for i in range(16)}
channel_expression = {
i: 127 for i in range(16)
} # unlikely to be useful. expression usually modifies an already played note.
channel_notes = {i: {} for i in range(16)}
channel_pedal_on = {i: False for i in range(16)}
channel_pedal_events = {
i: {} for i in range(16)
} # {channel: {(note, program) -> True}}
started_flag = False
output = ["<start>"]
token_data_buffer: List[
Tuple[int, int, int, float]
] = [] # need to sort notes between wait tokens
def flush_token_data_buffer():
nonlocal token_data_buffer, output, cfg, utils, augment
token_data = [
x for x in utils.prog_data_list_to_token_data_list(token_data_buffer)
]
if augment.instrument_bin_remap or augment.transpose_semitones:
# TODO put transpose in a real function
raw_transpose = (
lambda bin, n: n + augment.transpose_semitones
if bin != cfg._ch10_bin_int
else n
)
octave_shift_if_oob = (
lambda n: n + 12 if n < 0 else n - 12 if n >= cfg.note_events else n
)
# TODO handle ranges beyond 12
# octave_shift_if_oob = lambda n: 0 if n < 0 else (n - cfg.note_events) % 12 + cfg.note_events if n >= cfg.note_events else n
transpose = lambda bin, n: octave_shift_if_oob(raw_transpose(bin, n))
token_data = [
(augment.instrument_bin_remap.get(i, i), transpose(i, n), v)
for i, n, v in token_data
]
if cfg.do_token_sorting:
token_data = utils.sort_token_data(token_data)
if cfg.unrolled_tokens:
for t in token_data:
output += [
utils.format_unrolled_instrument_bin(t[0]),
utils.format_unrolled_note(t[1]),
utils.format_unrolled_velocity(t[2]),
]
else:
output += [utils.format_note_token(*t) for t in token_data]
token_data_buffer = []
def consume_note_program_data(prog: int, chan: int, note: int, vel: float):
nonlocal output, started_flag, delta_time_ms, cfg, utils, token_data_buffer
is_token_valid = (
utils.prog_data_to_token_data(prog, chan, note, vel) is not None
)
if not is_token_valid:
return
if started_flag:
wait_tokens = utils.data_to_wait_tokens(delta_time_ms)
if len(wait_tokens) > 0:
flush_token_data_buffer()
output += wait_tokens
delta_time_ms = 0.0
token_data_buffer.append((prog, chan, note, vel * augment.velocity_mod_factor))
started_flag = True
for msg in mid.tracks[0]:
time_ms = mido.tick2second(msg.time, mid.ticks_per_beat, tempo) * 1000.0
delta_time_ms += time_ms
t = msg.type
if msg.is_meta:
if t == "set_tempo":
tempo = msg.tempo * augment.time_stretch_factor
continue
def handle_note_off(ch, prog, n):
if channel_pedal_on[ch]:
channel_pedal_events[ch][(n, prog)] = True
else:
consume_note_program_data(prog, ch, n, 0)
if n in channel_notes[ch]:
del channel_notes[ch][n]
if t == "program_change":
channel_program[msg.channel] = msg.program
elif t == "note_on":
if msg.velocity == 0:
handle_note_off(msg.channel, channel_program[msg.channel], msg.note)
else:
if (msg.note, channel_program[msg.channel]) in channel_pedal_events[
msg.channel
]:
del channel_pedal_events[msg.channel][
(msg.note, channel_program[msg.channel])
]
consume_note_program_data(
channel_program[msg.channel],
msg.channel,
msg.note,
mix_volume(
msg.velocity,
channel_volume[msg.channel],
channel_expression[msg.channel],
),
)
channel_notes[msg.channel][msg.note] = True
elif t == "note_off":
handle_note_off(msg.channel, channel_program[msg.channel], msg.note)
elif t == "control_change":
if msg.control == 7 or msg.control == 39: # volume
channel_volume[msg.channel] = msg.value
elif msg.control == 11: # expression
channel_expression[msg.channel] = msg.value
elif msg.control == 64: # sustain pedal
channel_pedal_on[msg.channel] = msg.value >= 64
if not channel_pedal_on[msg.channel]:
for note, program in channel_pedal_events[msg.channel]:
handle_note_off(msg.channel, program, note)
channel_pedal_events[msg.channel] = {}
elif msg.control == 123: # all notes off
for channel in channel_notes.keys():
for note in list(channel_notes[channel]).copy():
handle_note_off(channel, channel_program[channel], note)
else:
pass
flush_token_data_buffer()
output.append("<end>")
return " ".join(output)
def generate_program_change_messages(cfg: VocabConfig):
for bin_name, channel in cfg.bin_channel_map.items():
if channel == 9:
continue
program = cfg._instrument_names_str_to_int[
cfg.bin_name_to_program_name[bin_name]
]
yield mido.Message("program_change", program=program, time=0, channel=channel)
yield mido.Message("program_change", program=0, time=0, channel=9)
@dataclass
class DecodeState:
total_time: float # milliseconds
delta_accum: float # milliseconds
current_bin: int
current_note: int
active_notes: Dict[Tuple[int, int], float] # { (channel, note): time started, ... }
def token_to_midi_message(
utils: VocabUtils, token: str, state: DecodeState, end_token_pause: float = 3.0
) -> Iterator[Tuple[Optional[mido.Message], DecodeState]]:
if state is None:
state = DecodeState(
total_time=0.0,
delta_accum=0.0,
current_bin=utils.cfg._short_instrument_names_str_to_int[
utils.cfg.short_instr_bin_names[0]
],
current_note=0,
active_notes={},
)
token = token.strip()
if not token:
yield None, state
return
if token == "<end>":
d = end_token_pause * 1000.0
state.delta_accum += d
state.total_time += d
if utils.cfg.decode_end_held_note_delay != 0.0:
# end held notes
for (channel, note), start_time in list(state.active_notes.items()).copy():
ticks = int(mido.second2tick(state.delta_accum / 1000.0, 480, 500000))
state.delta_accum = 0.0
del state.active_notes[(channel, note)]
yield mido.Message(
"note_off", note=note, time=ticks, channel=channel
), state
yield None, state
return
if token.startswith("<"):
yield None, state
return
if utils.cfg.unrolled_tokens:
if token[0] == "t":
d = utils.wait_token_to_delta(token)
state.delta_accum += d
state.total_time += d
elif token[0] == "n":
state.current_note = int(token[1:], base=16)
elif token[0] == "i":
state.current_bin = utils.cfg._short_instrument_names_str_to_int[token[1:]]
elif token[0] == "v":
current_velocity = utils.bin_to_velocity(int(token[1:], base=16))
channel = utils.cfg.bin_channel_map[
utils.cfg.bin_instrument_names[state.current_bin]
]
ticks = int(mido.second2tick(state.delta_accum / 1000.0, 480, 500000))
state.delta_accum = 0.0
if current_velocity > 0:
yield mido.Message(
"note_on",
note=state.current_note,
velocity=current_velocity,
time=ticks,
channel=channel,
), state
else:
yield mido.Message(
"note_off",
note=state.current_note,
velocity=0,
time=ticks,
channel=channel,
), state
else:
if token[0] == "t" and token[1].isdigit(): # wait token
d = utils.wait_token_to_delta(token)
state.delta_accum += d
state.total_time += d
if utils.cfg.decode_end_held_note_delay != 0.0:
# remove notes that have been held for too long
for (channel, note), start_time in list(
state.active_notes.items()
).copy():
if (
state.total_time - start_time
> utils.cfg.decode_end_held_note_delay * 1000.0
):
ticks = int(
mido.second2tick(state.delta_accum / 1000.0, 480, 500000)
)
state.delta_accum = 0.0
del state.active_notes[(channel, note)]
yield mido.Message(
"note_off", note=note, time=ticks, channel=channel
), state
return
else: # note token
bin, note, velocity = utils.note_token_to_data(token)
channel = utils.cfg.bin_channel_map[utils.cfg.bin_instrument_names[bin]]
ticks = int(mido.second2tick(state.delta_accum / 1000.0, 480, 500000))
state.delta_accum = 0.0
if velocity > 0:
if utils.cfg.decode_fix_repeated_notes:
if (channel, note) in state.active_notes:
del state.active_notes[(channel, note)]
yield mido.Message(
"note_off", note=note, time=ticks, channel=channel
), state
ticks = 0
state.active_notes[(channel, note)] = state.total_time
yield mido.Message(
"note_on", note=note, velocity=velocity, time=ticks, channel=channel
), state
return
else:
if (channel, note) in state.active_notes:
del state.active_notes[(channel, note)]
yield mido.Message(
"note_off", note=note, time=ticks, channel=channel
), state
return
yield None, state
def str_to_midi_messages(utils: VocabUtils, data: str) -> Iterator[mido.Message]:
state = None
for token in data.split(" "):
for msg, new_state in token_to_midi_message(utils, token, state):
state = new_state
if msg is not None:
yield msg
def convert_str_to_midi(
cfg: VocabConfig, data: str, meta_text: str = "Generated by MIDI-LLM-tokenizer"
) -> mido.MidiFile:
utils = VocabUtils(cfg)
mid = mido.MidiFile()
track = mido.MidiTrack()
mid.tracks.append(track)
tempo = 500000
if meta_text:
track.append(mido.MetaMessage("text", text=meta_text, time=0))
track.append(mido.MetaMessage("set_tempo", tempo=tempo, time=0))
for msg in generate_program_change_messages(cfg):
track.append(msg)
# data = data.replace("<start>", "").replace("<end>", "").replace("<pad>", "").strip()
for msg in str_to_midi_messages(utils, data):
track.append(msg)
track.append(mido.MetaMessage("end_of_track", time=0))
return mid

View File

@@ -0,0 +1,303 @@
{
"note_events": 128,
"wait_events": 125,
"max_wait_time": 1000,
"velocity_events": 128,
"velocity_bins": 12,
"velocity_exp": 0.5,
"do_token_sorting": true,
"unrolled_tokens": false,
"decode_end_held_note_delay": 5.0,
"decode_fix_repeated_notes": true,
"bin_instrument_names": [
"percussion",
"drum",
"tuba",
"marimba",
"bass",
"guitar",
"violin",
"trumpet",
"piano",
"sax",
"flute",
"lead",
"pad"
],
"ch10_instrument_bin_name": "percussion",
"program_name_to_bin_name": {
"Acoustic Grand Piano": "piano",
"Bright Acoustic Piano": "piano",
"Electric Grand Piano": "piano",
"Honky-tonk Piano": "piano",
"Electric Piano 1 (Rhodes Piano)": "piano",
"Electric Piano 2 (Chorused Piano)": "piano",
"Harpsichord": "piano",
"Clavinet": "piano",
"Celesta": "marimba",
"Glockenspiel": "marimba",
"Music Box": "marimba",
"Vibraphone": "marimba",
"Marimba": "marimba",
"Xylophone": "marimba",
"Tubular Bells": "marimba",
"Dulcimer (Santur)": "marimba",
"Drawbar Organ (Hammond)": "marimba",
"Percussive Organ": "piano",
"Rock Organ": "piano",
"Church Organ": "piano",
"Reed Organ": "piano",
"Accordion (French)": "piano",
"Harmonica": "piano",
"Tango Accordion (Band neon)": "piano",
"Acoustic Guitar (nylon)": "guitar",
"Acoustic Guitar (steel)": "guitar",
"Electric Guitar (jazz)": "guitar",
"Electric Guitar (clean)": "guitar",
"Electric Guitar (muted)": "guitar",
"Overdriven Guitar": "guitar",
"Distortion Guitar": "guitar",
"Guitar harmonics": "guitar",
"Acoustic Bass": "bass",
"Electric Bass (fingered)": "bass",
"Electric Bass (picked)": "bass",
"Fretless Bass": "bass",
"Slap Bass 1": "bass",
"Slap Bass 2": "bass",
"Synth Bass 1": "bass",
"Synth Bass 2": "bass",
"Violin": "violin",
"Viola": "violin",
"Cello": "bass",
"Contrabass": "bass",
"Tremolo Strings": "violin",
"Pizzicato Strings": "violin",
"Orchestral Harp": "violin",
"Timpani": "drum",
"String Ensemble 1 (strings)": "violin",
"String Ensemble 2 (slow strings)": "violin",
"SynthStrings 1": "violin",
"SynthStrings 2": "violin",
"Choir Aahs": "violin",
"Voice Oohs": "violin",
"Synth Voice": "violin",
"Orchestra Hit": "",
"Trumpet": "trumpet",
"Trombone": "tuba",
"Tuba": "tuba",
"Muted Trumpet": "trumpet",
"French Horn": "trumpet",
"Brass Section": "trumpet",
"SynthBrass 1": "trumpet",
"SynthBrass 2": "trumpet",
"Soprano Sax": "sax",
"Alto Sax": "sax",
"Tenor Sax": "sax",
"Baritone Sax": "sax",
"Oboe": "sax",
"English Horn": "trumpet",
"Bassoon": "sax",
"Clarinet": "sax",
"Piccolo": "flute",
"Flute": "flute",
"Recorder": "flute",
"Pan Flute": "flute",
"Blown Bottle": "flute",
"Shakuhachi": "flute",
"Whistle": "flute",
"Ocarina": "flute",
"Lead 1 (square wave)": "lead",
"Lead 2 (sawtooth wave)": "lead",
"Lead 3 (calliope)": "lead",
"Lead 4 (chiffer)": "lead",
"Lead 5 (charang)": "lead",
"Lead 6 (voice solo)": "violin",
"Lead 7 (fifths)": "lead",
"Lead 8 (bass + lead)": "lead",
"Pad 1 (new age Fantasia)": "pad",
"Pad 2 (warm)": "pad",
"Pad 3 (polysynth)": "pad",
"Pad 4 (choir space voice)": "violin",
"Pad 5 (bowed glass)": "pad",
"Pad 6 (metallic pro)": "pad",
"Pad 7 (halo)": "pad",
"Pad 8 (sweep)": "pad",
"FX 1 (rain)": "",
"FX 2 (soundtrack)": "",
"FX 3 (crystal)": "",
"FX 4 (atmosphere)": "",
"FX 5 (brightness)": "",
"FX 6 (goblins)": "",
"FX 7 (echoes, drops)": "",
"FX 8 (sci-fi, star theme)": "",
"Sitar": "guitar",
"Banjo": "guitar",
"Shamisen": "guitar",
"Koto": "guitar",
"Kalimba": "guitar",
"Bag pipe": "sax",
"Fiddle": "violin",
"Shanai": "sax",
"Tinkle Bell": "marimba",
"Agogo": "marimba",
"Steel Drums": "marimba",
"Woodblock": "marimba",
"Taiko Drum": "drum",
"Melodic Tom": "drum",
"Synth Drum": "drum",
"Reverse Cymbal": "",
"Guitar Fret Noise": "",
"Breath Noise": "",
"Seashore": "",
"Bird Tweet": "",
"Telephone Ring": "",
"Helicopter": "",
"Applause": "",
"Gunshot": ""
},
"bin_name_to_program_name": {
"piano": "Acoustic Grand Piano",
"marimba": "Marimba",
"drum": "Synth Drum",
"guitar": "Acoustic Guitar (steel)",
"bass": "Acoustic Bass",
"violin": "Violin",
"percussion": "",
"trumpet": "Trumpet",
"tuba": "Tuba",
"sax": "Tenor Sax",
"flute": "Flute",
"lead": "Lead 1 (square wave)",
"pad": "Pad 1 (new age Fantasia)"
},
"instrument_names": {
"0": "Acoustic Grand Piano",
"1": "Bright Acoustic Piano",
"2": "Electric Grand Piano",
"3": "Honky-tonk Piano",
"4": "Electric Piano 1 (Rhodes Piano)",
"5": "Electric Piano 2 (Chorused Piano)",
"6": "Harpsichord",
"7": "Clavinet",
"8": "Celesta",
"9": "Glockenspiel",
"10": "Music Box",
"11": "Vibraphone",
"12": "Marimba",
"13": "Xylophone",
"14": "Tubular Bells",
"15": "Dulcimer (Santur)",
"16": "Drawbar Organ (Hammond)",
"17": "Percussive Organ",
"18": "Rock Organ",
"19": "Church Organ",
"20": "Reed Organ",
"21": "Accordion (French)",
"22": "Harmonica",
"23": "Tango Accordion (Band neon)",
"24": "Acoustic Guitar (nylon)",
"25": "Acoustic Guitar (steel)",
"26": "Electric Guitar (jazz)",
"27": "Electric Guitar (clean)",
"28": "Electric Guitar (muted)",
"29": "Overdriven Guitar",
"30": "Distortion Guitar",
"31": "Guitar harmonics",
"32": "Acoustic Bass",
"33": "Electric Bass (fingered)",
"34": "Electric Bass (picked)",
"35": "Fretless Bass",
"36": "Slap Bass 1",
"37": "Slap Bass 2",
"38": "Synth Bass 1",
"39": "Synth Bass 2",
"40": "Violin",
"41": "Viola",
"42": "Cello",
"43": "Contrabass",
"44": "Tremolo Strings",
"45": "Pizzicato Strings",
"46": "Orchestral Harp",
"47": "Timpani",
"48": "String Ensemble 1 (strings)",
"49": "String Ensemble 2 (slow strings)",
"50": "SynthStrings 1",
"51": "SynthStrings 2",
"52": "Choir Aahs",
"53": "Voice Oohs",
"54": "Synth Voice",
"55": "Orchestra Hit",
"56": "Trumpet",
"57": "Trombone",
"58": "Tuba",
"59": "Muted Trumpet",
"60": "French Horn",
"61": "Brass Section",
"62": "SynthBrass 1",
"63": "SynthBrass 2",
"64": "Soprano Sax",
"65": "Alto Sax",
"66": "Tenor Sax",
"67": "Baritone Sax",
"68": "Oboe",
"69": "English Horn",
"70": "Bassoon",
"71": "Clarinet",
"72": "Piccolo",
"73": "Flute",
"74": "Recorder",
"75": "Pan Flute",
"76": "Blown Bottle",
"77": "Shakuhachi",
"78": "Whistle",
"79": "Ocarina",
"80": "Lead 1 (square wave)",
"81": "Lead 2 (sawtooth wave)",
"82": "Lead 3 (calliope)",
"83": "Lead 4 (chiffer)",
"84": "Lead 5 (charang)",
"85": "Lead 6 (voice solo)",
"86": "Lead 7 (fifths)",
"87": "Lead 8 (bass + lead)",
"88": "Pad 1 (new age Fantasia)",
"89": "Pad 2 (warm)",
"90": "Pad 3 (polysynth)",
"91": "Pad 4 (choir space voice)",
"92": "Pad 5 (bowed glass)",
"93": "Pad 6 (metallic pro)",
"94": "Pad 7 (halo)",
"95": "Pad 8 (sweep)",
"96": "FX 1 (rain)",
"97": "FX 2 (soundtrack)",
"98": "FX 3 (crystal)",
"99": "FX 4 (atmosphere)",
"100": "FX 5 (brightness)",
"101": "FX 6 (goblins)",
"102": "FX 7 (echoes, drops)",
"103": "FX 8 (sci-fi, star theme)",
"104": "Sitar",
"105": "Banjo",
"106": "Shamisen",
"107": "Koto",
"108": "Kalimba",
"109": "Bag pipe",
"110": "Fiddle",
"111": "Shanai",
"112": "Tinkle Bell",
"113": "Agogo",
"114": "Steel Drums",
"115": "Woodblock",
"116": "Taiko Drum",
"117": "Melodic Tom",
"118": "Synth Drum",
"119": "Reverse Cymbal",
"120": "Guitar Fret Noise",
"121": "Breath Noise",
"122": "Seashore",
"123": "Bird Tweet",
"124": "Telephone Ring",
"125": "Helicopter",
"126": "Applause",
"127": "Gunshot"
}
}

View File

@@ -1,14 +1,16 @@
from abc import ABC, abstractmethod
from enum import Enum, auto
import os
import pathlib
import copy
from typing import Dict, List, Tuple
import re
from typing import Dict, Iterable, List, Tuple, Union
from utils.log import quick_log
from fastapi import HTTPException
from pydantic import BaseModel, Field
import torch
import numpy as np
from rwkv_pip.utils import PIPELINE
from routes import state_cache
import global_var
END_OF_TEXT = 0
@@ -18,9 +20,26 @@ END_OF_LINE_DOUBLE = 535
os.environ["TORCH_EXTENSIONS_DIR"] = f"{pathlib.Path(__file__).parent.parent.resolve()}"
class RWKV:
def __init__(self, model: str, strategy: str, tokens_path: str) -> None:
from rwkv.model import RWKV as Model # dynamic import to make RWKV_CUDA_ON work
class RWKVType(Enum):
Raven = auto()
World = auto()
Music = auto()
class AbstractRWKV(ABC):
def __init__(self, model: str, strategy: str, tokens_path: str):
rwkv_beta = global_var.get(global_var.Args).rwkv_beta
# dynamic import to make RWKV_CUDA_ON work
if rwkv_beta:
from rwkv_pip.beta.model import (
RWKV as Model,
)
else:
from rwkv.model import (
RWKV as Model,
)
from rwkv_pip.utils import PIPELINE
filename, _ = os.path.splitext(os.path.basename(model))
self.name = filename
@@ -28,102 +47,54 @@ class RWKV:
self.pipeline = PIPELINE(self.model, tokens_path)
self.model_state = None
self.model_tokens = []
self.CHUNK_LEN = 256
self.rwkv_type: RWKVType = None
self.max_tokens_per_generation = 500
self.temperature = 1
self.top_p = 0.5
self.penalty_alpha_presence = 0.4
self.penalty_alpha_frequency = 0.4
self.top_p = 0.3
self.top_k = 0
self.penalty_alpha_presence = 0
self.penalty_alpha_frequency = 1
self.interface = ":"
if "world" in self.name.lower():
self.user = "Question"
self.bot = "Answer"
self.END_OF_LINE = 11
else:
self.user = "Bob"
self.bot = "Alice"
self.END_OF_LINE = 187
@abstractmethod
def adjust_occurrence(self, occurrence: Dict, token: int):
pass
self.AVOID_REPEAT_TOKENS = []
AVOID_REPEAT = ""
for i in AVOID_REPEAT:
dd = self.pipeline.encode(i)
assert len(dd) == 1
self.AVOID_REPEAT_TOKENS += dd
self.preload()
def preload(self):
interface = self.interface
user = self.user
bot = self.bot
preset_system = (
f"""
The following is a coherent verbose detailed conversation between a girl named {bot} and her friend {user}. \
{bot} is very intelligent, creative and friendly. \
{bot} is unlikely to disagree with {user}, and {bot} doesn't like to ask {user} questions. \
{bot} likes to tell {user} a lot about herself and her opinions. \
{bot} usually gives {user} kind, helpful and informative advices.\n
"""
if self.user == "Bob"
else f"{user}{interface} hi\n\n{bot}{interface} Hi. "
+ "I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.\n\n"
)
logits, _ = self.run_rnn(self.fix_tokens(self.pipeline.encode(preset_system)))
try:
state_cache.add_state(
state_cache.AddStateBody(
prompt=preset_system,
tokens=self.model_tokens,
state=self.model_state,
logits=logits,
)
)
except HTTPException:
pass
@abstractmethod
def adjust_forward_logits(self, logits: List[float], occurrence: Dict, i: int):
pass
# Model only saw '\n\n' as [187, 187] before, but the tokenizer outputs [535] for it at the end
def fix_tokens(self, tokens):
if "world" in self.name.lower():
return tokens
if len(tokens) > 0 and tokens[-1] == END_OF_LINE_DOUBLE:
tokens = tokens[:-1] + [self.END_OF_LINE, self.END_OF_LINE]
return tokens
@abstractmethod
def fix_tokens(self, tokens) -> List[int]:
pass
def run_rnn(self, _tokens: List[str], newline_adj: int = 0):
tokens = [int(x) for x in _tokens]
token_len = len(tokens)
self.model_tokens += tokens
@abstractmethod
def run_rnn(
self, _tokens: List[str], newline_adj: int = 0
) -> Tuple[List[float], int]:
pass
while len(tokens) > 0:
out, self.model_state = self.model.forward(
tokens[: self.CHUNK_LEN], self.model_state
)
tokens = tokens[self.CHUNK_LEN :]
out[self.END_OF_LINE] += newline_adj # adjust \n probability
if self.model_tokens[-1] in self.AVOID_REPEAT_TOKENS:
out[self.model_tokens[-1]] = -999999999
return out, token_len
@abstractmethod
def delta_postprocess(self, delta: str) -> str:
pass
def get_embedding(self, input: str, fast_mode: bool) -> Tuple[List[float], int]:
if fast_mode:
embedding, token_len = self.fast_embedding(
embedding, token_len = self.__fast_embedding(
self.fix_tokens(self.pipeline.encode(input)), None
)
else:
self.model_state = None
self.model_tokens = []
_, token_len = self.run_rnn(self.fix_tokens(self.pipeline.encode(input)))
embedding = self.model_state[-5].tolist()
embedding = self.model_state[-11].tolist()
embedding = (embedding / np.linalg.norm(embedding)).tolist()
return embedding, token_len
def fast_embedding(self, tokens: List[str], state):
def __fast_embedding(self, tokens: List[str], state):
import torch
tokens = [int(x) for x in tokens]
token_len = len(tokens)
self = self.model
@@ -260,7 +231,9 @@ The following is a coherent verbose detailed conversation between a girl named {
return state[0].tolist(), token_len
def generate(self, prompt: str, stop: str = None):
def generate(
self, prompt: str, stop: Union[str, List[str], None] = None
) -> Iterable[Tuple[str, str, int, int]]:
quick_log(None, None, "Generation Prompt:\n" + prompt)
cache = None
delta_prompt = prompt
@@ -304,46 +277,60 @@ The following is a coherent verbose detailed conversation between a girl named {
completion_token_len = 0
response = ""
for i in range(self.max_tokens_per_generation):
for n in occurrence:
logits[n] -= (
self.penalty_alpha_presence
+ occurrence[n] * self.penalty_alpha_frequency
)
self.adjust_forward_logits(logits, occurrence, i)
token = self.pipeline.sample_logits(
logits, temperature=self.temperature, top_p=self.top_p
logits, temperature=self.temperature, top_p=self.top_p, top_k=self.top_k
)
if token == END_OF_TEXT:
yield response, "", prompt_token_len, completion_token_len
break
for xxx in occurrence:
occurrence[xxx] *= 0.996
if token not in occurrence:
occurrence[token] = 1
else:
occurrence[token] += 1
self.adjust_occurrence(occurrence, token)
logits, _ = self.run_rnn([token])
completion_token_len = completion_token_len + 1
delta: str = self.pipeline.decode(self.model_tokens[out_last:])
delta: str = self.delta_postprocess(
self.pipeline.decode(self.model_tokens[out_last:])
)
if "\ufffd" not in delta: # avoid utf-8 display issues
response += delta
if stop is not None:
if stop in response:
try:
state_cache.add_state(
state_cache.AddStateBody(
prompt=prompt + response,
tokens=self.model_tokens,
state=self.model_state,
logits=logits,
if type(stop) == str:
if stop in response:
try:
state_cache.add_state(
state_cache.AddStateBody(
prompt=prompt + response,
tokens=self.model_tokens,
state=self.model_state,
logits=logits,
)
)
)
except HTTPException:
pass
response = response.split(stop)[0]
yield response, "", prompt_token_len, completion_token_len
break
except HTTPException:
pass
response = response.split(stop)[0]
yield response, "", prompt_token_len, completion_token_len
break
elif type(stop) == list:
stop_exist_regex = "|".join(stop)
matched = re.search(stop_exist_regex, response)
if matched:
try:
state_cache.add_state(
state_cache.AddStateBody(
prompt=prompt + response,
tokens=self.model_tokens,
state=self.model_state,
logits=logits,
)
)
except HTTPException:
pass
response = response.split(matched.group())[0]
yield response, "", prompt_token_len, completion_token_len
break
out_last = begin + i + 1
if i == self.max_tokens_per_generation - 1:
try:
@@ -360,6 +347,169 @@ The following is a coherent verbose detailed conversation between a girl named {
yield response, delta, prompt_token_len, completion_token_len
class TextRWKV(AbstractRWKV):
def __init__(self, model: str, strategy: str, tokens_path: str) -> None:
super().__init__(model, strategy, tokens_path)
self.CHUNK_LEN = 256
self.max_tokens_per_generation = 500
self.temperature = 1
self.top_p = 0.3
self.top_k = 0
self.penalty_alpha_presence = 0
self.penalty_alpha_frequency = 1
self.interface = ":"
if "world" in self.name.lower():
self.rwkv_type = RWKVType.World
self.user = "Question"
self.bot = "Answer"
self.END_OF_LINE = 11
else:
self.rwkv_type = RWKVType.Raven
self.user = "Bob"
self.bot = "Alice"
self.END_OF_LINE = 187
self.AVOID_REPEAT_TOKENS = []
AVOID_REPEAT = ""
for i in AVOID_REPEAT:
dd = self.pipeline.encode(i)
assert len(dd) == 1
self.AVOID_REPEAT_TOKENS += dd
self.__preload()
def adjust_occurrence(self, occurrence: Dict, token: int):
for xxx in occurrence:
occurrence[xxx] *= 0.996
if token not in occurrence:
occurrence[token] = 1
else:
occurrence[token] += 1
def adjust_forward_logits(self, logits: List[float], occurrence: Dict, i: int):
for n in occurrence:
logits[n] -= (
self.penalty_alpha_presence
+ occurrence[n] * self.penalty_alpha_frequency
)
if i == 0:
for token in self.model_tokens:
token = int(token)
for xxx in occurrence:
occurrence[xxx] *= 0.996
if token not in occurrence:
occurrence[token] = 1
else:
occurrence[token] += 1
# Model only saw '\n\n' as [187, 187] before, but the tokenizer outputs [535] for it at the end
def fix_tokens(self, tokens) -> List[int]:
if self.rwkv_type == RWKVType.World:
return tokens
if len(tokens) > 0 and tokens[-1] == END_OF_LINE_DOUBLE:
tokens = tokens[:-1] + [self.END_OF_LINE, self.END_OF_LINE]
return tokens
def run_rnn(
self, _tokens: List[str], newline_adj: int = 0
) -> Tuple[List[float], int]:
tokens = [int(x) for x in _tokens]
token_len = len(tokens)
self.model_tokens += tokens
while len(tokens) > 0:
out, self.model_state = self.model.forward(
tokens[: self.CHUNK_LEN], self.model_state
)
tokens = tokens[self.CHUNK_LEN :]
out[self.END_OF_LINE] += newline_adj # adjust \n probability
if self.model_tokens[-1] in self.AVOID_REPEAT_TOKENS:
out[self.model_tokens[-1]] = -999999999
return out, token_len
def delta_postprocess(self, delta: str) -> str:
return delta
def __preload(self):
interface = self.interface
user = self.user
bot = self.bot
preset_system = (
f"""
The following is a coherent verbose detailed conversation between a girl named {bot} and her friend {user}. \
{bot} is very intelligent, creative and friendly. \
{bot} is unlikely to disagree with {user}, and {bot} doesn't like to ask {user} questions. \
{bot} likes to tell {user} a lot about herself and her opinions. \
{bot} usually gives {user} kind, helpful and informative advices.\n
"""
if self.rwkv_type == RWKVType.Raven
else (
f"{user}{interface} hi\n\n{bot}{interface} Hi. "
+ "I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.\n\n"
)
)
logits, _ = self.run_rnn(self.fix_tokens(self.pipeline.encode(preset_system)))
try:
state_cache.add_state(
state_cache.AddStateBody(
prompt=preset_system,
tokens=self.model_tokens,
state=self.model_state,
logits=logits,
)
)
except HTTPException:
pass
class MusicRWKV(AbstractRWKV):
def __init__(self, model: str, strategy: str, tokens_path: str):
super().__init__(model, strategy, tokens_path)
self.max_tokens_per_generation = 500
self.temperature = 1
self.top_p = 0.8
self.top_k = 8
self.rwkv_type = RWKVType.Music
def adjust_occurrence(self, occurrence: Dict, token: int):
for n in occurrence:
occurrence[n] *= 0.997 #### decay repetition penalty
if token >= 128 or token == 127:
occurrence[token] = 1 + (occurrence[token] if token in occurrence else 0)
else:
occurrence[token] = 0.3 + (occurrence[token] if token in occurrence else 0)
def adjust_forward_logits(self, logits: List[float], occurrence: Dict, i: int):
for n in occurrence:
logits[n] -= 0 + occurrence[n] * 0.5
logits[0] += (i - 2000) / 500 # try not to be too short or too long
logits[127] -= 1 # avoid "t125"
def fix_tokens(self, tokens) -> List[int]:
return tokens
def run_rnn(
self, _tokens: List[str], newline_adj: int = 0
) -> Tuple[List[float], int]:
tokens = [int(x) for x in _tokens]
token_len = len(tokens)
self.model_tokens += tokens
out, self.model_state = self.model.forward(tokens, self.model_state)
return out, token_len
def delta_postprocess(self, delta: str) -> str:
return " " + delta
class ModelConfigBody(BaseModel):
max_tokens: int = Field(default=None, gt=0, le=102400)
temperature: float = Field(default=None, ge=0, le=2)
@@ -379,7 +529,7 @@ class ModelConfigBody(BaseModel):
}
def set_rwkv_config(model: RWKV, body: ModelConfigBody):
def set_rwkv_config(model: AbstractRWKV, body: ModelConfigBody):
if body.max_tokens is not None:
model.max_tokens_per_generation = body.max_tokens
if body.temperature is not None:
@@ -395,7 +545,7 @@ def set_rwkv_config(model: RWKV, body: ModelConfigBody):
model.penalty_alpha_frequency = body.frequency_penalty
def get_rwkv_config(model: RWKV) -> ModelConfigBody:
def get_rwkv_config(model: AbstractRWKV) -> ModelConfigBody:
return ModelConfigBody(
max_tokens=model.max_tokens_per_generation,
temperature=model.temperature,

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
For Mac and Linux users, please manually install Python 3.10 (usually the latest systems come with it built-in). You can specify the Python interpreter to use in Settings.
对于Mac和Linux用户请手动安装 Python3.10 (通常最新的系统已经内置了). 你可以在设置中指定使用的Python解释器.
MacおよびLinuxのユーザーの方は、Python3.10を手動でインストールしてください(通常、最新のシステムには既に組み込まれています)。 設定メニューで使用するPythonインタプリタを指定することができます。
For Mac and Linux users, please manually install Python 3.10 (usually the latest systems come with it built-in). You can specify the Python interpreter to use in Settings. (which python3)
对于Mac和Linux用户请手动安装 Python3.10 (通常最新的系统已经内置了). 你可以在设置中指定使用的Python解释器. (which python3)
MacおよびLinuxのユーザーの方は、Python3.10を手動でインストールしてください(通常、最新のシステムには既に組み込まれています)。 設定メニューで使用するPythonインタプリタを指定することができます。 (which python3)
Please execute this program in an empty directory. All related dependencies will be placed in this directory.
请将本程序放在一个空目录内执行, 所有相关依赖均会放置于此目录.

View File

@@ -1,3 +1,5 @@
echo $@
if [[ ${cnMirror} == 1 ]]; then
export PIP_INDEX_URL="https://pypi.tuna.tsinghua.edu.cn/simple"
if grep -q "mirrors.aliyun.com" /etc/apt/sources.list; then

View File

@@ -184,7 +184,7 @@ if __name__ == "__main__":
args.num_sanity_val_steps = 0
args.check_val_every_n_epoch = int(1e20)
args.log_every_n_steps = int(1e20)
args.max_epochs = -1 # continue forever
args.max_epochs = args.epoch_count # continue forever
args.betas = (args.beta1, args.beta2)
args.real_bsz = int(args.num_nodes) * int(args.devices) * args.micro_bsz
os.environ["RWKV_T_MAX"] = str(args.ctx_len)
@@ -373,7 +373,7 @@ if __name__ == "__main__":
for param in module.parameters():
param.requires_grad = True
elif enable_time_finetune and any(
n.startswith("time") for n, _ in module.named_parameters()
n.startswith("time") for n, _ in module.named_parameters()
):
for pname, param in module.named_parameters():
if pname.startswith("time"):
@@ -381,7 +381,7 @@ if __name__ == "__main__":
param.requires_grad = True
if (
len(args.load_model) == 0 or args.my_pile_stage == 1
len(args.load_model) == 0 or args.my_pile_stage == 1
): # shall we build the initial weights?
init_weight_name = f"{args.proj_dir}/rwkv-init.pth"
generate_init_weight(model, init_weight_name) # save initial weights
@@ -423,8 +423,8 @@ if __name__ == "__main__":
)
if (
args.lr_init > 1e-4
or trainer.world_size * args.micro_bsz * trainer.accumulate_grad_batches < 8
args.lr_init > 1e-4
or trainer.world_size * args.micro_bsz * trainer.accumulate_grad_batches < 8
):
if "I_KNOW_WHAT_IM_DOING" in os.environ:
if trainer.global_rank == 0:
@@ -459,10 +459,10 @@ if __name__ == "__main__":
if "deepspeed" in args.strategy:
trainer.strategy.config["zero_optimization"]["allgather_bucket_size"] = (
args.ds_bucket_mb * 1000 * 1000
args.ds_bucket_mb * 1000 * 1000
)
trainer.strategy.config["zero_optimization"]["reduce_bucket_size"] = (
args.ds_bucket_mb * 1000 * 1000
args.ds_bucket_mb * 1000 * 1000
)
# must set shuffle=False, persistent_workers=False (because worker is in another thread)

File diff suppressed because it is too large Load Diff

View File

@@ -11,11 +11,13 @@
"dependencies": {
"@fluentui/react-components": "^9.20.0",
"@fluentui/react-icons": "^2.0.201",
"@magenta/music": "^1.23.1",
"@microsoft/fetch-event-source": "^2.0.1",
"@primer/octicons-react": "^19.1.0",
"chart.js": "^4.3.0",
"classnames": "^2.3.2",
"github-markdown-css": "^5.2.0",
"html-midi-player": "^1.5.0",
"i18next": "^22.4.15",
"mobx": "^6.9.0",
"mobx-react-lite": "^3.4.3",

View File

@@ -0,0 +1,257 @@
{
"Home": "ホーム",
"Train": "トレーニング",
"About": "約",
"Settings": "設定",
"Go to chat page": "チャットページに移動する",
"Manage your configs": "あなたの設定を管理する",
"Manage models": "モデルの管理",
"Run": "実行",
"Offline": "オフライン",
"Starting": "起動中",
"Loading": "モデルを読み込み中",
"Working": "動作中",
"Stop": "停止",
"Enable High Precision For Last Layer": "最後の層で高精度を有効にする",
"Stored Layers": "メモリ層読み込み",
"Precision": "精度",
"Device": "デバイス",
"Convert model with these configs. Using a converted model will greatly improve the loading speed, but model parameters of the converted model cannot be modified.": "これらの設定でモデルを変換します。変換されたモデルを使用すると、読み込み速度が大幅に向上しますが、変換したモデルのパラメータを変更することはできません。",
"Manage Models": "モデルの管理",
"Model": "モデル",
"Model Parameters": "モデルのパラメータ",
"Frequency Penalty": "周波数のペナルティ",
"Presence Penalty": "存在のペナルティ",
"Top_P": "Top_P",
"Temperature": "温度",
"Max Response Token": "最大レスポンストークン",
"API Port": "API ポート",
"Hover your mouse over the text to view a detailed description. Settings marked with * will take effect immediately after being saved.": "マウスをテキストに一定時間置いて詳細な説明を表示します。 * が付いている設定は保存後すぐに有効化されます。",
"Default API Parameters": "デフォルトのAPIパラメータ",
"Provide JSON file URLs for the models manifest. Separate URLs with semicolons. The \"models\" field in JSON files will be parsed into the following table.": "モデルマニフェストのためのJSONファイルURLを提供します。URLはセミコロンで分割します。JSONファイルの\"models\"フィールドは次の表に解析されます。",
"Config Name": "構成名",
"Refresh": "リフレッシュ",
"Save Config": "構成を保存",
"Model Source Manifest List": "モデルソースマニフェストリスト",
"Models": "モデル",
"Delete Config": "設定を削除",
"Help": "ヘルプ",
"Version": "バージョン",
"New Config": "新たな設定",
"Open Url": "URLを開く",
"Download": "ダウンロード",
"Open Folder": "フォルダを開く",
"Configs": "設定",
"Automatic Updates Check": "自動更新チェック",
"Updates Check Error": "更新チェックエラー",
"Introduction": "序文",
"Dark Mode": "ダークモード",
"Language": "言語",
"In Development": "開発中",
"Chat": "チャット",
"Convert": "変更",
"Actions": "行動",
"Last updated": "最後に更新",
"Desc": "説明",
"Size": "サイズ",
"File": "ファイル",
"Config Saved": "設定が保存されました",
"Downloading": "ダウンロード中",
"Loading Model": "モデルを読み込んでいます",
"Startup Completed": "起動完了",
"Failed to switch model": "モデルの切り替えに失敗しました",
"Start Converting": "変換を開始",
"Convert Success": "変換成功",
"Convert Failed": "変換失敗",
"Model Not Found": "モデルが見つかりません",
"Model Status": "モデルの状態",
"Clear": "クリア",
"Send": "送信",
"Type your message here": "ここにメッセージを入力してください",
"Copy": "コピー",
"Read Aloud": "読み上げ",
"Hello! I'm RWKV, an open-source and commercially usable large language model.": "こんにちは私はRWKV、オープンソースで商用利用可能な大規模な言語モデルです。",
"This tool's API is compatible with OpenAI API. It can be used with any ChatGPT tool you like. Go to the settings of some ChatGPT tool, replace the 'https://api.openai.com' part in the API address with '": "このツールのAPIはOpenAI APIと互換性があります。 お好きなChatGPTツールで使用することができます。いくつかのChatGPTツールの設定に移動し、APIアドレスの 'https://api.openai.com' 部分を '",
"New Version Available": "新しいバージョンが存在します",
"Update": "更新",
"Please click the button in the top right corner to start the model": "右上角のボタンをクリックしてモデルを起動してください",
"Update Error": "更新エラー",
"Open the following URL with your browser to view the API documentation": "以下のURLをブラウザで開いてAPIドキュメンテーションを確認してください",
"By default, the maximum number of tokens that can be answered in a single response, it can be changed by the user by specifying API parameters.": "デフォルトでは、一度に回答できるトークンの最大数は、APIパラメータを指定することでユーザーが変更できます。",
"Sampling temperature, it's like giving alcohol to a model, the higher the stronger the randomness and creativity, while the lower, the more focused and deterministic it will be.": "サンプリング温度は、モデルにアルコールを与えるようなもので、高いほどランダム性と創造性が強く、低いほど焦点を絞り、決定論的になります。",
"Just like feeding sedatives to the model. Consider the results of the top n% probability mass, 0.1 considers the top 10%, with higher quality but more conservative, 1 considers all results, with lower quality but more diverse.": "モデルに鎮静剤を与えるようなもの。上位nの確率質量の結果を考えてみてください。0.1は上位10を考えており、質が高いが保守的で、1は全ての結果を考慮しており、質は低いが多様性があります。",
"Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.": "ポジティヴ値は、新しいトークンが今までのテキストに出現していたかどうかに基づいてこれらをペナルティとし、新しいトピックについて話す可能性を増加させます。",
"Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.": "ポジティブ値は、新しいトークンが既存のテキストでどれだけ頻繁に使われているかに基づいてペナルティを与え、モデルが同じ行を完全に繰り返す可能性を減らします。",
"int8 uses less VRAM, but has slightly lower quality. fp16 has higher quality, and fp32 has the best quality.": "int8はVRAMの使用量が少ないですが、質が若干低いです。fp16は高品質、fp32は最高品質です。",
"Number of the neural network layers loaded into VRAM, the more you load, the faster the speed, but it consumes more VRAM. (If your VRAM is not enough, it will fail to load)": "VRAMにロードされるニューラルネットワークの層の数。ロードする量が多いほど速度は速くなりますが、VRAMを多く消費します。(VRAMが不足している場合、ロードに失敗します)",
"Whether to use CPU to calculate the last output layer of the neural network with FP32 precision to obtain better quality.": "ネットワークの最終出力層をFP32精度で計算するためにCPUを使用するかどうか。",
"Downloads": "ダウンロード",
"Pause": "ポーズ",
"Continue": "続行",
"Resume": "続行",
"Check": "確認",
"Model file not found": "モデルファイルが見つかりません",
"Can not find download url": "ダウンロードURLが見つかりません",
"Python target not found, would you like to download it?": "Pythonターゲットが見つかりません、ダウンロードしますか",
"Python dependencies are incomplete, would you like to install them?": "Pythonの依存関係が不完全です、インストールしますか",
"Install": "インストール",
"This is the latest version": "これは最新バージョンです",
"Use Tsinghua Pip Mirrors": "清華大学Pipミラーサーバーを使用",
"Model Config Exception": "モデル設定例外",
"Use Gitee Updates Source": "Gitee更新ソースを使用",
"Use Custom CUDA kernel to Accelerate": "カスタムCUDAカーネルを使用して加速",
"Enabling this option can greatly improve inference speed and save some VRAM, but there may be compatibility issues. If it fails to start, please turn off this option.": "このオプションを有効にすると、推論速度が大幅に向上し、一部のVRAMを節約できますが、互換性の問題が生じる可能性があります。起動に失敗した場合は、このオプションをオフにしてください。",
"Supported custom cuda file not found": "対応しているカスタムCUDAファイルが見つかりません",
"Failed to copy custom cuda file": "カスタムCUDAファイルのコピーに失敗しました",
"Downloading update, please wait. If it is not completed, please manually download the program from GitHub and replace the original program.": "更新をダウンロード中です、お待ちください。完了しない場合は、GitHubから手動でプログラムをダウンロードし、元のプログラムを置き換えてください。",
"Completion": "補完",
"Parameters": "パラメータ",
"Stop Sequences": "シーケンスを停止",
"When this content appears in the response result, the generation will end.": "この内容が応答結果に表示されると、生成が終了します。",
"Reset": "リセット",
"Generate": "生成",
"Writer": "ライター",
"Translator": "翻訳者",
"Catgirl": "ネコガール",
"Code Generation": "コード生成",
"Werewolf": "人狼",
"Instruction": "指示",
"Blank": "空白",
"The following is an epic science fiction masterpiece that is immortalized, with delicate descriptions and grand depictions of interstellar civilization wars.\nChapter 1.\n": "以下は、壮大な描写と共に、不滅のエピックサイエンスフィクションの傑作で、星間文明戦争が繊細に描かれています。\n第1章\n",
"The following is a conversation between a cat girl and her owner. The cat girl is a humanized creature that behaves like a cat but is humanoid. At the end of each sentence in the dialogue, she will add \"Meow~\". In the following content, User represents the owner and Assistant represents the cat girl.\n\nUser: Hello.\n\nAssistant: I'm here, meow~.\n\nUser: Can you tell jokes?": "以下は、猫少女とその飼い主との会話です。猫少女は、猫のように振る舞いながらもヒトの姿をした生物です。会話の各文の終わりには必ず「にゃ〜」とつけています。以下の文章では、Userが飼い主、Assistantが猫少女を表しています。\n\nUser: こんにちは。\n\nAssistant: ここにいますよ、にゃ〜。\n\nUser: 笑い話を話せますか?",
"When response finished, inject this content.": "応答終了時に、この内容を注入します。",
"Inject start text": "開始テキストを注入",
"Inject end text": "終了テキストを注入",
"Before the response starts, inject this content.": "応答が始まる前に、この内容を注入します。",
"There is currently a game of Werewolf with six players, including a Seer (who can check identities at night), two Werewolves (who can choose someone to kill at night), a Bodyguard (who can choose someone to protect at night), two Villagers (with no special abilities), and a game host. User will play as Player 1, Assistant will play as Players 2-6 and the game host, and they will begin playing together. Every night, the host will ask User for his action and simulate the actions of the other players. During the day, the host will oversee the voting process and ask User for his vote. \n\nAssistant: Next, I will act as the game host and assign everyone their roles, including randomly assigning yours. Then, I will simulate the actions of Players 2-6 and let you know what happens each day. Based on your assigned role, you can tell me your actions and I will let you know the corresponding results each day.\n\nUser: Okay, I understand. Let's begin. Please assign me a role. Am I the Seer, Werewolf, Villager, or Bodyguard?\n\nAssistant: You are the Seer. Now that night has fallen, please choose a player to check his identity.\n\nUser: Tonight, I want to check Player 2 and find out his role.": "現在、6人のプレイヤーが参加する人狼ゲームが行われています。その中には、夜に任意のプレイヤーの正体を確認できる占い師、夜に誰かを殺すことができる人狼2名、夜に誰かを守ることができるボディガード、特殊な能力を持っていない村人2名、そしてゲームのホストがいます。Userはプレイヤー1として、Assistantはプレーヤー2から6まで及びゲームのホストとして参加し、一緒にゲームを始めます。ホストは毎晩、Userに彼の行動を問い、他のプレーヤーの行動をシミュレートします。昼には、ホストが投票プロセスを監督し、Userに彼の投票を求めます。\n\nAssistant: 次に、私はゲームのホストとして参加者全員に役割を割り当てることになります。それには、あなたの役割もランダムに割り当てます。その後、私はプレーヤー2から6の行動をシミュレートし、毎日何が起こったかを報告します。あなたに割り当てられた役割に基づいて、あなたの行動を教えてください。私は毎日、それに対する結果を報告します。\n\nUser: 了解しました。では、始めましょう。私の役割を割り当ててください。占い師、人狼、村人、ボディーガードのいずれなのでしょうか?\n\nAssistant: あなたの役割は占い師です。今夜が来たので、誰の正体を確認するか選んでください。\n\nUser: 今夜、プレイヤー2の役割を確認したい。",
"Writer, Translator, Role-playing": "ライター、翻訳者、ロールプレイング",
"Chinese Kongfu": "中国武術",
"Allow external access to the API (service must be restarted)": "APIへの外部アクセスを許可する (サービスを再起動する必要があります)",
"Custom": "カスタム",
"CUDA (Beta, Faster)": "CUDA (ベータ、高速)",
"Reset All Configs": "すべての設定をリセット",
"Cancel": "キャンセル",
"Confirm": "確認",
"Are you sure you want to reset all configs? This will obtain the latest preset configs, but will override your custom configs and cannot be undone.": "本当にすべての設定をリセットしますか?これにより最新のプリセット設定が取得されますが、カスタム設定は上書きされ、元に戻すことはできません。",
"Advanced": "高度な",
"Custom Python Path": "カスタムPythonパス",
"Custom Models Path": "カスタムモデルパス",
"Microsoft Visual C++ Redistributable is not installed, would you like to download it?": "Microsoft Visual C++ 再頒布可能パッケージがインストールされていません。ダウンロードしますか?",
"File Path Cannot Contain Space": "ファイルのパスにスペースを含めることはできません",
"Current Strategy": "現在の戦略",
"MacOS is not yet supported for performing this operation, please do it manually.": "MacOSはまだこの操作を実行するサポートがありませんので、手動で行ってください。",
"Linux is not yet supported for performing this operation, please do it manually.": "Linuxはまだこの操作を実行するサポートがありませんので、手動で行ってください。",
"On Linux system, you must manually install python dependencies.": "Linuxシステムでは、pythonの依存関係を手動でインストールする必要があります。",
"Update completed, please restart the program.": "更新が完了したら、プログラムを再起動してください。",
"Are you sure you want to reset this page? It cannot be undone.": "本当にこのページをリセットしてもよろしいですか?元に戻すことはできません。",
"Model file download is not complete": "モデルファイルのダウンロードが完了していません",
"Error": "エラー",
"Are you sure you want to clear the conversation? It cannot be undone.": "会話をクリアしてもよろしいですか?元に戻すことはできません。",
"Save": "保存",
"Conversation Saved": "会話が保存されました",
"Open": "開く",
"DPI Scaling": "DPIスケーリング",
"Restart the app to apply DPI Scaling.": "DPIスケーリングを適用するためにアプリを再起動してください。",
"Restart": "再起動",
"API Chat Model Name": "APIチャットモデル名",
"API Completion Model Name": "API完成モデル名",
"Localhost": "ローカルホスト",
"Retry": "リトライ",
"Delete": "削除",
"Edit": "編集",
"Memory is not enough, try to increase the virtual memory or use a smaller model.": "メモリが不足しています。仮想メモリを増やすか、もしくは小さなモデルを使ってみてください",
"Bad PyTorch version, please reinstall PyTorch with cuda.": "不適切なPyTorchのバージョンです。cudaと共にPyTorchを再インストールしてください。",
"The model file is corrupted, please download again.": "モデルファイルが破損しています。再度ダウンロードしてください。",
"Found no NVIDIA driver, please install the latest driver.": "NVIDIAのドライバが見つかりません。最新版のドライバをインストールしてください。",
"VRAM is not enough, please reduce stored layers or use a lower precision in Configs page.": "VRAMが足りません。設定ページで保存されているレイヤーを減らすか、精度を下げてください。",
"Failed to enable custom CUDA kernel, ninja is required to load C++ extensions. You may be using the CPU version of PyTorch, please reinstall PyTorch with CUDA. Or if you are using a custom Python interpreter, you must compile the CUDA kernel by yourself or disable Custom CUDA kernel acceleration.": "カスタムCUDAカーネルの有効化に失敗しました。C++拡張を読み込むためにはNinjaが必要です。あなたは恐らくCPU版のPyTorchを使用しており、CUDA版のPyTorchを再インストールする必要があります。または、あなたがカスタムPythonインタプリタを使用している場合は、CUDAカーネルを自分でコンパイルするか、カスタムCUDAカーネルのアクセラレーションを無効にする必要があります。",
"Presets": "プリセット",
"Online": "オンライン",
"english": "英語",
"chinese": "中国語",
"default": "デフォルト",
"japanese": "日本語",
"New Preset": "新規プリセット",
"Import": "インポート",
"Name": "名前",
"Imported successfully": "インポート成功",
"Failed to import. Please copy a preset to the clipboard.": "インポートに失敗しました。プリセットをクリップボードにコピーしてください。",
"Clipboard is empty.": "クリップボードが空です。",
"Successfully copied to clipboard.": "クリップボードにコピーしました。",
"Edit Character Settings": "キャラクター設定を編集",
"Go Back": "戻る",
"Description": "説明",
"Avatar Url": "アバターURL",
"Welcome Message": "ウェルカムメッセージ",
"Display Preset Messages": "プリセットメッセージの表示",
"Tag": "タグ",
"Activate": "アクティブ化",
"New": "新規",
"user": "ユーザー",
"assistant": "アシスタント",
"system": "システム",
"Regenerate": "再生成",
"LoRA Finetune": "LoRAの微調整",
"Command Stopped": "コマンドが停止しました",
"Please convert data first.": "先にデータを変換してください。",
"Ubuntu is not installed, do you want to install it?": "Ubuntuがインストールされていません、インストールしますか",
"Install Ubuntu": "Ubuntuをインストール",
"Please install Ubuntu using Microsoft Store, after installation click the Open button in Microsoft Store and then click the Train button": "UbuntuをMicrosoftストアからインストールすることができます。インストールが完了したら、MicrosoftストアのOpenボタンを押し、Trainボタンを押してください",
"WSL is not enabled, do you want to enable it?": "WSLが有効になっていません、有効化しますか",
"Enable WSL": "WSLを有効化",
"After installation, please restart your computer to enable WSL": "インストールが完了したら、WSLを有効化するためにコンピュータを再起動してください",
"Data Process": "データ処理",
"Data Path": "データパス",
"Vocab Path": "語彙パス",
"Train Parameters": "トレーニングパラメータ",
"Base Model": "基本モデル",
"LoRA Model": "LoRAモデル",
"Merge Model": "モデルの統合",
"Devices": "デバイス",
"Gradient Checkpoint": "勾配チェックポイント",
"Context Length": "コンテキストの長さ",
"Epoch Steps": "エポックステップ数",
"Epoch Count": "エポックの数",
"Epoch Begin": "エポックの起点",
"Epoch Save": "エポックの保存",
"Learning Rate Init": "初期学習率",
"Learning Rate Final": "最終学習率",
"Micro Batch Size": "マイクロバッチサイズ",
"Accumulate Gradient Batches": "勾配バッチの累計",
"Warmup Steps": "ウォームアップステップ",
"Pre-FFN": "FFNの前処理",
"None": "なし",
"Merge model successfully": "モデルのマージが成功しました",
"Convert Data successfully": "データ変換に成功しました",
"Please select a LoRA model": "LoRAモデルを選択してください",
"You are using sample data for training. For formal training, please make sure to create your own jsonl file.": "トレーニングにはサンプルデータを使用しています。正式なトレーニングのためには、自身でjsonlファイルを作成してください。",
"WSL is not running, please retry. If it keeps happening, it means you may be using an outdated version of WSL, run \"wsl --update\" to update.": "WSLが実行されていません、もう一度試してください。これが続く場合、古いバージョンのWSLを使用している可能性があります。\"wsl --update\"を実行して更新してください。",
"Memory is not enough, try to increase the virtual memory (Swap of WSL) or use a smaller base model.": "メモリが不足しています、仮想メモリ (WSL Swap) を増やすか小さなベースモデルを使用してみてください。",
"VRAM is not enough": "ビデオRAMが不足しています",
"Training data is not enough, reduce context length or add more data for training": "トレーニングデータが不足しています、コンテキストの長さを減らすか、トレーニング用のデータをさらに追加してください",
"You are using WSL 1 for training, please upgrade to WSL 2. e.g. Run \"wsl --set-version Ubuntu-22.04 2\"": "トレーニングにWSL 1を使用しています、WSL 2にアップグレードしてください。例:\"wsl --set-version Ubuntu-22.04 2\"を実行する",
"Matched CUDA is not installed": "対応するCUDAがインストールされていません",
"Failed to convert data": "データの変換に失敗しました",
"Failed to merge model": "モデルのマージに失敗しました",
"The data path should be a directory or a file in jsonl format (more formats will be supported in the future).\n\nWhen you provide a directory path, all the txt files within that directory will be automatically converted into training data. This is commonly used for large-scale training in writing, code generation, or knowledge bases.\n\nThe jsonl format file can be referenced at https://github.com/Abel2076/json2binidx_tool/blob/main/sample.jsonl.\nYou can also write it similar to OpenAI's playground format, as shown in https://platform.openai.com/playground/p/default-chat.\nEven for multi-turn conversations, they must be written in a single line using `\\n` to indicate line breaks. If they are different dialogues or topics, they should be written in separate lines.": "データのパスはディレクトリまたはjsonl形式のファイルでなければなりません将来的にはより多くの形式がサポートされる予定です。ディレクトリパスを提供した場合、そのディレクトリ内のすべてのtxtファイルが自動的にトレーニングデータに変換されます。これは大規模なライティング、コード生成、または知識ベースのトレーニングで一般的に使用されます。jsonl形式のファイルは、https://github.com/Abel2076/json2binidx_tool/blob/main/sample.jsonl を参照してください。\nhttps://platform.openai.com/playground/p/default-chat のように、OpenAIのプレイグラウンド形式に似た形式で書くこともできます。複数ターンの対話であっても、一行で書く必要があり、行の区切りを示すために`\\n`を使用します。それらが異なる対話やトピックであれば、それらは別々の行に書かれるべきです。",
"Size mismatch for blocks. You are attempting to continue training from the LoRA model, but it does not match the base model. Please set LoRA model to None.": "ブロックのサイズが一致しません。LoRAモデルからトレーニングを続けようとしていますが、それはベースモデルと一致しません。LoRAモデルをNoneに設定してください。",
"Instruction: Write a story using the following information\n\nInput: A man named Alex chops a tree down\n\nResponse:": "Instruction: Write a story using the following information\n\nInput: アレックスという男が木を切り倒す\n\nResponse:",
"Composition": "作曲",
"Use Local Sound Font": "ローカルサウンドフォントを使用する",
"Auto Play At The End": "最後に自動再生",
"No File to save": "保存するファイルがありません",
"File Saved": "ファイルが保存されました",
"Failed to load local sound font, please check if the files exist - assets/sound-font": "ローカルサウンドフォントの読み込みに失敗しました、ファイルが存在するか確認してください - assets/sound-font",
"Please convert model to safe tensors format first": "モデルを安全なテンソル形式に変換してください",
"Convert To Safe Tensors Format": "安全なテンソル形式に変換",
"Please change Strategy to WebGPU to use safetensors format": "StrategyをWebGPUに変更して、安全なテンソル形式を使用してください",
"Preview Only": "プレビューのみ",
"RAM": "RAM",
"VRAM": "VRAM",
"GPU Usage": "GPU使用率",
"Use Custom Tokenizer": "カスタムトークナイザーを使用する",
"Tokenizer Path (e.g. backend-python/rwkv_pip/20B_tokenizer.json)": "トークナイザーパス (例: backend-python/rwkv_pip/20B_tokenizer.json)",
"User Name": "ユーザー名",
"Assistant Name": "アシスタント名",
"Insert default system prompt at the beginning": "最初にデフォルトのシステムプロンプトを挿入"
}

View File

@@ -1,9 +1,10 @@
import zhHans from './zh-hans/main.json';
import ja from './ja/main.json';
export const resources = {
zh: {
translation: zhHans
}
},
// de: {
// translation: de,
// },
@@ -19,9 +20,9 @@ export const resources = {
// it: {
// translation: it,
// },
// ja: {
// translation: ja,
// },
ja: {
translation: ja
}
// ko: {
// translation: ko,
// },

View File

@@ -113,7 +113,7 @@
"Writer": "写作",
"Translator": "翻译",
"Catgirl": "猫娘",
"Explain Code": "代码解释",
"Code Generation": "代码生成",
"Werewolf": "狼人杀",
"Instruction": "指令",
"Blank": "空白",
@@ -128,6 +128,7 @@
"Chinese Kongfu": "情境冒险",
"Allow external access to the API (service must be restarted)": "允许外部访问API (必须重启服务)",
"Custom": "自定义",
"CUDA (Beta, Faster)": "CUDA (Beta, 更快)",
"Reset All Configs": "重置所有配置",
"Cancel": "取消",
"Confirm": "确认",
@@ -177,7 +178,7 @@
"Failed to import. Please copy a preset to the clipboard.": "导入失败。请复制一个预设到剪贴板",
"Clipboard is empty.": "剪贴板没有内容",
"Successfully copied to clipboard.": "成功复制到剪贴板",
"Edit Messages": "编辑对话",
"Edit Character Settings": "编辑人设",
"Go Back": "返回",
"Description": "描述",
"Avatar Url": "头像图片地址",
@@ -225,7 +226,7 @@
"Please select a LoRA model": "请选择一个LoRA模型",
"You are using sample data for training. For formal training, please make sure to create your own jsonl file.": "你正在使用示例数据训练对于正式训练场合请务必创建你自己的jsonl训练数据",
"WSL is not running, please retry. If it keeps happening, it means you may be using an outdated version of WSL, run \"wsl --update\" to update.": "WSL没有运行请重试。如果一直出现此错误意味着你可能正在使用旧版本的WSL请在cmd执行\"wsl --update\"以更新",
"Memory is not enough, try to increase the virtual memory or use a smaller base model.": "内存不足,尝试增加虚拟内存,或使用一个更小规模的基底模型",
"Memory is not enough, try to increase the virtual memory (Swap of WSL) or use a smaller base model.": "内存不足,尝试增加虚拟内存(WSL Swap),或使用一个更小规模的基底模型",
"VRAM is not enough": "显存不足",
"Training data is not enough, reduce context length or add more data for training": "训练数据不足,请减小上下文长度或增加训练数据",
"You are using WSL 1 for training, please upgrade to WSL 2. e.g. Run \"wsl --set-version Ubuntu-22.04 2\"": "你正在使用WSL 1进行训练请升级到WSL 2。例如运行\"wsl --set-version Ubuntu-22.04 2\"",
@@ -234,5 +235,23 @@
"Failed to merge model": "合并模型失败",
"The data path should be a directory or a file in jsonl format (more formats will be supported in the future).\n\nWhen you provide a directory path, all the txt files within that directory will be automatically converted into training data. This is commonly used for large-scale training in writing, code generation, or knowledge bases.\n\nThe jsonl format file can be referenced at https://github.com/Abel2076/json2binidx_tool/blob/main/sample.jsonl.\nYou can also write it similar to OpenAI's playground format, as shown in https://platform.openai.com/playground/p/default-chat.\nEven for multi-turn conversations, they must be written in a single line using `\\n` to indicate line breaks. If they are different dialogues or topics, they should be written in separate lines.": "数据路径必须是一个文件夹或者jsonl格式文件 (未来会支持更多格式)\n\n当你填写的路径是一个文件夹时该文件夹内的所有txt文件会被自动转换为训练数据通常这用于大批量训练写作代码生成或知识库\n\njsonl文件的格式参考 https://github.com/Abel2076/json2binidx_tool/blob/main/sample.jsonl\n你也可以仿照openai的playground编写参考 https://platform.openai.com/playground/p/default-chat\n即使是多轮对话也必须写在一行用`\\n`表示换行,如果是不同对话或主题,则另起一行",
"Size mismatch for blocks. You are attempting to continue training from the LoRA model, but it does not match the base model. Please set LoRA model to None.": "尺寸不匹配块。你正在尝试从LoRA模型继续训练但该LoRA模型与基底模型不匹配请将LoRA模型设为空",
"Instruction: Write a story using the following information\n\nInput: A man named Alex chops a tree down\n\nResponse:": "Instruction: Write a story using the following information\n\nInput: 艾利克斯砍倒了一棵树\n\nResponse:"
"Instruction: Write a story using the following information\n\nInput: A man named Alex chops a tree down\n\nResponse:": "Instruction: Write a story using the following information\n\nInput: 艾利克斯砍倒了一棵树\n\nResponse:",
"Composition": "作曲",
"Use Local Sound Font": "使用本地音色资源",
"Auto Play At The End": "结束时自动播放",
"No File to save": "无文件可保存",
"File Saved": "文件已保存",
"Failed to load local sound font, please check if the files exist - assets/sound-font": "加载本地音色资源失败,请检查文件是否存在 - assets/sound-font",
"Please convert model to safe tensors format first": "请先将模型转换为Safetensors格式",
"Convert To Safe Tensors Format": "转换为Safetensors格式",
"Please change Strategy to WebGPU to use safetensors format": "请将Strategy改为WebGPU以使用safetensors格式",
"Preview Only": "仅预览",
"RAM": "内存",
"VRAM": "显存",
"GPU Usage": "GPU占用",
"Use Custom Tokenizer": "使用自定义Tokenizer",
"Tokenizer Path (e.g. backend-python/rwkv_pip/20B_tokenizer.json)": "Tokenizer路径 (例如: backend-python/rwkv_pip/20B_tokenizer.json)",
"User Name": "用户名称",
"Assistant Name": "AI名称",
"Insert default system prompt at the beginning": "在开头自动插入默认系统提示"
}

View File

@@ -4,14 +4,14 @@ import { useTranslation } from 'react-i18next';
import { ArrowReset20Regular } from '@fluentui/react-icons';
import commonStore from '../stores/commonStore';
import { defaultModelConfigs, defaultModelConfigsMac } from '../pages/defaultModelConfigs';
import { defaultModelConfigs, defaultModelConfigsMac } from '../pages/defaultConfigs';
export const ResetConfigsButton: FC<{ afterConfirm?: () => void }> = ({ afterConfirm }) => {
const { t } = useTranslation();
return <DialogButton icon={<ArrowReset20Regular />} tooltip={t('Reset All Configs')} title={t('Reset All Configs')}
contentText={t('Are you sure you want to reset all configs? This will obtain the latest preset configs, but will override your custom configs and cannot be undone.')}
onConfirm={() => {
commonStore.setModelConfigs(commonStore.platform != 'darwin' ? defaultModelConfigs : defaultModelConfigsMac, false);
commonStore.setModelConfigs(commonStore.platform !== 'darwin' ? defaultModelConfigs : defaultModelConfigsMac, false);
commonStore.setCurrentConfigIndex(0, true);
afterConfirm?.();
}} />;

View File

@@ -1,6 +1,12 @@
import React, { FC, MouseEventHandler, ReactElement } from 'react';
import commonStore, { ModelStatus } from '../stores/commonStore';
import { AddToDownloadList, CopyFile, FileExists, StartServer } from '../../wailsjs/go/backend_golang/App';
import {
AddToDownloadList,
CopyFile,
FileExists,
StartServer,
StartWebGPUServer
} from '../../wailsjs/go/backend_golang/App';
import { Button } from '@fluentui/react-components';
import { observer } from 'mobx-react-lite';
import { exit, getStatus, readRoot, switchModel, updateConfig } from '../apis';
@@ -39,6 +45,7 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
commonStore.setStatus({ status: ModelStatus.Starting });
const modelConfig = commonStore.getCurrentModelConfig();
const webgpu = modelConfig.modelParameters.device === 'WebGPU';
let modelName = '';
let modelPath = '';
if (modelConfig && modelConfig.modelParameters) {
@@ -50,9 +57,32 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
return;
}
const ok = await checkDependencies(navigate);
if (!ok)
return;
if (webgpu) {
if (!['.st', '.safetensors'].some(ext => modelPath.endsWith(ext))) {
const stModelPath = modelPath.replace(/\.pth$/, '.st');
if (await FileExists(stModelPath)) {
modelPath = stModelPath;
} else {
toast(t('Please convert model to safe tensors format first'), { type: 'error' });
commonStore.setStatus({ status: ModelStatus.Offline });
return;
}
}
}
if (!webgpu) {
if (['.st', '.safetensors'].some(ext => modelPath.endsWith(ext))) {
toast(t('Please change Strategy to WebGPU to use safetensors format'), { type: 'error' });
commonStore.setStatus({ status: ModelStatus.Offline });
return;
}
}
if (!webgpu) {
const ok = await checkDependencies(navigate);
if (!ok)
return;
}
const currentModelSource = commonStore.modelSourceList.find(item => item.name === modelName);
@@ -85,7 +115,14 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
await exit(1000).catch(() => {
});
StartServer(commonStore.settings.customPythonPath, port, commonStore.settings.host !== '127.0.0.1' ? '0.0.0.0' : '127.0.0.1').catch((e) => {
const startServer = webgpu ?
(_: string, port: number, host: string) => StartWebGPUServer(port, host)
: StartServer;
startServer(commonStore.settings.customPythonPath, port, commonStore.settings.host !== '127.0.0.1' ? '0.0.0.0' : '127.0.0.1',
modelConfig.modelParameters.device === 'CUDA-Beta'
).catch((e) => {
const errMsg = e.message || e;
if (errMsg.includes('path contains space'))
toast(`${t('Error')} - ${t('File Path Cannot Contain Space')}`, { type: 'error' });
@@ -102,23 +139,27 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
if (r.ok && !loading) {
loading = true;
clearInterval(intervalId);
await getStatus().then(status => {
if (status)
commonStore.setStatus(status);
});
if (!webgpu) {
await getStatus().then(status => {
if (status)
commonStore.setStatus(status);
});
}
commonStore.setStatus({ status: ModelStatus.Loading });
toast(t('Loading Model'), { type: 'info' });
updateConfig({
max_tokens: modelConfig.apiParameters.maxResponseToken,
temperature: modelConfig.apiParameters.temperature,
top_p: modelConfig.apiParameters.topP,
presence_penalty: modelConfig.apiParameters.presencePenalty,
frequency_penalty: modelConfig.apiParameters.frequencyPenalty
});
if (!webgpu) {
updateConfig({
max_tokens: modelConfig.apiParameters.maxResponseToken,
temperature: modelConfig.apiParameters.temperature,
top_p: modelConfig.apiParameters.topP,
presence_penalty: modelConfig.apiParameters.presencePenalty,
frequency_penalty: modelConfig.apiParameters.frequencyPenalty
});
}
const strategy = getStrategy(modelConfig);
let customCudaFile = '';
if ((modelConfig.modelParameters.device === 'CUDA' || modelConfig.modelParameters.device === 'Custom')
if ((modelConfig.modelParameters.device.includes('CUDA') || modelConfig.modelParameters.device === 'Custom')
&& modelConfig.modelParameters.useCustomCuda && !strategy.includes('fp32')) {
if (commonStore.platform === 'windows') {
customCudaFile = getSupportedCustomCudaFile();
@@ -145,13 +186,21 @@ export const RunButton: FC<{ onClickRun?: MouseEventHandler, iconMode?: boolean
switchModel({
model: modelPath,
strategy: strategy,
tokenizer: modelConfig.modelParameters.useCustomTokenizer ? modelConfig.modelParameters.customTokenizer : undefined,
customCuda: customCudaFile !== ''
}).then(async (r) => {
if (r.ok) {
commonStore.setStatus({ status: ModelStatus.Working });
toastWithButton(t('Startup Completed'), t('Chat'), () => {
navigate({ pathname: '/chat' });
}, { type: 'success', autoClose: 3000 });
let buttonNameMap = {
'novel': 'Completion',
'midi': 'Composition'
};
let buttonName = 'Chat';
buttonName = Object.entries(buttonNameMap).find(([key, value]) => modelName.toLowerCase().includes(key))?.[1] || buttonName;
const buttonFn = () => {
navigate({ pathname: '/' + buttonName.toLowerCase() });
};
toastWithButton(t('Startup Completed'), t(buttonName), buttonFn, { type: 'success', autoClose: 3000 });
} else if (r.status === 304) {
toast(t('Loading Model'), { type: 'info' });
} else {

View File

@@ -6,6 +6,7 @@ import App from './App';
import { HashRouter } from 'react-router-dom';
import { startup } from './startup';
import './_locales/i18n-react';
import 'html-midi-player';
import { WindowShow } from '../wailsjs/runtime';
startup().then(() => {

View File

@@ -184,7 +184,9 @@ const ChatPanel: FC = observer(() => {
const bodyRef = useRef<HTMLDivElement>(null);
const inputRef = useRef<HTMLTextAreaElement>(null);
const mq = useMediaQuery('(min-width: 640px)');
const port = commonStore.getCurrentModelConfig().apiParameters.apiPort;
const currentConfig = commonStore.getCurrentModelConfig();
const apiParams = currentConfig.apiParameters;
const port = apiParams.apiPort;
let lastMessageId: string;
let generating: boolean = false;
@@ -308,12 +310,17 @@ const ChatPanel: FC = observer(() => {
body: JSON.stringify({
messages,
stream: true,
model: commonStore.settings.apiChatModelName // 'gpt-3.5-turbo'
model: commonStore.settings.apiChatModelName, // 'gpt-3.5-turbo'
temperature: apiParams.temperature,
top_p: apiParams.topP,
user_name: commonStore.activePreset?.userName,
assistant_name: commonStore.activePreset?.assistantName,
presystem: commonStore.activePreset?.presystem
}),
signal: chatSseController?.signal,
onmessage(e) {
scrollToBottom();
if (e.data === '[DONE]') {
if (e.data.trim() === '[DONE]') {
commonStore.conversation[answerId!].done = true;
commonStore.conversation[answerId!].content = commonStore.conversation[answerId!].content.trim();
commonStore.setConversation(commonStore.conversation);

View File

@@ -13,6 +13,7 @@ import { DialogButton } from '../components/DialogButton';
import { PresetsButton } from './PresetsManager/PresetsButton';
import { ToolTipButton } from '../components/ToolTipButton';
import { ArrowSync20Regular } from '@fluentui/react-icons';
import { defaultPresets } from './defaultConfigs';
export type CompletionParams = Omit<ApiParameters, 'apiPort'> & {
stop: string,
@@ -26,113 +27,6 @@ export type CompletionPreset = {
params: CompletionParams
}
export const defaultPresets: CompletionPreset[] = [{
name: 'Writer',
prompt: 'The following is an epic science fiction masterpiece that is immortalized, with delicate descriptions and grand depictions of interstellar civilization wars.\nChapter 1.\n',
params: {
maxResponseToken: 500,
temperature: 1.2,
topP: 0.5,
presencePenalty: 0.4,
frequencyPenalty: 0.4,
stop: '\\n\\nUser',
injectStart: '',
injectEnd: ''
}
}, {
name: 'Translator',
prompt: 'Translate this into Chinese.\n\nEnglish: What rooms do you have available?',
params: {
maxResponseToken: 500,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '\\nEnglish',
injectStart: '\\nChinese: ',
injectEnd: '\\nEnglish: '
}
}, {
name: 'Catgirl',
prompt: 'The following is a conversation between a cat girl and her owner. The cat girl is a humanized creature that behaves like a cat but is humanoid. At the end of each sentence in the dialogue, she will add \"Meow~\". In the following content, User represents the owner and Assistant represents the cat girl.\n\nUser: Hello.\n\nAssistant: I\'m here, meow~.\n\nUser: Can you tell jokes?',
params: {
maxResponseToken: 500,
temperature: 1.2,
topP: 0.5,
presencePenalty: 0.4,
frequencyPenalty: 0.4,
stop: '\\n\\nUser',
injectStart: '\\n\\nAssistant: ',
injectEnd: '\\n\\nUser: '
}
}, {
name: 'Chinese Kongfu',
prompt: 'User: 请你扮演一个文本冒险游戏,我是游戏主角。这是一个玄幻修真世界,有四大门派。我输入我的行动,请你显示行动结果,并具体描述环境。我的第一个行动是“醒来”,请开始故事。',
params: {
maxResponseToken: 500,
temperature: 1.1,
topP: 0.7,
presencePenalty: 0.3,
frequencyPenalty: 0.3,
stop: '\\n\\nUser',
injectStart: '\\n\\nAssistant: ',
injectEnd: '\\n\\nUser: '
}
}, {
// }, {
// name: 'Explain Code',
// prompt: 'export async function startup() {\n FileExists(\'cache.json\').then((exists) => {\n if (exists)\n downloadProgramFiles();\n else {\n deleteDynamicProgramFiles().then(downloadProgramFiles);\n }\n });\n EventsOn(\'downloadList\', (data) => {\n if (data)\n commonStore.setDownloadList(data);\n });\n\n initCache().then(initRemoteText);\n\n await initConfig();\n\n if (commonStore.settings.autoUpdatesCheck) // depends on config settings\n checkUpdate();\n\n getStatus(1000).then(status => { // depends on config api port\n if (status)\n commonStore.setStatus(status);\n });\n}\n\n\"\"\"\nHere\'s what the above code is doing, explained in a concise way:\n',
// params: {
// maxResponseToken: 500,
// temperature: 0.8,
// topP: 0.7,
// presencePenalty: 0.4,
// frequencyPenalty: 0.4,
// stop: '\\n\\n',
// injectStart: '',
// injectEnd: ''
// }
// }, {
name: 'Werewolf',
prompt: 'There is currently a game of Werewolf with six players, including a Seer (who can check identities at night), two Werewolves (who can choose someone to kill at night), a Bodyguard (who can choose someone to protect at night), two Villagers (with no special abilities), and a game host. User will play as Player 1, Assistant will play as Players 2-6 and the game host, and they will begin playing together. Every night, the host will ask User for his action and simulate the actions of the other players. During the day, the host will oversee the voting process and ask User for his vote. \n\nAssistant: Next, I will act as the game host and assign everyone their roles, including randomly assigning yours. Then, I will simulate the actions of Players 2-6 and let you know what happens each day. Based on your assigned role, you can tell me your actions and I will let you know the corresponding results each day.\n\nUser: Okay, I understand. Let\'s begin. Please assign me a role. Am I the Seer, Werewolf, Villager, or Bodyguard?\n\nAssistant: You are the Seer. Now that night has fallen, please choose a player to check his identity.\n\nUser: Tonight, I want to check Player 2 and find out his role.',
params: {
maxResponseToken: 500,
temperature: 1.2,
topP: 0.4,
presencePenalty: 0.5,
frequencyPenalty: 0.5,
stop: '\\n\\nUser',
injectStart: '\\n\\nAssistant: ',
injectEnd: '\\n\\nUser: '
}
}, {
name: 'Instruction',
prompt: 'Instruction: Write a story using the following information\n\nInput: A man named Alex chops a tree down\n\nResponse:',
params: {
maxResponseToken: 500,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '',
injectStart: '',
injectEnd: ''
}
}, {
name: 'Blank',
prompt: '',
params: {
maxResponseToken: 500,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '',
injectStart: '',
injectEnd: ''
}
}];
let completionSseController: AbortController | null = null;
const CompletionPanel: FC = observer(() => {
@@ -220,7 +114,7 @@ const CompletionPanel: FC = observer(() => {
signal: completionSseController?.signal,
onmessage(e) {
scrollToBottom();
if (e.data === '[DONE]') {
if (e.data.trim() === '[DONE]') {
commonStore.setCompletionGenerating(false);
return;
}
@@ -232,8 +126,8 @@ const CompletionPanel: FC = observer(() => {
return;
}
if (data.choices && Array.isArray(data.choices) && data.choices.length > 0) {
answer += data.choices[0].text;
setPrompt(prompt + answer.trim() + params.injectEnd.replaceAll('\\n', '\n'));
answer += data.choices[0]?.text || data.choices[0]?.delta?.content || '';
setPrompt(prompt + answer.replace(/\s+$/, '') + params.injectEnd.replaceAll('\\n', '\n'));
}
},
async onopen(response) {

View File

@@ -0,0 +1,345 @@
import React, { FC, useEffect, useRef } from 'react';
import { observer } from 'mobx-react-lite';
import { WorkHeader } from '../components/WorkHeader';
import { Button, Checkbox, Textarea } from '@fluentui/react-components';
import { Labeled } from '../components/Labeled';
import { ValuedSlider } from '../components/ValuedSlider';
import { useTranslation } from 'react-i18next';
import commonStore, { ModelStatus } from '../stores/commonStore';
import { fetchEventSource } from '@microsoft/fetch-event-source';
import { toast } from 'react-toastify';
import { DialogButton } from '../components/DialogButton';
import { ToolTipButton } from '../components/ToolTipButton';
import { ArrowSync20Regular, Save28Regular } from '@fluentui/react-icons';
import { PlayerElement, VisualizerElement } from 'html-midi-player';
import * as mm from '@magenta/music/esm/core.js';
import { NoteSequence } from '@magenta/music/esm/protobuf.js';
import { defaultCompositionPrompt } from './defaultConfigs';
import { FileExists, OpenFileFolder, OpenSaveFileDialogBytes } from '../../wailsjs/go/backend_golang/App';
import { toastWithButton } from '../utils';
export type CompositionParams = {
prompt: string,
maxResponseToken: number,
temperature: number,
topP: number,
autoPlay: boolean,
useLocalSoundFont: boolean,
midi: ArrayBuffer | null,
ns: NoteSequence | null
}
let compositionSseController: AbortController | null = null;
const CompositionPanel: FC = observer(() => {
const { t } = useTranslation();
const inputRef = useRef<HTMLTextAreaElement>(null);
const port = commonStore.getCurrentModelConfig().apiParameters.apiPort;
const visualizerRef = useRef<VisualizerElement>(null);
const playerRef = useRef<PlayerElement>(null);
const scrollToBottom = () => {
if (inputRef.current)
inputRef.current.scrollTop = inputRef.current.scrollHeight;
};
const params = commonStore.compositionParams;
const setParams = (newParams: Partial<CompositionParams>) => {
commonStore.setCompositionParams({
...commonStore.compositionParams,
...newParams
});
};
const setPrompt = (prompt: string) => {
setParams({
prompt
});
if (!commonStore.compositionGenerating)
generateNs(false);
};
const updateNs = (ns: NoteSequence | null) => {
if (playerRef.current) {
playerRef.current.noteSequence = ns;
playerRef.current.reload();
}
if (visualizerRef.current) {
visualizerRef.current.noteSequence = ns;
visualizerRef.current.reload();
}
};
const setSoundFont = async () => {
let soundUrl: string;
if (commonStore.compositionParams.useLocalSoundFont)
soundUrl = 'assets/sound-font';
else
soundUrl = !commonStore.settings.giteeUpdatesSource ?
`https://raw.githubusercontent.com/josStorer/sgm_plus/master` :
`https://gitee.com/josc146/sgm_plus/raw/master`;
const fallbackUrl = 'https://cdn.jsdelivr.net/gh/josstorer/sgm_plus';
await fetch(soundUrl + '/soundfont.json').then(r => {
if (!r.ok)
soundUrl = fallbackUrl;
}).catch(() => soundUrl = fallbackUrl);
if (playerRef.current) {
playerRef.current.soundFont = soundUrl;
}
};
useEffect(() => {
if (inputRef.current)
inputRef.current.style.height = '100%';
scrollToBottom();
if (playerRef.current && visualizerRef.current) {
playerRef.current.addVisualizer(visualizerRef.current);
playerRef.current.addEventListener('start', () => {
visualizerRef.current?.reload();
});
setSoundFont().then(() => {
updateNs(params.ns);
});
const button = playerRef.current.shadowRoot?.querySelector('.controls .play') as HTMLElement | null;
if (button)
button.style.background = '#f2f5f6';
}
}, []);
const generateNs = (autoPlay: boolean) => {
fetch(commonStore.settings.apiUrl ?
commonStore.settings.apiUrl + '/text-to-midi' :
`http://127.0.0.1:${port}/text-to-midi`, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
'text': commonStore.compositionParams.prompt.replaceAll(/<pad>|<start>|<end>/g, '').replaceAll(' ', ' ').trim()
})
}).then(r => {
r.arrayBuffer().then(midi => {
const ns = mm.midiToSequenceProto(midi);
setParams({
midi,
ns
});
updateNs(ns);
if (autoPlay) {
playerRef.current?.start();
}
});
});
};
const onSubmit = (prompt: string) => {
commonStore.setCompositionSubmittedPrompt(prompt);
if (commonStore.status.status === ModelStatus.Offline && !commonStore.settings.apiUrl) {
toast(t('Please click the button in the top right corner to start the model'), { type: 'warning' });
commonStore.setCompositionGenerating(false);
return;
}
let answer = '';
compositionSseController = new AbortController();
fetchEventSource( // https://api.openai.com/v1/completions || http://127.0.0.1:${port}/completions
commonStore.settings.apiUrl ?
commonStore.settings.apiUrl + '/v1/completions' :
`http://127.0.0.1:${port}/completions`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${commonStore.settings.apiKey}`
},
body: JSON.stringify({
prompt,
stream: true,
model: commonStore.settings.apiCompletionModelName, // 'text-davinci-003'
max_tokens: params.maxResponseToken,
temperature: params.temperature,
top_p: params.topP
}),
signal: compositionSseController?.signal,
onmessage(e) {
scrollToBottom();
if (e.data.trim() === '[DONE]') {
commonStore.setCompositionGenerating(false);
generateNs(commonStore.compositionParams.autoPlay);
return;
}
let data;
try {
data = JSON.parse(e.data);
} catch (error) {
console.debug('json error', error);
return;
}
if (data.choices && Array.isArray(data.choices) && data.choices.length > 0) {
answer += data.choices[0]?.text || data.choices[0]?.delta?.content || '';
setPrompt(prompt + answer.replace(/\s+$/, ''));
}
},
async onopen(response) {
if (response.status !== 200) {
toast(response.statusText + '\n' + (await response.text()), {
type: 'error'
});
}
},
onclose() {
console.log('Connection closed');
},
onerror(err) {
err = err.message || err;
if (err && !err.includes('ReadableStreamDefaultReader'))
toast(err, {
type: 'error'
});
commonStore.setCompositionGenerating(false);
throw err;
}
});
};
return (
<div className="flex flex-col gap-2 overflow-hidden grow">
<div className="flex flex-col sm:flex-row gap-2 overflow-hidden grow">
<Textarea
ref={inputRef}
className="grow"
value={params.prompt}
onChange={(e) => {
commonStore.setCompositionSubmittedPrompt(e.target.value);
setPrompt(e.target.value);
}}
/>
<div className="flex flex-col gap-1 max-h-48 sm:max-w-sm sm:max-h-full overflow-x-hidden overflow-y-auto p-1">
<Labeled flex breakline label={t('Max Response Token')}
desc={t('By default, the maximum number of tokens that can be answered in a single response, it can be changed by the user by specifying API parameters.')}
content={
<ValuedSlider value={params.maxResponseToken} min={100} max={4100}
step={100}
input
onChange={(e, data) => {
setParams({
maxResponseToken: data.value
});
}} />
} />
<Labeled flex breakline label={t('Temperature')}
desc={t('Sampling temperature, it\'s like giving alcohol to a model, the higher the stronger the randomness and creativity, while the lower, the more focused and deterministic it will be.')}
content={
<ValuedSlider value={params.temperature} min={0} max={2} step={0.1}
input
onChange={(e, data) => {
setParams({
temperature: data.value
});
}} />
} />
<Labeled flex breakline label={t('Top_P')}
desc={t('Just like feeding sedatives to the model. Consider the results of the top n% probability mass, 0.1 considers the top 10%, with higher quality but more conservative, 1 considers all results, with lower quality but more diverse.')}
content={
<ValuedSlider value={params.topP} min={0} max={1} step={0.1} input
onChange={(e, data) => {
setParams({
topP: data.value
});
}} />
} />
<div className="grow" />
<Checkbox className="select-none"
size="large" label={t('Use Local Sound Font')} checked={params.useLocalSoundFont}
onChange={async (_, data) => {
if (data.checked) {
if (!await FileExists('assets/sound-font/accordion/instrument.json')) {
toast(t('Failed to load local sound font, please check if the files exist - assets/sound-font'),
{ type: 'warning' });
return;
}
}
setParams({
useLocalSoundFont: data.checked as boolean
});
setSoundFont();
}} />
<Checkbox className="select-none"
size="large" label={t('Auto Play At The End')} checked={params.autoPlay} onChange={(_, data) => {
setParams({
autoPlay: data.checked as boolean
});
}} />
<div className="flex justify-between gap-2">
<ToolTipButton desc={t('Regenerate')} icon={<ArrowSync20Regular />} onClick={() => {
compositionSseController?.abort();
commonStore.setCompositionGenerating(true);
setPrompt(commonStore.compositionSubmittedPrompt);
onSubmit(commonStore.compositionSubmittedPrompt);
}} />
<DialogButton className="grow" text={t('Reset')} title={t('Reset')}
contentText={t('Are you sure you want to reset this page? It cannot be undone.')}
onConfirm={() => {
commonStore.setCompositionSubmittedPrompt(defaultCompositionPrompt);
setPrompt(defaultCompositionPrompt);
}} />
<Button className="grow" appearance="primary" onClick={() => {
if (commonStore.compositionGenerating) {
compositionSseController?.abort();
commonStore.setCompositionGenerating(false);
generateNs(params.autoPlay);
} else {
commonStore.setCompositionGenerating(true);
onSubmit(params.prompt);
}
}}>{!commonStore.compositionGenerating ? t('Generate') : t('Stop')}</Button>
</div>
</div>
</div>
<div className="flex flex-col">
<div className="ml-auto mr-auto">
<midi-visualizer
ref={visualizerRef}
type="waterfall"
/>
</div>
<div className="flex">
<midi-player
ref={playerRef}
style={{ width: '100%' }}
/>
<Button icon={<Save28Regular />}
onClick={() => {
if (params.midi) {
OpenSaveFileDialogBytes('*.mid', 'music.mid', Array.from(new Uint8Array(params.midi))).then((path) => {
if (path)
toastWithButton(t('File Saved'), t('Open'), () => {
OpenFileFolder(path, false);
});
}).catch((e: any) => {
toast(t('Error') + ' - ' + (e.message || e), { type: 'error', autoClose: 2500 });
});
} else {
toast(t('No File to save'), { type: 'warning', autoClose: 1500 });
}
}}
>
{t('Save')}
</Button>
</div>
</div>
</div>
);
});
export const Composition: FC = observer(() => {
return (
<div className="flex flex-col gap-1 p-2 h-full overflow-hidden">
<WorkHeader />
<CompositionPanel />
</div>
);
});

View File

@@ -1,6 +1,19 @@
import { Dropdown, Input, Label, Option, Select, Switch, Text } from '@fluentui/react-components';
import {
Accordion,
AccordionHeader,
AccordionItem,
AccordionPanel,
Checkbox,
Dropdown,
Input,
Label,
Option,
Select,
Switch,
Text
} from '@fluentui/react-components';
import { AddCircle20Regular, DataUsageSettings20Regular, Delete20Regular, Save20Regular } from '@fluentui/react-icons';
import React, { FC } from 'react';
import React, { FC, useEffect, useRef } from 'react';
import { Section } from '../components/Section';
import { Labeled } from '../components/Labeled';
import { ToolTipButton } from '../components/ToolTipButton';
@@ -13,13 +26,14 @@ import { Page } from '../components/Page';
import { useNavigate } from 'react-router';
import { RunButton } from '../components/RunButton';
import { updateConfig } from '../apis';
import { ConvertModel, FileExists, GetPyError } from '../../wailsjs/go/backend_golang/App';
import { getStrategy } from '../utils';
import { ConvertModel, ConvertSafetensors, FileExists, GetPyError } from '../../wailsjs/go/backend_golang/App';
import { checkDependencies, getStrategy } from '../utils';
import { useTranslation } from 'react-i18next';
import { WindowShow } from '../../wailsjs/runtime/runtime';
import strategyImg from '../assets/images/strategy.jpg';
import strategyZhImg from '../assets/images/strategy_zh.jpg';
import { ResetConfigsButton } from '../components/ResetConfigsButton';
import { useMediaQuery } from 'usehooks-ts';
export type ApiParameters = {
apiPort: number
@@ -30,7 +44,7 @@ export type ApiParameters = {
frequencyPenalty: number;
}
export type Device = 'CPU' | 'CUDA' | 'MPS' | 'Custom';
export type Device = 'CPU' | 'CUDA' | 'CUDA-Beta' | 'WebGPU' | 'MPS' | 'Custom';
export type Precision = 'fp16' | 'int8' | 'fp32';
export type ModelParameters = {
@@ -42,6 +56,8 @@ export type ModelParameters = {
maxStoredLayers: number;
useCustomCuda?: boolean;
customStrategy?: string;
useCustomTokenizer?: boolean;
customTokenizer?: string;
}
export type ModelConfig = {
@@ -56,9 +72,16 @@ export const Configs: FC = observer(() => {
const [selectedIndex, setSelectedIndex] = React.useState(commonStore.currentModelConfigIndex);
const [selectedConfig, setSelectedConfig] = React.useState(commonStore.modelConfigs[selectedIndex]);
const [displayStrategyImg, setDisplayStrategyImg] = React.useState(false);
const advancedHeaderRef = useRef<HTMLDivElement>(null);
const mq = useMediaQuery('(min-width: 640px)');
const navigate = useNavigate();
const port = selectedConfig.apiParameters.apiPort;
useEffect(() => {
if (advancedHeaderRef.current)
(advancedHeaderRef.current.firstElementChild as HTMLElement).style.padding = '0';
}, []);
const updateSelectedIndex = (newIndex: number) => {
setSelectedIndex(newIndex);
setSelectedConfig(commonStore.modelConfigs[newIndex]);
@@ -128,7 +151,8 @@ export const Configs: FC = observer(() => {
setSelectedIndex(0);
setSelectedConfig(commonStore.modelConfigs[0]);
}} />
<ToolTipButton desc={t('Save Config')} icon={<Save20Regular />} onClick={onClickSave} />
<ToolTipButton desc={mq ? '' : t('Save Config')} icon={<Save20Regular />} text={mq ? t('Save Config') : null}
onClick={onClickSave} />
</div>
<div className="flex items-center gap-4">
<Label>{t('Config Name')}</Label>
@@ -237,40 +261,84 @@ export const Configs: FC = observer(() => {
}} />
</div>
} />
<ToolTipButton text={t('Convert')}
desc={t('Convert model with these configs. Using a converted model will greatly improve the loading speed, but model parameters of the converted model cannot be modified.')}
onClick={async () => {
if (commonStore.platform == 'darwin') {
toast(t('MacOS is not yet supported for performing this operation, please do it manually.'), { type: 'info' });
return;
} else if (commonStore.platform == 'linux') {
toast(t('Linux is not yet supported for performing this operation, please do it manually.'), { type: 'info' });
return;
}
const modelPath = `${commonStore.settings.customModelsPath}/${selectedConfig.modelParameters.modelName}`;
if (await FileExists(modelPath)) {
const strategy = getStrategy(selectedConfig);
const newModelPath = modelPath + '-' + strategy.replace(/[:> *+]/g, '-');
toast(t('Start Converting'), { autoClose: 1000, type: 'info' });
ConvertModel(commonStore.settings.customPythonPath, modelPath, strategy, newModelPath).then(async () => {
if (!await FileExists(newModelPath + '.pth')) {
toast(t('Convert Failed') + ' - ' + await GetPyError(), { type: 'error' });
} else {
toast(`${t('Convert Success')} - ${newModelPath}`, { type: 'success' });
{
selectedConfig.modelParameters.device !== 'WebGPU' ?
<ToolTipButton text={t('Convert')}
desc={t('Convert model with these configs. Using a converted model will greatly improve the loading speed, but model parameters of the converted model cannot be modified.')}
onClick={async () => {
if (commonStore.platform === 'darwin') {
toast(t('MacOS is not yet supported for performing this operation, please do it manually.') + ' (backend-python/convert_model.py)', { type: 'info' });
return;
} else if (commonStore.platform === 'linux') {
toast(t('Linux is not yet supported for performing this operation, please do it manually.') + ' (backend-python/convert_model.py)', { type: 'info' });
return;
}
}).catch(e => {
const errMsg = e.message || e;
if (errMsg.includes('path contains space'))
toast(`${t('Convert Failed')} - ${t('File Path Cannot Contain Space')}`, { type: 'error' });
else
toast(`${t('Convert Failed')} - ${e.message || e}`, { type: 'error' });
});
setTimeout(WindowShow, 1000);
} else {
toast(`${t('Model Not Found')} - ${modelPath}`, { type: 'error' });
}
}} />
const ok = await checkDependencies(navigate);
if (!ok)
return;
const modelPath = `${commonStore.settings.customModelsPath}/${selectedConfig.modelParameters.modelName}`;
if (await FileExists(modelPath)) {
const strategy = getStrategy(selectedConfig);
const newModelPath = modelPath + '-' + strategy.replace(/[:> *+]/g, '-');
toast(t('Start Converting'), { autoClose: 1000, type: 'info' });
ConvertModel(commonStore.settings.customPythonPath, modelPath, strategy, newModelPath).then(async () => {
if (!await FileExists(newModelPath + '.pth')) {
toast(t('Convert Failed') + ' - ' + await GetPyError(), { type: 'error' });
} else {
toast(`${t('Convert Success')} - ${newModelPath}`, { type: 'success' });
}
}).catch(e => {
const errMsg = e.message || e;
if (errMsg.includes('path contains space'))
toast(`${t('Convert Failed')} - ${t('File Path Cannot Contain Space')}`, { type: 'error' });
else
toast(`${t('Convert Failed')} - ${e.message || e}`, { type: 'error' });
});
setTimeout(WindowShow, 1000);
} else {
toast(`${t('Model Not Found')} - ${modelPath}`, { type: 'error' });
}
}} /> :
<ToolTipButton text={t('Convert To Safe Tensors Format')}
desc=""
onClick={async () => {
if (commonStore.platform === 'darwin') {
toast(t('MacOS is not yet supported for performing this operation, please do it manually.') + ' (backend-python/convert_safetensors.py)', { type: 'info' });
return;
} else if (commonStore.platform === 'linux') {
toast(t('Linux is not yet supported for performing this operation, please do it manually.') + ' (backend-python/convert_safetensors.py)', { type: 'info' });
return;
}
const ok = await checkDependencies(navigate);
if (!ok)
return;
const modelPath = `${commonStore.settings.customModelsPath}/${selectedConfig.modelParameters.modelName}`;
if (await FileExists(modelPath)) {
toast(t('Start Converting'), { autoClose: 1000, type: 'info' });
const newModelPath = modelPath.replace(/\.pth$/, '.st');
ConvertSafetensors(commonStore.settings.customPythonPath, modelPath, newModelPath).then(async () => {
if (!await FileExists(newModelPath)) {
toast(t('Convert Failed') + ' - ' + await GetPyError(), { type: 'error' });
} else {
toast(`${t('Convert Success')} - ${newModelPath}`, { type: 'success' });
}
}).catch(e => {
const errMsg = e.message || e;
if (errMsg.includes('path contains space'))
toast(`${t('Convert Failed')} - ${t('File Path Cannot Contain Space')}`, { type: 'error' });
else
toast(`${t('Convert Failed')} - ${e.message || e}`, { type: 'error' });
});
setTimeout(WindowShow, 1000);
} else {
toast(`${t('Model Not Found')} - ${modelPath}`, { type: 'error' });
}
}} />
}
<Labeled label={t('Strategy')} content={
<Dropdown style={{ minWidth: 0 }} className="grow" value={t(selectedConfig.modelParameters.device)!}
selectedOptions={[selectedConfig.modelParameters.device]}
@@ -284,11 +352,13 @@ export const Configs: FC = observer(() => {
<Option value="CPU">CPU</Option>
{commonStore.platform === 'darwin' && <Option value="MPS">MPS</Option>}
<Option value="CUDA">CUDA</Option>
<Option value="CUDA-Beta">{t('CUDA (Beta, Faster)')!}</Option>
<Option value="WebGPU">WebGPU</Option>
<Option value="Custom">{t('Custom')!}</Option>
</Dropdown>
} />
{
selectedConfig.modelParameters.device != 'Custom' && <Labeled label={t('Precision')}
selectedConfig.modelParameters.device !== 'Custom' && <Labeled label={t('Precision')}
desc={t('int8 uses less VRAM, but has slightly lower quality. fp16 has higher quality, and fp32 has the best quality.')}
content={
<Dropdown style={{ minWidth: 0 }} className="grow"
@@ -303,17 +373,17 @@ export const Configs: FC = observer(() => {
}}>
<Option>fp16</Option>
<Option>int8</Option>
<Option>fp32</Option>
{selectedConfig.modelParameters.device !== 'WebGPU' && <Option>fp32</Option>}
</Dropdown>
} />
}
{
selectedConfig.modelParameters.device == 'CUDA' &&
selectedConfig.modelParameters.device.includes('CUDA') &&
<Labeled label={t('Current Strategy')}
content={<Text> {getStrategy(selectedConfig)} </Text>} />
}
{
selectedConfig.modelParameters.device == 'CUDA' &&
selectedConfig.modelParameters.device.includes('CUDA') &&
<Labeled label={t('Stored Layers')}
desc={t('Number of the neural network layers loaded into VRAM, the more you load, the faster the speed, but it consumes more VRAM. (If your VRAM is not enough, it will fail to load)')}
content={
@@ -326,9 +396,7 @@ export const Configs: FC = observer(() => {
}} />
} />
}
{
selectedConfig.modelParameters.device == 'CUDA' && <div />
}
{selectedConfig.modelParameters.device.includes('CUDA') && <div />}
{
displayStrategyImg &&
<img style={{ width: '80vh', height: 'auto', zIndex: 100 }}
@@ -336,13 +404,13 @@ export const Configs: FC = observer(() => {
src={commonStore.settings.language === 'zh' ? strategyZhImg : strategyImg} />
}
{
selectedConfig.modelParameters.device == 'Custom' &&
selectedConfig.modelParameters.device === 'Custom' &&
<Labeled label="Strategy"
onMouseEnter={() => setDisplayStrategyImg(true)}
onMouseLeave={() => setDisplayStrategyImg(false)}
content={
<Input className="grow"
placeholder={commonStore.platform != 'darwin' ? 'cuda:0 fp16 *20 -> cuda:1 fp16' : 'mps fp32'}
placeholder={commonStore.platform !== 'darwin' ? 'cuda:0 fp16 *20 -> cuda:1 fp16' : 'mps fp32'}
value={selectedConfig.modelParameters.customStrategy}
onChange={(e, data) => {
setSelectedConfigModelParams({
@@ -351,9 +419,9 @@ export const Configs: FC = observer(() => {
}} />
} />
}
{selectedConfig.modelParameters.device == 'Custom' && <div />}
{selectedConfig.modelParameters.device === 'Custom' && <div />}
{
selectedConfig.modelParameters.device != 'CPU' && selectedConfig.modelParameters.device != 'MPS' &&
(selectedConfig.modelParameters.device.includes('CUDA') || selectedConfig.modelParameters.device === 'Custom') &&
<Labeled label={t('Use Custom CUDA kernel to Accelerate')}
desc={t('Enabling this option can greatly improve inference speed and save some VRAM, but there may be compatibility issues. If it fails to start, please turn off this option.')}
content={
@@ -365,6 +433,40 @@ export const Configs: FC = observer(() => {
}} />
} />
}
{selectedConfig.modelParameters.device !== 'WebGPU' &&
<Accordion className="sm:col-span-2" collapsible
openItems={!commonStore.modelParamsCollapsed && 'advanced'}
onToggle={(e, data) => {
if (data.value === 'advanced')
commonStore.setModelParamsCollapsed(!commonStore.modelParamsCollapsed);
}}>
<AccordionItem value="advanced">
<AccordionHeader ref={advancedHeaderRef} size="small">{t('Advanced')}</AccordionHeader>
<AccordionPanel>
<div className="flex flex-col">
<div className="flex grow">
<Checkbox className="select-none"
size="large" label={t('Use Custom Tokenizer')}
checked={selectedConfig.modelParameters.useCustomTokenizer}
onChange={(_, data) => {
setSelectedConfigModelParams({
useCustomTokenizer: data.checked as boolean
});
}} />
<Input className="grow"
placeholder={t('Tokenizer Path (e.g. backend-python/rwkv_pip/20B_tokenizer.json)')!}
value={selectedConfig.modelParameters.customTokenizer}
onChange={(e, data) => {
setSelectedConfigModelParams({
customTokenizer: data.value
});
}} />
</div>
</div>
</AccordionPanel>
</AccordionItem>
</Accordion>
}
</div>
}
/>

View File

@@ -1,10 +1,10 @@
import React, { FC } from 'react';
import React, { FC, useEffect } from 'react';
import { useTranslation } from 'react-i18next';
import { Page } from '../components/Page';
import { observer } from 'mobx-react-lite';
import commonStore from '../stores/commonStore';
import { Divider, Field, ProgressBar } from '@fluentui/react-components';
import { bytesToGb, bytesToKb, bytesToMb } from '../utils';
import { bytesToGb, bytesToKb, bytesToMb, refreshLocalModels } from '../utils';
import { ToolTipButton } from '../components/ToolTipButton';
import { Folder20Regular, Pause20Regular, Play20Regular } from '@fluentui/react-icons';
import { AddToDownloadList, OpenFileFolder, PauseDownload } from '../../wailsjs/go/backend_golang/App';
@@ -23,6 +23,12 @@ export type DownloadStatus = {
export const Downloads: FC = observer(() => {
const { t } = useTranslation();
const finishedModelsLen = commonStore.downloadList.filter((status) => status.done && status.name.endsWith('.pth')).length;
useEffect(() => {
if (finishedModelsLen > 0)
refreshLocalModels({ models: commonStore.modelSourceList }, false);
console.log('finishedModelsLen:', finishedModelsLen);
}, [finishedModelsLen]);
let displayList = commonStore.downloadList.slice();
const downloadListNames = displayList.map(s => s.name);

View File

@@ -27,7 +27,7 @@ export type ModelSourceItem = {
name: string;
size: number;
lastUpdated: string;
desc?: { [lang: string]: string; };
desc?: { [lang: string]: string | undefined; };
SHA256?: string;
url?: string;
downloadUrl?: string;
@@ -63,10 +63,10 @@ const columns: TableColumnDefinition<ModelSourceItem>[] = [
const lang: string = commonStore.settings.language;
if (a.desc && b.desc) {
if (lang in a.desc && lang in b.desc)
return b.desc[lang].localeCompare(a.desc[lang]);
else if ('en' in a.desc && 'en' in b.desc)
return b.desc['en'].localeCompare(a.desc['en']);
if (lang in a.desc && lang in b.desc && a.desc[lang] && b.desc[lang])
return b.desc[lang]!.localeCompare(a.desc[lang]!);
else if ('en' in a.desc && 'en' in b.desc && a.desc['en'] && b.desc['en'])
return b.desc['en']!.localeCompare(a.desc['en']!);
}
return 0;
},

View File

@@ -56,6 +56,9 @@ export type Preset = {
stop: string,
injectStart: string,
injectEnd: string,
presystem?: boolean,
userName?: string,
assistantName?: string
}
export const defaultPreset: Preset = {
@@ -250,14 +253,41 @@ export const ChatPresetEditor: FC<{
}} />
<Button onClick={() => {
setEditingMessages(!editingMessages);
}}>{!editingMessages ? t('Edit Messages') : t('Go Back')}</Button>
}}>{!editingMessages ? t('Edit Character Settings') : t('Go Back')}</Button>
</div>
} />
{
editingMessages ?
<MessagesEditor /> :
<div className="flex flex-col gap-1">
<Labeled flex spaceBetween label={t('Insert default system prompt at the beginning')}
content={
<Switch checked={editingPreset.presystem === undefined ? true : editingPreset.presystem}
onChange={(e, data) => {
setEditingPreset({
presystem: data.checked
});
}} />
} />
<Labeled flex breakline label={t('User Name')}
content={
<Input placeholder="User" value={editingPreset.userName} onChange={(e, data) => {
setEditingPreset({
userName: data.value
});
}} />
} />
<Labeled flex breakline label={t('Assistant Name')}
content={
<Input placeholder="Assistant" value={editingPreset.assistantName} onChange={(e, data) => {
setEditingPreset({
assistantName: data.value
});
}} />
} />
<MessagesEditor />
</div> :
<div className="flex flex-col gap-1 p-2 overflow-x-hidden overflow-y-auto">
<Labeled flex breakline label={t('Description')}
<Labeled flex breakline label={`${t('Description')} (${t('Preview Only')})`}
content={
<Input value={editingPreset.desc} onChange={(e, data) => {
setEditingPreset({

View File

@@ -19,7 +19,8 @@ import { RestartApp } from '../../wailsjs/go/backend_golang/App';
export const Languages = {
dev: 'English', // i18n default
zh: '简体中文'
zh: '简体中文',
ja: '日本語'
};
export type Language = keyof typeof Languages;
@@ -125,7 +126,7 @@ export const Settings: FC = observer(() => {
} />
}
{
commonStore.settings.language === 'zh' && commonStore.platform != 'linux' &&
commonStore.settings.language === 'zh' && commonStore.platform !== 'linux' &&
<Labeled label={t('Use Tsinghua Pip Mirrors')} flex spaceBetween content={
<Switch checked={commonStore.settings.cnMirror}
onChange={(e, data) => {

View File

@@ -91,7 +91,7 @@ export type DataProcessParameters = {
vocabPath: string;
}
export type LoraFinetunePrecision = 'bf16' | 'fp16' | 'fp32' | 'tf32';
export type LoraFinetunePrecision = 'bf16' | 'fp16' | 'tf32';
export type LoraFinetuneParameters = {
baseModel: string;
@@ -154,7 +154,7 @@ const showError = (e: any) => {
};
const errorsMap = Object.entries({
'python3 ./finetune/lora/train.py': 'Memory is not enough, try to increase the virtual memory or use a smaller base model.',
'python3 ./finetune/lora/train.py': 'Memory is not enough, try to increase the virtual memory (Swap of WSL) or use a smaller base model.',
'cuda out of memory': 'VRAM is not enough',
'valueerror: high <= 0': 'Training data is not enough, reduce context length or add more data for training',
'+= \'+ptx\'': 'You are using WSL 1 for training, please upgrade to WSL 2. e.g. Run "wsl --set-version Ubuntu-22.04 2"',
@@ -219,7 +219,7 @@ const Terminal: FC = observer(() => {
WslStart().then(() => {
addWslMessage('WSL> ' + input);
setInput('');
WslCommand(input).catch(showError);
WslCommand(input).then(WindowShow).catch(showError);
}).catch(showError);
}
};
@@ -544,7 +544,6 @@ const LoraFinetune: FC = observer(() => {
>
<Option>bf16</Option>
<Option>fp16</Option>
<Option>fp32</Option>
<Option>tf32</Option>
</Dropdown>
: <div />

View File

@@ -1,4 +1,113 @@
import { ModelConfig } from './Configs';
import { CompletionPreset } from './Completion';
export const defaultCompositionPrompt = '<pad>';
export const defaultPresets: CompletionPreset[] = [{
name: 'Writer',
prompt: 'The following is an epic science fiction masterpiece that is immortalized, with delicate descriptions and grand depictions of interstellar civilization wars.\nChapter 1.\n',
params: {
maxResponseToken: 500,
temperature: 1.2,
topP: 0.5,
presencePenalty: 0.4,
frequencyPenalty: 0.4,
stop: '\\n\\nUser',
injectStart: '',
injectEnd: ''
}
}, {
name: 'Translator',
prompt: 'Translate this into Chinese.\n\nEnglish: What rooms do you have available?',
params: {
maxResponseToken: 500,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '\\n\\n',
injectStart: '\\nChinese: ',
injectEnd: '\\n\\nEnglish: '
}
}, {
name: 'Catgirl',
prompt: 'The following is a conversation between a cat girl and her owner. The cat girl is a humanized creature that behaves like a cat but is humanoid. At the end of each sentence in the dialogue, she will add \"Meow~\". In the following content, User represents the owner and Assistant represents the cat girl.\n\nUser: Hello.\n\nAssistant: I\'m here, meow~.\n\nUser: Can you tell jokes?',
params: {
maxResponseToken: 500,
temperature: 1.2,
topP: 0.5,
presencePenalty: 0.4,
frequencyPenalty: 0.4,
stop: '\\n\\nUser',
injectStart: '\\n\\nAssistant: ',
injectEnd: '\\n\\nUser: '
}
}, {
name: 'Chinese Kongfu',
prompt: 'User: 请你扮演一个文本冒险游戏,我是游戏主角。这是一个玄幻修真世界,有四大门派。我输入我的行动,请你显示行动结果,并具体描述环境。我的第一个行动是“醒来”,请开始故事。',
params: {
maxResponseToken: 500,
temperature: 1.1,
topP: 0.7,
presencePenalty: 0.3,
frequencyPenalty: 0.3,
stop: '\\n\\nUser',
injectStart: '\\n\\nAssistant: ',
injectEnd: '\\n\\nUser: '
}
}, {
name: 'Code Generation',
prompt: 'def sum(',
params: {
maxResponseToken: 500,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '\\n\\n',
injectStart: '',
injectEnd: ''
}
}, {
name: 'Werewolf',
prompt: 'There is currently a game of Werewolf with six players, including a Seer (who can check identities at night), two Werewolves (who can choose someone to kill at night), a Bodyguard (who can choose someone to protect at night), two Villagers (with no special abilities), and a game host. User will play as Player 1, Assistant will play as Players 2-6 and the game host, and they will begin playing together. Every night, the host will ask User for his action and simulate the actions of the other players. During the day, the host will oversee the voting process and ask User for his vote. \n\nAssistant: Next, I will act as the game host and assign everyone their roles, including randomly assigning yours. Then, I will simulate the actions of Players 2-6 and let you know what happens each day. Based on your assigned role, you can tell me your actions and I will let you know the corresponding results each day.\n\nUser: Okay, I understand. Let\'s begin. Please assign me a role. Am I the Seer, Werewolf, Villager, or Bodyguard?\n\nAssistant: You are the Seer. Now that night has fallen, please choose a player to check his identity.\n\nUser: Tonight, I want to check Player 2 and find out his role.',
params: {
maxResponseToken: 500,
temperature: 1.2,
topP: 0.4,
presencePenalty: 0.5,
frequencyPenalty: 0.5,
stop: '\\n\\nUser',
injectStart: '\\n\\nAssistant: ',
injectEnd: '\\n\\nUser: '
}
}, {
name: 'Instruction',
prompt: 'Instruction: Write a story using the following information\n\nInput: A man named Alex chops a tree down\n\nResponse:',
params: {
maxResponseToken: 500,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '',
injectStart: '',
injectEnd: ''
}
}, {
name: 'Blank',
prompt: '',
params: {
maxResponseToken: 500,
temperature: 1,
topP: 0.3,
presencePenalty: 0,
frequencyPenalty: 1,
stop: '',
injectStart: '',
injectEnd: ''
}
}];
export const defaultModelConfigsMac: ModelConfig[] = [
{
@@ -153,6 +262,42 @@ export const defaultModelConfigsMac: ModelConfig[] = [
customStrategy: 'mps fp32'
}
},
{
name: 'CPU-120M-Music',
apiParameters: {
apiPort: 8000,
maxResponseToken: 4100,
temperature: 1.0,
topP: 0.8,
presencePenalty: 0,
frequencyPenalty: 1
},
modelParameters: {
modelName: 'RWKV-4-MIDI-120M-v1-20230714-ctx4096.pth',
device: 'CPU',
precision: 'fp32',
storedLayers: 41,
maxStoredLayers: 41
}
},
{
name: 'CPU-560M-Music',
apiParameters: {
apiPort: 8000,
maxResponseToken: 4100,
temperature: 1.0,
topP: 0.8,
presencePenalty: 0,
frequencyPenalty: 1
},
modelParameters: {
modelName: 'RWKV-4-MIDI-560M-v1-20230717-ctx4096.pth',
device: 'CPU',
precision: 'fp32',
storedLayers: 41,
maxStoredLayers: 41
}
},
{
name: 'CPU-6G-1B5-World',
apiParameters: {
@@ -332,7 +477,7 @@ export const defaultModelConfigs: ModelConfig[] = [
modelParameters: {
modelName: 'RWKV-4-World-0.1B-v1-20230520-ctx4096.pth',
device: 'CUDA',
precision: 'fp32',
precision: 'fp32', // using fp16 will disable state cache (->)
storedLayers: 41,
maxStoredLayers: 41
}
@@ -980,6 +1125,42 @@ export const defaultModelConfigs: ModelConfig[] = [
useCustomCuda: true
}
},
{
name: 'CPU-120M-Music',
apiParameters: {
apiPort: 8000,
maxResponseToken: 4100,
temperature: 1.0,
topP: 0.8,
presencePenalty: 0,
frequencyPenalty: 1
},
modelParameters: {
modelName: 'RWKV-4-MIDI-120M-v1-20230714-ctx4096.pth',
device: 'CPU',
precision: 'fp32',
storedLayers: 41,
maxStoredLayers: 41
}
},
{
name: 'CPU-560M-Music',
apiParameters: {
apiPort: 8000,
maxResponseToken: 4100,
temperature: 1.0,
topP: 0.8,
presencePenalty: 0,
frequencyPenalty: 1
},
modelParameters: {
modelName: 'RWKV-4-MIDI-560M-v1-20230717-ctx4096.pth',
device: 'CPU',
precision: 'fp32',
storedLayers: 41,
maxStoredLayers: 41
}
},
{
name: 'CPU-6G-1B5-World',
apiParameters: {

View File

@@ -8,6 +8,7 @@ import {
DocumentSettings20Regular,
Home20Regular,
Info20Regular,
MusicNote220Regular,
Settings20Regular,
Storage20Regular
} from '@fluentui/react-icons';
@@ -19,6 +20,7 @@ import { Settings } from './Settings';
import { About } from './About';
import { Downloads } from './Downloads';
import { Completion } from './Completion';
import { Composition } from './Composition';
type NavigationItem = {
label: string;
@@ -50,6 +52,13 @@ export const pages: NavigationItem[] = [
element: <Completion />,
top: true
},
{
label: 'Composition',
path: '/composition',
icon: <MusicNote220Regular />,
element: <Composition />,
top: true
},
{
label: 'Configs',
path: '/configs',

View File

@@ -2,11 +2,12 @@ import commonStore, { Platform } from './stores/commonStore';
import { GetPlatform, ListDirFiles, ReadJson } from '../wailsjs/go/backend_golang/App';
import { Cache, checkUpdate, downloadProgramFiles, LocalConfig, refreshLocalModels, refreshModels } from './utils';
import { getStatus } from './apis';
import { EventsOn } from '../wailsjs/runtime';
import { EventsOn, WindowSetTitle } from '../wailsjs/runtime';
import manifest from '../../manifest.json';
import { defaultModelConfigs, defaultModelConfigsMac } from './pages/defaultModelConfigs';
import { defaultModelConfigs, defaultModelConfigsMac } from './pages/defaultConfigs';
import { Preset } from './pages/PresetsManager/PresetsButton';
import { wslHandler } from './pages/Train';
import { t } from 'i18next';
export async function startup() {
downloadProgramFiles();
@@ -23,6 +24,8 @@ export async function startup() {
initPresets();
initHardwareMonitor();
await GetPlatform().then(p => commonStore.setPlatform(p as Platform));
await initConfig();
@@ -70,7 +73,7 @@ async function initConfig() {
configData.currentModelConfigIndex >= 0 && configData.currentModelConfigIndex < configData.modelConfigs.length)
commonStore.setCurrentConfigIndex(configData.currentModelConfigIndex, false);
}).catch(() => {
commonStore.setModelConfigs(commonStore.platform != 'darwin' ? defaultModelConfigs : defaultModelConfigsMac, true);
commonStore.setModelConfigs(commonStore.platform !== 'darwin' ? defaultModelConfigs : defaultModelConfigsMac, true);
});
}
@@ -117,3 +120,20 @@ async function initLocalModelsNotify() {
refreshLocalModels({ models: commonStore.modelSourceList }, false); //TODO fix bug that only add models
});
}
type monitorData = {
usedMemory: number;
totalMemory: number;
gpuUsage: number;
gpuPower: number;
usedVram: number;
totalVram: number;
}
async function initHardwareMonitor() {
EventsOn('monitor', (data: string) => {
const results: monitorData = JSON.parse(data);
if (results)
WindowSetTitle(`RWKV-Runner (${t('RAM')}: ${results.usedMemory.toFixed(1)}/${results.totalMemory.toFixed(1)} GB, ${t('VRAM')}: ${(results.usedVram / 1024).toFixed(1)}/${(results.totalVram / 1024).toFixed(1)} GB, ${t('GPU Usage')}: ${results.gpuUsage}%)`);
});
}

View File

@@ -11,11 +11,12 @@ import { IntroductionContent } from '../pages/Home';
import { AboutContent } from '../pages/About';
import i18n from 'i18next';
import { CompletionPreset } from '../pages/Completion';
import { defaultModelConfigs, defaultModelConfigsMac } from '../pages/defaultModelConfigs';
import { defaultCompositionPrompt, defaultModelConfigs, defaultModelConfigsMac } from '../pages/defaultConfigs';
import commonStore from './commonStore';
import { Preset } from '../pages/PresetsManager/PresetsButton';
import { DataProcessParameters, LoraFinetuneParameters } from '../pages/Train';
import { ChartData } from 'chart.js';
import { CompositionParams } from '../pages/Composition';
export enum ModelStatus {
Offline,
@@ -57,9 +58,23 @@ class CommonStore {
completionPreset: CompletionPreset | null = null;
completionGenerating: boolean = false;
completionSubmittedPrompt: string = '';
// composition
compositionParams: CompositionParams = {
prompt: defaultCompositionPrompt,
maxResponseToken: 200,
temperature: 1,
topP: 0.8,
autoPlay: true,
useLocalSoundFont: false,
midi: null,
ns: null
};
compositionGenerating: boolean = false;
compositionSubmittedPrompt: string = defaultCompositionPrompt;
// configs
currentModelConfigIndex: number = 0;
modelConfigs: ModelConfig[] = [];
modelParamsCollapsed: boolean = true;
// models
modelSourceManifestList: string = 'https://cdn.jsdelivr.net/gh/josstorer/RWKV-Runner@master/manifest.json;';
modelSourceList: ModelSourceItem[] = [];
@@ -153,7 +168,7 @@ class CommonStore {
createModelConfig = (config: ModelConfig = defaultModelConfigs[0], saveConfig: boolean = true) => {
if (config.name === defaultModelConfigs[0].name) {
// deep copy
config = JSON.parse(JSON.stringify(commonStore.platform != 'darwin' ? defaultModelConfigs[0] : defaultModelConfigsMac[0]));
config = JSON.parse(JSON.stringify(commonStore.platform !== 'darwin' ? defaultModelConfigs[0] : defaultModelConfigsMac[0]));
config.name = new Date().toLocaleString();
}
this.modelConfigs.push(config);
@@ -245,6 +260,10 @@ class CommonStore {
this.advancedCollapsed = value;
}
setModelParamsCollapsed(value: boolean) {
this.modelParamsCollapsed = value;
}
setLastUnfinishedModelDownloads(value: DownloadStatus[]) {
this.lastUnfinishedModelDownloads = value;
}
@@ -267,6 +286,18 @@ class CommonStore {
this.completionSubmittedPrompt = value;
}
setCompositionParams(value: CompositionParams) {
this.compositionParams = value;
}
setCompositionGenerating(value: boolean) {
this.compositionGenerating = value;
}
setCompositionSubmittedPrompt(value: string) {
this.compositionSubmittedPrompt = value;
}
setWslStdout(value: string) {
this.wslStdout = value;
}

View File

@@ -28,6 +28,7 @@ body {
/* Works on Chrome, Edge, and Safari */
*::-webkit-scrollbar {
width: 9px;
height: 9px;
}
*::-webkit-scrollbar-thumb {
@@ -92,3 +93,22 @@ body {
}
}
}
midi-player {
&::part(control-panel) {
background: none;
}
}
midi-visualizer {
$instrument-colors: #007bff, #20c997, #dc3545, #6610f2, #ffc107, #e83e8c, #17a2b8, #fd7e14, #28a745;
svg {
@for $i from 0 to 200 {
$color: nth($instrument-colors, ($i % length($instrument-colors)) + 1);
rect.note[data-instrument="#{$i}"] {
fill: $color;
}
}
}
}

View File

@@ -0,0 +1,9 @@
declare module JSX {
import { PlayerElement } from 'html-midi-player';
import { VisualizerElement } from 'html-midi-player';
interface IntrinsicElements {
'midi-player': PlayerElement;
'midi-visualizer': VisualizerElement;
}
}

View File

@@ -57,6 +57,8 @@ export async function refreshBuiltInModels(readCache: boolean = false) {
return cache;
}
const modelSuffix = ['.pth', '.st', '.safetensors'];
export async function refreshLocalModels(cache: {
models: ModelSourceItem[]
}, filter: boolean = true, initUnfinishedModels: boolean = false) {
@@ -65,7 +67,7 @@ export async function refreshLocalModels(cache: {
await ListDirFiles(commonStore.settings.customModelsPath).then((data) => {
cache.models.push(...data.flatMap(d => {
if (!d.isDir && d.name.endsWith('.pth'))
if (!d.isDir && modelSuffix.some((ext => d.name.endsWith(ext))))
return [{
name: d.name,
size: d.size,
@@ -146,7 +148,7 @@ export async function refreshRemoteModels(cache: { models: ModelSourceItem[] })
.catch(() => {
});
cache.models = cache.models.filter((model, index, self) => {
return model.name.endsWith('.pth')
return modelSuffix.some((ext => model.name.endsWith(ext)))
&& index === self.findIndex(
m => m.name === model.name || (m.SHA256 && m.SHA256 === model.SHA256 && m.size === model.size));
});
@@ -176,7 +178,11 @@ export const getStrategy = (modelConfig: ModelConfig | undefined = undefined) =>
strategy += 'cpu ';
strategy += params.precision === 'int8' ? 'fp32i8' : 'fp32';
break;
case 'WebGPU':
strategy += params.precision === 'int8' ? 'fp16i8' : 'fp16';
break;
case 'CUDA':
case 'CUDA-Beta':
if (avoidOverflow)
strategy = 'cuda fp32 *1 -> ';
strategy += 'cuda ';
@@ -239,7 +245,7 @@ export function downloadProgramFiles() {
manifest.programFiles.forEach(({ url, path }) => {
if (path)
ReadFileInfo(path).then(info => {
if (info.size == 0 && url)
if (info.size === 0 && url)
AddToDownloadList(path, url.replace('@master', '@v' + manifest.version));
}).catch(() => {
if (url)
@@ -372,7 +378,7 @@ export const checkDependencies = async (navigate: NavigateFunction) => {
});
} else {
toast(depErrorMsg, { type: 'info', position: 'bottom-left' });
if (commonStore.platform != 'linux')
if (commonStore.platform !== 'linux')
toastWithButton(t('Python dependencies are incomplete, would you like to install them?'), t('Install'), () => {
InstallPyDep(commonStore.settings.customPythonPath, commonStore.settings.cnMirror).catch((e) => {
const errMsg = e.message || e;

View File

@@ -10,6 +10,8 @@ export function ConvertData(arg1:string,arg2:string,arg3:string,arg4:string):Pro
export function ConvertModel(arg1:string,arg2:string,arg3:string,arg4:string):Promise<string>;
export function ConvertSafetensors(arg1:string,arg2:string,arg3:string):Promise<string>;
export function CopyFile(arg1:string,arg2:string):Promise<void>;
export function DeleteFile(arg1:string):Promise<void>;
@@ -34,6 +36,8 @@ export function OpenFileFolder(arg1:string,arg2:boolean):Promise<void>;
export function OpenSaveFileDialog(arg1:string,arg2:string,arg3:string):Promise<string>;
export function OpenSaveFileDialogBytes(arg1:string,arg2:string,arg3:Array<number>):Promise<string>;
export function PauseDownload(arg1:string):Promise<void>;
export function ReadFileInfo(arg1:string):Promise<backend_golang.FileInfo>;
@@ -44,7 +48,9 @@ export function RestartApp():Promise<void>;
export function SaveJson(arg1:string,arg2:any):Promise<void>;
export function StartServer(arg1:string,arg2:number,arg3:string):Promise<string>;
export function StartServer(arg1:string,arg2:number,arg3:string,arg4:boolean):Promise<string>;
export function StartWebGPUServer(arg1:number,arg2:string):Promise<string>;
export function UpdateApp(arg1:string):Promise<boolean>;

View File

@@ -18,6 +18,10 @@ export function ConvertModel(arg1, arg2, arg3, arg4) {
return window['go']['backend_golang']['App']['ConvertModel'](arg1, arg2, arg3, arg4);
}
export function ConvertSafetensors(arg1, arg2, arg3) {
return window['go']['backend_golang']['App']['ConvertSafetensors'](arg1, arg2, arg3);
}
export function CopyFile(arg1, arg2) {
return window['go']['backend_golang']['App']['CopyFile'](arg1, arg2);
}
@@ -66,6 +70,10 @@ export function OpenSaveFileDialog(arg1, arg2, arg3) {
return window['go']['backend_golang']['App']['OpenSaveFileDialog'](arg1, arg2, arg3);
}
export function OpenSaveFileDialogBytes(arg1, arg2, arg3) {
return window['go']['backend_golang']['App']['OpenSaveFileDialogBytes'](arg1, arg2, arg3);
}
export function PauseDownload(arg1) {
return window['go']['backend_golang']['App']['PauseDownload'](arg1);
}
@@ -86,8 +94,12 @@ export function SaveJson(arg1, arg2) {
return window['go']['backend_golang']['App']['SaveJson'](arg1, arg2);
}
export function StartServer(arg1, arg2, arg3) {
return window['go']['backend_golang']['App']['StartServer'](arg1, arg2, arg3);
export function StartServer(arg1, arg2, arg3, arg4) {
return window['go']['backend_golang']['App']['StartServer'](arg1, arg2, arg3, arg4);
}
export function StartWebGPUServer(arg1, arg2) {
return window['go']['backend_golang']['App']['StartWebGPUServer'](arg1, arg2);
}
export function UpdateApp(arg1) {

11
go.mod
View File

@@ -4,15 +4,16 @@ go 1.20
require (
github.com/cavaliergopher/grab/v3 v3.0.1
github.com/fsnotify/fsnotify v1.6.0
github.com/minio/selfupdate v0.6.0
github.com/nyaosorg/go-windows-su v0.2.1
github.com/ubuntu/gowsl v0.0.0-20230615094051-94945650cc1e
github.com/wailsapp/wails/v2 v2.5.1
github.com/wailsapp/wails/v2 v2.6.0
)
require (
aead.dev/minisign v0.2.0 // indirect
github.com/bep/debounce v1.2.1 // indirect
github.com/fsnotify/fsnotify v1.6.0
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/google/uuid v1.3.0 // indirect
github.com/jchv/go-winloader v0.0.0-20210711035445-715c2860da7e // indirect
@@ -22,8 +23,7 @@ require (
github.com/leaanthony/gosod v1.0.3 // indirect
github.com/leaanthony/slicer v1.6.0 // indirect
github.com/mattn/go-colorable v0.1.13 // indirect
github.com/mattn/go-isatty v0.0.18 // indirect
github.com/nyaosorg/go-windows-su v0.2.1
github.com/mattn/go-isatty v0.0.19 // indirect
github.com/pkg/browser v0.0.0-20210911075715-681adbf594b8 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/rivo/uniseg v0.4.4 // indirect
@@ -33,9 +33,10 @@ require (
github.com/ubuntu/decorate v0.0.0-20230125165522-2d5b0a9bb117 // indirect
github.com/valyala/bytebufferpool v1.0.0 // indirect
github.com/valyala/fasttemplate v1.2.2 // indirect
github.com/wailsapp/go-webview2 v1.0.1 // indirect
github.com/wailsapp/mimetype v1.4.1 // indirect
golang.org/x/crypto v0.9.0 // indirect
golang.org/x/exp v0.0.0-20230515195305-f3d0a9c9a5cc // indirect
golang.org/x/exp v0.0.0-20230522175609-2e198f4a06a1 // indirect
golang.org/x/net v0.10.0 // indirect
golang.org/x/sys v0.9.0 // indirect
golang.org/x/text v0.9.0 // indirect

14
go.sum
View File

@@ -36,8 +36,8 @@ github.com/mattn/go-colorable v0.1.13 h1:fFA4WZxdEF4tXPZVKMLwD8oUnCTTo08duU7wxec
github.com/mattn/go-colorable v0.1.13/go.mod h1:7S9/ev0klgBDR4GtXTXX8a3vIGJpMovkB8vQcUbaXHg=
github.com/mattn/go-isatty v0.0.14/go.mod h1:7GGIvUiUoEMVVmxf/4nioHXj79iQHKdU27kJ6hsGG94=
github.com/mattn/go-isatty v0.0.16/go.mod h1:kYGgaQfpe5nmfYZH+SKPsOc2e4SrIfOl2e/yFXSvRLM=
github.com/mattn/go-isatty v0.0.18 h1:DOKFKCQ7FNG2L1rbrmstDN4QVRdS89Nkh85u68Uwp98=
github.com/mattn/go-isatty v0.0.18/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/mattn/go-isatty v0.0.19 h1:JITubQf0MOLdlGRuRq+jtsDlekdYPia9ZFsB8h/APPA=
github.com/mattn/go-isatty v0.0.19/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/minio/selfupdate v0.6.0 h1:i76PgT0K5xO9+hjzKcacQtO7+MjJ4JKA8Ak8XQ9DDwU=
github.com/minio/selfupdate v0.6.0/go.mod h1:bO02GTIPCMQFTEvE5h4DjYB58bCoZ35XLeBf0buTDdM=
github.com/nyaosorg/go-windows-su v0.2.1 h1:5V0XavLyjOqPUp7psxxCvBISaneU4XmFPSMlejSl5sc=
@@ -69,17 +69,19 @@ github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyC
github.com/valyala/fasttemplate v1.2.1/go.mod h1:KHLXt3tVN2HBp8eijSv/kGJopbvo7S+qRAEEKiv+SiQ=
github.com/valyala/fasttemplate v1.2.2 h1:lxLXG0uE3Qnshl9QyaK6XJxMXlQZELvChBOCmQD0Loo=
github.com/valyala/fasttemplate v1.2.2/go.mod h1:KHLXt3tVN2HBp8eijSv/kGJopbvo7S+qRAEEKiv+SiQ=
github.com/wailsapp/go-webview2 v1.0.1 h1:dEJIeEApW/MhO2tTMISZBFZPuW7kwrFA1NtgFB1z1II=
github.com/wailsapp/go-webview2 v1.0.1/go.mod h1:Uk2BePfCRzttBBjFrBmqKGJd41P6QIHeV9kTgIeOZNo=
github.com/wailsapp/mimetype v1.4.1 h1:pQN9ycO7uo4vsUUuPeHEYoUkLVkaRntMnHJxVwYhwHs=
github.com/wailsapp/mimetype v1.4.1/go.mod h1:9aV5k31bBOv5z6u+QP8TltzvNGJPmNJD4XlAL3U+j3o=
github.com/wailsapp/wails/v2 v2.5.1 h1:mfG+2kWqQXYOwdgI43HEILjOZDXbk5woPYI3jP2b+js=
github.com/wailsapp/wails/v2 v2.5.1/go.mod h1:jbOZbcr/zm79PxXxAjP8UoVlDd9wLW3uDs+isIthDfs=
github.com/wailsapp/wails/v2 v2.6.0 h1:EyH0zR/EO6dDiqNy8qU5spaXDfkluiq77xrkabPYD4c=
github.com/wailsapp/wails/v2 v2.6.0/go.mod h1:WBG9KKWuw0FKfoepBrr/vRlyTmHaMibWesK3yz6nNiM=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20210220033148-5ea612d1eb83/go.mod h1:jdWPYTVW3xRLrWPugEBEK3UY2ZEsg3UU495nc5E+M+I=
golang.org/x/crypto v0.0.0-20211209193657-4570a0811e8b/go.mod h1:IxCIyHEi3zRg3s0A5j5BB6A9Jmi73HwBIUl50j+osU4=
golang.org/x/crypto v0.9.0 h1:LF6fAI+IutBocDJ2OT0Q1g8plpYljMZ4+lty+dsqw3g=
golang.org/x/crypto v0.9.0/go.mod h1:yrmDGqONDYtNj3tH8X9dzUun2m2lzPa9ngI6/RUPGR0=
golang.org/x/exp v0.0.0-20230515195305-f3d0a9c9a5cc h1:mCRnTeVUjcrhlRmO0VK8a6k6Rrf6TF9htwo2pJVSjIU=
golang.org/x/exp v0.0.0-20230515195305-f3d0a9c9a5cc/go.mod h1:V1LtkGg67GoY2N1AnLN78QLrzxkLyJw7RJb1gzOOz9w=
golang.org/x/exp v0.0.0-20230522175609-2e198f4a06a1 h1:k/i9J1pBpvlfR+9QsetwPyERsqu1GIbi967PQMq3Ivc=
golang.org/x/exp v0.0.0-20230522175609-2e198f4a06a1/go.mod h1:V1LtkGg67GoY2N1AnLN78QLrzxkLyJw7RJb1gzOOz9w=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20210505024714-0287a6fb4125/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y=
golang.org/x/net v0.0.0-20211112202133-69e39bad7dc2/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y=

47
main.go
View File

@@ -2,6 +2,9 @@ package main
import (
"embed"
"fmt"
"net/http"
"os"
"runtime/debug"
"strings"
@@ -13,6 +16,27 @@ import (
"github.com/wailsapp/wails/v2/pkg/options/windows"
)
type FileLoader struct {
http.Handler
}
func NewFileLoader() *FileLoader {
return &FileLoader{}
}
func (h *FileLoader) ServeHTTP(res http.ResponseWriter, req *http.Request) {
var err error
requestedFilename := strings.TrimPrefix(req.URL.Path, "/")
println("Requesting file:", requestedFilename)
fileData, err := os.ReadFile(requestedFilename)
if err != nil {
res.WriteHeader(http.StatusBadRequest)
res.Write([]byte(fmt.Sprintf("Could not load file %s", requestedFilename)))
}
res.Write(fileData)
}
//go:embed all:frontend/dist
var assets embed.FS
@@ -25,15 +49,31 @@ var cyacInfo embed.FS
//go:embed backend-python
var py embed.FS
//go:embed backend-rust
var webgpu embed.FS
//go:embed finetune
var finetune embed.FS
//go:embed midi
var midi embed.FS
//go:embed assets/sound-font
var midiAssets embed.FS
//go:embed components
var components embed.FS
func main() {
if buildInfo, ok := debug.ReadBuildInfo(); !ok || strings.Contains(buildInfo.String(), "-ldflags") {
backend.CopyEmbed(cyac)
backend.CopyEmbed(cyacInfo)
backend.CopyEmbed(py)
backend.CopyEmbed(webgpu)
backend.CopyEmbed(finetune)
backend.CopyEmbed(midi)
backend.CopyEmbed(midiAssets)
backend.CopyEmbed(components)
}
// Create an instance of the app structure
@@ -58,14 +98,17 @@ func main() {
Height: 680,
MinWidth: 375,
MinHeight: 640,
EnableDefaultContextMenu: true,
Windows: &windows.Options{
ZoomFactor: zoomFactor,
IsZoomControlEnabled: true,
},
AssetServer: &assetserver.Options{
Assets: assets,
Assets: assets,
Handler: NewFileLoader(),
},
OnStartup: app.OnStartup,
OnStartup: app.OnStartup,
OnBeforeClose: app.OnBeforeClose,
Bind: []any{
app,
},

View File

@@ -1,12 +1,12 @@
{
"version": "1.3.8",
"version": "1.4.6",
"introduction": {
"en": "RWKV is an open-source, commercially usable large language model with high flexibility and great potential for development.\n### About This Tool\nThis tool aims to lower the barrier of entry for using large language models, making it accessible to everyone. It provides fully automated dependency and model management. You simply need to click and run, following the instructions, to deploy a local large language model. The tool itself is very compact and only requires a single executable file for one-click deployment.\nAdditionally, this tool offers an interface that is fully compatible with the OpenAI API. This means you can use any ChatGPT client as a client for RWKV, enabling capability expansion beyond just chat functionality.\n### Preset Configuration Rules at the Bottom\nThis tool comes with a series of preset configurations to reduce complexity. The naming rules for each configuration represent the following in order: device - required VRAM/memory - model size - model language.\nFor example, \"GPU-8G-3B-EN\" indicates that this configuration is for a graphics card with 8GB of VRAM, a model size of 3 billion parameters, and it uses an English language model.\nLarger model sizes have higher performance and VRAM requirements. Among configurations with the same model size, those with higher VRAM usage will have faster runtime.\nFor example, if you have 12GB of VRAM but running the \"GPU-12G-7B-EN\" configuration is slow, you can downgrade to \"GPU-8G-3B-EN\" for a significant speed improvement.\n### About RWKV\nRWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the \"GPT\" mode to quickly compute the hidden state for the \"RNN\" mode.<br/>So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, \"infinite\" ctx_len, and free sentence embedding (using the final hidden state).",
"zh": "RWKV是一个开源且允许商用的大语言模型灵活性很高且极具发展潜力。\n### 关于本工具\n本工具旨在降低大语言模型的使用门槛做到人人可用本工具提供了全自动化的依赖和模型管理你只需要直接点击运行跟随引导即可完成本地大语言模型的部署工具本身体积极小只需要一个exe即可完成一键部署。\n此外本工具提供了与OpenAI API完全兼容的接口这意味着你可以把任意ChatGPT客户端用作RWKV的客户端实现能力拓展而不局限于聊天。\n### 底部的预设配置规则\n本工具内置了一系列预设配置以降低使用难度每个配置名的规则依次代表着设备-所需显存/内存-模型规模-模型语言。\n例如GPU-8G-3B-CN表示该配置用于显卡需要8G显存模型规模为30亿参数使用的是中文模型。\n模型规模越大性能要求越高显存要求也越高而同样模型规模的配置中显存占用越高的运行速度越快。\n例如当你有12G显存但运行GPU-12G-7B-CN配置速度比较慢可降级成GPU-8G-3B-CN将会大幅提速。\n### 关于RWKV\nRWKV是具有Transformer级别LLM性能的RNN也可以像GPT Transformer一样直接进行训练可并行化。而且它是100% attention-free的。你只需在位置t处获得隐藏状态即可计算位置t + 1处的状态。你可以使用“GPT”模式快速计算用于“RNN”模式的隐藏状态。\n因此它将RNN和Transformer的优点结合起来 - 高性能、快速推理、节省显存、快速训练、“无限”上下文长度以及免费的语句嵌入(使用最终隐藏状态)。"
},
"about": {
"en": "<div align=\"center\">\n\nProject Source Code:\nhttps://github.com/josStorer/RWKV-Runner\nAuthor: [@josStorer](https://github.com/josStorer)\nFAQs: https://github.com/josStorer/RWKV-Runner/wiki/FAQs\n\nRelated Repositories:\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\n\n</div>",
"zh": "<div align=\"center\">\n\n本项目源码:\nhttps://github.com/josStorer/RWKV-Runner\n作者: [@josStorer](https://github.com/josStorer)\n演示与常见问题说明视频: https://www.bilibili.com/video/BV1hM4y1v76R\n疑难解答: https://www.bilibili.com/read/cv23921171\n\n相关仓库:\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\n\n</div>"
"en": "<div align=\"center\">\n\nProject Source Code:\nhttps://github.com/josStorer/RWKV-Runner\nAuthor: [@josStorer](https://github.com/josStorer)\nFAQs: https://github.com/josStorer/RWKV-Runner/wiki/FAQs\n\nRelated Repositories:\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\nMIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer\n\n</div>",
"zh": "<div align=\"center\">\n\n本项目源码:\nhttps://github.com/josStorer/RWKV-Runner\n作者: [@josStorer](https://github.com/josStorer)\n演示与常见问题说明视频: https://www.bilibili.com/video/BV1hM4y1v76R\n疑难解答: https://www.bilibili.com/read/cv23921171\n\n相关仓库:\nRWKV-4-World: https://huggingface.co/BlinkDL/rwkv-4-world/tree/main\nRWKV-4-Raven: https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main\nChatRWKV: https://github.com/BlinkDL/ChatRWKV\nRWKV-LM: https://github.com/BlinkDL/RWKV-LM\nRWKV-LM-LoRA: https://github.com/Blealtan/RWKV-LM-LoRA\nMIDI-LLM-tokenizer: https://github.com/briansemrau/MIDI-LLM-tokenizer\n\n</div>"
},
"programFiles": [
{
@@ -19,7 +19,8 @@
"name": "RWKV-4-World-CHNtuned-0.1B-v1-20230617-ctx4096.pth",
"desc": {
"en": "Global Languages 0.1B v1 Enhanced Chinese",
"zh": "全球语言 0.1B v1 中文增强"
"zh": "全球语言 0.1B v1 中文增强",
"ja": "グローバル言語 0.1B v1 中国語強化"
},
"size": 385594610,
"SHA256": "a3888f9958d378ee6d4976ae1c02edb698f4382e426086febafb4a69417b9080",
@@ -31,7 +32,8 @@
"name": "RWKV-4-World-0.1B-v1-20230520-ctx4096.pth",
"desc": {
"en": "Global Languages 0.1B v1",
"zh": "全球语言 0.1B v1"
"zh": "全球语言 0.1B v1",
"ja": "グローバル言語 0.1B v1"
},
"size": 385594610,
"SHA256": "a10ef99df2a8f8a6801edf4fc92a9c49bedd63dcb900d3e5667a2136b3d671e7",
@@ -43,7 +45,8 @@
"name": "RWKV-4-World-CHNtuned-0.4B-v1-20230618-ctx4096.pth",
"desc": {
"en": "Global Languages 0.4B v1 Enhanced Chinese",
"zh": "全球语言 0.4B v1 中文增强"
"zh": "全球语言 0.4B v1 中文增强",
"ja": "グローバル言語 0.4B v1 中国語強化"
},
"size": 923362866,
"SHA256": "dbd5302cbee596bbc900f97eb10b2af3001a7f2c7e4d8643bf8683b2cdbdd324",
@@ -55,7 +58,8 @@
"name": "RWKV-4-World-0.4B-v1-20230529-ctx4096.pth",
"desc": {
"en": "Global Languages 0.4B v1",
"zh": "全球语言 0.4B v1"
"zh": "全球语言 0.4B v1",
"ja": "グローバル言語 0.4B v1"
},
"size": 923362866,
"SHA256": "4b4a2733cf5e5dc97dd62106f391d99895d16b11c5ccd10c89f28c52067a4919",
@@ -67,7 +71,8 @@
"name": "RWKV-4-World-CHNtuned-1.5B-v1-20230620-ctx4096.pth",
"desc": {
"en": "Global Languages 1.5B v1 Enhanced Chinese",
"zh": "全球语言 1.5B v1 中文增强"
"zh": "全球语言 1.5B v1 中文增强",
"ja": "グローバル言語 1.5B v1 中国語強化"
},
"size": 3155281586,
"SHA256": "9f31f2ed5fe52dcf2d50208eb2efd764b9674dba2adb1baeff61997b4390a26b",
@@ -118,7 +123,8 @@
"name": "RWKV-4-World-1.5B-v1-fixed-20230612-ctx4096.pth",
"desc": {
"en": "Global Languages 1.5B v1 fixed",
"zh": "全球语言 1.5B v1 修复"
"zh": "全球语言 1.5B v1 修复",
"ja": "グローバル言語 1.5B v1"
},
"size": 3155281586,
"SHA256": "71f0c3229f9227cbcb8ae5fee6461197129a57e26366c4d23a49058417b046c9",
@@ -182,7 +188,8 @@
"name": "RWKV-4-World-3B-v1-20230619-ctx4096.pth",
"desc": {
"en": "Global Languages 3B v1",
"zh": "全球语言 3B v1"
"zh": "全球语言 3B v1",
"ja": "グローバル言語 3B v1"
},
"size": 6125597618,
"SHA256": "1b227af317fa25b6939ab3c7cd321226ca48b8fe4bbbd2df3db669f1482c54ba",
@@ -194,7 +201,8 @@
"name": "RWKV-4-World-CHNtuned-3B-v1-20230625-ctx4096.pth",
"desc": {
"en": "Global Languages 3B v1 Enhanced Chinese",
"zh": "全球语言 3B v1 中文增强"
"zh": "全球语言 3B v1 中文增强",
"ja": "グローバル言語 3B v1 中国語強化"
},
"size": 6125597618,
"SHA256": "7d3b5a4d0e9780a3e3d9ae7c2defbe8564d240bc9a238db4ba70cfb66dc33888",
@@ -284,7 +292,8 @@
"name": "RWKV-4-World-7B-v1-20230626-ctx4096.pth",
"desc": {
"en": "Global Languages 7B v1",
"zh": "全球语言 7B v1"
"zh": "全球语言 7B v1",
"ja": "グローバル言語 7B v1"
},
"size": 15035393586,
"SHA256": "db7b011247a0fe4389e1d76e3d6a904185f85d509c8a44ad18bf401094efc293",
@@ -292,11 +301,64 @@
"url": "https://huggingface.co/BlinkDL/rwkv-4-world/blob/main/RWKV-4-World-7B-v1-20230626-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-4-world/resolve/main/RWKV-4-World-7B-v1-20230626-ctx4096.pth"
},
{
"name": "RWKV-claude-4-World-7B-20230805-ctx65k.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx65k Claude Like",
"zh": "全球语言 7B v1 65k上下文 Claude功能",
"ja": "グローバル言語 7B v1 65kコンテキスト Claude機能"
},
"size": 15035391533,
"SHA256": "8cd25f8a1ab58965993cc47b3b2f99585836eed008a2e44526c258189ea751a6",
"lastUpdated": "2023-08-05T08:52:20",
"url": "https://huggingface.co/xiaol/RWKV-claude-4-World-7B-65k/blob/main/RWKV-claude-4-World-7B-20230805-ctx65k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-claude-4-World-7B-65k/resolve/main/RWKV-claude-4-World-7B-20230805-ctx65k.pth"
},
{
"name": "RWKV-toolformer-translation-japanese-chinese-english-7B-World-20230815-ctx128k.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx128k Toolformer",
"zh": "全球语言 7B v1 128k上下文 Toolformer",
"ja": "グローバル言語 7B v1 128kコンテキスト Toolformer"
},
"size": 15035391533,
"SHA256": "648a3b21055bdab77021ce278da80fbada8dcaae0b3d41d1eca9aa194c1fd25f",
"lastUpdated": "2023-08-15T07:18:23",
"url": "https://huggingface.co/xiaol/RWKV-toolformer-translation-japanese-chinese-english-7B-World-128k/blob/main/RWKV-toolformer-translation-japanese-chinese-english-7B-World-20230815-ctx128k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-toolformer-translation-japanese-chinese-english-7B-World-128k/resolve/main/RWKV-toolformer-translation-japanese-chinese-english-7B-World-20230815-ctx128k.pth"
},
{
"name": "RWKV-code-4-World-7B-20230820-ctx32k.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx32k Code Ability",
"zh": "全球语言 7B v1 32k上下文 代码能力",
"ja": "グローバル言語 7B v1 32kコンテキスト コード能力"
},
"size": 15035391533,
"SHA256": "19666620437ae3a5fb06e16a52729d67e449fca155fab3d5861ffe9ecf247404",
"lastUpdated": "2023-08-20T05:00:17",
"url": "https://huggingface.co/xiaol/RWKV-Code-7B-world-32k/blob/main/RWKV-code-4-World-7B-20230820-ctx32k.pth",
"downloadUrl": "https://huggingface.co/xiaol/RWKV-Code-7B-world-32k/resolve/main/RWKV-code-4-World-7B-20230820-ctx32k.pth"
},
{
"name": "wizard-rwkv-4-world-ctx32k.pth",
"desc": {
"en": "Global Languages 7B v1 Ctx32k Wikipedia",
"zh": "全球语言 7B v1 32k上下文 维基百科",
"ja": "グローバル言語 7B v1 32kコンテキスト ウィキペディア"
},
"size": 15035391538,
"SHA256": "c5d991f315a1676d4bed93dd91f803b1376096e7a4af5bf72b339d055f53bac7",
"lastUpdated": "2023-07-29T03:21:47",
"url": "https://huggingface.co/xiaol/wizard-rwkv-world-7B-ctx32k/blob/main/wizard-rwkv-4-world-ctx32k.pth",
"downloadUrl": "https://huggingface.co/xiaol/wizard-rwkv-world-7B-ctx32k/resolve/main/wizard-rwkv-4-world-ctx32k.pth"
},
{
"name": "RWKV-4-World-CHNtuned-7B-v1-20230709-ctx4096.pth",
"desc": {
"en": "Global Languages 7B v1 Enhanced Chinese",
"zh": "全球语言 7B v1 中文增强"
"zh": "全球语言 7B v1 中文增强",
"ja": "グローバル言語 7B v1 中国語強化"
},
"size": 15035393458,
"SHA256": "52d33e8352a40158d21425fee4f68df1515d6324056f788d2c78a366ef578ffa",
@@ -304,6 +366,84 @@
"url": "https://huggingface.co/BlinkDL/rwkv-4-world/blob/main/RWKV-4-World-CHNtuned-7B-v1-20230709-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-4-world/resolve/main/RWKV-4-World-CHNtuned-7B-v1-20230709-ctx4096.pth"
},
{
"name": "Readflow-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k.pth",
"desc": {
"en": "Global Languages 7B v1 Enhanced Chinese Ctx32k Summary Ability",
"zh": "全球语言 7B v1 中文增强 32k上下文 总结能力",
"ja": "グローバル言語 7B v1 中国語強化 32kコンテキスト まとめる能力"
},
"size": 15035391543,
"SHA256": "1bd1de8cdbd56b67e1374588fe5d202884049c71278ffcb12f5c4efbdb422ee1",
"lastUpdated": "2023-07-20T06:11:29",
"url": "https://huggingface.co/xiaol/readflow-rwkv-4-world-ctx32k/blob/main/Readflow-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k.pth",
"downloadUrl": "https://huggingface.co/xiaol/readflow-rwkv-4-world-ctx32k/resolve/main/Readflow-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k.pth"
},
{
"name": "novel-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k.pth",
"desc": {
"en": "Global Languages 7B v1 Enhanced Chinese Ctx32k Novel Outline Ability",
"zh": "全球语言 7B v1 中文增强 32k上下文 小说大纲扩写",
"ja": "グローバル言語 7B v1 中国語強化 32kコンテキスト 小説のあらすじを書く"
},
"size": 15035391538,
"SHA256": "0fe2415ce61af52a8c38c071b475c01b4c9f8a4f2b4aaed6181f0334f3faf7f4",
"lastUpdated": "2023-07-28T13:30:59",
"url": "https://huggingface.co/xiaol/ruotangwx-rwkv-7b-novel-32k/blob/main/novel-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k.pth",
"downloadUrl": "https://huggingface.co/xiaol/ruotangwx-rwkv-7b-novel-32k/resolve/main/novel-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k.pth"
},
{
"name": "chatgal-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k-1000.pth",
"desc": {
"en": "Global Languages 7B v1 Enhanced Chinese Ctx32k GalGame 1000",
"zh": "全球语言 7B v1 中文增强 32k上下文 GalGame 1000",
"ja": "グローバル言語 7B v1 中国語強化 32kコンテキスト GalGame 1000"
},
"size": 15035391543,
"SHA256": "aaed29cfd1bddee47c48f564aa800eb001f62fd03290d772647d5678e40d66e8",
"lastUpdated": "2023-07-21T08:59:18",
"url": "https://huggingface.co/xiaol/chatgal-rwkv-7b-world-32k/blob/main/chatgal-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k-1000.pth",
"downloadUrl": "https://huggingface.co/xiaol/chatgal-rwkv-7b-world-32k/resolve/main/chatgal-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k-1000.pth"
},
{
"name": "chatgal-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k-500.pth",
"desc": {
"en": "Global Languages 7B v1 Enhanced Chinese Ctx32k GalGame 500",
"zh": "全球语言 7B v1 中文增强 32k上下文 GalGame 500",
"ja": "グローバル言語 7B v1 中国語強化 32kコンテキスト GalGame 500"
},
"size": 15035391538,
"SHA256": "b5d347d5dedb4f398ec31489ab87b75b1dee772ae7d0a34c26635cf5d95c8794",
"lastUpdated": "2023-07-21T07:31:05",
"url": "https://huggingface.co/xiaol/chatgal-rwkv-7b-world-32k/blob/main/chatgal-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k-500.pth",
"downloadUrl": "https://huggingface.co/xiaol/chatgal-rwkv-7b-world-32k/resolve/main/chatgal-RWKV-4-World-CHNtuned-7B-v1-20230709-ctx32k-500.pth"
},
{
"name": "RWKV-4-World-JPNtuned-7B-v1-20230718-ctx4096.pth",
"desc": {
"en": "Global Languages 7B v1 Enhanced Japanese",
"zh": "全球语言 7B v1 日文增强",
"ja": "グローバル言語 7B v1 日本語強化"
},
"size": 15035393458,
"SHA256": "3e4c7664ce893ac1f6bb59cd76664fb5c872cb076bb82dbd534db0555b6e9fa5",
"lastUpdated": "2023-07-18T20:01:12",
"url": "https://huggingface.co/BlinkDL/rwkv-4-world/blob/main/RWKV-4-World-JPNtuned-7B-v1-20230718-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-4-world/resolve/main/RWKV-4-World-JPNtuned-7B-v1-20230718-ctx4096.pth"
},
{
"name": "RWKV-novel-4-World-7B-20230810-ctx128k.pth",
"desc": {
"en": "Global Languages Writer 7B v1 Ctx128k",
"zh": "全球语言写作 7B v1 128k上下文",
"ja": "グローバル言語ライター 7B v1 128kコンテキスト"
},
"size": 15035391533,
"SHA256": "5e429c49e4cab2f29a93f87a80635422c8710d70e5b1d962c078e47d957389c8",
"lastUpdated": "2023-08-10T06:30:32",
"url": "https://huggingface.co/xiaol/rwkv-7B-world-novel-128k/blob/main/RWKV-novel-4-World-7B-20230810-ctx128k.pth",
"downloadUrl": "https://huggingface.co/xiaol/rwkv-7B-world-novel-128k/resolve/main/RWKV-novel-4-World-7B-20230810-ctx128k.pth"
},
{
"name": "RWKV-4-Novel-7B-v1-ChnEng-ChnPro-20230410-ctx4096.pth",
"desc": {
@@ -526,6 +666,32 @@
"lastUpdated": "2023-05-23T11:22:41",
"url": "https://huggingface.co/BlinkDL/rwkv-4-raven/blob/main/RWKV-4-Raven-14B-v12-Eng98%25-Other2%25-20230523-ctx8192.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-4-raven/resolve/main/RWKV-4-Raven-14B-v12-Eng98%25-Other2%25-20230523-ctx8192.pth"
},
{
"name": "RWKV-4-MIDI-120M-v1-20230714-ctx4096.pth",
"desc": {
"en": "Music 120M v1",
"zh": "作曲 120M v1",
"ja": "作曲 120M v1"
},
"size": 239224753,
"SHA256": "161d27dcf50d0958d230601ba1e0f8e7dd9c236105e92d2b833496412ace430c",
"lastUpdated": "2023-07-15T08:03:36",
"url": "https://huggingface.co/BlinkDL/rwkv-4-music/blob/main/RWKV-4-MIDI-120M-v1-20230714-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-4-music/resolve/main/RWKV-4-MIDI-120M-v1-20230714-ctx4096.pth"
},
{
"name": "RWKV-4-MIDI-560M-v1-20230717-ctx4096.pth",
"desc": {
"en": "Music 560M v1",
"zh": "作曲 560M v1",
"ja": "作曲 560M v1"
},
"size": 1130577457,
"SHA256": "62b21841b24af38ef176e9e9d895d9fff730cea8aa0623f53a1784d74ce828d6",
"lastUpdated": "2023-07-17T15:02:08",
"url": "https://huggingface.co/BlinkDL/rwkv-4-music/blob/main/RWKV-4-MIDI-560M-v1-20230717-ctx4096.pth",
"downloadUrl": "https://huggingface.co/BlinkDL/rwkv-4-music/resolve/main/RWKV-4-MIDI-560M-v1-20230717-ctx4096.pth"
}
]
}

1
midi/sample.txt Normal file
View File

@@ -0,0 +1 @@
<start> p:24:a p:2a:a p:31:a p:39:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:24:0 p:2a:0 p:31:0 p:39:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:26:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:2e:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2e:0 p:3b:0 p:45:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:2e:a p:3b:a p:45:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2e:0 p:3b:0 p:45:0 b:26:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:26:a p:2a:a p:3b:a p:45:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a b:26:a g:3e:a g:3e:a g:42:a g:42:a g:45:a g:45:a pi:3e:a pi:42:a pi:45:a t14 p:2a:0 p:3b:0 p:45:0 b:26:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:2d:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 b:2d:0 g:3e:0 g:3e:0 g:42:0 g:42:0 g:45:0 g:45:0 pi:3e:0 pi:42:0 pi:45:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2e:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2e:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:26:a p:2a:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:26:a p:2e:a p:31:a p:39:a p:3b:a p:45:a b:21:a g:39:a g:39:a g:3d:a g:3d:a g:40:a g:40:a pi:39:a pi:3d:a pi:40:a t14 p:26:0 p:2e:0 p:31:0 p:39:0 p:3b:0 p:45:0 b:21:0 t2 p:26:a p:2e:a p:31:a p:39:a p:3b:a p:45:a b:21:a t14 p:26:0 p:2e:0 p:31:0 p:39:0 p:3b:0 p:45:0 b:21:0 g:39:0 g:39:0 g:3d:0 g:3d:0 g:40:0 g:40:0 pi:39:0 pi:3d:0 pi:40:0 t2 p:24:a p:2a:a p:31:a p:39:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:24:0 p:2a:0 p:31:0 p:39:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:2e:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2e:0 p:3b:0 p:45:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:2e:a p:3b:a p:45:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2e:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:26:a p:2a:a p:3b:a p:45:a t14 p:26:0 p:2a:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a b:1f:a g:3b:a g:3b:a g:3e:a g:3e:a g:43:a g:43:a pi:3b:a pi:3e:a pi:43:a t14 p:2a:0 p:3b:0 p:45:0 b:1f:0 t2 p:24:a p:2a:a p:3b:a p:45:a b:1f:a t14 p:24:0 p:2a:0 p:3b:0 p:45:0 b:1f:0 g:3b:0 g:3b:0 g:3e:0 g:3e:0 g:43:0 g:43:0 pi:3b:0 pi:3e:0 pi:43:0 t2 p:24:a p:2e:a p:3b:a p:45:a b:26:a g:39:a g:39:a g:3e:a g:3e:a g:42:a g:42:a pi:39:a pi:3e:a pi:42:a t14 p:24:0 p:2e:0 p:3b:0 p:45:0 t2 p:2a:a p:3b:a p:45:a t14 p:2a:0 p:3b:0 <end>

View File

@@ -2,5 +2,9 @@
- ^backend-python/wkv_cuda_utils/
- ^backend-python/get-pip\.py
- ^backend-python/convert_model\.py
- ^backend-python/convert_safetensors\.py
- ^backend-python/utils/midi\.py
- ^build/
- ^finetune/lora/
- ^finetune/json2binidx_tool/
- ^frontend/wailsjs/