allow setting quantizedLayers of WebGPU mode; chore

2024-03-01 14:23:05 +08:00
parent c9513822c9
commit 887ba06bd6
8 changed files with 46 additions and 10 deletions
--- a/frontend/src/_locales/zh-hans/main.json
+++ b/frontend/src/_locales/zh-hans/main.json
@@ -341,5 +341,7 @@
  "Load Conversation": "读取对话",
  "The latest X messages will be sent to the server. If you are using the RWKV-Runner server, please use the default value because RWKV-Runner has built-in state cache management which only calculates increments. Sending all messages will have lower cost. If you are using ChatGPT, adjust this value according to your needs to reduce ChatGPT expenses.": "最近的X条消息会发送至服务器. 如果你正在使用RWKV-Runner服务器, 请使用默认值, 因为RWKV-Runner内置了state缓存管理, 只计算增量, 发送所有消息将具有更低的成本. 如果你正在使用ChatGPT, 则根据你的需要调整此值, 这可以降低ChatGPT的费用",
  "History Message Number": "历史消息数量",
-  "Send All Message": "发送所有消息"
+  "Send All Message": "发送所有消息",
+  "Quantized Layers": "量化层数",
+  "Number of the neural network layers quantized with current precision, the more you quantize, the lower the VRAM usage, but the quality correspondingly decreases.": "神经网络以当前精度量化的层数, 量化越多, 占用显存越低, 但质量相应下降"
 }