Browse Source

Use quantized model instead of fp16 for faster response and lower memory footprint

make-qwen-llm-configurable
haixuantao 5 months ago
parent
commit
763eb44833
1 changed files with 1 additions and 0 deletions
  1. +1
    -0
      examples/openai-realtime/whisper-template-metal.yml

+ 1
- 0
examples/openai-realtime/whisper-template-metal.yml View File

@@ -43,6 +43,7 @@ nodes:
- text
env:
MODEL_NAME_OR_PATH: Qwen/Qwen2.5-0.5B-Instruct-GGUF
MODEL_FILE_PATTERN: "*[qQ]6_[kK].[gG][gG][uU][fF]"

- id: tts
build: pip install -e ../../node-hub/dora-kokoro-tts


Loading…
Cancel
Save