You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

llama_cpp_python.yaml 814 B

10 months ago
12345678910111213141516171819202122232425262728
  1. nodes:
  2. - id: benchmark_script
  3. path: ../mllm/benchmark_script.py
  4. inputs:
  5. text: llm/text
  6. outputs:
  7. - text
  8. env:
  9. TEXT: "Please only generate the following output: This is a test"
  10. TEXT_TRUTH: "This is a test"
  11. - id: llm
  12. build: pip install -e ../../node-hub/dora-llama-cpp-python
  13. path: dora-llama-cpp-python
  14. inputs:
  15. text:
  16. source: benchmark_script/text
  17. queue-size: 10
  18. outputs:
  19. - text
  20. env:
  21. MODEL_NAME_OR_PATH: "Qwen/Qwen2.5-0.5B-Instruct-GGUF"
  22. MODEL_FILE_PATTERN: "*fp16.gguf"
  23. SYSTEM_PROMPT: "You're a very succinct AI assistant with short answers."
  24. MAX_TOKENS: "512"
  25. N_GPU_LAYERS: "35" # Enable GPU acceleration
  26. N_THREADS: "16" # CPU threads
  27. CONTEXT_SIZE: "4096" # Maximum context window