You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

llama_cpp_python.yaml 771 B

123456789101112131415161718192021222324252627
  1. nodes:
  2. - id: benchmark_script
  3. path: benchmark_script.py
  4. inputs:
  5. text: llm/text
  6. outputs:
  7. - data
  8. env:
  9. DATA: "Please only generate the following output: This is a test"
  10. - id: llm
  11. build: pip install -e ../../node-hub/dora-llama-cpp-python
  12. path: dora-llama-cpp-python
  13. inputs:
  14. text:
  15. source: benchmark_script/data
  16. queue-size: 10
  17. outputs:
  18. - text
  19. env:
  20. MODEL_NAME_OR_PATH: "Qwen/Qwen2.5-0.5B-Instruct-GGUF"
  21. MODEL_FILE_PATTERN: "*fp16.gguf"
  22. SYSTEM_PROMPT: "You're a very succinct AI assistant with short answers."
  23. MAX_TOKENS: "512"
  24. N_GPU_LAYERS: "35" # Enable GPU acceleration
  25. N_THREADS: "16" # CPU threads
  26. CONTEXT_SIZE: "4096" # Maximum context window