Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
|
9 months ago | |
|---|---|---|
| .. | ||
| dora_llama_cpp_python | 10 months ago | |
| tests | 10 months ago | |
| README.md | 10 months ago | |
| pyproject.toml | 9 months ago | |
| test.yml | 10 months ago | |
A Dora node that provides access to LLaMA models using llama-cpp-python for efficient CPU/GPU inference.
uv venv -p 3.11 --seed
uv pip install -e .
The node can be configured in your dataflow YAML file:
# Using a HuggingFace model
- id: dora-llama-cpp-python
build: pip install -e path/to/dora-llama-cpp-python
path: dora-llama-cpp-python
inputs:
text: source_node/text # Input text to generate response for
outputs:
- text # Generated response text
env:
MODEL_NAME_OR_PATH: "TheBloke/Llama-2-7B-Chat-GGUF"
MODEL_FILE_PATTERN: "*Q4_K_M.gguf"
SYSTEM_PROMPT: "You're a very succinct AI assistant with short answers."
ACTIVATION_WORDS: "what how who where you"
MAX_TOKENS: "512"
N_GPU_LAYERS: "35" # Enable GPU acceleration
N_THREADS: "4" # CPU threads
CONTEXT_SIZE: "4096" # Maximum context window
MODEL_NAME_OR_PATH: Path to local model file or HuggingFace repo id (default: "TheBloke/Llama-2-7B-Chat-GGUF")MODEL_FILE_PATTERN: Pattern to match model file when downloading from HF (default: "*Q4_K_M.gguf")SYSTEM_PROMPT: Customize the AI assistant's personality/behaviorACTIVATION_WORDS: Space-separated list of words that trigger model responseMAX_TOKENS: Maximum number of tokens to generate (default: 512)N_GPU_LAYERS: Number of layers to offload to GPU (default: 0, set to 35 for GPU acceleration)N_THREADS: Number of CPU threads to use (default: 4)CONTEXT_SIZE: Maximum context window size (default: 4096)This example shows how to create a conversational AI pipeline that:
nodes:
- id: dora-microphone
build: pip install dora-microphone
path: dora-microphone
inputs:
tick: dora/timer/millis/2000
outputs:
- audio
- id: dora-vad
build: pip install dora-vad
path: dora-vad
inputs:
audio: dora-microphone/audio
outputs:
- audio
- timestamp_start
- id: dora-whisper
build: pip install dora-distil-whisper
path: dora-distil-whisper
inputs:
input: dora-vad/audio
outputs:
- text
- id: dora-llama-cpp-python
build: pip install -e .
path: dora-llama-cpp-python
inputs:
text: dora-whisper/text
outputs:
- text
env:
MODEL_NAME: "TheBloke/Llama-2-7B-Chat-GGUF"
MODEL_FILE_PATTERN: "*Q4_K_M.gguf"
SYSTEM_PROMPT: "You're a helpful assistant."
ACTIVATION_WORDS: "hey help what how"
MAX_TOKENS: "512"
N_GPU_LAYERS: "35"
N_THREADS: "4"
CONTEXT_SIZE: "4096"
- id: dora-tts
build: pip install dora-kokoro-tts
path: dora-kokoro-tts
inputs:
text: dora-llama-cpp-python/text
outputs:
- audio
dora build example.yml
dora run example.yml
uv pip install ruff
uv run ruff check . --fix
uv run ruff check .
uv pip install pytest
uv run pytest . # Test
dora-llama-cpp-python is released under the MIT License
DORA (Dataflow-Oriented Robotic Architecture) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed datafl
Rust Python TOML Markdown C other