Architecture Overview
!!! note "Work in progress" Architecture documentation is being migrated from internal developer docs. For now, see dev_docs/ in the repository for architecture decision logs and solution patterns.
Components at a Glance
| Component | Technology | Purpose |
|---|---|---|
| API layer | FastAPI | REST endpoints, SSE streaming, webhook ingestion |
| LLM integration | LiteLLM, Claude SDK, OpenAI SDK | Multi-provider chat completions, function calling, and structured outputs |
| Workflow engine | Temporal | Durable workflow execution with composable nodes (choice, LLM routing, sub-workflows, parallel, iteration) |
| Database | PostgreSQL + SQLAlchemy | Definitions, sessions, messages, access control |
| Streaming | Redis PubSub | Real-time token delivery from LLM to client |
| Storage | S3-compatible (R2, MinIO) | Document and file storage with tenant isolation |
| Ingestion | Docling, Chonkie, LiteLLM, pgvector | Document parsing, chunking, embedding, and hybrid search |
Workflow Engine
The workflow engine uses Temporal for durable execution. Each workflow is a directed graph of nodes connected by edges, executed via BFS traversal.
Execution Model
API Request → DynamicWorkflowExecutor (Temporal workflow)
→ BFS over graph nodes
→ Each node dispatches to an activity based on its type
→ Edge routing determines next nodes (conditional, parallel, or all)Key classes:
| Class | File | Purpose |
|---|---|---|
DynamicWorkflowExecutor | temporal/workflows.py | Main BFS executor — processes nodes, evaluates choices, routes edges |
SubWorkflowWrapper | temporal/workflows.py | Executes child workflows with input/output mapping and cycle detection |
| Activities | temporal/activities.py | One per node type: transform, LLM router, function call, parallel |
Node Types
Node types are data-driven — defined in the database via seed_node_types.py and centralized in core/workflow/node_types.py. Each type has an execution_handler that maps to a Temporal activity.
| Node Type | Handler | Description |
|---|---|---|
| START / END | passthrough | Graph entry/exit points |
| TRANSFORM | activity:execute_transform_node_activity | Data transformation via templates |
| FUNCTION | activity:execute_function_node_activity | Call a platform or HTTP function |
| LLM | activity:execute_llm_node_activity | LLM completion |
| CHOICE | workflow:evaluate_choice | Rule-based conditional routing |
| CHOICE_LLM | activity:execute_llm_router_activity | AI classification routing |
| SUB_WORKFLOW | workflow:execute_child | Child workflow composition |
| PARALLEL | workflow:execute_parallel | Concurrent branch execution |
| WAIT | temporal:wait_for_event | Pause until external signal |
| APPROVAL_GATE | temporal:approval_gate | Human approval before continuing |
| FOR_EACH | workflow_code:for_each | Iterate over arrays with concurrent batching |
| FILTER | activity:execute_filter_node_activity | Filter array items by conditions |
| REDUCE | activity:execute_reduce_node_activity | Aggregate array to single value |
Edge Routing
After each node executes, the executor determines which edges to follow:
- Normal nodes: Follow all outgoing edges
- Choice / Choice LLM: Follow only the edge whose
source_handlematches the chosen branch (branch-0,branch-1, ordefault) - Parallel: Follow edges with
source_handle: parallel-*concurrently - For-each: Follow the
source_handle: foreach-bodyedge for each item in the array
Workflows without source_handle on edges fall back to legacy behavior (follow all edges).
Safety Features
- Cycle detection: Sub-workflows track a
_nesting_pathto prevent recursive loops (max depth: 3) - ReDoS protection: Regex patterns in conditions are length-capped
- Parallel branch limit: Maximum 20 concurrent branches per parallel node
- Template injection guard: Condition values cannot reference
{{nodes.*}} - Fail-fast on missing types: Unknown node types raise immediately instead of silently skipping