Advanced Metrics
Deep dive into monitoring Orchestrix workflows with professional metrics tools.
Key Metrics to Track
For any production workflow, you should track at least these four Golden Signals:
- Latency: Total flow duration and individual step durations.
- Traffic: Number of flow executions per second/minute.
- Errors: Failure rates for flows and steps.
- Saturation: If running many parallel steps, monitor your event loop or thread pool.
Implementation with Prometheus
ts
import { Counter, Histogram } from 'prom-client';
const flowLatency = new Histogram({
name: 'orchestrix_flow_duration_seconds',
help: 'Duration of flows in seconds',
labelNames: ['flow_name', 'status']
});
const stepRetries = new Counter({
name: 'orchestrix_step_retries_total',
help: 'Total number of step retries',
labelNames: ['flow_name', 'step_name']
});
const flow = create("my-flow", {
hooks: {
onFlowComplete: ({ result }) => {
flowLatency.observe({
flow_name: result.name,
status: result.status
}, result.durationMs / 1000);
},
onStepRetry: (event, attempt) => {
stepRetries.inc({
flow_name: event.flowName,
step_name: event.stepName
});
}
}
});Dashboard Ideas
Workflow Health
- Success Rate %:
sum(rate(orchestrix_flow_duration_seconds_count{status="completed"}[5m])) / sum(rate(orchestrix_flow_duration_seconds_count[5m])) - P99 Latency:
histogram_quantile(0.99, sum by (le) (rate(orchestrix_flow_duration_seconds_bucket[5m])))
Step Reliability
- Top 5 Retrying Steps: Identify flaky dependencies.
- Compensation Rate: How often do your flows fail and require undoing?
Business Metrics
Don't forget that workflows often represent business processes. You can use the FlowContext in your hooks to extract business-level metrics:
ts
onFlowSuccess: (result) => {
const amount = result.context.get<number>('totalAmount');
revenueCounter.inc(amount);
}