tg-show-flow-state
Displays the processor states for a specific flow and its associated flow class.
Synopsis
tg-show-flow-state [options]
Description
The tg-show-flow-state
command shows the current state of processors within a specific TrustGraph flow instance and its corresponding flow class. It queries the metrics system to determine which processing components are running and displays their status with visual indicators.
This command is essential for monitoring flow health and debugging processing issues.
Options
Optional Arguments
-f, --flow-id ID
: Flow instance ID to examine (default:default
)-u, --api-url URL
: TrustGraph API URL (default:$TRUSTGRAPH_URL
orhttp://localhost:8088/
)-m, --metrics-url URL
: Metrics API URL (default:http://localhost:8088/api/metrics
)
Examples
Check Default Flow State
tg-show-flow-state
Check Specific Flow
tg-show-flow-state -f "production-flow"
Use Custom Metrics URL
tg-show-flow-state \
-f "research-flow" \
-m "http://metrics-server:8088/api/metrics"
Check Flow in Different Environment
tg-show-flow-state \
-f "staging-flow" \
-u "http://staging:8088/" \
-m "http://staging:8088/api/metrics"
Output Format
The command displays processor states for both the flow instance and its flow class:
Flow production-flow
- pdf-processor 💚
- text-extractor 💚
- embeddings-generator 💚
- knowledge-builder ❌
- document-indexer 💚
Class document-processing-v2
- base-pdf-processor 💚
- base-text-extractor 💚
- base-embeddings-generator 💚
- base-knowledge-builder 💚
- base-document-indexer 💚
Status Indicators
- 💚 (Green Heart): Processor is running and healthy
- ❌ (Red X): Processor is not running or unhealthy
Information Displayed
- Flow Section: Shows the state of processors in the specific flow instance
- Class Section: Shows the state of processors in the flow class template
- Processor Names: Individual processing components within the flow
Use Cases
Flow Health Monitoring
# Monitor flow health continuously
monitor_flow_health() {
local flow_id="$1"
local interval="${2:-30}" # Default 30 seconds
echo "Monitoring flow health: $flow_id"
echo "Refresh interval: ${interval}s"
echo "Press Ctrl+C to stop"
while true; do
clear
echo "Flow Health Monitor - $(date)"
echo "=============================="
tg-show-flow-state -f "$flow_id"
sleep "$interval"
done
}
# Monitor production flow
monitor_flow_health "production-flow" 15
Debugging Processing Issues
# Comprehensive flow debugging
debug_flow_issues() {
local flow_id="$1"
echo "Debugging flow: $flow_id"
echo "======================="
# Check flow state
echo "1. Processor States:"
tg-show-flow-state -f "$flow_id"
# Check flow configuration
echo -e "\n2. Flow Configuration:"
tg-show-flows | grep "$flow_id"
# Check active processing
echo -e "\n3. Active Processing:"
tg-show-flows | grep -i processing
# Check system resources
echo -e "\n4. System Resources:"
free -h
df -h
echo -e "\nDebugging complete for: $flow_id"
}
# Debug specific flow
debug_flow_issues "problematic-flow"
Multi-Flow Status Dashboard
# Create status dashboard for multiple flows
create_flow_dashboard() {
local flows=("$@")
echo "TrustGraph Flow Dashboard - $(date)"
echo "==================================="
for flow in "${flows[@]}"; do
echo -e "\n=== Flow: $flow ==="
tg-show-flow-state -f "$flow" 2>/dev/null || echo "Flow not found or inaccessible"
done
echo -e "\n=== Summary ==="
echo "Total flows monitored: ${#flows[@]}"
echo "Dashboard generated: $(date)"
}
# Monitor multiple flows
flows=("production-flow" "research-flow" "development-flow")
create_flow_dashboard "${flows[@]}"
Automated Health Checks
# Automated health check with alerts
health_check_with_alerts() {
local flow_id="$1"
local alert_email="$2"
echo "Performing health check for: $flow_id"
# Capture flow state
flow_state=$(tg-show-flow-state -f "$flow_id" 2>&1)
if [ $? -ne 0 ]; then
echo "ERROR: Failed to get flow state"
# Send alert email if configured
if [ -n "$alert_email" ]; then
echo "Flow $flow_id is not responding" | mail -s "TrustGraph Alert" "$alert_email"
fi
return 1
fi
# Check for failed processors
failed_count=$(echo "$flow_state" | grep -c "❌")
if [ "$failed_count" -gt 0 ]; then
echo "WARNING: $failed_count processors are not running"
echo "$flow_state"
# Send alert if configured
if [ -n "$alert_email" ]; then
echo -e "Flow $flow_id has $failed_count failed processors:\n\n$flow_state" | \
mail -s "TrustGraph Health Alert" "$alert_email"
fi
return 1
else
echo "✓ All processors are running normally"
return 0
fi
}
# Run health check
health_check_with_alerts "production-flow" "admin@company.com"
Advanced Usage
Flow State Comparison
# Compare flow states between environments
compare_flow_states() {
local flow_id="$1"
local env1_url="$2"
local env2_url="$3"
echo "Comparing flow state: $flow_id"
echo "Environment 1: $env1_url"
echo "Environment 2: $env2_url"
echo "================================"
# Get states from both environments
echo "Environment 1 State:"
tg-show-flow-state -f "$flow_id" -u "$env1_url" -m "$env1_url/api/metrics"
echo -e "\nEnvironment 2 State:"
tg-show-flow-state -f "$flow_id" -u "$env2_url" -m "$env2_url/api/metrics"
echo -e "\nComparison complete"
}
# Compare production vs staging
compare_flow_states "main-flow" "http://prod:8088" "http://staging:8088"
Historical State Tracking
# Track flow state over time
track_flow_state_history() {
local flow_id="$1"
local log_file="flow_state_history.log"
local interval="${2:-60}" # Default 1 minute
echo "Starting flow state tracking: $flow_id"
echo "Log file: $log_file"
echo "Interval: ${interval}s"
while true; do
timestamp=$(date '+%Y-%m-%d %H:%M:%S')
# Get current state
state_output=$(tg-show-flow-state -f "$flow_id" 2>&1)
if [ $? -eq 0 ]; then
# Count healthy and failed processors
healthy_count=$(echo "$state_output" | grep -c "💚")
failed_count=$(echo "$state_output" | grep -c "❌")
# Log summary
echo "$timestamp,$flow_id,$healthy_count,$failed_count" >> "$log_file"
# If there are failures, log details
if [ "$failed_count" -gt 0 ]; then
echo "$timestamp - FAILURES DETECTED in $flow_id:" >> "${log_file}.detailed"
echo "$state_output" >> "${log_file}.detailed"
echo "---" >> "${log_file}.detailed"
fi
else
echo "$timestamp,$flow_id,ERROR,ERROR" >> "$log_file"
fi
sleep "$interval"
done
}
# Start tracking (run in background)
track_flow_state_history "production-flow" 30 &
State-Based Actions
# Perform actions based on flow state
state_based_actions() {
local flow_id="$1"
echo "Checking flow state for automated actions: $flow_id"
# Get current state
state_output=$(tg-show-flow-state -f "$flow_id")
if [ $? -ne 0 ]; then
echo "ERROR: Cannot get flow state"
return 1
fi
# Check specific processors
if echo "$state_output" | grep -q "pdf-processor.*❌"; then
echo "PDF processor is down - attempting restart..."
# Restart specific processor (this would need additional commands)
# restart_processor "$flow_id" "pdf-processor"
fi
if echo "$state_output" | grep -q "embeddings-generator.*❌"; then
echo "Embeddings generator is down - checking dependencies..."
# Check GPU availability, memory, etc.
nvidia-smi 2>/dev/null || echo "GPU not available"
fi
# Count total failures
failed_count=$(echo "$state_output" | grep -c "❌")
if [ "$failed_count" -gt 3 ]; then
echo "CRITICAL: More than 3 processors failed - considering flow restart"
# This would trigger more serious recovery actions
fi
}
Performance Correlation
# Correlate flow state with performance metrics
correlate_state_performance() {
local flow_id="$1"
local metrics_url="$2"
echo "Correlating flow state with performance for: $flow_id"
# Get flow state
state_output=$(tg-show-flow-state -f "$flow_id" -m "$metrics_url")
healthy_count=$(echo "$state_output" | grep -c "💚")
failed_count=$(echo "$state_output" | grep -c "❌")
echo "Processors - Healthy: $healthy_count, Failed: $failed_count"
# Get performance metrics (this would need additional API calls)
# throughput=$(get_flow_throughput "$flow_id" "$metrics_url")
# latency=$(get_flow_latency "$flow_id" "$metrics_url")
# echo "Performance - Throughput: ${throughput}/min, Latency: ${latency}ms"
# Calculate health ratio
total_processors=$((healthy_count + failed_count))
if [ "$total_processors" -gt 0 ]; then
health_ratio=$(echo "scale=2; $healthy_count * 100 / $total_processors" | bc)
echo "Health ratio: ${health_ratio}%"
fi
}
Integration with Monitoring Systems
Prometheus Integration
# Export flow state metrics to Prometheus format
export_prometheus_metrics() {
local flow_id="$1"
local metrics_file="flow_state_metrics.prom"
# Get flow state
state_output=$(tg-show-flow-state -f "$flow_id")
# Count states
healthy_count=$(echo "$state_output" | grep -c "💚")
failed_count=$(echo "$state_output" | grep -c "❌")
# Generate Prometheus metrics
cat > "$metrics_file" << EOF
# HELP trustgraph_flow_processors_healthy Number of healthy processors in flow
# TYPE trustgraph_flow_processors_healthy gauge
trustgraph_flow_processors_healthy{flow_id="$flow_id"} $healthy_count
# HELP trustgraph_flow_processors_failed Number of failed processors in flow
# TYPE trustgraph_flow_processors_failed gauge
trustgraph_flow_processors_failed{flow_id="$flow_id"} $failed_count
# HELP trustgraph_flow_health_ratio Ratio of healthy processors
# TYPE trustgraph_flow_health_ratio gauge
EOF
total=$((healthy_count + failed_count))
if [ "$total" -gt 0 ]; then
ratio=$(echo "scale=4; $healthy_count / $total" | bc)
echo "trustgraph_flow_health_ratio{flow_id=\"$flow_id\"} $ratio" >> "$metrics_file"
fi
echo "Prometheus metrics exported to: $metrics_file"
}
Grafana Dashboard Data
# Generate data for Grafana dashboard
generate_grafana_data() {
local flows=("$@")
local output_file="grafana_flow_data.json"
echo "Generating Grafana dashboard data..."
echo "{" > "$output_file"
echo " \"flows\": [" >> "$output_file"
for i in "${!flows[@]}"; do
flow="${flows[$i]}"
# Get flow state
state_output=$(tg-show-flow-state -f "$flow" 2>/dev/null)
if [ $? -eq 0 ]; then
healthy=$(echo "$state_output" | grep -c "💚")
failed=$(echo "$state_output" | grep -c "❌")
else
healthy=0
failed=0
fi
echo " {" >> "$output_file"
echo " \"flow_id\": \"$flow\"," >> "$output_file"
echo " \"healthy_processors\": $healthy," >> "$output_file"
echo " \"failed_processors\": $failed," >> "$output_file"
echo " \"timestamp\": \"$(date -Iseconds)\"" >> "$output_file"
if [ $i -lt $((${#flows[@]} - 1)) ]; then
echo " }," >> "$output_file"
else
echo " }" >> "$output_file"
fi
done
echo " ]" >> "$output_file"
echo "}" >> "$output_file"
echo "Grafana data generated: $output_file"
}
Error Handling
Flow Not Found
Exception: Flow 'nonexistent-flow' not found
Solution: Verify the flow ID exists with tg-show-flows
.
Metrics API Unavailable
Exception: Connection refused to metrics API
Solution: Check metrics URL and ensure metrics service is running.
Permission Issues
Exception: Access denied to metrics
Solution: Verify permissions for accessing metrics and flow information.
Invalid Flow State
Exception: Unable to parse flow state
Solution: Check if the flow is properly initialized and processors are configured.
Environment Variables
TRUSTGRAPH_URL
: Default API URL
Related Commands
tg-show-flows
- List all flowstg-show-processor-state
- Show all processor statestg-start-flow
- Start flow instancestg-stop-flow
- Stop flow instances
API Integration
This command integrates with:
- TrustGraph Flow API for flow information
- Prometheus/Metrics API for processor state information
Best Practices
- Regular Monitoring: Check flow states regularly in production
- Automated Alerts: Set up automated health checks with alerting
- Historical Tracking: Maintain historical flow state data
- Integration: Integrate with monitoring systems like Prometheus/Grafana
- Documentation: Document expected processor configurations
- Correlation: Correlate flow state with performance metrics
- Recovery Procedures: Develop automated recovery procedures for common failures
Troubleshooting
No Processors Shown
# Check if flow exists
tg-show-flows | grep "flow-id"
# Verify metrics service
curl -s http://localhost:8088/api/metrics/query?query=processor_info
Inconsistent States
# Check metrics service health
curl -s http://localhost:8088/api/metrics/health
# Restart metrics collection if needed
Connection Issues
# Test API connectivity
curl -s http://localhost:8088/api/v1/flows
# Test metrics connectivity
curl -s http://localhost:8088/api/metrics/query?query=up