tg-show-library-processing

Displays all active library document processing records and their details.

Synopsis

tg-show-library-processing [options]

Description

The tg-show-library-processing command lists all library document processing records, showing the status and details of document processing jobs that have been initiated through the library system. This provides visibility into which documents are being processed, their associated flows, and processing metadata.

Options

Optional Arguments

  • -u, --api-url URL: TrustGraph API URL (default: $TRUSTGRAPH_URL or http://localhost:8088/)
  • -U, --user USER: User ID to filter processing records (default: trustgraph)

Examples

Show All Processing Records

tg-show-library-processing

Show Processing for Specific User

tg-show-library-processing -U "research-team"

Use Custom API URL

tg-show-library-processing -u http://production:8088/

Output Format

The command displays processing records in formatted tables:

+----------------+----------------------------------+
| id             | proc_research_001                |
| document-id    | doc_123456789                    |
| time           | 2023-12-15 14:30:22             |
| flow           | research-processing              |
| collection     | research-docs                    |
| tags           | nlp, research, automated         |
+----------------+----------------------------------+

+----------------+----------------------------------+
| id             | proc_batch_002                   |
| document-id    | doc_987654321                    |
| time           | 2023-12-15 14:25:18             |
| flow           | document-analysis                |
| collection     | batch-processed                  |
| tags           | batch, analysis                  |
+----------------+----------------------------------+

Field Details

  • id: Unique processing record identifier
  • document-id: ID of the document being processed
  • time: Timestamp when processing was initiated
  • flow: Flow instance used for processing
  • collection: Target collection for processed data
  • tags: Associated tags for categorization

Empty Results

If no processing records exist:

No processing objects.

Use Cases

Processing Status Monitoring

# Monitor active processing jobs
monitor_processing_status() {
  local interval="${1:-30}"  # Default 30 seconds
  
  echo "Monitoring library processing status..."
  echo "Refresh interval: ${interval}s"
  echo "Press Ctrl+C to stop"
  
  while true; do
    clear
    echo "Library Processing Monitor - $(date)"
    echo "===================================="
    
    tg-show-library-processing
    
    echo -e "\nProcessing Summary:"
    processing_count=$(tg-show-library-processing 2>/dev/null | grep -c "| id" || echo "0")
    echo "Active processing jobs: $processing_count"
    
    sleep "$interval"
  done
}

# Start monitoring
monitor_processing_status 15

User Activity Analysis

# Analyze processing activity by user
analyze_user_processing() {
  local users=("user1" "user2" "user3" "research-team")
  
  echo "Processing Activity Analysis"
  echo "==========================="
  
  for user in "${users[@]}"; do
    echo -e "\n--- User: $user ---"
    
    processing_output=$(tg-show-library-processing -U "$user" 2>/dev/null)
    
    if echo "$processing_output" | grep -q "No processing objects"; then
      echo "No active processing"
    else
      count=$(echo "$processing_output" | grep -c "| id" || echo "0")
      echo "Active processing jobs: $count"
      
      # Show recent jobs
      echo "Recent processing:"
      echo "$processing_output" | grep -E "(id|time|flow)" | head -9
    fi
  done
}

# Run analysis
analyze_user_processing

Processing Queue Management

# Manage processing queue
manage_processing_queue() {
  echo "Processing Queue Management"
  echo "=========================="
  
  # Show current queue
  echo "Current processing queue:"
  tg-show-library-processing
  
  # Count by flow
  echo -e "\nProcessing jobs by flow:"
  tg-show-library-processing | \
    grep "| flow" | \
    awk '{print $3}' | \
    sort | uniq -c | sort -nr
  
  # Count by collection
  echo -e "\nProcessing jobs by collection:"
  tg-show-library-processing | \
    grep "| collection" | \
    awk '{print $3}' | \
    sort | uniq -c | sort -nr
  
  # Find long-running jobs (would need timestamps comparison)
  echo -e "\nNote: Check timestamps for long-running jobs"
}

# Run queue management
manage_processing_queue

Cleanup and Maintenance

# Clean up completed processing records
cleanup_processing_records() {
  local user="$1"
  local max_age_days="${2:-7}"  # Default 7 days
  
  echo "Cleaning up processing records older than $max_age_days days for user: $user"
  
  # Get processing records
  processing_output=$(tg-show-library-processing -U "$user")
  
  if echo "$processing_output" | grep -q "No processing objects"; then
    echo "No processing records to clean up"
    return
  fi
  
  # Parse processing records (this is a simplified example)
  echo "$processing_output" | \
    grep "| id" | \
    awk '{print $3}' | \
    while read proc_id; do
      echo "Checking processing record: $proc_id"
      
      # Get the time for this processing record
      proc_time=$(echo "$processing_output" | \
        grep -A10 "$proc_id" | \
        grep "| time" | \
        awk '{print $3 " " $4}')
      
      if [ -n "$proc_time" ]; then
        # Calculate age (this would need proper date comparison)
        echo "Processing record $proc_id from: $proc_time"
        
        # Check if document processing is complete
        if tg-invoke-document-rag -q "test" -U "$user" 2>/dev/null | grep -q "answer"; then
          echo "Document appears to be processed, considering cleanup..."
          # tg-stop-library-processing --id "$proc_id" -U "$user"
        fi
      fi
    done
}

# Clean up old records
cleanup_processing_records "test-user" 3

Advanced Usage

Processing Performance Analysis

# Analyze processing performance
analyze_processing_performance() {
  echo "Processing Performance Analysis"
  echo "=============================="
  
  # Get all processing records
  processing_data=$(tg-show-library-processing)
  
  if echo "$processing_data" | grep -q "No processing objects"; then
    echo "No processing data available"
    return
  fi
  
  # Count total processing jobs
  total_jobs=$(echo "$processing_data" | grep -c "| id")
  echo "Total active processing jobs: $total_jobs"
  
  # Analyze by flow type
  echo -e "\nJobs by flow type:"
  echo "$processing_data" | \
    grep "| flow" | \
    awk '{print $3}' | \
    sort | uniq -c | sort -nr | \
    while read count flow; do
      echo "  $flow: $count jobs"
    done
  
  # Analyze by time patterns
  echo -e "\nJobs by hour (last 24h):"
  echo "$processing_data" | \
    grep "| time" | \
    awk '{print $4}' | \
    cut -d: -f1 | \
    sort | uniq -c | sort -k2n | \
    while read count hour; do
      echo "  ${hour}:00: $count jobs"
    done
}

# Run performance analysis
analyze_processing_performance

Cross-User Processing Comparison

# Compare processing across users
compare_user_processing() {
  local users=("$@")
  
  echo "Cross-User Processing Comparison"
  echo "==============================="
  
  for user in "${users[@]}"; do
    echo -e "\n--- User: $user ---"
    
    processing_data=$(tg-show-library-processing -U "$user" 2>/dev/null)
    
    if echo "$processing_data" | grep -q "No processing objects"; then
      echo "Active jobs: 0"
      echo "Collections: none"
      echo "Flows: none"
    else
      # Count jobs
      job_count=$(echo "$processing_data" | grep -c "| id")
      echo "Active jobs: $job_count"
      
      # List collections
      collections=$(echo "$processing_data" | \
        grep "| collection" | \
        awk '{print $3}' | \
        sort | uniq | tr '\n' ', ' | sed 's/,$//')
      echo "Collections: $collections"
      
      # List flows
      flows=$(echo "$processing_data" | \
        grep "| flow" | \
        awk '{print $3}' | \
        sort | uniq | tr '\n' ', ' | sed 's/,$//')
      echo "Flows: $flows"
    fi
  done
}

# Compare processing for multiple users
compare_user_processing "user1" "user2" "research-team" "admin"

Processing Health Check

# Health check for processing system
processing_health_check() {
  echo "Library Processing Health Check"
  echo "=============================="
  
  # Check if processing service is responsive
  if tg-show-library-processing > /dev/null 2>&1; then
    echo "✓ Processing service is responsive"
  else
    echo "✗ Processing service is not responsive"
    return 1
  fi
  
  # Get processing statistics
  processing_data=$(tg-show-library-processing 2>/dev/null)
  
  if echo "$processing_data" | grep -q "No processing objects"; then
    echo "ℹ No active processing jobs"
  else
    active_jobs=$(echo "$processing_data" | grep -c "| id")
    echo "ℹ Active processing jobs: $active_jobs"
    
    # Check for stuck jobs (simplified check)
    echo "Recent job timestamps:"
    echo "$processing_data" | \
      grep "| time" | \
      awk '{print $3 " " $4}' | \
      head -5
  fi
  
  # Check flow availability
  echo -e "\nFlow availability check:"
  flows=$(echo "$processing_data" | grep "| flow" | awk '{print $3}' | sort | uniq)
  
  for flow in $flows; do
    if tg-show-flows | grep -q "$flow"; then
      echo "✓ Flow '$flow' is available"
    else
      echo "⚠ Flow '$flow' may not be available"
    fi
  done
  
  echo "Health check completed"
}

# Run health check
processing_health_check

Processing Report Generation

# Generate comprehensive processing report
generate_processing_report() {
  local output_file="processing_report_$(date +%Y%m%d_%H%M%S).txt"
  
  echo "Generating processing report: $output_file"
  
  cat > "$output_file" << EOF
TrustGraph Library Processing Report
Generated: $(date)
====================================

EOF
  
  # Overall statistics
  echo "OVERVIEW" >> "$output_file"
  echo "--------" >> "$output_file"
  
  processing_data=$(tg-show-library-processing 2>/dev/null)
  
  if echo "$processing_data" | grep -q "No processing objects"; then
    echo "No active processing jobs" >> "$output_file"
  else
    total_jobs=$(echo "$processing_data" | grep -c "| id")
    echo "Total active jobs: $total_jobs" >> "$output_file"
    
    # Flow distribution
    echo -e "\nFLOW DISTRIBUTION" >> "$output_file"
    echo "-----------------" >> "$output_file"
    echo "$processing_data" | \
      grep "| flow" | \
      awk '{print $3}' | \
      sort | uniq -c | sort -nr >> "$output_file"
    
    # Collection distribution
    echo -e "\nCOLLECTION DISTRIBUTION" >> "$output_file"
    echo "-----------------------" >> "$output_file"
    echo "$processing_data" | \
      grep "| collection" | \
      awk '{print $3}' | \
      sort | uniq -c | sort -nr >> "$output_file"
    
    # Recent activity
    echo -e "\nRECENT PROCESSING JOBS" >> "$output_file"
    echo "----------------------" >> "$output_file"
    echo "$processing_data" | head -50 >> "$output_file"
  fi
  
  echo "Report generated: $output_file"
}

# Generate report
generate_processing_report

Integration with Other Commands

Processing Workflow Management

# Complete processing workflow
manage_processing_workflow() {
  local user="$1"
  local action="$2"
  
  case "$action" in
    "status")
      echo "Processing status for user: $user"
      tg-show-library-processing -U "$user"
      ;;
    "start-batch")
      echo "Starting batch processing for user: $user"
      tg-show-library-documents -U "$user" | \
        grep "| id" | \
        awk '{print $3}' | \
        while read doc_id; do
          proc_id="batch_$(date +%s)_${doc_id}"
          tg-start-library-processing -d "$doc_id" --id "$proc_id" -U "$user"
        done
      ;;
    "cleanup")
      echo "Cleaning up completed processing for user: $user"
      cleanup_processing_records "$user"
      ;;
    *)
      echo "Usage: manage_processing_workflow <user> <status|start-batch|cleanup>"
      ;;
  esac
}

# Manage workflow for user
manage_processing_workflow "research-team" "status"

Monitoring Integration

# Integration with system monitoring
processing_metrics_export() {
  local metrics_file="processing_metrics.txt"
  
  # Get processing data
  processing_data=$(tg-show-library-processing 2>/dev/null)
  
  if echo "$processing_data" | grep -q "No processing objects"; then
    active_jobs=0
  else
    active_jobs=$(echo "$processing_data" | grep -c "| id")
  fi
  
  # Export metrics
  echo "trustgraph_library_processing_active_jobs $active_jobs" > "$metrics_file"
  echo "trustgraph_library_processing_timestamp $(date +%s)" >> "$metrics_file"
  
  # Export by flow
  if [ "$active_jobs" -gt 0 ]; then
    echo "$processing_data" | \
      grep "| flow" | \
      awk '{print $3}' | \
      sort | uniq -c | \
      while read count flow; do
        echo "trustgraph_library_processing_jobs_by_flow{flow=\"$flow\"} $count" >> "$metrics_file"
      done
  fi
  
  echo "Metrics exported to: $metrics_file"
}

processing_metrics_export

Error Handling

API Connection Issues

Exception: Connection refused

Solution: Check API URL and ensure TrustGraph is running.

Permission Errors

Exception: Access denied

Solution: Verify user permissions for library access.

User Not Found

Exception: User not found

Solution: Check user ID and ensure user exists in the system.

Service Unavailable

Exception: Service temporarily unavailable

Solution: Check TrustGraph service status and try again.

Environment Variables

  • TRUSTGRAPH_URL: Default API URL

API Integration

This command uses the Library API to retrieve processing record information.

Best Practices

  1. Regular Monitoring: Check processing status regularly
  2. User Filtering: Use user filtering to focus on relevant processing
  3. Cleanup: Regularly clean up completed processing records
  4. Performance Tracking: Monitor processing patterns and performance
  5. Integration: Integrate with monitoring and alerting systems
  6. Documentation: Document processing workflows and procedures
  7. Troubleshooting: Use processing information for issue diagnosis

Troubleshooting

No Processing Records

# Check if library service is running
curl -s http://localhost:8088/api/v1/library/processing

# Verify documents exist
tg-show-library-documents

Stale Processing Records

# Check for long-running processes
tg-show-library-processing | grep "$(date -d '1 hour ago' '+%Y-%m-%d')"

# Check flow status
tg-show-flows

Performance Issues

# Check system resources
free -h
df -h

# Monitor API response times
time tg-show-library-processing