tg-show-token-rate

Synopsis

tg-show-token-rate [OPTIONS]

Description

The tg-show-token-rate command displays a live stream of token usage rates from TrustGraph processors. It monitors both input and output tokens, showing instantaneous rates and cumulative averages over time. This command is essential for monitoring LLM token consumption and understanding processing throughput.

The command queries the metrics endpoint for token usage data and displays:

Input token rates (tokens per second)
Output token rates (tokens per second)
Total token rates (combined input + output)

All rates are calculated as averages since the command started running.

Options

-m, --metrics-url URL
- Metrics endpoint URL to query for token information
- Default: http://localhost:8088/api/metrics
- Should point to a Prometheus-compatible metrics endpoint
-p, --period SECONDS
- Sampling period in seconds between measurements
- Default: 1
- Controls how frequently token rates are updated
-n, --number-samples COUNT
- Number of samples to collect before stopping
- Default: 100
- Set to a large value for continuous monitoring
-h, --help
- Show help message and exit

Examples

Basic Usage

Monitor token rates with default settings (1-second intervals, 100 samples):

tg-show-token-rate

Custom Sampling Period

Monitor token rates with 5-second intervals:

tg-show-token-rate --period 5

Continuous Monitoring

Monitor token rates continuously (1000 samples):

tg-show-token-rate -n 1000

Remote Monitoring

Monitor token rates from a remote TrustGraph instance:

tg-show-token-rate -m http://10.0.1.100:8088/api/metrics

High-Frequency Monitoring

Monitor token rates with sub-second precision:

tg-show-token-rate --period 0.5 --number-samples 200

Output Format

The command displays a table with continuously updated token rates:

     Input     Output      Total
     -----     ------      -----
      12.3       8.7       21.0
      15.2      10.1       25.3
      18.7      12.4       31.1
      ...

Each row shows:

Input: Average input tokens per second since monitoring started
Output: Average output tokens per second since monitoring started
Total: Combined input + output tokens per second

Advanced Usage

Token Rate Analysis

Create a script to analyze token usage patterns:

#!/bin/bash
echo "Starting token rate analysis..."
tg-show-token-rate --period 2 --number-samples 60 > token_rates.txt
echo "Analysis complete. Data saved to token_rates.txt"

Performance Monitoring

Monitor token rates during load testing:

#!/bin/bash
echo "Starting load test monitoring..."
tg-show-token-rate --period 1 --number-samples 300 | tee load_test_tokens.log

Alert on High Token Usage

Create an alert script for excessive token consumption:

#!/bin/bash
tg-show-token-rate -n 10 -p 5 | tail -n 1 | awk '{
    if ($3 > 100) {
        print "WARNING: High token rate detected:", $3, "tokens/sec"
        exit 1
    }
}'

Cost Estimation

Estimate token costs during processing:

#!/bin/bash
echo "Monitoring token usage for cost estimation..."
tg-show-token-rate --period 10 --number-samples 36 | \
awk 'NR>2 {total+=$3} END {print "Average tokens/sec:", total/NR-2}'

Error Handling

The command handles various error conditions:

Connection errors: If the metrics endpoint is unavailable
Invalid JSON: If the metrics response is malformed
Missing metrics: If token metrics are not found
Network timeouts: If requests to the metrics endpoint time out

Common error scenarios:

# Metrics endpoint not available
tg-show-token-rate -m http://invalid-host:8088/api/metrics
# Output: Exception: [Connection error details]

# Invalid period value
tg-show-token-rate --period 0
# Output: Exception: [Invalid period error]

Integration with Other Commands

With Cost Monitoring

Combine with token cost analysis:

echo "=== Token Rates ==="
tg-show-token-rate -n 5 -p 2
echo
echo "=== Token Costs ==="
tg-show-token-costs

With Processor State

Monitor tokens alongside processor health:

echo "=== Processor States ==="
tg-show-processor-state
echo
echo "=== Token Rates ==="
tg-show-token-rate -n 10 -p 1

With Flow Monitoring

Track token usage per flow:

#!/bin/bash
echo "=== Active Flows ==="
tg-show-flows
echo
echo "=== Token Usage ==="
tg-show-token-rate -n 20 -p 3

Best Practices

Baseline Monitoring: Establish baseline token rates for normal operation
Alert Thresholds: Set up alerts for unusually high token consumption
Cost Tracking: Monitor token rates to estimate operational costs
Load Testing: Use during load testing to understand capacity limits
Historical Analysis: Save token rate data for trend analysis

Troubleshooting

No Token Data

If no token rates are displayed:

Verify that TrustGraph processors are actively processing requests
Check that token metrics are being exported properly
Ensure the metrics endpoint is accessible
Verify that LLM services are receiving requests

Inconsistent Rates

For inconsistent or erratic token rates:

Check for network issues affecting metrics collection
Verify that the sampling period is appropriate for your workload
Ensure multiple processors aren’t conflicting
Check system resources (CPU, memory) on the TrustGraph instance

High Token Rates

If token rates are unexpectedly high:

Investigate the types of queries being processed
Check for inefficient prompts or large document processing
Verify that caching is working properly
Consider if the workload justifies the token usage

Performance Considerations

Sampling Frequency: Higher frequencies provide more granular data but consume more resources
Network Latency: Consider network latency when setting sampling periods
Metrics Storage: Long monitoring sessions generate significant data
Resource Usage: The command itself uses minimal resources

tg-show-token-costs - Display token usage costs
tg-show-processor-state - Show processor states
tg-show-flow-state - Display flow processor states
tg-show-config - Show TrustGraph configuration