tg-save-doc-embeds
Saves document embeddings from TrustGraph processing streams to MessagePack format files.
Synopsis
tg-save-doc-embeds -o OUTPUT_FILE [options]
Description
The tg-save-doc-embeds command connects to TrustGraph’s document embeddings export stream and saves the embeddings to a file in MessagePack format. This is useful for creating backups, exporting data for analysis, or preparing data for migration between systems.
The command should typically be started before document processing begins to capture all embeddings as they are generated.
Options
Required Arguments
| Option | Description |
|---|---|
-o, --output-file FILE | Output file for saved embeddings |
Optional Arguments
| Option | Default | Description |
|---|---|---|
-u, --url URL | $TRUSTGRAPH_API or http://localhost:8088/ | TrustGraph API URL |
-t, --token TOKEN | $TRUSTGRAPH_TOKEN | Authentication token |
-f, --flow-id ID | default | Flow instance ID to monitor |
--format FORMAT | msgpack | Output format - msgpack or json |
--user USER | (none) | Filter by user ID |
--collection COLLECTION | (none) | Filter by collection ID |
Examples
Save Document Embeddings
tg-save-doc-embeds -o document-embeddings.msgpack
Save from Specific Flow
tg-save-doc-embeds \
-o research-embeddings.msgpack \
-f "research-processing-flow"
Filter by Collection
tg-save-doc-embeds \
-o filtered-embeddings.msgpack \
--collection "research-docs"
Export to JSON Format
tg-save-doc-embeds \
-o embeddings.json \
--format json
Output Format
MessagePack Structure
Document embeddings are saved as MessagePack records:
["de", {
"m": {
"i": "document-id",
"u": "user-id",
"c": "collection-id"
},
"c": [{
"c": "text chunk content",
"v": [0.1, 0.2, 0.3, ...]
}]
}]
Components:
- Record Type:
"de"indicates document embeddings - Metadata (
m): Document information and context - Chunks (
c): Text chunks with their vector embeddings
Environment Variables
TRUSTGRAPH_API: Default API URLTRUSTGRAPH_TOKEN: Default authentication token
Related Commands
tg-load-doc-embeds- Load document embeddingstg-dump-msgpack- Inspect MessagePack filestg-save-graph-embeds- Save graph embeddings
API Integration
This command uses the Document Embeddings Export API to stream embeddings data.