tg-load-doc-embeds
Loads document embeddings from MessagePack format into TrustGraph processing pipelines.
Synopsis
tg-load-doc-embeds -i INPUT_FILE [options]
Description
The tg-load-doc-embeds command loads document embeddings from MessagePack files into a running TrustGraph system. This is typically used to restore previously saved document embeddings or to load embeddings generated by external systems.
The command reads document embedding data and streams it to TrustGraph’s document embeddings import API via WebSocket connections.
Options
Required Arguments
| Option | Description |
|---|---|
-i, --input-file FILE | Input MessagePack file containing document embeddings |
Optional Arguments
| Option | Default | Description |
|---|---|---|
-u, --url URL | $TRUSTGRAPH_API or http://localhost:8088/ | TrustGraph API URL |
-t, --token TOKEN | $TRUSTGRAPH_TOKEN | Authentication token |
-f, --flow-id ID | default | Flow instance ID to use |
--format FORMAT | msgpack | Input format - msgpack or json |
--user USER | (from input) | Override user ID from input data |
--collection COLLECTION | (from input) | Override collection ID from input data |
Examples
Load Document Embeddings
tg-load-doc-embeds -i document-embeddings.msgpack
Load with Custom Flow
tg-load-doc-embeds \
-i embeddings.msgpack \
-f "document-processing-flow"
Override Collection
tg-load-doc-embeds \
-i embeddings.msgpack \
--collection "research-docs"
Load from JSON Format
tg-load-doc-embeds \
-i embeddings.json \
--format json
Input Data Format
MessagePack Structure
Document embeddings are stored as MessagePack records:
["de", {
"m": {
"i": "document-id",
"u": "user-id",
"c": "collection-id"
},
"c": [{
"c": "text chunk content",
"v": [0.1, 0.2, 0.3, ...]
}]
}]
Components:
- Metadata (
m): Document ID, user, and collection - Chunks (
c): Text chunks with their vector embeddings
Environment Variables
TRUSTGRAPH_API: Default API URLTRUSTGRAPH_TOKEN: Default authentication token
Related Commands
tg-save-doc-embeds- Save document embeddingstg-dump-msgpack- Inspect MessagePack filestg-load-graph-embeds- Load graph embeddings
API Integration
This command uses the Document Embeddings Import API via WebSocket for efficient streaming.