Trigo Documentation

Trigo Architecture

Overview

Trigo is an RDF triplestore inspired by Oxigraph's architecture, implemented in Go. It provides efficient storage and querying of RDF data with SPARQL support.

Core Components

1. RDF Data Model (pkg/rdf)

Defines the fundamental RDF types:

2. Encoding Layer (internal/encoding)

Handles efficient encoding and decoding of RDF terms using xxHash3 128-bit hashing.

Encoding Strategy

Each term is encoded as:

Term Types:

Hash Function:

Key Encoding

3. Storage Layer (internal/storage)

Provides an abstraction over key-value stores with BadgerDB implementation.

Storage Interface

type Storage interface {
    Begin(writable bool) (Transaction, error)
    Close() error
    Sync() error
}

BadgerDB Implementation

Table Structure

11 logical tables (column families):

Metadata:

Default Graph (3 indexes):

Named Graphs (6 indexes):

Graph Metadata:

4. Store Layer (pkg/store)

Manages the triplestore with automatic index maintenance.

Operations

Index Selection Algorithm

Selects the optimal index based on bound positions in the pattern:

Pattern: (?s, bound_p, bound_o, ?g)
Selected: POS index (predicate and object are bound)

Priority:

  1. Most bound terms
  2. Subject/Predicate bindings preferred over Object
  3. Graph-specific indexes when graph is bound

5. SPARQL Parser (pkg/sparql/parser)

Converts SPARQL query text to an Abstract Syntax Tree (AST).

Supported Query Types

AST Structure

Parser Design

6. Query Optimizer (pkg/sparql/optimizer)

Optimizes query execution plans using heuristic-based optimization.

Optimization Techniques

1. Join Reordering (Greedy)

Selectivity Heuristics:

- Bound subject:   selectivity × 0.01
- Bound predicate: selectivity × 0.1
- Bound object:    selectivity × 0.1

2. Filter Push-Down

3. Join Type Selection

Query Plans

Operators (following Volcano model):

7. Query Executor (pkg/sparql/executor)

Executes optimized query plans using the Volcano iterator model.

Volcano Iterator Model

Each operator is an iterator with:

Lazy Evaluation:

Iterators

ScanIterator:

NestedLoopJoinIterator:

FilterIterator:

ProjectionIterator:

LimitIterator:

OffsetIterator:

DistinctIterator:

Query Execution Flow

SPARQL Query Text
       ↓
   [Parser]
       ↓
      AST
       ↓
  [Optimizer]
       ↓
   Query Plan
       ↓
  [Executor]
       ↓
   Iterators
       ↓
    Results

Example Query Execution

Query:

SELECT ?person ?name
WHERE {
  ?person foaf:name ?name .
  ?person foaf:age ?age .
}
LIMIT 10

Execution Plan:

LimitPlan(10)
  └─ ProjectionPlan([?person, ?name])
      └─ JoinPlan(NestedLoop)
          ├─ ScanPlan(?person foaf:age ?age)    [more selective]
          └─ ScanPlan(?person foaf:name ?name)

Why this order?

Transaction Model

Snapshot Isolation

No Garbage Collection

Performance Characteristics

Strengths

Current Limitations

Future Enhancements

Short-term

  1. Complete filter expression evaluation
  2. Implement hash join and merge join
  3. Add ORDER BY execution
  4. Support OPTIONAL and UNION patterns

Medium-term

  1. Collect statistics for better selectivity estimates
  2. Parallel query execution
  3. SPARQL UPDATE (INSERT/DELETE DATA)
  4. RDF data format parsers (Turtle, N-Triples)

Long-term

  1. RDF-star support (quoted triples)
  2. Property paths
  3. Aggregation functions
  4. Full-text search integration
  5. Federated query support

Comparison with Oxigraph

Similarities

Differences

Testing Strategy

Unit Tests

Integration Tests

W3C SPARQL Test Suite

References