Skip to content
maenifold
GitHub

RunFullBenchmark

Validates Ma Core performance claims: GRPH-009 CTE vs N+1, search latency, sync timing, complex traversal bottlenecks.

Validates Ma Core performance claims: GRPH-009 CTE vs N+1, search latency, sync timing, complex traversal bottlenecks.

Parameters

  • iterations (int, default: 5): Test iterations per benchmark
  • maxTestFiles (int, default: 1000): Max files for benchmarks
  • includeDeepTraversal (bool, default: true): Run expensive deep traversal tests (may timeout on large graphs)

Returns

Example

Constraints

  • Minimum dataset: 100 files required for meaningful search benchmarks
  • Sync iterations: Limited to 3 regardless of iterations parameter (sync is expensive)
  • Graph data required: Run Sync before benchmarks for accurate graph metrics

Benchmarks Executed

  1. GRPH-009 CTE vs N+1: Tests concepts with >5 relations, depth 3, 50 nodes max. Validates 34% performance claim.
  2. Search Performance: 5 test queries (ML, optimization, graph, vectors, concepts) against Hybrid/Semantic/Full-text modes. Claims: 33ms/116ms/47ms.
  3. Sync Performance: Measures sync timing, projects to 2,500 files. Claim: 27s target.
  4. Complex Traversal (optional): Hub concepts with most connections, depth 5, 200 nodes. Identifies 5-8s bottleneck scenarios.

Interpreting Results

✓ Passing (within ±20% of claims)

  • Hybrid: 26-40ms | Semantic: 93-140ms | Full-text: 38-56ms
  • Sync: 22-32s for 2,500 files

⚠️ Warning (20-50% slower)

Database fragmentation, resource constraints, or disk I/O bottlenecks. Run VACUUM, check system resources.

🚨 Failure (>50% slower)

Corrupted indices (rebuild database), resource exhaustion, incorrect configuration, or locking contention.

Common Use Cases

Pre-release validation:

Quick health check (~1-2 min):

Post-optimization comparison:

Integration

  • Sync: Required before running - rebuilds graph indices
  • MemoryStatus: Compare dataset size with benchmark report
  • SearchMemories/BuildContext/Visualize: Tools being benchmarked - results affect UX

Troubleshooting

"Insufficient test data (X files)": Need 100+ files. Add test data or accept small dataset limitations.

Benchmark timeout: Complex traversal can exceed 5 min on large graphs. Use includeDeepTraversal=false.

Results highly variable: Increase iterations (10-20) or close other applications during benchmark.

Results much slower than claims: Run Sync, VACUUM database, check system resources (4GB+ RAM, SSD recommended).