RunFullBenchmark
Validates Ma Core performance claims: GRPH-009 CTE vs N+1, search latency, sync timing, complex traversal bottlenecks.
Validates Ma Core performance claims: GRPH-009 CTE vs N+1, search latency, sync timing, complex traversal bottlenecks.
Parameters
iterations(int, default: 5): Test iterations per benchmarkmaxTestFiles(int, default: 1000): Max files for benchmarksincludeDeepTraversal(bool, default: true): Run expensive deep traversal tests (may timeout on large graphs)
Returns
Example
Constraints
- Minimum dataset: 100 files required for meaningful search benchmarks
- Sync iterations: Limited to 3 regardless of
iterationsparameter (sync is expensive) - Graph data required: Run Sync before benchmarks for accurate graph metrics
Benchmarks Executed
- GRPH-009 CTE vs N+1: Tests concepts with >5 relations, depth 3, 50 nodes max. Validates 34% performance claim.
- Search Performance: 5 test queries (ML, optimization, graph, vectors, concepts) against Hybrid/Semantic/Full-text modes. Claims: 33ms/116ms/47ms.
- Sync Performance: Measures sync timing, projects to 2,500 files. Claim: 27s target.
- Complex Traversal (optional): Hub concepts with most connections, depth 5, 200 nodes. Identifies 5-8s bottleneck scenarios.
Interpreting Results
✓ Passing (within ±20% of claims)
- Hybrid: 26-40ms | Semantic: 93-140ms | Full-text: 38-56ms
- Sync: 22-32s for 2,500 files
⚠️ Warning (20-50% slower)
Database fragmentation, resource constraints, or disk I/O bottlenecks. Run VACUUM, check system resources.
🚨 Failure (>50% slower)
Corrupted indices (rebuild database), resource exhaustion, incorrect configuration, or locking contention.
Common Use Cases
Pre-release validation:
Quick health check (~1-2 min):
Post-optimization comparison:
Integration
- Sync: Required before running - rebuilds graph indices
- MemoryStatus: Compare dataset size with benchmark report
- SearchMemories/BuildContext/Visualize: Tools being benchmarked - results affect UX
Troubleshooting
"Insufficient test data (X files)": Need 100+ files. Add test data or accept small dataset limitations.
Benchmark timeout: Complex traversal can exceed 5 min on large graphs. Use includeDeepTraversal=false.
Results highly variable: Increase iterations (10-20) or close other applications during benchmark.
Results much slower than claims: Run Sync, VACUUM database, check system resources (4GB+ RAM, SSD recommended).