Wesichain

Benchmarks

Transparent, reproducible performance comparisons

Recursive Text Splitting

Document chunking throughput with UTF-8 safe separators

Input Size Time (avg) Throughput Notes
16 KB 70.8 us 221 MiB/s Baseline
128 KB 572 us 218 MiB/s Consistent scaling
1 MB 4.97 ms 201 MiB/s Large input overhead

vs Python: ~2-4x faster

Typical Python recursive text splitters achieve ~50-100 MiB/s. Wesichain's Rust implementation peaks at 221 MiB/s in the current artifact snapshot.

Methodology: Measured with Criterion on macOS. Chunk size: 1000 chars, overlap: 200 chars. Run: cargo bench -p wesichain-retrieval --bench recursive_splitter

Connector Payload Microbenchmarks

Local payload construction snapshots for Qdrant and Weaviate connectors

Benchmark Wesichain mean Baseline mean Delta
Qdrant payload construction 0.801 ms 0.998 ms 1.25x faster
Weaviate payload construction 1.099 ms 1.450 ms 1.32x faster

Caveat: these snapshots are single local runs on synthetic datasets and benchmark local payload construction only (no live network call timing). Source files: wesichain/docs/benchmarks/data/qdrant-2026-02-16.json, wesichain/docs/benchmarks/data/weaviate-2026-02-16.json.

Test Parameters

Text Splitter

  • • Default separators: ["\n\n", "\n", " ", ""]
  • • Chunk size: 1000 characters
  • • Overlap: 200 characters
  • • Character-based: UTF-8 safe

Environment

  • • Platform: macOS (Darwin)
  • • Rust version: 1.92.0 in latest connector snapshots
  • • Optimization: --release
  • • CPU: Apple Silicon M-series

Reproducing These Results

All benchmarks are open source and reproducible. Run them yourself:

git clone https://github.com/wesichain/wesichain.git
cd wesichain
cargo bench -p wesichain-retrieval --bench recursive_splitter
cargo bench -p wesichain-qdrant --bench vs_langchain -- --sample-size 10
cargo bench -p wesichain-weaviate --bench vs_langchain -- --sample-size 10

Results saved to target/criterion/ for detailed analysis with confidence intervals.