Benchmarks

Transparent, reproducible performance comparisons

Recursive Text Splitting

Document chunking throughput with UTF-8 safe separators

Input Size	Time (avg)	Throughput	Notes
16 KB	70.8 us	221 MiB/s	Baseline
128 KB	572 us	218 MiB/s	Consistent scaling
1 MB	4.97 ms	201 MiB/s	Large input overhead

vs Python: ~2-4x faster

Typical Python recursive text splitters achieve ~50-100 MiB/s. Wesichain's Rust implementation peaks at 221 MiB/s in the current artifact snapshot.

Methodology: Measured with Criterion on macOS. Chunk size: 1000 chars, overlap: 200 chars. Run: cargo bench -p wesichain-retrieval --bench recursive_splitter

Connector Payload Microbenchmarks

Local payload construction snapshots for Qdrant and Weaviate connectors

Benchmark	Wesichain mean	Baseline mean	Delta
Qdrant payload construction	0.801 ms	0.998 ms	1.25x faster
Weaviate payload construction	1.099 ms	1.450 ms	1.32x faster

Caveat: these snapshots are single local runs on synthetic datasets and benchmark local payload construction only (no live network call timing). Source files: wesichain/docs/benchmarks/data/qdrant-2026-02-16.json, wesichain/docs/benchmarks/data/weaviate-2026-02-16.json.

Test Parameters

Text Splitter

• Default separators: ["\n\n", "\n", " ", ""]
• Chunk size: 1000 characters
• Overlap: 200 characters
• Character-based: UTF-8 safe

Environment

• Platform: macOS (Darwin)
• Rust version: 1.92.0 in latest connector snapshots
• Optimization: --release
• CPU: Apple Silicon M-series

Reproducing These Results

All benchmarks are open source and reproducible. Run them yourself:

git clone https://github.com/wesichain/wesichain.git
cd wesichain
cargo bench -p wesichain-retrieval --bench recursive_splitter
cargo bench -p wesichain-qdrant --bench vs_langchain -- --sample-size 10
cargo bench -p wesichain-weaviate --bench vs_langchain -- --sample-size 10

Results saved to target/criterion/ for detailed analysis with confidence intervals.