1 points | by ajainvivek 8 hours ago ago
1 comments
Benchmark
We ran a small benchmark on a real-world insurance corpus: • 4 policy documents • ~1,900 hierarchical nodes • 100 queries across 6 complexity tiers
Comparing ReasonDB to a typical RAG pipeline (LangChain / LlamaIndex defaults):
Metric ReasonDB Typical RAG Pass rate 100% (12/12) 55–70% Context recall 90% avg 60–75% Median latency 6.1 s 15–45 s
The key difference is that ReasonDB performs BM25 candidate selection + LLM-guided traversal, rather than flat chunk similarity.
⸻
Example reasoning case
One query asked:
“What conditions define recurrent disability?”
The answer was split across two sections: • disability definition clause • policy schedule clause
Flat chunk retrieval returned only the first section.
ReasonDB followed the cross-reference extracted during ingestion, which raised recall from 67% → 100%.
Benchmark
We ran a small benchmark on a real-world insurance corpus: • 4 policy documents • ~1,900 hierarchical nodes • 100 queries across 6 complexity tiers
Comparing ReasonDB to a typical RAG pipeline (LangChain / LlamaIndex defaults):
Metric ReasonDB Typical RAG Pass rate 100% (12/12) 55–70% Context recall 90% avg 60–75% Median latency 6.1 s 15–45 s
The key difference is that ReasonDB performs BM25 candidate selection + LLM-guided traversal, rather than flat chunk similarity.
⸻
Example reasoning case
One query asked:
“What conditions define recurrent disability?”
The answer was split across two sections: • disability definition clause • policy schedule clause
Flat chunk retrieval returned only the first section.
ReasonDB followed the cross-reference extracted during ingestion, which raised recall from 67% → 100%.