Enterprise RAG Copilot for a 12k-doc Knowledge Base
A copilot for 900 internal analysts that grounds answers in 12,000 policy, legal and commercial documents, with citations they can defend to clients.
Analysts spent 40% of the week searching proprietary PDFs. Compliance refused to let any content leave the VPC.
A previous vendor shipped a chatbot that hallucinated clause numbers — one lawsuit scare later, it was torn out.
The ask was strict: grounded answers, citable page spans, zero data egress, under 1s p95.
Ran a 10-day readiness sprint: document taxonomy, chunking strategy per doc type, eval set of 320 expert questions.
Deployed Pinecone in the client VPC with hybrid search (BM25 + dense), page-span citations and strict answer-grounding rules.
Built a LangGraph supervisor: router → retriever → verifier → answerer, with refusal when grounding confidence is low.
Shipped a Next.js workspace with analyst-side tools: quoting, redlining, export to their brief templates.
3.1x analyst throughput on briefing tasks, measured by tracked time-to-first-draft.
p95 answer latency under 800ms even on complex multi-doc questions.
0 documented hallucinated citations in the first 8 weeks of rollout.
Passed internal security and legal review on first submission.
- Retrieval architecture and eval harness
- LangGraph agent design and refusal logic
- Front-end analyst workflow with shadcn/ui
- Vendor and cost model for OpenAI + Pinecone