RAG · Enterprise — 2024

Enterprise RAG Copilot for a 12k-doc Knowledge Base

A copilot for 900 internal analysts that grounds answers in 12,000 policy, legal and commercial documents, with citations they can defend to clients.

RAG · Enterprise

3.1×

Analyst throughput

780ms

p95 answer latency

Hallucinated citations

12,400

Docs indexed

The problem

Analysts spent 40% of the week searching proprietary PDFs. Compliance refused to let any content leave the VPC.

A previous vendor shipped a chatbot that hallucinated clause numbers — one lawsuit scare later, it was torn out.

The ask was strict: grounded answers, citable page spans, zero data egress, under 1s p95.

The approach

Ran a 10-day readiness sprint: document taxonomy, chunking strategy per doc type, eval set of 320 expert questions.
Deployed Pinecone in the client VPC with hybrid search (BM25 + dense), page-span citations and strict answer-grounding rules.
Built a LangGraph supervisor: router → retriever → verifier → answerer, with refusal when grounding confidence is low.
Shipped a Next.js workspace with analyst-side tools: quoting, redlining, export to their brief templates.

Architecture

Rendering architecture diagram…

Outcome

3.1x analyst throughput on briefing tasks, measured by tracked time-to-first-draft.
p95 answer latency under 800ms even on complex multi-doc questions.
0 documented hallucinated citations in the first 8 weeks of rollout.
Passed internal security and legal review on first submission.

What I owned end-to-end

Next up