Back to insights
ragsaaseu-ai-actgdprgenerative-aiknowledge-managementpgvector

RAG in Production: A Stack EU SaaS Teams Can Audit

By DataDiwan · 2026-06-18 · 9 min read

RAG in Production: A Stack EU SaaS Teams Can Audit

RAG in Production: A Stack EU SaaS Teams Can Audit

Short answer: Production RAG is not "embed PDFs and pray." It is chunking, locale-aware ingest, hybrid retrieval, cited answers, and retention policies your DPO can read — especially if you serve EU customers from a SaaS product.


When RAG beats fine-tuning for SaaS

Fine-tuningRAG
Expensive retraining per doc updateUpdate documents, re-ingest
Hard to cite sourcesNatural citations from chunks
Risk of memorising PIIRetrieve only what you index
Slow compliance reviewsAuditors see retrieval logs

For policy-heavy SaaS (legal, health, fintech, HR), grounded retrieval is usually the right default. Fine-tune later for tone, not facts.


Reference stack (vendor-agnostic)

  1. Ingest — markdown, PDF, tickets; detect locale (doc.ar.md, frontmatter)
  2. Chunk — ~500–800 tokens with overlap; keep title + source metadata
  3. Embed — voyage, open-source, or hosted; document dimension consistency
  4. Store — Postgres + pgvector (or equivalent) with tenant_id column
  5. Retrieve — vector + full-text fallback; filter by locale and tenant
  6. Generate — system prompt that forbids uncited claims
  7. Log — optional query log with retention cap (GDPR)

We use this pattern on datadiwan.com and in client deployments.


EU compliance hooks that matter

  • Data minimisation — do not embed what you do not need in answers
  • Retention — TTL on query logs; document in DPIA
  • Sub-processors — list embedding and LLM providers in privacy policy
  • Arabic + Finnish — filename or frontmatter locale beats "English-only index"

Our EU AI Act scorecard includes documentation prompts for RAG deployments.


Evaluation: the step most teams skip

Before launch, build 30–50 real questions from:

  • Customer support macros
  • Onboarding docs
  • Compliance FAQs

Score:

MetricTarget
Answer uses correct doc>85%
Citation matches source>90%
"I don't know" when missingallowed and encouraged
Latency p95<5s for internal tools

Regression-test after every ingest change.


Common failures we fix

  1. English-only ingest for trilingual customers
  2. Giant chunks that dilute retrieval precision
  3. No FTS fallback when embeddings drift
  4. Blog posts outranking policy docs on compliance questions
  5. Shared index across tenants in multi-tenant SaaS

From prototype to product

PhaseDurationOutput
Readiness sprint1–3 weeksArchitecture, risk tier, eval set
Build & deploy4–12 weeksIngest CLI, API, admin UI, monitoring
HandoverRunbooks, DPIA inputs, model cards

Resources


Next step

Book a free AI readiness call or download the EU AI Act scorecard.

DataDiwan — data science, generative AI, RAG, and automation for teams shipping SaaS in Europe and the Arab world.