Document parsing: Unstructured, LlamaParse, Reducto, Docling, Azure Document Intelligence, AWS Textract
Chunking strategies: Semantic chunking, late chunking, layout-aware, context-enriched (Anthropic contextual retrieval pattern)
Embedding models: OpenAI text-embedding-3-large, Voyage voyage-3, Cohere Embed v3, BGE, Nomic, Jina v3, BM25 hybrid
Vector databases: Pinecone, Weaviate, Qdrant, Chroma, pgvector, MongoDB Atlas Vector Search, Turbopuffer
Hybrid retrieval: BM25 + dense + reciprocal rank fusion, query expansion, HyDE, multi-query, parent-document retrieval
Rerankers: Cohere Rerank 3, Voyage rerank-2, BGE reranker, Jina reranker, ColBERT v2 for high-precision retrieval
Generation: OpenAI (GPT-5 family), Anthropic Claude (Sonnet 4.5, Opus), Gemini, open-source via Together / Groq / vLLM
Permission-aware retrieval: Per-document ACL filters in vector DB, post-retrieval permission checks, OAuth-based source access
Evals & quality: Ragas, TruLens, LangSmith, LangFuse, Promptfoo, Phoenix — automated retrieval and answer-quality scoring
Observability: LangFuse, Helicone, Sentry, OpenTelemetry — full prompt and retrieval trace per request
Deployment: AWS, Modal, Vercel, Cloudflare Workers, on-prem (air-gapped), Kubernetes